deep learning in the wild
Deep Learning in the Wild

DeepsetShape the future of NLP

with Malte Pietsch

Can you introduce yourself and the company?

My name Malte, I am CTO and cofounder of deep set. We are a company in the NLP space working mostly on the open source project Haystack that helps developers build semantic search pipelines. 

How did you get started in the field of machine learning and deep learning in particular?

I studied a mix of computer and statistics so I think it was kind of inevitable that I ended up doing machine learning, because I really always liked this mix of quantitative methods engineering and solving real world problems with it. 

I was interested in physics and moved more and more towards computer science. The one key event was spending a year at Carnegie Mellon University where I soaked in the atmosphere on campus and witnessed deep learning on the rise conquering computer vision tasks one after the other. I was working on time series data from the health care domain at the time and figured out that one most important signals we got was from the doctors’ notes, not actually the original time series data and that is how I got curious about NLP.

Did you get your start in NLP at the same time as you did with deep learning?

Basically, we started with very simple techniques and even those yielded some very interesting results. This piqued my curiosity; we could apparently use these simple techniques, bag of word, word to vec etc. and I saw what happened on the computer vision side and was convinced that deep learning was the future for NLP as well. 

How did you leverage your research experience into your work at deepset?

The research I did sparked my interest in NLP, but I decided not to pursue academic research, because I really wanted to see NLP in production, so I started working for a couple of startups in Berlin. 
At some point we decided to build deepset, because we saw that on the one hand NLP was on the rise (2017) we could see early things working, for instance transfer learning just came out Big Bird and transformers came out a year later and things got easier.
We also saw this gap between research and industry, when I talked to peers from industry and told them what we were doing and how we applied deep learning in production, we met with a lot of skepticism that this would not work at their scale, so they stuck to working with methods that had come out a decade earlier.
We thought there has to be something to lower the threshold for production deployment of deep learning methods, at the time it took a lot of engineering resources to make these models stable and fast.
But we believed in the power of NLP and in our ability to build tooling to bridge the gap from research to production and that is why we started deepset.

Have you had success breaking into those industries with those deep learning based methods, in my experience as recently as a couple of years ago it proved fairly difficult to sell deep learning methods as they were considered black boxes lacking interpretability.

For us it worked quite well and things certainly became easier after transformers were released. We were able to use transfer learning to adapt to different industries and that really started piquing people’s interest. They started seeing the need for other models, especially for the use cases that we worked on i.e. semantic search and question answering where usually when we talked to them they had already experimented with deep learning techniques, but were not really satisfied. 
I would say this attitude has changed in the industry. The concerns right now are more about how to deploy these models in production, how to do ML-Ops, how to adapt these models to their domain rather than questioning whether they should go for deep learning at all.

You maintain an open source project called Haystack, what is Haystack?

Haystack is Python framework for developers to build semantic search pipelines, it allows you to build information extraction pipelines in a very modular way, core building blocks you can stick together to power full search pipelines. One of these building blocks for instance can be a retriever model identifying documents from a larger set of documents that are relevant to your query or it can be something like a question answering model which you then connect, so that when you have a search query coming in you can define very granularly how it should flow through these blocks. 
It is also tightly integrated with Hugging Face so you can plug it in through hugging face as well. We use it in a more specialized area of NLP, semantic search, to cover everything from end to end from labelling to picking the right model, to exposing a rest pipeline.
So with Haystack we are able to cover the whole range of a production grade pipeline, not just the neural network part of the model. 

We noticed that in our own client project that  while there is tooling for tasks such as document indexing and preprocessing, it is still a lot of work to connect them into a coherent whole. Especially if you are working with domain experts for additional input such as labelling. That is what we set out to simplify.

Is your business case then mostly built around consulting or do you have productized services on top of Haystack?

We do not do consulting, we do integration work around open source with clients, say custom deployments and help them with a few pieces that are missing. What we see more and more is that we are approached about solving the same kinds of problems and that is why we are building out an enterprise platform / SaaS offering. which is not a traditional open core model with premium features on top of open source that you have to pay for, but rather a complementary platform that focuses on the workflows and integration with domain experts. If you are interested in question and answer, you first need to understand whether it works in your domain and understand that you need to come up with a good dataset, some questions, some answers and for that you need domain experts from other departments.
Say you are in the pharma or finance industry, our product would bridge the gap between the data scientist and those domain experts from early prototyping to production and monitoring the system.

What kind of tech stack would someone in a deep learning engineering role work with at deepset?

The majority of the code base is Python based with some minor things in Rust. From a modelling perspective we are big fans of Pytorch, especially for training and experimentation, for our deployment we sometimes use other runtimes ONNX for example.
Fast API for the API layer, Kubernetes and AWS for deploying them at scale. Lately we are experimenting a little with Ray which is a framework for distributed applications that works especially well with machine learning pipelines.

What have been some technical challenges you have had to overcome?

One big challenge we are working on constantly is how to make question answering scaleable, I would say mostly from a technical perspective in terms of speed and large datasets, how we tackle that is mostly a mix of scouting research that is out there and hands on experimentation. Especially Facebook's research on open domain QA proved very useful for designing an efficient scaling mechanism for open domain QA.
But research is always a starting point to generate good ideas, the hard part is then devising practical, stable and efficient ways to implement it. That is where benchmarking comes into play. We have a whole suite of models that we benchmark so we can identify bottlenecks in the pipeline to tweak and optimize, but also to compare different building blocks that we have on top of Haystack.
We are also striving to make benchmarking a more integral part of our development workflow, so we started looking into CML, to integrate with our Github workflow and include benchmarks automatically with each pull request which allows us to track whether something improved or deteriorated.
Another kind of challenge stems from our SaaS offering where we are attempting to simplify the adaptation of semantic search or question answering for particular business use cases and industrial domains. A lot of that is related to the question of how to implement efficient transfer learning, but also how can you simplify these workflows and on a more basic level, how can you measure success, what are good metrics for such a search model.

What excites you most about the future of deep learning? Where would you like to see the field going?

I am most excited about NLP, that is also why we focused on it, because we believe that at some point language is the API of the human that we still need to rely on messy work arounds to integrate with wizards or certain query languages, because you cannot talk directly to the computer.
I am mostly looking forward to how information access will change with deep learning as chatbots and search come closer together with search becoming more conversational and more about context and semantics. Eventually natural conversational chatbots could go beyond the one hundred intents that you define, but focus more on the long tail as you start integrating dense retrieval techniques, QA and so on.
So I am curious to see how these parts come together. Technically I am most excited about generative models such as generative Q&A which holds a lot of promise although there are still some drawbacks. For instance it is still not clear how to evaluate the merit of the answer to a question. Right now, this seems to be the biggest blocker for industry. If you can find a method that assigns certain facts that can be checked and verified to an answer that would lend it some trustworthiness.

How can someone without a PhD in deep learning break into the industry and work in research? What do you look for on a candidate’s resume?

One way to stand out is definitely open source contributions to projects that are relevant, this is very easy to verify and a transparent way to show excitement about and commitment to the field. It really demonstrates whether deep learning is just an interest that you occasionally read about or whether you are dedicated enough to actually build something. When I look at a resume I try to find out whether this is a person that is more on the theoretical side, following the latest research and trying out new models or whether someone is more practically inclined and wants to make things work.
The latter type of person has a problem to solve and finds the right methods to apply and a way to engineer around it and that is an exceedingly important property to have in industry.

What is the background of your team, how varied is it?

We have a wild mix of people from computational linguists and classical linguists studying Latin and Greek, who eventually ended up coding to the PhD who focused on NLP in academia as well as people with a very strong engineering background who got interested in ML and NLP during their careers.
It is incredibly important to have people of diverse backgrounds that follow the same vision and come together to build it. 

What makes deepset a great place to work and what makes it stand out from other employers?

We work on some of the most interesting NLP problems and engineering problems, we want them to run at scale and that comes with a great need for innovation. Being on the front lines of NLP yet in close proximity to industry sets us apart from a lot of other places that focus on either research or industry niche.
Secondly, our open source work; you see the impact of your work directly and you have other developers using it immediately. Last but not least, we are very close to industry seeing problems as they arise when developers use our projects. We know what it takes to bring sematic search and Q&A to the next level.