Question Answering API

What is Question Answering?

Question answering is about answering a question with the help of a context. The NLP model cannot answer the question by itself, but it leverages a piece of text ("context") you are passing along with your question in order to find elements that can help it answer.

For example imagine you want to ask the following question:

Who is the French president?

Let's also say that you have the following context:

French president Emmanuel Macron said the country was at war with an invisible, elusive enemy, and the measures were unprecedented, but circumstances demanded them.

The question answering model will give you the answer: Emmanuel Macron along with its likelihood (from 0 to 1) and the position of the answer in the context.

Why Use Question Answering?

Question Answering can be usefully used in the "real world". Here are a couple of examples.

Contracts Questions

Chat bots are used more and more everyday, both to answer customer questions and internal collaborators questions. Imagine that a customer is asking a legal question about his contract. You could perfectly use a question answering model for that and pass the contract as a context.

Product Questions

Here's another chat bots related example. Imagine that a collaborator has a technical question about a product. Why not provide him with a natural language interface and make his life easier?

Question Answering with Hugging Face Transformers.

Hugging Face transformers is an amazing library that has been recently released. It is based on either PyTorch or TensorFlow, depending on the model you're using. Transformers have clearly helped deep learning NLP make great progress in terms of accuracy. However this accuracy improvement comes at a cost: transformers are extremely demanding in terms of resources.

Hugging Face is a central repository regrouping all the newest open-source NLP transformer-based models. One of them, Deepset's Roberta Base Squad 2 is perfectly suited for question answering.

Question Answering API

Building an inference API for question answering is a necessary step as soon a you want to use question answering in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independantly and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Question Answering API

NLP Cloud proposes a question answering API that gives you the opportunity to perform question answering out of the box, based on Hugging Face transformers' Deepset's Roberta Base Squad 2 model, with excellent performances. The response time (latency) is very good for this model.

For more details, see our documentation about question answering.

Testing question answering locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!

As for all our NLP models, you can use question answering for free, up to 3 API requests per minute.