Text Classification API

What is Text Classification?

Text classification is the process of categorizing a block of text based on one or several labels.

Let's say you have the following block of text:

Perseverance is just getting started, and already has provided some of the most iconic visuals in space exploration history. It reinforces the remarkable level of engineering and precision that is required to build and fly a vehicle to the Red Planet.

Let's also say that you also have the following labels: space, science, and food.

Now the question is: which ones of these labels apply best to this block of text? Answer is space and science of course.

Why Use Text Classification?

Text classification can be used in many useful situations. Let's give you a couple of examples.

Sort Incoming Messages

Are you flooded with incoming messages at work? Well, properly labelling these messages in advance can definitely make you more productive. You could know in advance which messages are advertising, and which one are customer requests, for example

Detect Urgency

Some customers requests must sometimes be addressed as a priority. If that's the case it can be very interesting to detect them in advance and address them right away.

Leads Qualification

Let's say you are looking for companies in the automotive field. You could scan websites and only keep those who have the "automotive" label applied.

Economic Intelligence

You might want to monitor new content from various sources and categorize it accordingly. Text classification is the right way to do so.

Text Classification with Hugging Face Transformers.

Hugging Face transformers is an amazing library that has been recently released. It is based on either PyTorch or TensorFlow, depending on the model you're using. Transformers have clearly helped deep learning NLP make great progress in terms of accuracy. However this accuracy improvement comes at a cost: transformers are extremely demanding in terms of resources.

Hugging Face is a central repository regrouping all the newest open-source NLP transformer-based models. two of them, Facebook's Bart Large MNLI (for English) and Joe Davison's XLM Roberta Large XNLI (for non-English languages) are perfectly suited for text classification in many languages.

Text Classification Inference API

Building an inference API for text classification is a necessary step as soon a you want to use text classification in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Text Classification API

NLP Cloud proposes a text classification API that gives you the opportunity to perform text classification out of the box, based on Hugging Face transformers' Facebook's Bart Large MNLI model and Joe Davison's XLM Roberta Large XNLI, with excellent performances. The response time (latency) is very good for these model. You can either use these pre-trained models or upload your own tranformer-based custom models!

For more details, see our documentation about text classification.

Testing text classification locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!

As for all our NLP models, you can use text classification for free, up to 3 API requests per minute.