Text Classification API

What is Text Classification?

Text classification is the process of categorizing a block of text. As an option, you can ask the AI to choose a category among a list of categories you gave beforehand.

Let's say you have the following block of text:

Perseverance is just getting started, and already has provided some of the most iconic visuals in space exploration history. It reinforces the remarkable level of engineering and precision that is required to build and fly a vehicle to the Red Planet.

Let's also say that you also have the following categories: space, science, and food.

Now the question is: which ones of these categories apply best to this block of text? Answer is space and science of course.

If you don't suggest any candidate categories, the AI will suggest the best category possible based on the data it was trained on.

Why Use Text Classification?

Text classification can be used in many useful situations. Let's give you a couple of examples.

Sort Incoming Messages

Are you flooded with incoming messages at work? Well, properly labelling these messages in advance can definitely make you more productive. You could know in advance which messages are advertising, and which one are customer requests, for example

Detect Urgency

Some customers requests must sometimes be addressed as a priority. If that's the case it can be very interesting to detect them in advance and address them right away.

Leads Qualification

Let's say you are looking for companies in the automotive field. You could scan websites and only keep those who have the "automotive" label applied.

Economic Intelligence

You might want to monitor new content from various sources and categorize it accordingly. Text classification is the right way to do so.

Text Classification with Hugging Face Transformers and GPT-J.

Hugging Face transformers is an amazing library that has been recently released. It is based on either PyTorch or TensorFlow, depending on the model you're using. Transformers have clearly helped deep learning Natural Language Processing make great progress in terms of accuracy. However this accuracy improvement comes at a cost: transformers are extremely demanding in terms of resources.

Hugging Face is a central repository regrouping all the newest open-source Natural Language Processing transformer-based models. Two of them, Facebook's Bart Large MNLI (for English) and Joe Davison's XLM Roberta Large XNLI (for non-English languages) are perfectly suited for text classification in many languages.

For more advanced results, it is possible to perform text classification with GPT-J too. It gives great results, even when no input labels are provided.

Text Classification Inference API

Building an inference API for text classification is a necessary step as soon a you want to use text classification in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Text Classification API

NLP Cloud proposes a text classification API that gives you the opportunity to perform text classification out of the box, based on Hugging Face transformers' Facebook's Bart Large MNLI model and Joe Davison's XLM Roberta Large XNLI, and on GPT-J, with excellent performances. The response time (latency) is very good for these model. You can either use these pre-trained models, or train your own models, or upload your own custom models!

For more details, see our documentation about text classification here.

Testing text classification locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!

As for all our Natural Language Processing models, you can use text classification for free, up to 3 API requests per minute.