OpenAI's GPT-3 model gives great results, but it is not open-source and OpenAI's API is very expensive. GPT-Neo and GPT-J are 2 great open-source alternatives to GPT-3. How do they compare to GPT-3?
GPT-3 was released in May 2020 by the OpenAI research lab, based in San Francisco. As of this writing this is the biggest Natural Language Processing model ever created, trained on 175 billion parameters! See GPT-3's website here.
It can be used for various Natural Language Processing use cases, but it especially excels at text generation (give a small piece of text to the model with an expected text length, and let the model generate the rest of the text for you). The stunning accuracy of GPT-3 opens tons of new AI possibilities, like automatic marketing content creation, advanced chat bots, medical question answering, and much more. For more examples about what you can do with GPT-3, please see their documentation.
Problem: the model can only be used through an expensive paid API as a black box, and only for selected customers (new customers have to join a waitlist). The source code of GPT-3 belongs to Microsoft.
OpenAI's API offers 4 GPT-3 models trained on different numbers of parameters: Ada, Babbage, Curie, and Davinci. OpenAI don't say how many parameters each model contains, but some estimations have been made (see here) and it seems that Ada contains more or less 350 million parameters, Babbage contains 1.3 billion parameters, Curie contains 6.7 billion parameters, and Davinci contains 175 billion parameters.
The more parameters the better the accuracy, but also the slower the model, and the higher the price. Their price is per token. Basically you can consider that 100 tokens are roughly equivalent to 75 words. They count the tokens you send in the input request plus the tokens generated by the model. On Davinci, for example, the price for 1,000 tokens is $0.06.
Example: sending a piece of text made up of 1000 tokens (roughly 750 words) and asking the model to generate 200 tokens, will cost you $0.072. If you want to send 20 requests per minute to their API, it will then cost you $64,250 per month...
GPT-Neo has been released in March 2021, and GPT-J in June 2021, as open-source models, both created by EleutherAI (a collective of researchers working to open source AI).
GPT-Neo has 3 versions: 125 million parameters, 1.3 billion parameters (equivalent to GPT-3 Babbage), and 2.7 billion parameters. GPT-J has 6 billion parameters, which makes it the most advanced open-source Natural Language Processing model as of this writing. This is a direct equivalent of GPT-3 Curie.
The tests made on these models show great performances. When generating text with GPT-J, this is almost impossible to tell whether it has been written by a human or a machine...
GPT-Neo and GPT-J are open-source Natural Language Processing models, so everybody can download them and use them. Well, in theory...
GPT-J, for example, needs around 25GB of RAM to run + many CPUs. On CPUs, GPT-J is painfully slow though, so it is much better to perform inference with GPT-J on a GPU. As GPT-J needs around 25GB of GPU VRAM, it does not fit in most of the standard NVIDIA GPUs existing on the market today (that either have 8GB or 16GB of VRAM maximum).
These hardware requirements make it very impractical to test GPT-J and GPT-Neo, let alone use them reliably for inference in production with high-availability and scalability in mind.
So if you want to simply try GPT-J and GPT-Neo or use them for real in your production application, we do recommend that you use an existing Natural Language Processing API like NLP Cloud. As far as we know, NLP Cloud is the only API proposing GPT-J as of this writing (see NLP Cloud's text generation API here)! And you can also fine-tune GPT-J, so the model is perfectly tailored to your use case.
The GPT-3 API is very expensive. On the contrary, NLP Cloud tried to make their GPT-J API as affordable as possible, despite the very high computation costs required on the server side. Let's do the math.
Imagine that you want to perform text generation with GPT-3 Curie. You want to pass an input of 1000 tokens and generate 200 tokens. You want to perform 3 requests per minute.
The price per month would be (1200/1000) x 0.006 x 133,920 = $964/month
Now the same thing with GPT-J on NLP Cloud:
On NLP cloud, the plan for 3 requests per minute on GPT-J costs $29/month on CPU or $99/month on GPU, no matter the number of tokens.
As you can see the price difference is quite significant.
GPT-3 is an amazing model that really changed the Natural Language Processing game. For the first time in Natural Language Processing history, it's almost impossible to say whether the generated content is coming from a human or a machine, which leads many companies to integrate GPT-3 into their product or their internal workflows.
However, sadly, GPT-3 is far from easily accessible... EleutherAI made a great job at designing open-source alternatives to GPT-3. GPT-J is the best of these alternatives as of this writing.
If you want to use GPT-J, don't hesitate to have a try on the NLP Cloud API (try it here)!
CTO at NLPCloud.io