Effectively using GPT-J and GPT-Neo, the GPT-3 open-source alternatives, with few-shot learning

GPT-J and GPT-Neo, the open-source alternatives to GPT-3, are among the best NLP models as of this writing. But using them effectively can take practice. Few-shot learning is an NLP technique that works very well with these models.

GPT-J and GPT-Neo

GPT-Neo and GPT-J are both open-source NLP models, created by EleutherAI (a collective of researchers working to open source AI).

GPT-J has 6 billion parameters, which makes it the most advanced open-source NLP model as of this writing. This is a direct alternative to OpenAI's proprietary GPT-3 Curie.

These models are very versatile. They can be used for text generation, sentiment analysis, classification, machine translation... However using them effectively sometimes takes practice.

GPT-J and GPT-Neo are both available on the NLP Cloud API. Below, we're showing you examples obtained using the GPT-J endpoint of NLP Cloud with cURL (command line). If you want to copy paste the examples, please don't forget to add your own API token. Also, you might have to replace new lines with \n for cURL to work.

Few-Shot Learning

Few-shot learning is about helping a machine learning model make predictions thanks to only a couple of examples. No need to train a new model here: models like GPT-J and GPT-Neo are so big that they can easily adapt to many contexts without being re-trained.

Giving only a few examples to the model does help it dramatically increase its accuracy.

In NLP, the idea is to pass these examples along with your text input. See the examples below!

Sentiment Analysis with GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token {your_token}" -X POST -d '{
    "text":"Message: Support has been terrible for 2 weeks...
            Sentiment: Negative
            ###
            Message: I love your API, it is simple and so fast!
            Sentiment: Positive
            ###
            Message: GPT-J has been released 2 months ago.
            Sentiment: Neutral
            ###
            Message: The reactivity of your team has been amazing, thanks!
            Sentiment:",
    "max_length": 100,
    "end_sequence": "###"
}'

Output:

{"generated_text":"Message: Support has been terrible for 2 weeks...
                Sentiment: Negative
                ###
                Message: I love your API, it is simple and so fast!
                Sentiment: Positive
                ###
                Message: GPT-J has been released 2 months ago.
                Sentiment: Neutral
                ###
                Message: The reactivity of your team has been amazing, thanks!
                Sentiment: Positive
                ###"}

As you can see, the fact that we first give 3 examples with a proper format, leads GPT-J to understand that we want to perform sentiment analysis. And its result is good.

### is an arbitrary delimiter that helps GPT-J understand the different sections. We could perfectly use something else like --- or simply a new line. Then we set "end_sequence": "###" which is an NLP Cloud parameter that tells GPT-J to stop generating content after a new ###.

HTML code generation with GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"description: a red button that says stop
            code: <button style=color:white; background-color:red;>Stop</button>
            ###
            description: a blue box that contains yellow circles with red borders
            code: <div style=background-color: blue; padding: 20px;><div style=background-color: yellow; border: 5px solid red; border-radius: 50%; padding: 20px; width: 100px; height: 100px;>
            ###
            description: a Headline saying Welcome to AI
            code:",
    "max_length": 500,
    "end_sequence": "###"
}'

Output:

{"generated_text": "description: a red button that says stop
                    code: <button style=color: white; background-color: red;>Stop</button>
                    ###
                    description: a blue box that contains yellow circles with red borders
                    code: <div style=background-color: blue; padding: 20px;><div style=background-color: yellow; border: 5px solid red; border-radius: 50%; padding: 20px; width: 100px; height: 100px;>
                    ###
                    description: a Headline saying Welcome to AI
                    code: <h1 style=color: white;>Welcome to AI</h1>
                    ###"}

Code generation with GPT-J really is amazing. This is partly because a of GPT-J has been trained on huge code bases.

SQL code generation with GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"Question: Fetch the companies that have less than five people in it.
            Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;
            ###
            Question: Show all companies along with the number of employees in each department
            Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;
            ###
            Question: Show the last record of the Employee table
            Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;
            ###
            Question: Fetch the three max employees from the Employee table;
            Answer:",
    "max_length": 500,
    "end_sequence": "###"
}'

Output:

{"generated_text":"Question: Fetch the companies that have less than five people in it.
                    Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;
                    ###
                    Question: Show all companies along with the number of employees in each department
                    Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;
                    ###
                    Question: Show the last record of the Employee table
                    Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;
                    ###
                    Question: Fetch the three max employees from the Employee table;
                    Answer: SELECT * FROM Employee ORDER BY ID DESC LIMIT 3;
                    ###"}

Automatic SQL generation works very well with GPT-J.

Question Answering With GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage NLP in production.
            Question: When was NLP Cloud founded?
            Answer: 2021
            ###
            Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
            Question: What did NLP Cloud develop?
            Answer: API
            ###
            Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
            Question: Which plan is recommended for GPT-J?
            Answer:",
    "max_length": 500,
    "end_sequence": "###"
}'

Output:

{"generated_text":"Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage NLP in production.
                    Question: When was NLP Cloud founded?
                    Answer: 2021
                    ###
                    Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
                    Question: What did NLP Cloud develop?
                    Answer: API
                    ###
                    Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
                    Question: Which plan is recommended for GPT-J?
                    Answer: GPU plan
                    ###"}

Question answering can also be achieved with other dedicated NLP models, but maybe not with the same level of accuracy.

Tutorial Creation With GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"Here a tutorial how to make a cake.
        
            1. Take some flour.
            2. Take some sugar.",
    "max_length": 500
}'

Output:

{"generated_text":"Here a tutorial how to make a cake.
        
                    1. Take some flour.
                    2. Take some sugar.
                    3. Take some butter.
                    4. Take some eggs.
                    5. Take some water.
                    6. Take some baking powder.
                    7. Take some vanilla.
                    8. Mix all together.
                    9. Bake in a pan.
                    10. Enjoy.
                    
                    Well, that's it. You can make this for your birthday or a party or you can even make it for your kids. They will love this."}

As you can see it's quite impressive how GPT-J automatically follows your initial formatting, and the generated content is also very good too.

Spell Correction with GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"I love goin to the beach.
            Correction: I love going to the beach.
            ###
            Let me hav it!
            Correction: Let me have it!
            ###
            It have too many drawbacks.
            Correction: It has too many drawbacks.
            ###
            I do not wan to go
            Correction:",
    "max_length": 500,
    "end_sequence": "###"
}'

Output:

{"generated_text":"I love goin to the beach.
                    Correction: I love going to the beach.
                    ###
                    Let me hav it!
                    Correction: Let me have it!
                    ###
                    It have too many drawbacks.
                    Correction: It has too many drawbacks.
                    ###
                    I do not wan to go
                    Correction: I do not want to go."}

Spell checking and correction works!

Machine Translation with GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"Hugging Face a révolutionné le NLP.
            Translation: Hugging Face revolutionized NLP.
            ###
            Cela est incroyable!
            Correction: This is unbelievable!
            ###
            Désolé je ne peux pas.
            Correction: Sorry but I cannot.
            ###
            NLP Cloud permet de deployer le NLP en production facilement.
            Correction",
    "max_length": 500,
    "end_sequence": "###"
}'

Output:

{"generated_text":"Hugging Face a révolutionné le NLP.
                    Translation: Hugging Face revolutionized NLP.
                    ###
                    Cela est incroyable!
                    Correction: This is unbelievable!
                    ###
                    Désolé je ne peux pas.
                    Correction: Sorry but I cannot.
                    ###
                    NLP Cloud permet de deployer le NLP en production facilement.
                    Correction: NLP Cloud makes it easy to deploy NLP to production.
                    ###"}

Machine translation works great too.

Tweet Generation with GPT-J

curl "https://api.nlpcloud.io/v1/gpt-j/generation" -H "Authorization: Token your_token" -X POST -d '{
    "text":"keyword: markets
            tweet: Take feedback from nature and markets, not from people
            ###
            keyword: children
            tweet: Maybe we die so we can come back as children.
            ###
            keyword: startups
            tweet: Startups should not worry about how to put out fires, they should worry about how to start them.
            ###
            keyword: NLP
            tweet:",
    "max_length": 500,
    "end_sequence": "###"
}'

Output:

{"generated_text":"key: markets
                    tweet: Take feedback from nature and markets, not from people
                    ###
                    keyword: children
                    tweet: Maybe we die so we can come back as children.
                    ###
                    keyword: startups
                    tweet: Startups should not worry about how to put out fires, they should worry about how to start them.
                    ###
                    keyword: NLP
                    tweet: People want a way to get the benefits of NLP without paying for it.
                    ###"}

A funny way to generate short tweets following a context.

Conclusion

As you can see, few-shot learning is a great technique that helps GPT-J and GPT-Neo achieve almost any NLP task! The key here is to pass a correct context before making your request.

Even for simple text generation, it is recommended to pass as much context as possible, in order to help the model.

Hope you found it useful! If you have some questions about how to make the most of these models, please don't hesitate to ask us.

Julien Salinas
CTO at NLPCloud.io