There is a lot of hype about GPT-3 since it appeared. Everybody was so excited about its results. It doesn’t need fine-tuning for some specific task, which is quite an impressive result. Also, it is very good at writing poems, writing code, and SQL queries at the same time. Isn’t it amazing?
Still one question remains: how GPT-3 does this, and how its creators achieved such results?
The answer is simple: it is very huge, and it is trained agains enormous text corpora.
And the result is the best language model people ever produced.
But, do we really understand what does this mean?
Language model is nothing but tool to calculate probability of the all of the words in the vocabulary to be in the place of the missing word in a sentence. Nothing less, but also nothing more than that.
That means that everything GPT-3 and the rest of language models produced is simply the most probable text on the places where it is asked for, based on the training text corpora. The keyword here is “probable”, not the “exactly needed”.
GPT-3 writing poems? I bet they are beautiful.
But, GPT-3 writing code and SQL quries? I doubt this will pass in the production environment How about writing unit tests with GPT-3 for testing code written by GPT-3? What GPT-3 can produce here is the most probable ending of started code segment or SQL query. It will “probably” be correct having having programming/querying language syntax in mind, and compiler/interpreter will know if it is. But, will it be semantically corrrect? All we cay is “maybe”.
Some other posts for GPT-3 on this site:
GPT-3 is nothing more than very successfull random text generator. Let’s see what GPT-5 will be able to do.