Using prompt-based methods with Transformers holds great promise for businesses lacking large quantities of labelled data and training resource. However, fundamental weaknesses currently prevent their safe and successful use in real-world applications.
Natural language processing (NLP) and large language models are helping businesses break new barriers in analytics, process automation and business transformation. Emerging applications like AI language generation and question answering have advanced leaps and bounds with the advent of models like GPT-3, even if business is still reaching for an appropriate real-world use case.
Transformers - a revolutionary machine learning architecture - have been the not so secret sauce behind these advances, greatly enhancing the performance of NLP models especially on large data sets. Yet nothing in AI ever remains static. The research community is constantly finding new ways to enhance and get more value from Transformers through faster, more efficient training.
The traditional approach has been to take a pre-trained Transformer language model and fine-tune it on the specific task desired. More recently, however, prompt-based learning - or prompting - has been rising in popularity.
Our resident research scientist Harshil Shah has written a great piece on the state of prompt-based learning, its advantages and disadvantages, and provides guidance on whether or not it’s the right training method for your business or use case.
What is prompt-based learning?
Prompt-based learning is an emerging group of ML model training methods. In prompting, users directly specify the task they want completed in natural language for the pre-trained language model to interpret and complete. This contrasts with traditional Transformer training methods where models are first pre-trained using unlabelled data and then fine-tuned, using labelled data, for the desired downstream task.
A prompt is essentially an instruction written in natural language by the user for the model to execute or complete. Depending on the complexity of the task being trained for, several prompts may be required. Creating the best prompt, or series of prompts, for the desired use task is a process known as ‘prompt engineering’.
For example, if you wanted to use prompt-based learning to train a model to accurately analyse the sentiment of the hotel review ‘no reason to stay’, you could add a prompt to the sentence to get ‘no reason to stay. It was [blank]’. You would reasonably expect the model to generate ‘terrible’ more often than ‘great’.
Why is prompt-based learning important?
Prompt-based learning has numerous advantages over the traditional pre-train, fine-tune paradigm. The biggest advantage is that prompting generally works well with small amounts of labeled data. With GPT-3, for example, it’s possible to achieve strong performance on certain tasks with only one labelled example.
Indeed, research has shown that a single prompt may be comparable to training with 100 conventional data points. This suggests that prompting could enable a massive advance in training efficiency, meaning less cost, less energy expended and faster time to value with AI models. This makes prompt-based learning a tantalising prospect for many businesses seeking to leverage and train their own NLP models.
Should I use prompt-based learning?
Prompting has many advantages, but it isn’t quite the silver bullet that will solve all challenges associated with model training. In time, however, it will likely create its own paradigm in NLP development.
Prompting shows strong performance with a small number of labelled training examples. This could be highly beneficial in applications where there is very little structured and annotated data, or where there is a lack of subject matter experts available to train the model.
However, prompts are often prone to hallucination and the production of biased and offensive outputs. Furthermore, prompt-based methods often rely on hand-designed prompts, which in turn depend on time-consuming prompt engineering for their success. To date, these factors make them risky and inefficient to use in real-world business settings.
Prompt-based learning holds great potential, but we have so far only glimpsed its green shoots. Much work and research still has to be done before they can become practical for real-world business use.
At Re:infer, we are actively working to make prompt-based methods safe and efficient to use. For our full, initial overview of prompting, its advantages, disadvantages and applications, read Harshil’s comprehensive, technical analysis of the topic.