ChatGPT was introduced in November 2022, sparking a revolution of generative AI adoption. Before its publication, only academia-related people, researchers, and experts in the field knew about GenAI and the myriad possibilities it brings. By August 2023, McKinsey reported that one-third of the users surveyed already have used generative AI to enhance their work.
With the publication of the paper, ‘Attention is all you need’ in 2017, transformer architecture and attention mechanism made way for plethora of large language models (LLMs) developed by both research groups and companies, as shown in the figure, some of which are open sourced and available at (https://huggingface.co/models).
Previously, users needed to learn to code in various computer programming languages like C, C++, Python, Java, SQL etc., to get the computer to understand what they wanted to do and use an AI model, which created a barrier to entry when trying to use NL models.
Now with using natural language via an AI prompt to state what they want to accomplish, this breakthrough has been a turning point in the adoption and is indeed helping democratize AI.
ChatGPT, which was introduced in Nov-2022, has gained popularity and now practically everyone is looking for ways to use it to improve their productivity. ChatGPT’s popularity stems from the fact that it can take NL input from the user and return appropriate responses in a variety of formats such as natural language, code, etc.
To ensure the adoption of these models continue to grow, the output from the models has to be of acceptable quality. The quality of the responses that a GenAI application like ChatGPT returns not only depends on the underlying model but also on the types of prompts the user provides to the model.
So as an end user, we have limited levers to change the underlying models (settings like max response, temperature, top-p, frequency penalty, presence penalty etc.) but can tweak the response by learning how to prompt the model effectively.
Often the user might notice that the same prompt might elicit different responses, to ensure the response stays more or less consistent while the user experiments with the prompts, the user could start off by setting the model’s ‘temperature’ parameter to equal 0.
Before we deep-dive into how to create a good prompt, let’s get started by defining what a ‘prompt’ is. ‘Prompts’ are simply ways the user can tell an application like ChatGPT what the user wants it to do (for example: summarize a text, write a joke etc.).
Next, the term ‘prompt engineering’ describes the process of improving the prompt, where users who use the applications can improve the quality of the generated responses.
Let’s also define a few additional terms that the user might encounter when using GenAI applications like ChatGPT and develop their own custom prompts.
Artificial Intelligence Prompt Engineer manual - Designing prompts and tips on how to create good prompts, what to avoid when creating prompts.
While the output in majority of the cases seem coherent, one of the major issues with putting these models into production is that the output generated could be completely inaccurate, ungrounded (fabricated with no citation). To prevent inaccurate responses, the prompts can also include grounding data to provide additional context.
To do so, an AI prompt engineer should include the contextual data in the prompt so that the model can use it to generate an appropriate output. Unlike the initial few models like GPT-3.5 which had a context window of 4,096 tokens, the latest GPT4-Turbo model has a rather large context window of 128K tokens (tokens are essentially pieces of words, typically a token is approximately equal to 3/4th of a word in English).
Often editing the system prompt also might help reduce model hallucinations or ungrounded responses.
For example, ‘You are an AI assistant that helps people find information, use only the information provided to generate a response.’. If needed, the user can even add such text to prevent ungrounded responses ‘If there isn't enough information below, say you don't know. Do not generate answers that don't use the sources below’.
Users have to also bear in mind that these LLMs are pre-trained on large datasets (often open source from the internet and other sources) but at a specific point in time, hence they lack knowledge about concepts outside that training dataset (for example if the training data is from 2022 and the user asks about an event from 2023, the models tends to give inaccurate information unless additional data is provided). Here the user can provide more data to the context to help generate data for the newer timeframe.
As time goes on, newer models or model generations will have richer capabilities but also bring unique quirks and trade-offs in terms of cost & complexity. Often, the latest and more expensive GPT-4 models might not be needed for your use case, hence experimenting with GPT-3 models might be worth investing some time in to check for the output generated by the different models before finalizing on the model to be used in production.
Prior to putting these prompts into production systems, it is also good to be aware of the vast domain of ‘prompt hacking’ and how these systems need to be safeguarded against malicious users.
‘Prompt hacking’ is a term used to describe a type of attack that exploits the vulnerabilities of LLMs, by manipulating their inputs or prompts. Unlike traditional hacking, which typically exploits software vulnerabilities, prompt hacking relies on carefully crafting prompts to deceive the LLM into performing unintended actions like Prompt injection, Prompt leaking, Jailbreaking.
Prompt engineering as a domain is still evolving. It is a super-fresh, new and exciting field to explore. It is an opportunity for data science experts and non-technical people alike by introducing the natural language into the human-machine relations.
To learn more about common AI terminology, refer to https://news.microsoft.com/10-ai-terms and for more information about the mathematical principles behind these LLMs, refer to Generative AI for Beginners.