Sep 21, 2023 9 min read prompt-engineering

Mastering Effective Prompting Part 1

This is the first part of a short series of articles I wrote about effective prompting. This part introduces a few concepts that will help you understand how LLMs work and their limitations. After that, I explain two very simple techniques that will make your chatbot give you more relevant information.

In part two, I will go to more advanced techniques, namely Chain of Thought, prompt templating, and techniques I use to control the chat flow. Stay tuned for that one!

Introduction: How you ask matters

"Will AI steal my job?" This is the question people ponder nowadays all the time. My personal opinion is that AI won't take your job, but people who know how to use AI will. If your experience with an AI assistant like ChatGPT was "Meh, it does not really work that well," I can guarantee you that the issue was in the prompt. Prompting is akin to "googling". The result will be suboptimal if you don't know what you are doing. Think about when you have used Google search for the first time and how you are using it now. How does the way you search change the results? My experience with prompting is that it matters a lot. In many ways, prompting AI effectively can be even more nuanced than crafting the perfect query on search engines like Google.

I apply LLMs in my work, which means I must be up to date with the latest research on LLM's capabilities. The software I build heavily uses the technology, and a lot of it is writing prompt templates. A month ago, I showed how the software works to one of my friends. In his words, it "blew my mind." The same reaction happened to another friend of mine a week later. Both people are highly technical and have tried and used AI assistants like chatbots and coding assistants. Capabilities of my product aside, what was extraordinary for them were the prompts and interactions. I figured some things I was doing might not be intuitive, but seeing them in action made them reconsider using AI in their work.

In this series, I want to show you examples of prompting techniques that will elevate your experience from 'Meh' to 'Oh my god 🤯'

Why Short Questions to Chatbots Might Fail

I used ChatGPT as a search engine for a while, asking simple questions. With a search engine, you typically would not input a paragraph. Doing so often results in no matching outcomes. So, search engines want you to be brief but specific. Indeed, when using a search engine, you are not asking questions. You are providing keywords to search for on the web. LLMs are a completely different story. LLMs and, by extension, chatbots benefit from the context and specific instructions you provide.

To understand why, I will give you a very high-level overview of how LLM works, which hopefully will help you gain some intuition about how to use it and explain its shortcomings.

(Very) Brief Intro Into LLMs

First, LLM does not see the text you give to the chatbot. It sees tokens. A token is a simple number that references a character like a or, a word like apple, or part of a word like do in doing. Those references build LLM's internal vocabulary during its initial training. This is important, as that explains, for example, why your chatbot does not like spelling errors so much or that it struggles with manipulating made-up or misspelled words. It is also essential to understand that the model builds such a vocabulary itself, and humans do not control it.

The way LLM generates text is that it "simply" tries to predict the next token. So it takes your prompt, turns it into tokens, and then tries to predict what is the next number (token) after the one you have just provided. It does it one token at a time. I want to emphasize that it also considers tokens that it generated.

It does not have up-to-date information. The only thing that the model knows is a) what it was trained with (usually a few years behind), b) what you have provided in your prompt, and c) what it has generated answering your prompt.

The last thing I want to explain is a "Context Window." Simply put, you cannot give an LLM an unlimited prompt. The limit depends on the model and varies dramatically. Now that you know what a token is, you can more-or-less understand when you read that GPT-4 has 8192 tokens limit. It means your prompt AND the model's answer (called inference) cannot exceed that limit. So for sake of this article "Context Window", is an absolute limit of tokens LLM can process at ones.

Key Takeaways:

LLMs (trasformer-based) don't "see" text. They operate tokens.
LLMs consider their own generated text.
LLMs do not have up-to-date information unless you provide it to them.
Prompt has a hard limit, and that limit includes the answer (inference) LLM gives you

The Art of Providing Context: Improving AI Communication

When creating a prompt, context is everything. The more relevant information you provide, the more relevant answer you will receive. Let's look at an example of an interaction with ChatGPT. I began with a vague question: How do I get carrots?, which resulted in an equally ambiguous answer ranging from "online grocery shopping" to "foraging." Then, I narrowed the question, eventually resulting in precise information, including the names of stores where I actually buy carrots.

Here is the mental model for you. Imagine the chatbot to be one of those goofy, super smart characters like Sheldon from "The Big Bang." It is clueless about your needs, and it does not care about you a bit. However, it likes puzzles, and unlike Sheldon, it really wants to answer the question. So, if you ask it an ambiguous question, it will answer it literally, sometimes resulting in laughter from the audience.

The more specific your question is, the more context you need to provide. You can also be specific about what you expect it to do, and in most cases, it will do it. Here is another example. First, I ask it to write a function without defining anything specific. So, the resulting code is in Python. For the second prompt, I gave precise instructions about the language, coding style, etc. It is also worth noting that it does not always do what you ask, as you can see in that example when I explicitly asked it not to show how to use the code. We will see how to fix that in the next section.

As mentioned before, "Context Window" is limited. If you have ever had a very long chat with ChatGPT, you might have noticed that it "forgets" some information you gave earlier. I do not believe OpenAI ever shared details on how ChatGPT manages context, so I cannot tell you how it works. I can only assume they are using some summarization to hold a running summary of the chat when it rolls over the limit. Whatever the way it is managed, you must be aware of this limit. If you need long conversations, restate the limitations you want to impose and the relevant context you want to give.

Key Takeaways:

Context is everything. The more specific you are, the more relevant the result.
Your chat with ChatGPT is endless, but the "Context Window" is not. "Remind" your chatbot what you want if your chat is long. Remember, all text counts, not only yours.
Bonus advice that comes from the previous point. You can and should control how chat answers you. Here is an example of such a chat. In that example, I set the format first and then use short questions and have the answer in an exact format every time.

Prompting Techniques: Personas, Few-Shot Inference

While interacting with an LLM might feel like chatting with a friend, it is closer to programming. As developers employ certain techniques to optimize code, we can apply techniques to optimize our interactions with AI. Do not be alarmed. There is nothing about programming here, and you do not need to know how to code to use those.

Using Personas for Tailored Responses

A persona, in the context of interacting with an AI like ChatGPT, is a predefined character or role we assign to the chatbot. It is like giving the AI a temporary identity that shapes the nature of its responses.

I had to incorporate a company this year, and not being a lawyer myself made it challenging to go over all the paperwork my lawyer sent me for a signature. I do not suggest you use ChatGPT instead of a lawyer. Always consult with a professional for legal matters. In my case, I wanted to understand everything I am signing, and usually, it takes me hours to go over a few-page contract, like an employment agreement. With incorporation, I had around 80 pages. It would take me a week, and by the middle of that work, I would give up the idea of entrepreneurship. Nothing worth that much misery. I have asked ChatGPT to assume a persona and explain the text I am giving to it.

For obvious reasons, I will not share that chat with you, but here is what my first message looked like:

You are a corporate lawyer in BC, Canada, specializing in tech startups. 
I am building a tech startup in BC, Canada. 
My lawyer already crafted the incorporation documents for me. I want to review and understand them. My background is in software engineering, and I have a very limited understanding of the legal system and lingo.
I will be giving you excerpts from the documents, and you will:
1. "Translate" it to me. Explain what it is about, explain any legal terms present in the text
2. Highlight any possible issues in the text that can put me or my company in trouble now or later.
...

As you might guess, this chat was very long, and ChatGPT was "forgetting" what it was doing. All I had to do was paste this instruction again when that happened and get going.

By assigning a persona, we give a chatbot a context and perspective. This can lead to more nuanced, tailored, and insightful answers that align closely with what we seek.

Your personas can range from You are experienced baker (Yes I use ChatGPT for recipes) to You are a software quality assurance engineer. You are thorow, you analize requirements, always consider all the edge cases, and provide deep insights about user experience.

Of all the techniques I describe in this article, I use personas in all situations. As it does not have to be long, I might say, "You are a TypeScript developer," to know whom am I talking to 😉

Key Takeaways:

Start with a clear introduction of the persona.
Define the background, expertise, and any relevant limitations.
Outline the specific tasks or questions you have for the persona.
If necessary, remind the chatbot of the persona in extended conversations.

Few-Shot Inference: Guiding with Examples

In simple words, "Few-Shot Inference" is when you provide examples of desired output. This technique is also known as "In context learning" because it learns (LLM is a machine learning model) in the given context (we have already discussed "Context Window"). You can treat "shot" as an "example", so it really means "few examples".

GPT models used in ChatGPT are already very good at following instructions. However, even those models sometimes fail to follow your instructions exactly how you want. You have seen an example I provided above, where the chatbot failed to follow some of my instructions. Let us amend that prompt with "One Shot Inference" to help it understand how we want the output to look. Here is the resulting chat for this prompt:

<same instructions as before>. For example:
User: Function that adds two numbers
AI:
/**
 * This function adds two numbers together.
 *
 * @param {number} num1 The first number.
 * @param {number} num2 The second number.
 * @returns {number} The sum of the two numbers.
*/
function addNumbers(num1: number, num2: number): number {
  return num1 + num2;
}

As a result, the response from the chatbot is precisely what I am looking for. You do not have to use "User" and "AI" to describe input and output. As long as it is clear from the context, the model will understand you.

The example in your prompt does not have to be complex or exhaustive. As you can see in my prompt, I gave it the dumbest example that makes sense, and it still figured it out. It needs to understand the structure you want, so if you fail to explain it or explaining it is too hard, give an example.

If what you are trying to do is more complex, try a few examples ("few shot"), but I would not recommend going over 2-3 with GPT models. In my experience, if it did not figure out what you want after three examples, adding more will just eat up the tokens. In such cases, you must use other techniques to achieve better results. It is also worth experimenting with examples to gain better intuition about how much information is enough.

At the end of the day, what is the better way to disambiguate something than providing an example? I think this technique is the most intuitive and the one that makes the chatbot obey your structure the most.

Key Takeaways:

Use simple, clear examples.
Limit the number of examples to 2-3.
If the AI struggles after multiple examples, consider rephrasing or simplifying the request.

Conclusion

Hopefully information here will help you realize that there is much more to interaction with your chatbot than just asking one-sentence questions.

In the next part, we will go to more elaborate techniques that will provide more insight into how the model "thinks" (Chain of Thought), and how you can reuse prompts, and I will demonstrate how you can assume complete control over chat conversation.

Thank you for reading Part One of "Mastering Effective Prompting." Your feedback is valuable, so please let me know your thoughts. What techniques are you using when interacting with a chatbot?