Skip to main content

Prompt engineering is all about crafting inputs—whether text, code, or images—that guide AI models like OpenAI’s large language models or Stable Diffusion to produce specific, high-quality results. In this blog post, we’ll cover the basics, explore various techniques, and touch on challenges like prompt hijacking.

What is Prompt Engineering?

Think of prompt engineering as the art of writing the perfect instructions to get AI to do what you want. It’s an iterative process, similar to tweaking and refining data in machine learning, where you test and adjust until you get the right outcome.

Why Does It Matter?

Prompts help steer AI models, like the GPT series, by providing clear directions. Without them, models might produce errors or irrelevant results. Effective prompt engineering makes AI output more reliable and closer to what you actually need.

Examples

Let’s take some examples to understand the idea very well.

Text Generation: A small tweak in a prompt can change the outcome. For instance, asking GPT-3.5 “813 * 99” might give you a wrong answer, but adding a question mark shows you’re asking for a calculation, and the model gets it right (Figure 1).

Figure 1: Text Generation Prompting Example

Image Generation: Changing a prompt from “Strawberry Panda” to “Strawberry Panda Steampunk” can shift the style dramatically, as seen in Figures 2 and 3.

Figure 2: Strawberry panda (generated by Bing Image Creator)
Figure 3: Strawberry panda steampunk (generated by Bing Image Creator)

Prompt engineering isn’t a one-size-fits-all process—it’s a mix of creativity and strategy, tailored to the task and model you’re working with.

Basics of Prompt Engineering

Prompt engineering isn’t just about asking an AI model to do something; it’s about asking in the right way to get the best results. The quality of your prompt heavily influences the model’s output, and this blog post breaks down the essential elements that make up an effective prompt.

At its core, a prompt typically includes six key elements:

  • Instructions: What you want the model to do, like a task or question.
  • Primary Content: The main information the model should focus on.
  • Input Examples: Templates or formatting rules that help the model understand what you’re looking for.
  • Output Examples: Desired qualities of the output or specific templates for it to follow.
  • Cue: Context or hints that guide the model’s response.
  • Supporting Content: Additional information that can influence the output, different from the main content.
Figure 4 visually outlines the elements in a prompt.

When constructing a prompt, the sequence matters, especially with models like GPT. Clear instructions should come first, followed by other details, as this setup typically yields higher-quality results.

Practical Example:

Consider a prompt that asks for name suggestions for a new pet salon:

„Suggest three names for a new pet salon business. The names should evoke positive emotions and highlight features like professionalism, friendliness, and personalized service. Consider using rhymes, puns, or positive adjectives.“

This prompt starts with clear instructions, followed by primary content (the type of business), and includes cues about the desired tone and style.

The Process of Prompt Engineering

Prompt engineering is similar to building a machine learning model—it involves a lot of trial and error. You test different aspects of prompts, evaluate the results, and refine them as needed. This process, often referred to as “PromptOps,” is like MLOps but focuses on the operational aspects of prompt engineering, such as testing, evaluation, deployment, and monitoring.

Figure 5 illustrates the prompt engineering process.

To make prompt engineering more efficient, especially at scale, new tools like Prompt Flow and LangChain are emerging, helping to build and manage LLM-powered applications.

Evaluating and Refining Prompts

Once you’ve crafted a prompt, it’s essential to analyze the output and make adjustments. This can involve:

  • Adding or removing keywords: To guide the model towards more specific responses.
  • Rephrasing: To clarify the intent or focus.
  • Rearranging words: To improve grammatical structure and clarity.
  • Splitting prompts: To break down complex queries into more focused questions.
  • Adjusting model parameters: Tweaking settings like temperature or frequency penalty to refine the output.

Here are a few examples of prompt refinement in different areas:

Data Analysis:
Original: „Analyze sales data.“
Modified: „Generate a concise report detailing quarterly sales trends over the last two years, focusing on top-performing products.“

Email Drafts:
Original:
„Draft an email about the meeting.“
Rephrased: „Compose a professional email summarizing key decisions from the recent strategic planning meeting.“

For enterprises, the precision of prompts is crucial as they directly impact business outcomes. Carefully engineered prompts can extract valuable insights and drive better decision-making.

In-Context Learning and In-Context Prompting

In-context learning is a fascinating departure from traditional machine learning methods. Unlike standard ML models that require large datasets of labeled examples for training, in-context learning allows a model to pick up a new task using just a few examples provided in a prompt during inference. What’s particularly intriguing is that the model wasn’t specifically pre-trained to learn this way—it’s an emergent property that we don’t fully understand yet.

Traditional machine learning models often demand rigid prompt structures. If the input doesn’t match the expected format precisely, the model might fail to deliver the desired output. This rigidity was a common issue with many early chatbots before the advent of large language models (LLMs). In contrast, in-context learning allows a model to adapt quickly to new information or tasks using minimal examples, as illustrated in Figure 6.

Figure 6 shows an example of in-context learning.

This technique has several advantages over traditional ML approaches:

  • No Need for Labeled Data: In-context learning doesn’t require pre-labeled data, making it invaluable in situations where labeled data is scarce or costly.
  • Flexibility: It allows us to teach an LLM various tasks without the need for retraining.

For instance, suppose you want the model to convert temperatures from Celsius to Fahrenheit. By providing a few examples within the prompt, as shown in Figure 7, you can guide the model to perform the conversion correctly.

Figure 7 demonstrates in-context learning with a temperature conversion example.

When we discuss prompt engineering, we’re often talking about in-context prompting. This is a technique where we use prompts to direct the output of generative AI models. It involves giving the model a description of the desired task along with examples of the desired output.

While in-context learning and in-context prompting are closely related, they address different aspects:

  • In-context Learning: The model adapts to new tasks or information based on the provided context without needing extensive retraining.
  • In-context Prompting: The model understands and generates responses based on the provided context using flexible and natural inputs.

Both concepts hinge on the idea of context—one focuses on learning from it, and the other on generating responses based on it.

Prompt Engineering Techniques

Prompt engineering is a versatile tool that adapts across various model types. Depending on the model and API you use, the way you format your input may vary. For instance, when working with OpenAI’s GPT models, two main APIs support prompt engineering:

  • Chat Completion API: This API is used with GPT-3.5-Turbo and GPT-4 models. These models expect input as an array of dictionaries, simulating a chat-like transcript.
  • Completion API: This API is used with older GPT-3 models and takes input as a simple text string. While GPT-3.5-Turbo can also use this API, the Chat Completion API is recommended.

System Message

The Chat Completion API, used by newer models, is ideal for setting up context, instructions, examples, and cues through a system message. For instance, you can instruct the model to answer with „I don’t know“ when unsure or refuse to answer off-topic queries.

Here’s a simple example using a system message to limit the model’s conversation scope to pets:

Running this code will show how the model sticks to the given scope, as seen in Figure 8.

Figure 8: System message for prompt engineering

Entity Extraction to Structured Output

Building on this concept, you can also instruct the model to output information in a specific format, such as JSON. For example:

When you run this, you might see unwanted fields in the output. By refining the prompt to be more explicit, such as instructing the model not to add extra fields, you can achieve the desired structured output. This approach is demonstrated in Figure 9.

Figure 9: Entity extraction to structured output example

Zero-Shot, Few-Shot, and Many-Shot Learning

These concepts refer to the model’s ability to perform tasks with varying amounts of prior examples:

  • Zero-Shot Learning: The model performs a task without having seen any specific examples during training. For instance, translating a sentence from English to Spanish without prior translation examples.
  • Few-Shot Learning: The model is given a few examples of a task, helping it understand how to perform it. For instance, defining imaginary words after being given a couple of examples.
  • Many-Shot Learning: The model is provided with many examples, which could range from tens to hundreds, to improve its understanding of more complex tasks.

Use Clear Syntax

Clear syntax involves using proper punctuation, words, and formatting to enhance the model’s understanding. Here are some tips:

  • Clear Intent: Use explicit words and verbs, as if explaining to a child.
  • Structure: Include the format you want in the response, such as lists or JSON schemas.
  • Separators: Use symbols like „###“ or „—“ to distinguish different prompt sections.
  • Grammar: Pay attention to grammar and punctuation, which help the model recognize sentence boundaries.
  • Headings and Subheadings: Organize your prompt with headings and bullet points.

Table 1 shows examples of prompts with varying clarity.

TaskOriginal PromptBetter Prompt
Translate a sentence from English to FrenchTranslate thisTranslate the following English sentence into French: “…”
Summarize a news articleSummarize this articleWrite a summary of this news article’s main points and key details in three sentences or less. Use your own words.
Table 1: Example of Prompt clarity

Making In-Context Learning Work

In-context learning is the technique of providing the model with examples during inference to guide its understanding of the task. The structure, distribution, and format of these examples significantly affect the model’s performance. Even if the labels are not entirely accurate, how they are presented can make a difference.

Reasoning – Chain of Thought (CoT)

Chain of Thought (CoT) is a technique that improves a model’s reasoning by guiding it through a sequence of prompts that build upon each other. For example, when explaining a complex topic like photosynthesis, each prompt refines the model’s response, leading to a deeper understanding.

  • Zero-Shot CoT: Adding phrases like „Take a step-by-step approach“ in the prompt encourages the model to reason logically through a problem.

This method helps break down complex tasks into manageable steps, enhancing the model’s performance on tasks like question answering, translation, and code generation.

Image Prompting

Image prompting is a specific form of prompt engineering aimed at guiding an image generation model to produce a desired visual output. An image prompt typically consists of three main components:

  • Image Content: The subject or scene of the image, such as „a panda on a couch“ or „a city at sunset.“
  • Art Form and Style: The aesthetic appearance of the image, like „watercolor painting“ or „pixel art.“
  • Additional Details: Further specifications, such as „the panda is sleeping“ or „the city has a futuristic vibe.“

The general structure of an image prompt is as follows:

  • [Main subject of the image, description of action, state, mood]
  • [Art form, art style, artist references, if any]
  • [Additional settings, such as lighting, colors, framing]

For instance, consider the following prompt: „strawberry panda on mars, waving, happy mood.“ Using DALLE-3 to generate an image with this prompt might produce an image similar to the one shown in Figure 10 below.

Figure 10: Bing Create: strawberry panda on mars, waving, happy mood

By refining the prompt with more details, such as „strawberry panda on mars, waving, happy mood, earth in the distant background, realistic, colorful, 8k,“ the generated image becomes even more specific, as illustrated in Figure 11.

Figure 11: Bing Create: strawberry panda on mars, waving, happy mood, earth in the distant background, realistic, colorful, 8k

In this example, adding elements like „earth in the background“ and specifying „realistic, colorful, 8k“ significantly influences the final output. The inclusion of „8k“ enhances the image’s detail, though it does not necessarily affect the resolution.

While the permutations and combinations for image prompts are vast and dependent on the AI model being used, here are some key aspects to consider:

  • Art Medium: Options include drawing, painting, ink, origami, mosaic, pottery, etc.
  • Camera: Lens perspective, camera settings, etc.
  • Display and Resolution: Factors influencing visual quality.
  • Lighting: Types, intensity, and effects.
  • Material: Whether it’s metal, cloth, glass, wood, or other substances.

Image prompting is a powerful tool for generating diverse and visually stunning images from text descriptions. However, it’s essential to remember that this process is not deterministic—meaning that the same prompt might yield different results each time it’s used. This variability stems from the model’s inherent randomness and creative capabilities, which can lead to novel but sometimes unexpected outputs.

Users should keep the following considerations in mind when working with image prompts:

  • Experimentation: Try different prompts and parameters. Small changes in wording or added details can significantly impact the quality and relevance of the generated images.
  • Critical Evaluation: Always critically assess the generated images. Don’t blindly accept them as accurate or realistic representations of the prompt. Check for errors, inconsistencies, or artifacts that may suggest a mismatch with the intended output. Be mindful of the ethical and social implications, especially when dealing with sensitive topics.
  • Supplemental Sources: Rely on additional sources of information or feedback beyond the AI-generated images. Consult existing images, data, experts, or peers to verify, enhance, or complement the generated content.

This approach ensures that the generated images align more closely with your expectations and are of high quality.

Prompt Hijacking

Prompt injection is a newly emerging attack vector specific to Large Language Models (LLMs), allowing attackers to manipulate the output of these models. This type of attack is particularly concerning as LLMs are increasingly being integrated with “plug-ins” that enable them to access up-to-date information, perform complex calculations, or generate graphical content. Prompt injection attacks can be classified into two main types: direct and indirect.

Direct Prompt Injection

In this form, a malicious user enters a text prompt into an LLM or chatbot designed to overwrite existing system prompts, causing the LLM or chatbot to perform unauthorized actions. For example, as shown in Figure 12, the attacker might instruct the chatbot to ignore moderation guidelines and generate unrestricted outputs.

Figure 12: Prompt Injection attack example

Indirect Prompt Injection

This occurs when a malicious user manipulates the data source of the LLM, such as a website, to influence the input and output of the LLM or chatbot. For instance, an attacker could embed a malicious prompt on a webpage that an LLM scans, leading the model to produce harmful content, as in the example of instructing a chatbot to generate and send malware via email.

Common Examples of Prompt Injection Attacks:

  • A malicious user uses direct prompt injection to override system prompts and force the LLM to return private or dangerous information.
  • A user leverages an LLM to summarize a webpage containing an indirect prompt injection, potentially leading the LLM to extract sensitive information.
  • A user activates a plugin connected to a bank or similar service, where rogue instructions embedded in a visited website exploit this plugin, resulting in unauthorized purchases.
  • A document containing a prompt injection is uploaded, instructing the LLM to falsely claim that the document is excellent, affecting the summary generated by internal users.
  • Rogue content embedded in a website exploits other plugins to deceive users.

Prompt injection is a continuous challenge, akin to a cat-and-mouse game. While some of the simpler attacks are being mitigated, often through AI classifiers or improved steerability of underlying models like GPT-4, the risk remains significant. Figure 13 illustrates a Bing prompt injection mitigation example.

Figure 13: Bing prompt injection mitigation example

Best Practices to Mitigate Prompt Injection Attacks:

  • Implement Prompt Engineering Best Practices: Use correct delimiters, provide clear instructions and examples, and ensure high-quality data.
  • Use Classifiers: Detect and filter out malicious prompts or inputs before they are fed to the LLM.
  • Sanitize User Input: Remove or escape any special characters or symbols that could be used to inject malicious instructions.
  • Filter the Output: Check for anomalies such as unexpected content, format, or length, and use classifiers to detect and filter out malicious outputs.
  • Monitor Model Outputs: Regularly review model outputs for signs of compromise or manipulation, and set up automated alerts for suspicious activity.
  • Use Parameterized Queries: Prevent user input from modifying the chatbot’s prompt by using placeholders or variables instead of directly concatenating input with the prompt.
  • Secure Sensitive Information: Encrypt and securely store any sensitive information that the chatbot might access to prevent leakage through prompt injection attacks.

Conclusion

Prompt engineering is a transformative approach in the realm of AI, enabling more refined and effective interactions with Large Language Models (LLMs). As we’ve explored, the technique involves a careful balance of creativity, specificity, and technical precision, all while navigating inherent challenges such as model limitations, token constraints, and the risk of overfitting. Additionally, the emerging threat of prompt hijacking underscores the importance of robust security practices in this evolving field.

Mastering prompt engineering requires an understanding of both the strengths and vulnerabilities of LLMs. By adhering to best practices—such as being specific in instructions, breaking down complex tasks, and using varied prompts—AI practitioners can unlock the full potential of LLMs while mitigating risks. However, as AI continues to advance, the landscape of prompt engineering will evolve, bringing new challenges and opportunities.

In essence, prompt engineering is not just about optimizing AI outputs but also about shaping the future of human-computer interaction. By refining our approach to crafting prompts, we can ensure that AI systems become more reliable, secure, and aligned with our intentions, paving the way for innovative applications across industries.