Prompting Guide
What is a Prompt?
A prompt is the instruction provided to an AI model, such as GPT-4, to generate a response. For voice agents, it defines how they interpret user intent and deliver accurate, context-aware answers. Crafting effective prompts is essential to ensure clear, efficient, and engaging interactions, as poorly designed prompts can lead to confusion or irrelevant responses.
This guide will equip you with the principles and best practices to design prompts perfectly aligned with your objectives.
What is a Good Prompt?
A good prompt is concise, clear, and structured to guide the AI effectively.
-
Conciseness and Clarity: Short, straightforward sentences are essential to avoid ambiguity, especially for voice agents operating in real-time. Precise instructions ensure the AI understands the task fully and minimize the risk of irrelevant or incomplete responses.
-
Structure: Well-organized prompts help the model identify the distinct elements of the script, improving coherence and accuracy.
Example
Bad Prompt:
“Ask the client for his name and ask him if his available tomorrow, if not ask him for the day after tomorrow, and after the meeting is booked ask for his number”
Improved Prompt:
“First, ask the client for their name. Then, check if they are available tomorrow. If they’re not, ask about their availability the day after. Once the meeting is scheduled, request their phone number.”
Rounded Recommended Format
Global Settings Configuration for the Voice Agent
In the Global Settings section, we recommend configuring the voice agent according to the following plan:
Role
This defines the voice agent’s identity and the persona it should adopt. Whether the agent is acting as a helpful assistant, a customer service representative, or a more specific character, this sets the foundation for how the agent interacts with users.
Personality
Here, you define the tone of the agent. For example, it might be friendly, polite, professional, persuasive when necessary, or even empathetic depending on the context. The personality shapes how the agent’s responses feel to the user.
Style of Language
This refers to how the agent should communicate, including the formality of the language. Should the agent use formal language (like “vous” in French), be more casual, or perhaps use informal language with a conversational style? Specify if there are any specific phrasing preferences or any phrasing types to avoid.
Conversation Context
This section sets the stage for the conversation. For example, if the conversation occurs on a specific date, or if the agent is responding to a client call for a particular purpose (e.g., scheduling a meeting or answering a query), this context will ensure the agent tailors its responses accordingly.
These sections must be distinct, with signs indicating the start of a new section, for example: #### Role:
These Global Settings establish the framework for your voice agent. The instructions provided here will apply to all prompts associated with the agent, ensuring consistency and alignment throughout its interactions.
Task-Specific Prompting Format for the Voice Agent
For each task your voice agent is responsible for, use the following format to ensure clarity and consistency in its actions:
Objective
Define the overall goal that the agent should accomplish in this specific task. This is the broader outcome the agent is working towards, such as scheduling an appointment, answering a query, or confirming details.
Instructions
Provide a step-by-step breakdown of the actions the agent must take to achieve the objective. This includes things to say and actions to perform. Ensure the instructions are clear and precise to guide the agent effectively through the task.
In Case of Misunderstanding (Optional)
Explain what the agent should do if it encounters confusion or a misunderstanding from the user. This could involve asking for clarification, rephrasing, or providing a fallback response. Anticipating potential misunderstandings helps ensure that the agent can handle unexpected situations smoothly.
These sections must be distinct, with signs indicating the start of a new section, for example: #### Objective:
By following this structure, you ensure that the voice agent remains focused on the objective, can perform tasks efficiently, and is prepared to handle any uncertainties that may arise during the conversation.
Separate Tasks and Use Step-by-Step Interactions
To ensure the voice agent operates smoothly and doesn’t overwhelm the user by asking multiple questions at once, it’s essential to clearly separate tasks and present them in a logical, step-by-step manner.
Best Practices
-
List Tasks Distinctly: Structure each task or question as a separate instruction in the prompt. This avoids confusion and helps the agent prioritize and execute tasks in the correct order.
-
Include “Wait for User Response” (If Needed): If the agent doesn’t naturally pause for user input, explicitly include a directive like “Wait for user response before proceeding.” This ensures the interaction feels natural and interactive, allowing the user time to respond. However, if the agent already handles pauses naturally, this step can be omitted.
-
Logical Flow: Arrange tasks in a sequence that mirrors a typical human conversation, progressing naturally from one topic to the next.
Example
Instead of:
“Ask the user for their name, email, and phone number.”
Use this approach:
- Ask the user’s name: “What’s your name?”
- Wait for user response (if necessary).
- Ask for their email: “Can you share your email address, please?”
- Wait for user response (if necessary).
- Ask for their phone number: “Finally, what’s your phone number?”
By applying this method, the interaction becomes more user-friendly and reduces the risk of overwhelming the user or causing confusion.
Best Practices Guides
Single-Prompt vs Multi-Prompt Approach
When designing voice agents, two prompting approaches are commonly used: the single-prompt approach and the multi-prompt approach.
Single-Prompt Approach
The single-prompt approach involves using a single, complete prompt to define the entire voice agent. It is simple to implement and works well for simpler agents, but it comes with limitations for more complex agents. As the prompt grows in length, it can become harder to maintain consistent responses, instructions can become less precise, and function calls may become less reliable. This can lead to answers deviating from expectations, ultimately affecting the user experience.
Multi-Prompt Approach
The multi-prompt approach involves structuring the agent as a tree of prompts, where each node contains its own specific prompt, custom function call instructions, and clear logic for transitioning between nodes. This method is ideal for more sophisticated agents because it improves accuracy, modularity, and maintainability. By isolating each task in its own specific prompt, the agent can execute more targeted and consistent actions, while also making it easier to update and manage interactions.
When to Use Each Approach
-
Single-Prompt Approach: Use for simple agents with limited tasks, where managing a single prompt does not interfere with efficiency.
-
Multi-Prompt Approach: Preferred for more complex voice agents, where smooth transitions between tasks and greater control are needed. It allows for better handling of varied scenarios, providing greater flexibility and simplifying long-term maintenance.
Frequent Testing and Iterations
Regularly testing the voice agent in different situations is crucial to ensure the prompt is effective. It’s important to put the agent in challenging scenarios to evaluate how it handles unexpected or complex interactions. This helps identify weaknesses or areas for improvement. By adjusting the prompt based on these tests, you can refine its performance until the agent consistently achieves a satisfactory success rate—completing tasks autonomously without requiring human intervention. Frequent testing and iteration are vital to ensure the agent responds reliably, adapts to various situations, and continues to meet user expectations over time.
Prepare a Custom Message for Misunderstandings
It’s important to prepare a clear and polite response in case the voice agent encounters a misunderstanding or is unsure about what the user is asking. Instead of the agent giving an inappropriate or vague answer, it should use a phrase like, “Hum…I’m sorry, I didn’t quite catch that. Could you please repeat?” This ensures the agent remains professional and avoids confusion. Having a pre-defined message for these situations helps maintain a smooth conversation flow and improves the overall user experience, as it gives users the chance to clarify their requests without feeling frustrated.
Optimize Number and Date Pronunciation
If the voice agent struggles to pronounce numbers correctly, consider writing them out in full (e.g., “twenty-four” instead of “24”) to reduce diction errors.
For dates or phone numbers, provide specific pronunciation instructions. When reading French phone numbers, the agent should:
-
Read two digits together: (e.g., “quarante-deux” for 42) unless the pair begins with a zero, in which case each digit should be pronounced separately (e.g., 05 as “zéro cinq”).
-
Insert a short pause between each pair of digits for better clarity.
Example:
The phone number 06 05 42 should be read as: “zéro six, [pause], zéro cinq, [pause], quarante-deux.”
This ensures clear communication and minimizes misunderstandings, contributing to a professional and seamless user experience.
Add Human-Like Elements to the Agent’s Speech
To make the voice agent sound more natural and relatable, incorporate human-like speech patterns into its responses. These subtle elements can enhance the conversational experience:
-
Hesitations: Include phrases like “Euuuh…” to simulate natural thinking pauses.
-
Slight Stutters: For example, “I-I-I would say” can make the agent feel less robotic.
-
Ellipses for Pauses: Use “…” to create natural breaks in speech and convey a thoughtful tone.
-
Strategic Punctuation: Employ commas, exclamation points, or question marks to shape the agent’s tone and express emotions appropriately.
These touches make the agent’s voice feel more dynamic and human, improving user engagement and creating a more authentic conversational experience.
Spell Words Phonetically for Better Pronunciation
To ensure the voice agent pronounces words correctly, don’t hesitate to spell them phonetically, reflecting how they sound rather than their standard written form. This can be particularly useful for proper nouns, uncommon words, or regional terms.
Example:
Instead of writing “Montpellier,” which might be mispronounced, use “Mon-Peu-Lié.”
This approach guarantees clearer and more accurate pronunciation, improving the user experience and avoiding confusion, especially for critical information like names or locations.
Adjust the Temperature
Temperature in a Language Model (LLM) controls the randomness of its responses. A low temperature (close to 0) results in more predictable and focused answers, while a high temperature (closer to 1) generates more creative and varied responses.
Adjusting the temperature can help tailor the agent’s behavior:
-
Lower Temperatures: Ideal for consistency and accuracy.
-
Higher Temperatures: Better for creativity and flexibility.
Rounded lets you customize this setting to optimize your agent’s responses for different situations.
Utilizing Tools Effectively
Voice agents on the Rounded studio can leverage external tools and APIs to perform specific actions, improving efficiency and user experience. It’s essential to clearly specify when and how these tools should be invoked, ensuring seamless integration into conversations.
Utilizing Tools Effectively
To maximize the efficiency and clarity of your agent’s workflow, consider the following best practices when utilizing tools:
-
Meaningful Tool Names: Always provide meaningful names to your tools that clearly describe their functionality. This helps the model understand the purpose of each tool and reduces ambiguity during task execution.
-
Use comprehensive tool descriptions: Provide detailed descriptions for each tool to ensure the model understands their functionality and usage. This helps the model make informed decisions and execute tasks more effectively.
-
Use clear parameters names and descriptions: Provide clear names and descriptions for each parameter to ensure the model understands their functionality and usage. This helps the model provide the correct information to the tool.
-
Limit the Number of Tools per Task: Avoid using too many tools within a single task to prevent confusing the model. Tools are often linked to specific phases of the conversation, and an overload can disrupt the flow and effectiveness of the agent.
-
Strategic Tool Usage: Use tools to provide additional variables that can be utilized in subsequent tasks. Tools can also assist the agent in choosing the next appropriate task based on the conversation’s context and extracted data.
-
Organize Tools by Conversation Phases: Structure your tools in a way that aligns with different phases of the conversation. This organization helps maintain clarity and ensures that each tool is used in the appropriate context.
-
Instructions in the api response: Use the field
instructions
at the root of the api responseto provide instructions to the agent after the tool call, instead of providing the agent with the whole api response. -
Determistic response message: Use the field
agent_message
at the root of the api response to provide a deterministic response to the agent after the tool call.
For more detailed information on configuring and managing tools, refer to the Tools section.