Conversation Management
Addis AI supports maintaining conversation context through the
conversation_history
parameter, enabling continuous, contextual conversations.
Overview and Use Cases
Conversation management enables:
- Multi-turn conversations that maintain context
- Stateful chatbots with memory of previous exchanges
- Reference resolution (understanding "it", "that", etc.)
- Follow-up questions and clarifications
- Personalized user experiences that evolve over time
- User information retention without repeated collection
Conversation History Format
The
conversation_history
should be an array of message objects, each with a
role
and
content
field:
{
"prompt": "What's the capital of Ethiopia again?",
"target_language": "am",
"conversation_history": [
{
"role": "user",
"content": "What is the capital of Ethiopia?"
},
{
"role": "assistant",
"content": "The capital of Ethiopia is Addis Ababa."
}
],
"generation_config": { "temperature": 0.7 }
}
Roles
"role": "user"
- Messages from the human user
"role": "assistant"
- Messages from the AI (automatically mapped to "model" for the underlying model)
System Prompts and Conversation History
When no history is provided, the system automatically includes a language-specific system prompt that directs the AI to respond in the target language with appropriate cultural context.
When you provide a conversation history:
- The system does not automatically add the default system prompt to the conversation
- The history you provide is used exactly as given
- Role mappings are preserved with "assistant" roles mapped to "model" for the underlying API
If you want to include a system prompt with your history, you can add it as the first message with
"role": "assistant"
:
{
"conversation_history": [
{
"role": "assistant",
"content": "You are Addis AI, an AI assistant that specializes in Ethiopian languages..."
},
{
"role": "user",
"content": "Hello!"
},
{
"role": "assistant",
"content": "ሰላም! እንደምን አሉ? እኔ አዲስ AI ነኝ።"
}
]
}
Context Retention
The conversation history allows the AI to:
- Remember facts mentioned earlier in the conversation
- Understand references to previously discussed topics
- Build upon earlier responses
- Provide consistent information throughout the conversation
- Personalize responses based on established user identity
Example of Context Retention
User: "My name is Abebe and I work as a teacher."
Assistant: "Nice to meet you, Abebe! How long have you been a teacher?"
User: "For 5 years. I teach mathematics."
Assistant: "That's great! Mathematics is an important subject. Do you teach at a primary or secondary school?"
User: "What was my name again?"
Assistant: "Your name is Abebe."
Multi-turn Conversations
A typical multi-turn conversation flow:
-
Initial Request (no history):
{
"prompt": "What is the capital of Ethiopia?",
"target_language": "am"
}
-
Response:
{
"response_text": "The capital of Ethiopia is Addis Ababa."
}
-
Follow-up (with history):
{
"prompt": "When was it founded?",
"target_language": "am",
"conversation_history": [
{
"role": "user",
"content": "What is the capital of Ethiopia?"
},
{
"role": "assistant",
"content": "The capital of Ethiopia is Addis Ababa."
}
]
}
-
Response:
{
"response_text": "Addis Ababa was founded in 1886 by Emperor Menelik II."
}
Building Conversation History with Streaming
When using streaming mode (
"stream": true
), clients should:
- Collect all chunks of the response
- Identify when the final chunk is received (check for
"is_last_chunk": true
or non-null finish_reason
)
- Concatenate the text from all chunks
- Add the complete response to the conversation history for future requests
let responseText = "";
let isComplete = false;
fetch("https://api.addisassistant.com/api/v1/chat_generate", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-API-Key": "YOUR_API_KEY",
},
body: JSON.stringify({
prompt: "Tell me more",
target_language: "am",
conversation_history: existingHistory,
generation_config: { stream: true, temperature: 0.7 },
}),
}).then((response) => {
const reader = response.body.getReader();
function readChunk() {
return reader.read().then(({ done, value }) => {
if (done) return;
const chunk = new TextDecoder().decode(value);
const lines = chunk.split("\n").filter((line) => line.trim());
for (const line of lines) {
try {
const data = JSON.parse(line);
responseText += data.response_text;
if (data.is_last_chunk || data.finish_reason) {
isComplete = true;
const newHistory = [
...existingHistory,
{ role: "user", content: prompt },
{ role: "assistant", content: responseText },
];
setConversationHistory(newHistory);
}
} catch (e) {
console.error("Error parsing chunk:", e);
}
}
return readChunk();
});
}
return readChunk();
});
:::caution
Chat streaming functionality is currently in BETA and not recommended for production environments. For production use, we recommend using non-streaming mode for chat interactions.
:::
Advanced: Conversation History with Attachments
For multimedia conversations, use the
parts
array format instead of
content
:
{
"conversation_history": [
{
"role": "user",
"parts": [
{
"fileData": {
"fileUri": "gs://.../file1.png",
"mimeType": "image/png"
}
},
{ "text": "What is in this image?" }
]
},
{
"role": "assistant",
"parts": [
{ "text": "This image shows the Entoto Mountains near Addis Ababa." }
]
},
{
"role": "user",
"parts": [{ "text": "Is this in Ethiopia?" }]
}
]
}
Mixing Content and Parts Formats
For backward compatibility, the system supports both the
content
field and the
parts
array format. However, we recommend using the
parts
format when working with attachments.
Token Management
Each message in the conversation history consumes tokens, which count toward your usage limits and can affect response generation:
- Token Limits: The total tokens in the conversation history, current prompt, and response must fit within the model's context window
- Truncation Strategies:
- Keep the most recent N turns (typically 10-20)
- Retain only the most relevant messages
- Summarize older parts of the conversation in a system message
Example Token Management
function manageConversationTokens(history, maxTurns = 10) {
if (history.length > maxTurns) {
return history.slice(history.length - maxTurns);
}
return history;
}
const truncatedHistory = manageConversationTokens(conversationHistory);
Best Practices
-
Clear Session Boundaries
- Start with an empty history array for new conversations
- Don't mix conversations from different sessions
- Consider implementing session timeouts (e.g., reset after 30 minutes of inactivity)
-
History Management
- Store the history on the client side securely
- Limit history length to prevent exceeding token limits (consider keeping the most recent 10-20 turns)
- Include only essential exchanges that provide context
- Consider implementing a "forget" feature to let users clear history
-
Stream Handling
- For streaming responses, collect all chunks before adding to history
- Don't add partial responses to history if an error occurs
- Implement appropriate UI feedback during streaming
-
Security and Privacy
- Sanitize sensitive information before adding to conversation history
- Consider implementing timeouts for conversation sessions
- Provide clear privacy policies about conversation retention
- Don't store personal identifiable information (PII) in conversations unless necessary
Full Example: React Application with Conversation Management
import React, { useState, useEffect } from "react";
function ChatApp() {
const [messages, setMessages] = useState([]);
const [inputText, setInputText] = useState("");
const [isLoading, setIsLoading] = useState(false);
const getConversationHistory = () => {
return messages.map((msg) => ({
role: msg.sender,
content: msg.text,
}));
};
const sendMessage = async () => {
if (!inputText.trim()) return;
const userMessage = { sender: "user", text: inputText };
setMessages((prev) => [...prev, userMessage]);
setInputText("");
setIsLoading(true);
try {
const history = getConversationHistory();
const response = await fetch(
"https://api.addisassistant.com/api/v1/chat_generate",
{
method: "POST",
headers: {
"Content-Type": "application/json",
"X-API-Key": "YOUR_API_KEY",
},
body: JSON.stringify({
prompt: userMessage.text,
target_language: "am",
conversation_history: history,
generation_config: {
temperature: 0.7,
},
}),
},
);
const data = await response.json();
setMessages((prev) => [
...prev,
{ sender: "assistant", text: data.response_text },
]);
} catch (error) {
console.error("Error sending message:", error);
setMessages((prev) => [
...prev,
{
sender: "assistant",
text: "Sorry, there was an error processing your request.",
},
]);
} finally {
setIsLoading(false);
}
};
const clearHistory = () => {
setMessages([]);
};
return (
<div className="chat-container">
<div className="chat-header">
<h2>Addis AI Chat</h2>
<button onClick={clearHistory}>Clear Chat</button>
</div>
<div className="chat-messages">
{messages.length === 0 ? (
<div className="empty-state">Start a conversation with Addis AI</div>
) : (
messages.map((msg, index) => (
<div key={index} className={`message ${msg.sender}`}>
{msg.text}
</div>
))
)}
{isLoading && <div className="message loading">...</div>}
</div>
<div className="chat-input">
<input
type="text"
value={inputText}
onChange={(e) => setInputText(e.target.value)}
onKeyPress={(e) => e.key === "Enter" && sendMessage()}
placeholder="Type your message..."
disabled={isLoading}
/>
<button onClick={sendMessage} disabled={isLoading}>
Send
</button>
</div>
</div>
);
}
export default ChatApp;