AddisAI Realtime API Documentation

Welcome to the AddisAI Realtime API! This documentation provides everything you need to integrate your application with our WebSocket service for real-time, voice-based conversations with our AI.

Getting Started

To get started, you'll need an API key. If you don't have one, please create one from here ( https://platform.addisassistant.com/apikeys)

WebSocket API

Our WebSocket API allows for real-time, two-way audio streaming between your application and our servers.

Connecting

To connect to our WebSocket server, use the following URL format:

wss://relay.addisassistant.com/ws?apiKey=YOUR_API_KEY

text

Replace YOUR_API_KEY with your actual API key.

Authentication

Authentication is handled via the apiKey query parameter in the WebSocket URL. If the API key is missing or invalid, the connection will be rejected.

Audio Format

The API expects audio to be sent in the following format:

Encoding: PCM
Sample Rate: 16000 Hz
Bit Depth: 16-bit
Channels: 1 (mono)

Messaging

All messages sent and received over the WebSocket connection are in JSON format.

Client to Server

To send audio to the server, you must send a JSON object containing a data field. The data field should contain the raw PCM audio data, base64 encoded. Example:

{
  "data": "BASE64_ENCODED_PCM_AUDIO_DATA"
}

json

Server to Client

The server will send various types of messages to the client. Your application should be prepared to handle the following message formats: 1. Status Messages Status messages provide information about the state of the connection.

{
  "type": "status",
  "message": "Ready to start conversation"
}

json

2. AI Audio Response This message contains the AI's audio response. The audio data is base64 encoded.

{
  "serverContent": {
    "modelTurn": {
      "parts": [
        {
          "inlineData": {
            "data": "BASE64_ENCODED_AUDIO_RESPONSE_DATA"
          }
        }
      ]
    }
  }
}

json

3. Turn Complete & Usage Metadata When the AI has finished its turn, the server will send a message with turnComplete set to true, along with usage metadata.

{
  "serverContent": {
    "turnComplete": true
  },
  "usageMetadata": {
    "totalBilledAudioDurationSeconds": 5.2
  }
}

json

4. Warning Messages The server may send warning messages, for example, if your account balance is low.

{
  "type": "warning",
  "message": "Your wallet balance is low. Please top up to avoid service interruption."
}

json

5. Error Messages If an error occurs, the server will send an error message.

{
  "error": {
    "message": "AI service error",
    "status": 500,
    "timestamp": "2025-07-15T10:00:00.000Z"
  }
}

json

Integration Examples

Here are some examples of how to connect to our WebSocket API and handle audio streaming.

JavaScript (Browser)

This example demonstrates how to capture microphone audio, encode it, and send it to the WebSocket server.

const apiKey = 'YOUR_API_KEY';
const socket = new WebSocket(`wss://relay.addisassistant.com/ws?apiKey=${apiKey}`);

let audioContext;
let scriptProcessorNode;
let mediaStream;

socket.onopen = () => {
  console.log('Connected to AddisAI Realtime API');
  // Start capturing audio once connected
  startCapturingAudio();
};

socket.onmessage = (event) => {
  const message = JSON.parse(event.data);

  if (message.type === 'status') {
    console.log('Status:', message.message);
  } else if (message.serverContent?.modelTurn?.parts?.[0]?.inlineData) {
    // Handle incoming audio from the AI
    const audioData = atob(message.serverContent.modelTurn.parts[0].inlineData.data);
    // You'll need a function to play back the raw PCM audio
    playAudio(audioData);
  } else if (message.error) {
    console.error('Error from server:', message.error);
  }
};

socket.onclose = (event) => {
  console.log(`Connection closed: ${event.reason || 'Unknown reason'}`);
  stopCapturingAudio();
};

socket.onerror = (error) => {
  console.error('WebSocket error:', error);
};

function startCapturingAudio() {
  navigator.mediaDevices.getUserMedia({ audio: true, video: false })
    .then(stream => {
      mediaStream = stream;
      audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 16000 });
      const source = audioContext.createMediaStreamSource(stream);
      scriptProcessorNode = audioContext.createScriptProcessor(4096, 1, 1);

      scriptProcessorNode.onaudioprocess = (audioProcessingEvent) => {
        const inputBuffer = audioProcessingEvent.inputBuffer;
        const pcmData = inputBuffer.getChannelData(0);
        
        // Convert PCM data to base64
        const base64Data = btoa(String.fromCharCode.apply(null, new Uint8Array(pcmData.buffer)));

        if (socket.readyState === WebSocket.OPEN) {
          socket.send(JSON.stringify({ data: base64Data }));
        }
      };

      source.connect(scriptProcessorNode);
      scriptProcessorNode.connect(audioContext.destination);
    })
    .catch(err => {
      console.error('Error accessing microphone:', err);
    });
}

function stopCapturingAudio() {
  if (mediaStream) {
    mediaStream.getTracks().forEach(track => track.stop());
  }
  if (scriptProcessorNode) {
    scriptProcessorNode.disconnect();
  }
  if (audioContext) {
    audioContext.close();
  }
}

// This is a placeholder for audio playback.
// You would need a more robust implementation to handle raw PCM audio playback.
function playAudio(base64AudioData) {
    console.log("Received audio data to play.");
    // In a real application, you would decode the base64 data 
    // and use the Web Audio API to play it back.
}

javascript

Python

This example uses the websockets library to connect to the API. You would need to integrate this with a library like PyAudio to capture microphone input.

import asyncio
import websockets
import json
import base64

# This is a placeholder for your audio input.
# In a real application, you would capture this from a microphone.
def get_audio_data():
    # Replace this with actual audio data (16-bit PCM, 16kHz, mono)
    # For this example, we'll use some dummy data.
    dummy_audio_chunk = b'�' * 8000 
    return base64.b64encode(dummy_audio_chunk).decode('utf-8')

async def addisai_realtime():
    api_key = 'YOUR_API_KEY'
    uri = f"wss://relay.addisassistant.com/ws?apiKey={api_key}"
    
    async with websockets.connect(uri) as websocket:
        print("Connected to AddisAI Realtime API")

        async def send_audio():
            while True:
                audio_data = get_audio_data()
                await websocket.send(json.dumps({"data": audio_data}))
                await asyncio.sleep(0.5) # Send audio every 500ms

        async def receive_messages():
            async for message in websocket:
                response = json.loads(message)
                print(f"< {response}")

        # Run send and receive tasks concurrently
        send_task = asyncio.create_task(send_audio())
        receive_task = asyncio.create_task(receive_messages())

        await asyncio.gather(send_task, receive_task)

if __name__ == "__main__":
    try:
        asyncio.run(addisai_realtime())
    except KeyboardInterrupt:
        print("Interrupted by user")

python

Support

If you have any questions or need assistance, please contact our support team.

Previous Next