Skip to Content

Chat API

Stream conversational responses with tool calling support via the Chat API.

Overview

The Chat API enables streaming conversations with agents that support:

  • Real-time streaming: NDJSON stream of delta updates
  • Tool calling: Agents can invoke external tools
  • Reasoning traces: See agent’s thought process
  • Token usage tracking: Monitor costs in real-time

Stream Chat

Send a message and stream the response.

POST /api/chat

Request Body

{ "sessionId": "session_abc123", "message": "What's the weather in San Francisco?", "params": { "temperature": 0.7, "maxTokens": 1000 } }
FieldTypeRequiredDescription
sessionIdstringYesActive session ID
messagestringYesUser message to send
paramsobjectNoOptional agent parameters

Response Format

The response is an NDJSON (Newline Delimited JSON) stream. Each line is a separate JSON event:

Event Types

1. assistant_delta - Streamed response chunks

{ "type": "assistant_delta", "delta": "The weather in San Francisco", "timestamp": "2024-01-15T10:30:00.000Z" }

2. tool_call - Agent invokes a tool

{ "type": "tool_call", "toolCall": { "id": "call_abc123", "name": "get_weather", "arguments": { "location": "San Francisco, CA" } }, "timestamp": "2024-01-15T10:30:01.000Z" }

3. tool_result - Tool execution result

{ "type": "tool_result", "toolResult": { "id": "call_abc123", "output": { "temperature": 68, "conditions": "sunny", "humidity": 65 } }, "timestamp": "2024-01-15T10:30:02.000Z" }

4. reasoning_delta - Agent’s reasoning (if enabled)

{ "type": "reasoning_delta", "delta": "I need to fetch the current weather data...", "timestamp": "2024-01-15T10:30:00.500Z" }

5. done - Stream completion with metrics

{ "type": "done", "usage": { "promptTokens": 150, "completionTokens": 85, "totalTokens": 235 }, "costUsd": "0.0024", "timestamp": "2024-01-15T10:30:05.000Z" }

6. error - Error during streaming

{ "type": "error", "error": { "code": "RATE_LIMIT_EXCEEDED", "message": "Too many requests" }, "timestamp": "2024-01-15T10:30:00.000Z" }

Example Requests

cURL (Basic)

curl -X POST https://agents.zazmic.com/api/chat \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -N \ -d '{ "sessionId": "session_abc123", "message": "Hello, agent!" }'

JavaScript (Browser)

async function streamChat(sessionId, message) { const response = await fetch('https://agents.zazmic.com/api/chat', { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ sessionId, message }) }); const reader = response.body.getReader(); const decoder = new TextDecoder(); let buffer = ''; while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const lines = buffer.split('\n'); buffer = lines.pop() || ''; // Keep incomplete line in buffer for (const line of lines) { if (line.trim()) { const event = JSON.parse(line); handleEvent(event); } } } } function handleEvent(event) { switch (event.type) { case 'assistant_delta': console.log('Assistant:', event.delta); break; case 'tool_call': console.log('Tool called:', event.toolCall.name); break; case 'tool_result': console.log('Tool result:', event.toolResult.output); break; case 'done': console.log('Cost:', event.costUsd); break; case 'error': console.error('Error:', event.error.message); break; } } // Usage await streamChat('session_abc123', 'What is 2+2?');

Python

import requests import json def stream_chat(session_id, message): response = requests.post( "https://agents.zazmic.com/api/chat", headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" }, json={ "sessionId": session_id, "message": message }, stream=True ) for line in response.iter_lines(): if line: event = json.loads(line.decode('utf-8')) handle_event(event) def handle_event(event): event_type = event['type'] if event_type == 'assistant_delta': print(event['delta'], end='', flush=True) elif event_type == 'tool_call': print(f"\n[Tool: {event['toolCall']['name']}]") elif event_type == 'tool_result': print(f"[Result: {event['toolResult']['output']}]") elif event_type == 'done': print(f"\n\nCost: ${event['costUsd']}") elif event_type == 'error': print(f"\nError: {event['error']['message']}") # Usage stream_chat('session_abc123', 'What is 2+2?')

Complete Chat Example

Here’s a full example showing session creation and chat:

async function completeChat() { // 1. Create session const sessionResponse = await fetch('/api/sessions', { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ agentId: 'agent_123' }) }); const { sessionId } = await sessionResponse.json(); console.log('Session created:', sessionId); // 2. Send message and stream response let fullResponse = ''; const chatResponse = await fetch('/api/chat', { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ sessionId: sessionId, message: 'Explain quantum computing' }) }); const reader = chatResponse.body.getReader(); const decoder = new TextDecoder(); let buffer = ''; while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const lines = buffer.split('\n'); buffer = lines.pop() || ''; for (const line of lines) { if (line.trim()) { const event = JSON.parse(line); if (event.type === 'assistant_delta') { fullResponse += event.delta; process.stdout.write(event.delta); } else if (event.type === 'done') { console.log(`\n\nTokens: ${event.usage.totalTokens}`); console.log(`Cost: $${event.costUsd}`); } } } } return { sessionId, response: fullResponse }; } completeChat();

Billing

Chat requests use component-based pricing:

  • Input tokens: Charged per 1K tokens
  • Output tokens: Charged per 1K tokens
  • Tool calls: Additional charge per invocation
  • Reasoning tokens: Charged separately (if enabled)

Costs are calculated in real-time and returned in the done event.

Error Responses

Session Not Found

{ "type": "error", "error": { "code": "SESSION_NOT_FOUND", "message": "Session not found or expired" } }

Rate Limit Exceeded

{ "type": "error", "error": { "code": "RATE_LIMIT_EXCEEDED", "message": "You have exceeded the rate limit of 10 requests per minute" } }

Insufficient Balance

{ "type": "error", "error": { "code": "INSUFFICIENT_BALANCE", "message": "Insufficient balance to complete this request" } }

Best Practices

Buffer Management

Always maintain a buffer for incomplete lines:

let buffer = ''; for (const chunk of stream) { buffer += decoder.decode(chunk, { stream: true }); const lines = buffer.split('\n'); buffer = lines.pop() || ''; // Keep incomplete line for (const line of lines) { if (line.trim()) { processLine(line); } } }

Error Handling

def stream_chat_safe(session_id, message): try: response = requests.post( "https://agents.zazmic.com/api/chat", headers={"Authorization": f"Bearer {API_KEY}"}, json={"sessionId": session_id, "message": message}, stream=True, timeout=60 ) response.raise_for_status() for line in response.iter_lines(): if line: event = json.loads(line) if event['type'] == 'error': raise Exception(event['error']['message']) handle_event(event) except requests.exceptions.Timeout: print("Request timed out") except requests.exceptions.HTTPError as e: print(f"HTTP error: {e}") except Exception as e: print(f"Error: {e}")

UI Integration

// React component example function ChatComponent({ sessionId }) { const [messages, setMessages] = useState([]); const [streaming, setStreaming] = useState(false); async function sendMessage(message) { setStreaming(true); let currentMessage = ''; const response = await fetch('/api/chat', { method: 'POST', headers: { /* ... */ }, body: JSON.stringify({ sessionId, message }) }); const reader = response.body.getReader(); const decoder = new TextDecoder(); while (true) { const { done, value } = await reader.read(); if (done) break; const lines = decoder.decode(value).split('\n'); for (const line of lines) { if (line.trim()) { const event = JSON.parse(line); if (event.type === 'assistant_delta') { currentMessage += event.delta; setMessages(prev => [ ...prev.slice(0, -1), { role: 'assistant', content: currentMessage } ]); } } } } setStreaming(false); } return (/* UI JSX */); }

Next Steps