Stream a chat response
The response is text/event-stream. Each frame contains an event field (event type)
and a data field (JSON payload). Consume text_delta events to progressively render
the assistant’s reply.
Event types in order:
stream_start— stream opened, includes model and provider infostep_start— reasoning step startedtext_start— text block startedtext_delta— incremental text chunk; readdata.deltafor the tokentext_complete— text block finishedstep_finish— reasoning step finishedstream_end— stream closed, includesfinish_reasonand tokenusage
Example stream:
event: stream_start
data: {"id":"evt_01KPTDR1BWRTJRY7W56MBCGF0R","timestamp":1776855811,"model":"grok-4-1-fast-non-reasoning","provider":"xai","metadata":null}
event: step_start
data: {"id":"evt_01KPTDR1BWRTJRY7W56MBCGF0S","timestamp":1776855811}
event: text_start
data: {"id":"evt_01KPTDR1BWRTJRY7W56MBCGF0T","timestamp":1776855811,"message_id":"evt_abc"}
event: text_delta
data: {"id":"evt_01KPTDR1BXRXHBK6DFSZ3W32H6","timestamp":1776855811,"delta":"Hello","message_id":"evt_def"}
event: text_delta
data: {"id":"evt_01KPTDR1CXK80DJ647VEF68ATW","timestamp":1776855811,"delta":"!","message_id":"evt_klm"}
event: text_complete
data: {"id":"evt_01KPTDR1Q0Z330AKTGKFXYRQ7G","timestamp":1776855811,"message_id":"evt_xyz"}
event: step_finish
data: {"id":"evt_01KPTDR1Q1PD7FG8F9DS2GCAVJ","timestamp":1776855811}
event: stream_end
data: {"id":"evt_01KPTDR1Q1PD7FG8F9DS2GCAVK","timestamp":1776855811,"finish_reason":"Stop","usage":{"prompt_tokens":12,"completion_tokens":8,"cache_write_input_tokens":null,"cache_read_input_tokens":null,"thought_tokens":null},"citations":null}
Note: Use
curl -N(no-buffer) or an EventSource client to see tokens arrive in real time. Postman buffers SSE until the connection closes.
Authorizations
The access token received from the authorization server in the OAuth 2.0 flow.
Headers
Tenant identifier. Send the Tenant ID in the X-Tenant header to scope API requests to a specific tenant.
Body
Response
SSE stream. Listen for text_delta events and read data.delta to render the response progressively.