stream() method is used to make streaming chat completion requests to the Edgee AI Gateway. It returns an AsyncGenerator<StreamChunk> that yields response chunks as they arrive from the API.
Arguments
Thestream() method accepts two arguments:
| Parameter | Type | Description |
|---|---|---|
model | string | The model identifier to use (e.g., "openai/gpt-4o") |
input | string | InputObject | The input for the completion. Can be a simple string or a structured InputObject |
Input Types
String Input
Wheninput is a string, it’s automatically converted to a user message:
InputObject
Wheninput is an InputObject, you have full control over the conversation:
| Property | Type | Description |
|---|---|---|
messages | Message[] | Array of conversation messages |
tools | Tool[] | Array of function tools available to the model |
tool_choice | ToolChoice | Controls which tool (if any) the model should call. See Tools documentation for details |
Message type, see the Send Method documentation.
For details about Tool and ToolChoice types, see the Tools documentation.
Example - Streaming with Messages:
Return Value
Thestream() method returns an AsyncGenerator<StreamChunk>. Each chunk contains incremental updates to the response.
StreamChunk Object
Each chunk yielded by the generator has the following structure:| Property | Type | Description |
|---|---|---|
choices | StreamChoice[] | Array of streaming choices (typically one) |
StreamChoice Object
Each choice in thechoices array contains:
| Property | Type | Description |
|---|---|---|
index | number | The index of this choice in the array |
delta | StreamDelta | The incremental update to the message |
finish_reason | string | null | undefined | Reason why the generation stopped. Only present in the final chunk. Possible values: "stop", "length", "tool_calls", "content_filter", or null |
StreamDelta Object
Thedelta object contains incremental updates:
| Property | Type | Description |
|---|---|---|
role | string | undefined | The role of the message (typically "assistant"). Only present in the first chunk |
content | string | undefined | Incremental text content. Each chunk contains a portion of the full response |
tool_calls | ToolCall[] | undefined | Array of tool calls (if any). See Tools documentation for details |
Convenience Properties
TheStreamChunk class provides convenience getters for easier access:
| Property | Type | Description |
|---|---|---|
text | string | null | Shortcut to choices[0].delta.content - the incremental text content |
role | string | null | Shortcut to choices[0].delta.role - the message role (first chunk only) |
finishReason | string | null | Shortcut to choices[0].finish_reason - the finish reason (final chunk only) |
Understanding Streaming Behavior
Chunk Structure
- First chunk: Contains
role(typically"assistant") and may contain initialcontent - Content chunks: Contain incremental
contentupdates - Final chunk: Contains
finish_reasonindicating why generation stopped
Finish Reasons
| Value | Description |
|---|---|
"stop" | Model generated a complete response and stopped naturally |
"length" | Response was cut off due to token limit |
"tool_calls" | Model requested tool/function calls |
"content_filter" | Content was filtered by safety systems |
null | Generation is still in progress (not the final chunk) |
Empty Chunks
Some chunks may not containcontent. This is normal and can happen when:
- The chunk only contains metadata (role, finish_reason)
- The chunk is part of tool call processing
- Network buffering creates empty chunks
chunk.text before using it:
Error Handling
Thestream() method can throw errors: