API Streaming

Learn when you can stream back to your front end


Streaming in Tailwinds allows for real-time token delivery as they become available, enhancing the responsiveness and user experience of your AI applications. This guide will walk you through configuring and using API streaming with Tailwinds.

How Streaming Works

When streaming is enabled for a prediction request, Tailwinds sends tokens as data-only server-sent events as soon as they are generated. This approach provides a more dynamic and interactive experience for users.

Configuring Streaming

Here's how you can implement streaming using Python's requests library:

import json
import requests

def stream_prediction(chatflow_id, question):
    url = f"https://your-tailwinds-instance.com/api/v1/predictions/{chatflow_id}"
    payload = {
        "question": question,
        "streaming": True
    }
    headers = {
        "Content-Type": "application/json"
    }
    
    with requests.post(url, json=payload, headers=headers, stream=True) as response:
        for line in response.iter_lines():
            if line:
                decoded_line = line.decode('utf-8')
                if decoded_line.startswith('data: '):
                    data = json.loads(decoded_line[6:])
                    if isinstance(data, dict) and 'token' in data:
                        print(data['token'], end='', flush=True)
                elif decoded_line.startswith('event: error'):
                    print(f"\nError: {decoded_line}")
                    break
                elif decoded_line == 'event: end':
                    print("\nStream ended")
                    break

# Usage
stream_prediction("your-chatflow-id", "Hello world!")

Understanding the Event Stream

A prediction's event stream consists of the following event types:

EventDescription

start

Indicates the start of streaming

token

Emitted when a new token is available

error

Emitted if an error occurs during prediction

end

Signals the end of the prediction stream

metadata

Contains chatId, messageId, etc. Sent after all tokens and before the end event

sourceDocuments

Emitted when the flow returns sources from a vector store

usedTools

Emitted when the flow uses tools during prediction

Example of a Token Event

event: token
data: Once upon a time...

Best Practices

  1. Error Handling: Always implement proper error handling to manage potential issues during streaming.

  2. Buffering: Consider implementing a buffer on the client-side to smooth out the display of incoming tokens.

  3. Timeout Management: Set appropriate timeouts to handle cases where the stream might unexpectedly end.

  4. User Interface: Design your UI to gracefully handle incoming streamed data, providing a smooth experience for the end-user.

Last updated