Streaming in Tailwinds allows for real-time token delivery as they become available, enhancing the responsiveness and user experience of your AI applications. This guide will walk you through configuring and using API streaming with Tailwinds.
How Streaming Works
When streaming is enabled for a prediction request, Tailwinds sends tokens as data-only server-sent events as soon as they are generated. This approach provides a more dynamic and interactive experience for users.
Configuring Streaming
Here's how you can implement streaming using Python's requests library:
import json
import requests
def stream_prediction(chatflow_id, question):
url = f"https://your-tailwinds-instance.com/api/v1/predictions/{chatflow_id}"
payload = {
"question": question,
"streaming": True
}
headers = {
"Content-Type": "application/json"
}
with requests.post(url, json=payload, headers=headers, stream=True) as response:
for line in response.iter_lines():
if line:
decoded_line = line.decode('utf-8')
if decoded_line.startswith('data: '):
data = json.loads(decoded_line[6:])
if isinstance(data, dict) and 'token' in data:
print(data['token'], end='', flush=True)
elif decoded_line.startswith('event: error'):
print(f"\nError: {decoded_line}")
break
elif decoded_line == 'event: end':
print("\nStream ended")
break
# Usage
stream_prediction("your-chatflow-id", "Hello world!")
To enable streaming with cURL, set the streaming parameter to true in your JSON payload. Here's an example: