Assistants
Solution Overview
OpenAI Assistants provide a powerful business solution for integrating AI-driven interactions into applications and workflows. By defining specific roles, instructions, and tools, businesses can create tailored AI assistants capable of handling customer support, data analysis, content generation, and more. With structured thread-based conversations and API-driven automation, Assistants streamline complex processes, enhance user engagement, and improve operational efficiency. Whether used for internal productivity or customer-facing interactions, they offer a scalable way to incorporate advanced AI capabilities into any business environment.
Main Entities in Assistants' Workflow
Here's a simple explanation of how Messages, Threads, Assistants, Runs, and Tools work in OpenAI's Assistants' workflow:
A Message is a single exchange in a conversation. It can be something a user sends to the Assistant (like a question or command) or something the Assistant replies with. Think of it as one text bubble in a chat.
A Thread is a collection of Messages that belong to the same conversation. It keeps all the back-and-forth exchanges together, so the Assistant can remember what was said earlier in that conversation. Each time a user starts a new discussion, it creates a new Thread. There is no limit to the number of Messages you can store in a Thread. When the size of the Messages exceeds the model's context window, the Thread will smartly truncate messages before fully dropping the least important ones.
An Assistant is the AI itself—its personality, skills, and behavior. When you set up an Assistant, you define what it knows, how it should respond, and whether it can use extra Tools (like APIs or file uploads). One Assistant can be used across multiple Threads and users.
A Run is the process that executes the Assistant’s response within a Thread. When a user sends a Message, a Run is created to generate the Assistant’s reply based on the conversation history and its predefined behavior. Runs manage the execution flow, including any tool calls the Assistant might need to make. A Run goes through different statuses—such as
queued
,in_progress
,completed
, orfailed
—indicating its current state. Each Thread can have multiple Runs, ensuring smooth and structured interactions between the user and the Assistant.A Tool is an additional capability that an Assistant can use to enhance its responses. Tools allow the Assistant to perform external actions, such as calling APIs, retrieving files, or running custom functions. (You may have already encountered this concept by using Function Calling when calling text models without creating Assistants.) When a Tool is enabled, the Assistant can decide when to use it based on the conversation context. If a Tool is required during a Run, the process pauses until the necessary output is provided. This makes Tools essential for handling complex tasks that go beyond simple text-based interactions. Currently, three Tool options are available:
Code Interpreter,
File Search,
Function Calling, which can call your custom functions.
How to Use Assistant API
Create an Assistant. To set up an Assistant, you need to choose an AI model that will handle chat completion. The selected model determines the Assistant’s capabilities and response quality. Additionally, you should define the Assistant’s role and behavior by providing instructions—this will guide how it interacts with users. Additionally, you can add files to further train the Assistant on specific materials, enhancing its ability to provide more tailored responses. Enabling tools like Code Interpreter, File Search, and Function Calling can further improve its functionality. Once created, the Assistant will be assigned a unique ID, which you’ll use in further interactions. The method returns the ID of the created Assistant, which you can use later to link it with Threads and Runs.
from openai import OpenAI
client = OpenAI()
assistant = client.beta.assistants.create(
name="Math Tutor",
instructions="You are a personal math tutor. Write and run code to answer math questions.",
tools=[{"type": "code_interpreter"}],
model="gpt-4o",
)
Create a Thread where the interaction between the created Assistant and the user (a human) will take place. This method returns the ID of the created Thread.
thread = client.beta.threads.create()
Create the first user message either directly in the code or by passing it from a form. This message will be the first in your Thread. You must specify role: 'user'
and the corresponding thread_id
.
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="I need to solve the equation `3x + 11 = 14`. Can you help me?"
)
Create a Run to initiate the Chat Completion process for messages within the specified Thread using the specified Assistant. This is very similar to calling a model without using the Assistants and Threads framework. If external Tool calls might be needed during processing, you must predefine the available Tools in the tools
parameter and set tool_choice
to 'auto'
.
Note that runs expire ten minutes after creation. Be sure to submit your Tool outputs before the 10-minute mark.
run = client.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=assistant.id,
instructions="Please address the user as Jane Doe. The user has a premium account."
)
Example #1: Chat With Assistant (Without Streaming)
import openai
from openai import OpenAI
# Connect to OpenAI API
client = OpenAI(
api_key="<YOUR_LAPLASAPI_KEY>",
base_url="https://api.apilaplas.com/"
)
# Create an Assistant
my_assistant = client.beta.assistants.create(
instructions="You are a helpful assistant.",
name="AI Assistant",
model="gpt-4o", # Specify the model
)
assistant_id = my_assistant.id # Store Assistant ID
thread = client.beta.threads.create() # Create a new Thread
thread_id = thread.id # Store the Thread ID
def send_message(user_message):
"""Send a message to the Assistant and receive a full response"""
if not user_message.strip():
print("⚠️ Message cannot be empty!")
return
# Add the user's Message to the thread
client.beta.threads.messages.create(
thread_id=thread_id,
role="user",
content=user_message
)
# Start a new Run and wait for completion
run = client.beta.threads.runs.create_and_poll(
thread_id=thread_id,
assistant_id=assistant_id,
instructions="Keep responses concise and clear."
)
# Check if the Run was successful
if run.status == "completed":
# Retrieve messages from the thread
messages = client.beta.threads.messages.list(thread_id=thread_id)
# Find the last Assistant Message
for message in reversed(messages.data):
if message.role == "assistant":
print() # Add an empty line for spacing
print(f"Assistant > {message.content[0].text.value}")
return
print("⚠️ Error: Failed to get a response from the Assistant.")
# Main chat loop
print("🤖 AI Assistant is ready! Type 'exit' to quit.")
while True:
user_input = input("\nYou > ")
if user_input.lower() in ["exit", "quit"]:
print("👋 Chat session ended. See you next time!")
break
send_message(user_input)
Example #2: Chat With Assistant (With Streaming)
import openai
from openai import OpenAI
from typing_extensions import override
from openai import AssistantEventHandler
# Connect to OpenAI API
client = OpenAI(
api_key="<YOUR_LAPLASAPI_KEY>",
base_url="https://api.apilaplas.com/"
)
# Create an assistant
my_assistant = client.beta.assistants.create(
instructions="You are a helpful assistant.",
name="AI Assistant",
model="gpt-4o", # Specify the model
)
assistant_id = my_assistant.id # Store assistant ID
thread = client.beta.threads.create() # Create a new thread
thread_id = thread.id # Store the thread ID
# Event handler for streaming responses
class EventHandler(AssistantEventHandler):
@override
def on_text_created(self, text) -> None:
print("\nAssistant >", end=" ", flush=True)
@override
def on_text_delta(self, delta, snapshot):
print(delta.value, end="", flush=True)
def on_tool_call_created(self, tool_call):
print(f"\nAssistant > {tool_call.type}\n", flush=True)
def on_tool_call_delta(self, delta, snapshot):
if delta.type == 'code_interpreter':
if delta.code_interpreter.input:
print(delta.code_interpreter.input, end="", flush=True)
if delta.code_interpreter.outputs:
print(f"\n\noutput >", flush=True)
for output in delta.code_interpreter.outputs:
if output.type == "logs":
print(f"\n{output.logs}", flush=True)
def send_message(user_message):
"""Send a message to the Assistant and display the response"""
if not user_message.strip():
print("⚠️ Message cannot be empty!")
return
# Add the user's Message to the Thread
client.beta.threads.messages.create(
thread_id=thread_id,
role="user",
content=user_message
)
# Start processing the Message with streaming output
with client.beta.threads.runs.stream(
thread_id=thread_id,
assistant_id=assistant_id,
instructions="Keep responses concise and clear.",
event_handler=EventHandler(),
) as stream:
stream.until_done()
# Main chat loop
print("🤖 AI Assistant is ready! Type 'exit' to quit.")
while True:
user_input = input("\nYou > ")
if user_input.lower() in ["exit", "quit"]:
print("👋 Chat session ended. See you next time!")
break
send_message(user_input)
Last updated