Skip to main content

Building Chatbots with LLM APIs

Building a chatbot requires more than chaining API calls; it requires managing conversation state, preserving context across turns, handling user interruptions, and maintaining a consistent personality. A production chatbot is an orchestration of prompt design, message history management, state machines, error recovery, and user interaction patterns. This article covers the architecture and implementation of chatbots ranging from simple single-turn Q and A to complex multi-turn systems with memory and personalization.

Single-Turn vs. Multi-Turn Chatbots

A single-turn chatbot processes one user input and returns one response (no history). Multi-turn chatbots maintain conversation context, allowing the model to understand references to prior exchanges. Single-turn is simpler but less conversational; multi-turn enables coherent dialogue:

from openai import OpenAI

client = OpenAI()

# Single-turn chatbot: each call is independent
def single_turn_chat(user_input):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a friendly Python tutor."},
{"role": "user", "content": user_input}
]
)
return response.choices[0].message.content

result = single_turn_chat("What is a list?")
print(result)

# Multi-turn chatbot: maintains conversation history
class MultiTurnChatbot:
def __init__(self, system_prompt):
self.system_prompt = system_prompt
self.messages = [
{"role": "system", "content": system_prompt}
]

def chat(self, user_input):
"""Process user input and return response."""
self.messages.append({"role": "user", "content": user_input})

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=self.messages
)

assistant_response = response.choices[0].message.content
self.messages.append({"role": "assistant", "content": assistant_response})

return assistant_response

def get_history(self):
"""Return conversation history (excluding system message)."""
return self.messages[1:] # Skip system prompt

def reset(self):
"""Clear conversation history but keep system prompt."""
self.messages = [self.messages[0]]

# Usage
bot = MultiTurnChatbot("You are a helpful Python tutor. Explain concepts clearly.")

response1 = bot.chat("What is a list?")
print(f"Bot: {response1}\n")

response2 = bot.chat("Can you show me an example?")
print(f"Bot: {response2}\n")

response3 = bot.chat("How is that different from a tuple?") # References prior context
print(f"Bot: {response3}")

Multi-turn bots produce more natural dialogue because the model understands prior context. The tradeoff is increased token consumption (history is sent with every request) and complexity (managing state).

Stateful Chatbots with Personality and Constraints

Add personality and constraints by crafting detailed system prompts:

from openai import OpenAI
import json

client = OpenAI()

class PersonalityChatbot:
def __init__(self, name, personality, constraints=None):
self.name = name
self.personality = personality
self.constraints = constraints or []
self.messages = []

system_prompt = f"""You are {name}. {personality}
Constraints:
{chr(10).join(f'- {c}' for c in self.constraints)}"""

self.messages.append({"role": "system", "content": system_prompt})

def chat(self, user_input, max_history=10):
"""Process user input, maintaining bounded history."""
self.messages.append({"role": "user", "content": user_input})

# Keep only the system message and recent exchanges
if len(self.messages) > max_history + 1:
self.messages = [self.messages[0]] + self.messages[-(max_history):]

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=self.messages,
temperature=0.7 # Personality: creative but not random
)

assistant_response = response.choices[0].message.content
self.messages.append({"role": "assistant", "content": assistant_response})

return assistant_response

# Create a customer support chatbot
support_bot = PersonalityChatbot(
name="SupportBot",
personality="You are a friendly customer support agent for TechCorp. You are helpful, professional, and aim to resolve issues quickly.",
constraints=[
"Always offer solutions before escalating to a human agent",
"Maintain a professional but warm tone",
"Keep responses under 200 words",
"If unsure, say 'Let me check with my team' rather than guessing"
]
)

print(support_bot.chat("My app keeps crashing when I log in"))
print()
print(support_bot.chat("I already tried restarting"))

Personality and constraints shape behavior. The system prompt is the blueprint; tweak it to achieve different tones (friendly, formal, humorous).

Intent Detection and Routing

Complex chatbots route user input to different handlers based on intent (customer service, FAQ, escalation, etc.):

from openai import OpenAI
import json

client = OpenAI()

class IntentRoutingChatbot:
INTENTS = [
{"name": "FAQ", "description": "User asks a common question"},
{"name": "COMPLAINT", "description": "User reports a problem or complaint"},
{"name": "PURCHASE", "description": "User wants to buy a product"},
{"name": "OTHER", "description": "Unrelated query"}
]

def __init__(self):
self.messages = []

def detect_intent(self, user_input):
"""Use LLM to classify user intent."""
intent_prompt = f"""Classify the user input into one of these intents:
{json.dumps(self.INTENTS, indent=2)}

User input: "{user_input}"

Respond with only the intent name (FAQ, COMPLAINT, PURCHASE, or OTHER)."""

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": intent_prompt}],
max_tokens=10
)

intent = response.choices[0].message.content.strip().upper()
return intent if intent in ["FAQ", "COMPLAINT", "PURCHASE", "OTHER"] else "OTHER"

def handle_faq(self, user_input):
"""Handle FAQ intent."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful FAQ bot. Answer questions directly and concisely."},
{"role": "user", "content": user_input}
]
)
return response.choices[0].message.content

def handle_complaint(self, user_input):
"""Handle complaint intent."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a empathetic support agent. Validate the user's concern and offer a solution."},
{"role": "user", "content": user_input}
]
)
return response.choices[0].message.content

def handle_purchase(self, user_input):
"""Handle purchase intent."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a sales assistant. Guide the user through the purchase process."},
{"role": "user", "content": user_input}
]
)
return response.choices[0].message.content

def chat(self, user_input):
"""Route user input to appropriate handler."""
intent = self.detect_intent(user_input)
print(f"Detected intent: {intent}")

if intent == "FAQ":
return self.handle_faq(user_input)
elif intent == "COMPLAINT":
return self.handle_complaint(user_input)
elif intent == "PURCHASE":
return self.handle_purchase(user_input)
else:
return "I am not sure how to help with that. Can you rephrase?"

# Usage
router = IntentRoutingChatbot()

print(router.chat("How do I reset my password?"))
print()
print(router.chat("Your product broke after one day!"))
print()
print(router.chat("I want to buy the premium plan"))

Intent routing allows specialized handling per category. FAQ might return structured answers; complaints trigger escalation workflows; purchases invoke payment logic.

Conversation Logging and Analytics

Log conversations for debugging, improvement, and compliance:

from openai import OpenAI
import json
import datetime

client = OpenAI()

class LoggingChatbot:
def __init__(self, log_file="chatbot_logs.jsonl"):
self.log_file = log_file
self.messages = [
{"role": "system", "content": "You are a helpful assistant."}
]

def chat(self, user_input, session_id="default"):
"""Process input and log the exchange."""
self.messages.append({"role": "user", "content": user_input})

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=self.messages
)

assistant_response = response.choices[0].message.content
self.messages.append({"role": "assistant", "content": assistant_response})

# Log the exchange
log_entry = {
"timestamp": datetime.datetime.utcnow().isoformat(),
"session_id": session_id,
"user_input": user_input,
"assistant_response": assistant_response,
"model": "gpt-4o-mini",
"tokens_used": response.usage.total_tokens
}

with open(self.log_file, "a") as f:
f.write(json.dumps(log_entry) + "\n")

return assistant_response

def analyze_logs(self):
"""Analyze conversation logs."""
interactions = 0
total_tokens = 0

try:
with open(self.log_file, "r") as f:
for line in f:
log = json.loads(line)
interactions += 1
total_tokens += log["tokens_used"]
except FileNotFoundError:
pass

print(f"Total interactions: {interactions}")
print(f"Total tokens used: {total_tokens}")
if interactions > 0:
print(f"Avg tokens per interaction: {total_tokens / interactions:.1f}")

# Usage
bot = LoggingChatbot()

bot.chat("What is Python?", session_id="user_123")
bot.chat("Tell me more", session_id="user_123")
bot.chat("How do I learn it?", session_id="user_456")

bot.analyze_logs()

Logging captures every interaction for analysis and debugging. Use JSONL (JSON Lines) format for efficient streaming and analysis.

Key Takeaways

  • Multi-turn chatbots maintain message history; single-turn processes each input independently.
  • Craft detailed system prompts to define personality, constraints, and expertise.
  • Use intent detection to route complex queries to specialized handlers.
  • Bound conversation history to prevent token overflow and manage costs.
  • Log all interactions for debugging, improvement, and compliance.
  • Test edge cases: empty input, very long responses, abusive input, rapid-fire queries.

Frequently Asked Questions

How long can a conversation be?

Context windows limit history length. GPT-4o supports 128,000 tokens; a typical exchange uses 200–500 tokens. This allows 200+ turns per conversation. Implement rolling windows to trim old messages when approaching limits.

Should I store conversations in a database?

Yes, for production systems. Log to a database (PostgreSQL, MongoDB) instead of files for scalability, querying, and backup. Ensure compliance with privacy laws (GDPR, CCPA) by encrypting sensitive data.

Can I pre-train a chatbot on domain-specific knowledge?

Not directly with the API. Instead, use retrieval-augmented generation (RAG, covered in Article 10) to inject context. Alternatively, fine-tune a model on your data (expensive and complex).

How do I handle toxic or abusive input?

Add moderation: use OpenAI Moderation API to flag harmful content before processing. Reject abusive inputs and respond with a canned message ("I cannot help with that request").

Can I export conversation history?

Yes, from JSONL logs you can export to CSV, JSON, or other formats. Always anonymize user data before export for privacy.

Further Reading