Module 1: What Are AI Agents?
The Agent Paradigm
Introduction: From Chatbots to Autonomous Agents
Welcome to Module 1. Before we write any agent code, we need to build a clear mental model of what AI agents are, how they differ from the chatbots and automation scripts you may have already built, and why the agent paradigm is reshaping software development.
By the end of this module, you will:
- Understand the sense-reason-act cycle that defines all agents
- Know the difference between agents, chatbots, and traditional automation
- Recognize the ReACT pattern and why it works
- Have a framework for deciding when to use agents vs simpler approaches
- Build your first agent loop in pure Python
1.1 The Agent Paradigm: Sense, Reason, Act
What Makes Something an "Agent"?
An AI agent is a system that can perceive its environment, reason about what to do, and take action to achieve a goal. This three-part cycle is the foundation of every agent, from the simplest script to the most complex multi-agent system.
Sense (Perception)
The agent receives input from its environment. This could be:
- A user's message or request
- Data from an API or database
- An event trigger (new email, file upload, scheduled task)
- Results from a previous action
Reason (Decision Making)
The agent uses a large language model to think through the problem:
- What is the user asking for?
- What information do I need?
- Which tools should I use?
- What is the best sequence of actions?
Act (Execution)
The agent takes action in the real world:
- Calling an API to fetch data
- Writing to a database
- Sending an email or message
- Generating a document or report
Then the cycle repeats. The agent observes the result of its action, reasons about whether the task is complete, and decides what to do next.
# The agent cycle in pseudocode
while not task_complete:
observation = sense(environment) # What happened?
thought = reason(observation, goal) # What should I do?
result = act(thought) # Do it
environment.update(result) # Observe the outcome
A Concrete Example
Imagine you ask an agent: "What is the weather in Tokyo and should I pack an umbrella?"
Cycle 1:
- Sense: User asks about Tokyo weather
- Reason: "I need current weather data. I should use my weather API tool."
- Act: Calls
get_weather(city="Tokyo")and receives{"temp": 18, "condition": "rain", "humidity": 85}
Cycle 2:
- Sense: Weather data shows rain in Tokyo
- Reason: "It is raining in Tokyo. The user should pack an umbrella. I have enough information to respond."
- Act: Returns "It is currently 18C and raining in Tokyo with 85% humidity. Yes, you should definitely pack an umbrella."
Two cycles, one tool call, and a helpful answer. That is an agent at work.
1.2 Agents vs Chatbots vs Automation
The Three Paradigms
Understanding the differences between these three approaches is critical for making good architectural decisions.
Chatbots: Talk Only
A chatbot receives text input and produces text output. It cannot interact with external systems.
# A chatbot — can only generate text
from openai import OpenAI
client = OpenAI()
def chatbot(user_message: str) -> str:
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": user_message}
]
)
return response.choices[0].message.content
# The chatbot can only respond with text
print(chatbot("What is the weather in Tokyo?"))
# "I don't have access to real-time weather data..."
The chatbot admits it cannot help because it has no tools. It can only talk.
Traditional Automation: Fixed Rules, No Reasoning
Automation scripts follow predetermined logic. They can interact with external systems but cannot adapt to novel situations.
# Traditional automation — fixed logic, no reasoning
def process_order(order: dict) -> str:
if order["total"] > 100:
apply_discount(order, 0.10)
if order["shipping"] == "express":
charge_express_fee(order)
send_confirmation_email(order)
return "Order processed"
This works perfectly for known scenarios but breaks when something unexpected happens. What if the customer has a complaint attached to the order? What if they are asking for a combination that does not fit any rule?
Agents: Reasoning + Action
An agent combines the reasoning capability of an LLM with the ability to use tools and take action.
# An agent — can reason AND take action
from openai import OpenAI
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
}
]
def agent(user_message: str) -> str:
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": user_message}
],
tools=tools
)
# The agent decides whether to use a tool
message = response.choices[0].message
if message.tool_calls:
# Agent chose to act — execute the tool
tool_call = message.tool_calls[0]
result = execute_tool(tool_call)
return f"Based on the data: {result}"
else:
# Agent chose to respond directly
return message.content
The agent decides whether it needs a tool. It is not following a script — it is reasoning about the best approach.
Comparison Table
| Capability | Chatbot | Automation | Agent |
|---|---|---|---|
| Text generation | Yes | No | Yes |
| Tool usage | No | Yes (hardcoded) | Yes (dynamic) |
| Reasoning | Limited | None | Yes |
| Adapts to novel input | Somewhat | No | Yes |
| Autonomous action | No | Yes (fixed paths) | Yes (flexible) |
| Error recovery | No | Limited | Yes |
1.3 The ReACT Pattern
Reasoning + Acting = ReACT
The ReACT pattern (Reasoning and Acting) is the most widely used framework for building AI agents. Published by Yao et al. in 2022, it formalizes the agent loop into a structured cycle of Thought, Action, and Observation.
How ReACT Works:
Thought: I need to find the current stock price of Apple.
Action: search_stock(symbol="AAPL")
Observation: AAPL is trading at $178.50, up 1.2% today.
Thought: I have the stock price. The user also asked about the P/E ratio.
I should look that up too.
Action: get_financials(symbol="AAPL", metric="pe_ratio")
Observation: AAPL P/E ratio is 28.5.
Thought: I now have both pieces of information the user requested.
I can provide a complete answer.
Action: respond("Apple (AAPL) is trading at $178.50, up 1.2% today.
The current P/E ratio is 28.5.")
Each step is explicit. The agent thinks before it acts, and it observes the result before thinking again. This makes the agent's reasoning transparent and debuggable.
Why ReACT Works So Well
1. Transparent Reasoning
You can see exactly why the agent made each decision. This is critical for debugging and trust.
2. Error Recovery
If a tool call fails, the agent can reason about the error and try an alternative approach:
Thought: I need the weather in Tokyo.
Action: get_weather(city="Tokyo")
Observation: Error — API rate limit exceeded.
Thought: The weather API is rate-limited. I should try the backup
weather service instead.
Action: get_weather_backup(city="Tokyo")
Observation: Tokyo: 18°C, rain, 85% humidity.
3. Multi-Step Problem Solving
Complex tasks naturally decompose into a sequence of thought-action-observation cycles. The agent builds up context with each step.
Implementing ReACT in Python
Here is a simplified ReACT loop:
from openai import OpenAI
client = OpenAI()
def react_agent(task: str, tools: dict, max_steps: int = 5) -> str:
"""A simple ReACT agent loop."""
messages = [
{
"role": "system",
"content": (
"You are a helpful assistant. For each step, provide your "
"reasoning as 'Thought:', then specify an action. When you "
"have the final answer, respond directly to the user."
)
},
{"role": "user", "content": task}
]
for step in range(max_steps):
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=format_tools(tools)
)
message = response.choices[0].message
messages.append(message)
# If no tool calls, the agent is done
if not message.tool_calls:
return message.content
# Execute each tool call
for tool_call in message.tool_calls:
func_name = tool_call.function.name
func_args = json.loads(tool_call.function.arguments)
print(f" Action: {func_name}({func_args})")
# Execute the tool
result = tools[func_name](**func_args)
print(f" Observation: {result}")
# Feed the result back to the agent
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result)
})
return "Max steps reached without completing the task."
This is the skeleton of every agent you will build in this course. The details will change — we will use LangChain and LangGraph to make this more robust — but the pattern remains the same.
1.4 When to Use Agents vs Simple LLM Calls
The Decision Framework
Not every problem needs an agent. Using an agent when a simple LLM call would suffice adds unnecessary complexity, cost, and latency. Here is a practical framework for deciding.
Use a simple LLM call when:
- The task requires only text generation (summarization, translation, writing)
- No external data or tools are needed
- The output is a single response with no follow-up actions
- Latency must be minimal (single round-trip)
# Simple LLM call — perfect for text-only tasks
def summarize(text: str) -> str:
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "Summarize the following text."},
{"role": "user", "content": text}
]
)
return response.choices[0].message.content
Use an agent when:
- The task requires external data (APIs, databases, web search)
- Multiple steps are needed with decisions between them
- The path to completion depends on intermediate results
- Error recovery and adaptation are important
- Human approval is needed before certain actions
# Agent — needed when reasoning + action are required
# "Research this company and draft an investment memo"
# Requires: web search, data analysis, document generation, approval
The Complexity Spectrum
Simple ────────────────────────────────────── Complex
Single LLM Chain of Tool-using Multi-step Multi-agent
call prompts agent agent system
"Translate "Summarize "Look up "Research "Team of agents
this text" and then the stock competitors, coordinate to
format" price" analyze, and write a report"
draft memo"
Rule of thumb: Start with the simplest approach that solves the problem. Upgrade to agents only when you need reasoning, tools, or multi-step execution.
Cost and Latency Considerations
Every agent step involves at least one LLM call. A five-step agent workflow might make five or more API calls, each costing money and adding latency.
| Approach | API Calls | Typical Latency | Typical Cost |
|---|---|---|---|
| Single LLM call | 1 | 1-3 seconds | $0.01-0.05 |
| Simple chain | 2-3 | 3-8 seconds | $0.03-0.15 |
| Tool-using agent | 3-7 | 5-20 seconds | $0.05-0.50 |
| Multi-agent system | 10-30+ | 30-120 seconds | $0.50-5.00+ |
This is why the decision framework matters. Do not build a multi-agent system when a single LLM call will do.
Project: Your First Agent Loop in Python
Let's build a working agent from scratch using the OpenAI API. No frameworks, no abstractions — just Python and the raw API.
Setup
# Create a new project directory
# mkdir my-first-agent && cd my-first-agent
# Create a virtual environment
# python -m venv venv
# source venv/bin/activate (Mac/Linux)
# venv\Scripts\activate (Windows)
# Install dependencies
# pip install openai python-dotenv
Create a .env file:
OPENAI_API_KEY=your_api_key_here
The Code
Create agent.py:
import json
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI()
# Define the tools our agent can use
def get_weather(city: str) -> str:
"""Simulate getting weather data."""
weather_data = {
"Tokyo": {"temp": 18, "condition": "Rainy", "humidity": 85},
"London": {"temp": 12, "condition": "Cloudy", "humidity": 70},
"New York": {"temp": 25, "condition": "Sunny", "humidity": 45},
}
data = weather_data.get(city, {"temp": 20, "condition": "Unknown", "humidity": 50})
return json.dumps({"city": city, **data})
def calculate(expression: str) -> str:
"""Safely evaluate a math expression."""
try:
# Only allow safe math operations
allowed_chars = set("0123456789+-*/.() ")
if all(c in allowed_chars for c in expression):
result = eval(expression)
return json.dumps({"expression": expression, "result": result})
return json.dumps({"error": "Invalid expression"})
except Exception as e:
return json.dumps({"error": str(e)})
# Map function names to implementations
available_tools = {
"get_weather": get_weather,
"calculate": calculate,
}
# Define tool schemas for the LLM
tool_schemas = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name (e.g., Tokyo, London)"
}
},
"required": ["city"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Evaluate a mathematical expression",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "The math expression (e.g., '2 + 2', '100 * 0.15')"
}
},
"required": ["expression"]
}
}
}
]
def run_agent(user_input: str, max_steps: int = 5) -> str:
"""Run the agent loop with ReACT-style reasoning."""
messages = [
{
"role": "system",
"content": (
"You are a helpful assistant with access to tools. "
"Think step by step about what the user needs. "
"Use tools when you need external data or calculations. "
"When you have enough information, respond directly."
)
},
{"role": "user", "content": user_input}
]
for step in range(max_steps):
print(f"\n--- Step {step + 1} ---")
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tool_schemas,
)
message = response.choices[0].message
messages.append(message)
# If the agent responds without tool calls, we are done
if not message.tool_calls:
print(f"Agent: {message.content}")
return message.content
# Process each tool call
for tool_call in message.tool_calls:
func_name = tool_call.function.name
func_args = json.loads(tool_call.function.arguments)
print(f"Tool call: {func_name}({func_args})")
# Execute the tool
if func_name in available_tools:
result = available_tools[func_name](**func_args)
else:
result = json.dumps({"error": f"Unknown tool: {func_name}"})
print(f"Result: {result}")
# Feed the result back to the conversation
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
return "Agent reached maximum steps without completing the task."
def main():
print("AI Agent Started. Type 'exit' to quit.\n")
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
print("Goodbye!")
break
run_agent(user_input)
print()
if __name__ == "__main__":
main()
Run It
# python agent.py
Try These Prompts
You: What is the weather in Tokyo?
You: What is 15% of 850?
You: Compare the weather in Tokyo and New York, and tell me which is warmer.
What Just Happened?
You built a working AI agent that:
- Reasons about which tools to use based on the user's question
- Calls tools to get real data (weather, calculations)
- Observes the results and incorporates them into its response
- Chains multiple tool calls when the task requires it
- Loops until it has enough information to respond
This is the ReACT pattern in action. Everything you build from here will be a more sophisticated version of this same loop.
Key Takeaways
- Agents follow the sense-reason-act cycle — they perceive, think, and execute autonomously
- Chatbots can only talk; automation follows fixed rules; agents combine reasoning with action
- The ReACT pattern (Thought, Action, Observation) is the foundation of modern AI agents
- Start simple — use agents only when you need reasoning, tools, or multi-step execution
- Every agent is a loop: receive input, reason, act, observe, repeat until done
Exercises
Before moving to Module 2, try these extensions to your agent:
-
Add a new tool: Create a
search_contactstool that looks up contact information from a dictionary. Test it with prompts like "What is Alice's email?" -
Add conversation memory: Modify
main()to keepmessagesacross turns so the agent remembers previous questions in the same session. -
Add error handling: What happens if a tool call fails? Modify
run_agentto handle exceptions gracefully and let the agent try an alternative approach. -
Experiment with the system prompt: Change the system message to give the agent a specific personality or role (e.g., "You are a travel advisor"). How does this change its behavior?
-
Count the cost: Add a counter that tracks how many API calls the agent makes per task. Try asking simple questions vs complex ones and compare.
Next up: Module 2, where we explore the Python AI agent stack — LangChain, LangGraph, and the tools that make building agents dramatically easier.

