Module 1: The "AI Engineer" Mindset (JavaScript Edition)
Theory & Setup
Introduction: Thinking Like an AI Engineer
Welcome to Module 1. Before we write a single line of code, we need to shift how we think about building software.
Traditional software development is deterministic: given the same input, you get the same output every time. AI agents are probabilistic: they make decisions, reason about problems, and their outputs can vary.
This module will help you understand what makes agents different, why JavaScript is an excellent choice for building them, and how to set up your development environment for success.
1.1 Beyond the Chatbot
The Evolution of AI Interfaces
Phase 1: The Chatbot (2020-2022)
Early LLM applications were essentially fancy text completion systems:
- User types a message
- AI generates a response
- Conversation continues
This was impressive, but limited. The AI could only talk. It couldn't do anything.
Phase 2: The Agent (2023-Present)
An AI agent can:
- Reason: Break down complex tasks into steps
- Use Tools: Call functions to interact with external systems
- Take Action: Execute tasks autonomously
- Remember: Maintain context across conversations
- Adapt: Learn from feedback and adjust behavior
What Makes Something an "Agent"?
An agent has three core capabilities:
1. Reasoning (The Brain)
// The agent can think through problems step by step
"To research Tesla's earnings:
1. I need to search for recent earnings reports
2. Extract key financial metrics
3. Analyze trends vs previous quarters
4. Summarize findings"
2. Tools (The Hands)
// The agent can use functions to interact with the world
const tools = {
searchWeb: (query: string) => SearchResult[],
readDocument: (url: string) => string,
sendEmail: (to: string, content: string) => void
}
3. Action (The Execution)
// The agent can autonomously execute tasks
async function executeResearch() {
const results = await searchWeb("Tesla Q4 earnings")
const report = await readDocument(results[0].url)
const analysis = await analyzeWithLLM(report)
await sendEmail("team@company.com", analysis)
}
The Agent Loop
Every agent follows a variation of this pattern:
1. RECEIVE task
2. THINK about what needs to be done
3. DECIDE which tool to use (if any)
4. EXECUTE the tool
5. OBSERVE the result
6. REPEAT steps 2-5 until task is complete
7. RESPOND with final output
This is called the ReACT pattern (Reason, Act, Observe), and we'll implement it in Module 3.
1.2 The Stack: Why Node.js Wins for Orchestration
The Python Dominance Myth
Most AI courses teach Python because:
- PyTorch and TensorFlow are Python-first
- Jupyter notebooks are popular for ML research
- That's what everyone else is teaching
But here's what they don't tell you: you're not training models, you're using APIs.
Why JavaScript is Perfect for AI Agents
1. You're Building Applications, Not Training Models
Modern AI development is about orchestration:
- Calling LLM APIs (OpenAI, Anthropic, etc.)
- Integrating with existing web services
- Building user interfaces
- Managing state and workflows
JavaScript excels at all of these.
2. The Best Full-Stack Experience
// Backend (API Route)
export async function POST(req: Request) {
const { message } = await req.json()
const response = await agent.run(message)
return Response.json(response)
}
// Frontend (React Component)
function Chat() {
const { messages, append } = useChat()
return <ChatInterface messages={messages} onSend={append} />
}
One language, one codebase, seamless data flow.
3. The JavaScript AI Ecosystem is Thriving
- Vercel AI SDK: Best-in-class streaming, tool calling, and generative UI
- LangGraph.js: Stateful agent orchestration
- LangChain.js: Extensive chain and RAG utilities
- Zod: Runtime type validation perfect for LLM outputs
- Next.js: Server and client in perfect harmony
4. Real-Time & Streaming Native
JavaScript was built for async operations and streaming:
// Streaming AI responses
const stream = await streamText({
model: openai('gpt-4'),
messages
})
for await (const chunk of stream.textStream) {
process.stdout.write(chunk) // Real-time output
}
5. Deployment Advantage
- Vercel: Deploy in seconds, scale automatically
- Cloudflare Workers: Edge computing for global agents
- Serverless: No infrastructure management
When You Might Actually Need Python
- Training custom models from scratch
- Heavy numerical computing
- Working with specific Python-only libraries
For 95% of AI agent work, JavaScript is the better choice.
1.3 The Brain: Connecting to Models
Understanding Large Language Models (LLMs)
An LLM is essentially a very sophisticated text prediction system:
- It has been trained on massive amounts of text
- It can generate human-like responses
- It can follow instructions
- It can reason about problems (to a degree)
Popular LLMs:
- OpenAI GPT-4: Most capable for complex reasoning
- Anthropic Claude: Excellent for long context and analysis
- Google Gemini: Strong at multimodal tasks
- Meta Llama: Open source option
The Vercel AI SDK: Your Gateway to LLMs
The Vercel AI SDK provides a unified interface to multiple LLM providers:
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { anthropic } from '@ai-sdk/anthropic'
// Use any model with the same interface
const response1 = await generateText({
model: openai('gpt-4-turbo'),
prompt: 'Explain quantum computing'
})
const response2 = await generateText({
model: anthropic('claude-3-sonnet'),
prompt: 'Explain quantum computing'
})
Two Core Functions: generateText vs streamText
generateText: Wait for complete response
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
const { text } = await generateText({
model: openai('gpt-4-turbo'),
prompt: 'Write a haiku about TypeScript'
})
console.log(text)
// Output (all at once):
// "Types dance in the code
// Errors caught before runtime
// JavaScript refined"
streamText: Get response as it's generated
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
const { textStream } = await streamText({
model: openai('gpt-4-turbo'),
prompt: 'Write a haiku about TypeScript'
})
for await (const chunk of textStream) {
process.stdout.write(chunk) // Types... dance... in... the... code...
}
Message-Based Conversations
For chat-like interactions, use the messages array:
const { text } = await generateText({
model: openai('gpt-4-turbo'),
messages: [
{
role: 'system',
content: 'You are a helpful financial analyst.'
},
{
role: 'user',
content: 'What was Apple\'s revenue last quarter?'
}
]
})
Message Roles:
system: Instructions for the AI's behavioruser: The human's inputassistant: The AI's previous responsestool: Results from tool executions (we'll cover this in Module 2)
Project: "The One-File Agent"
Let's build your first AI agent in a single TypeScript file.
Setup
mkdir my-first-agent
cd my-first-agent
npm init -y
npm install ai @ai-sdk/openai dotenv
npm install -D typescript @types/node tsx
npx tsc --init
Create .env:
OPENAI_API_KEY=your_api_key_here
The Code
Create agent.ts:
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import * as dotenv from 'dotenv'
import * as readline from 'readline'
dotenv.config()
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
})
async function ask(question: string): Promise<string> {
return new Promise((resolve) => {
rl.question(question, resolve)
})
}
async function main() {
console.log('🤖 AI Agent Started. Type "exit" to quit.\n')
const conversationHistory: any[] = [
{
role: 'system',
content: 'You are a helpful AI assistant. Be concise and friendly.'
}
]
while (true) {
const userInput = await ask('You: ')
if (userInput.toLowerCase() === 'exit') {
console.log('Goodbye!')
rl.close()
break
}
conversationHistory.push({
role: 'user',
content: userInput
})
console.log('Agent: Thinking...')
const { text } = await generateText({
model: openai('gpt-4-turbo'),
messages: conversationHistory
})
console.log(`Agent: ${text}\n`)
conversationHistory.push({
role: 'assistant',
content: text
})
}
}
main()
Run It
npx tsx agent.ts
What Just Happened?
You built a functional AI agent that:
- Maintains conversation history
- Sends context to the LLM
- Displays responses
- Continues the conversation
This is the foundation. Everything else builds on this simple loop.
Key Takeaways
- Agents are different from chatbots—they can reason, use tools, and take action
- JavaScript is excellent for building AI agents (you're orchestrating, not training)
- The Vercel AI SDK provides a clean, type-safe interface to any LLM
- Every agent follows a loop: think, decide, act, observe, repeat
Exercise: Extend Your Agent
Before moving to Module 2, try modifying your one-file agent:
- Add a system message that changes the agent's personality
- Implement streaming instead of waiting for full responses
- Add a command like
/resetthat clears conversation history - Experiment with different models (try Claude or Gemini)
Next up: Module 2, where we give the agent "hands" by implementing tool calling.

