AI-Powered Notebooks and Chained Workflows
Your notebook is now the richest AI surface you have. Jupyter, VS Code, Hex, Deepnote, and Noteable all ship with AI features that can write cells for you, explain output, generate charts, and even run multi-step analyses from a single natural-language request. Chained workflows go further: AI automatically runs a sequence of steps (query → clean → analyze → summarize) without you prompting at each one.
This lesson shows how to use AI inside notebooks productively and how to chain multiple AI steps into repeatable analyst workflows.
What You'll Learn
- The main AI notebook surfaces and what each one is best for
- Effective in-notebook prompting habits
- Building chained workflows for repeated analyses
- When multi-step AI agents are worth it — and when they are not
AI Notebook Surfaces You Should Know
Jupyter AI (open-source): Provides a %ai magic and a /ai chat panel. Works with Claude, ChatGPT, Gemini, and local models. Good for data scientists who live in Jupyter.
GitHub Copilot in VS Code notebooks: Inline cell completions, "explain this cell," and "fix this error" actions. Best if your notebooks live in VS Code alongside your code repo.
Hex AI: Commercial notebook with built-in AI that can generate entire cells from natural-language descriptions, suggest next steps, and summarize results. Strong for shared analyst work.
Deepnote AI: Similar to Hex — natural-language to SQL and Python, AI-generated narrative, and shared collaborative notebooks.
ChatGPT Advanced Data Analysis: A sandboxed notebook experience inside ChatGPT. You upload a CSV and chat; it writes and runs Python under the hood.
Claude with the analysis tool: Similar to ChatGPT's code interpreter, running Python and producing charts inline.
For most analysts, Hex or Deepnote shine for collaborative analyst work, and ChatGPT / Claude sandboxes shine for one-off analyses where you do not want to set up a local environment.
In-Notebook Prompting Habits
The notebook is a streaming environment — you see intermediate results. Use that to your advantage.
1. Write the plan as a markdown cell first
Before asking AI to generate code, write the analysis plan:
## Goal
Understand Q1 2026 churn drivers
## Steps
1. Load orders and customers for 2025-10-01 to 2026-03-31
2. Define churn = 90 days since last order
3. Compute RFM per customer at churn boundary
4. Cross-tab churn vs segment
5. Logistic regression: churn ~ segment + RFM
6. Summarize top 3 drivers
Then ask the AI to write code for one step at a time, in a new cell each time. This gives you an auditable, modular notebook.
2. Narrate what each cell does
After every cell, add a markdown cell with a one-sentence summary:
"This cell loads orders from 2025-10-01 to 2026-03-31 and filters to completed orders only. Expected row count: ~4.8M."
Ask the AI to generate these summaries for you:
"For each code cell in this notebook, write a one-line markdown summary to go above it."
Future-you will thank present-you.
3. Ask "what would a senior analyst check here?"
After running a cell, ask:
"I just ran this groupby aggregation. What checks should a senior analyst do on this output before trusting it? Give me the specific assertions (Python code) to add."
You will get useful checks: no nulls in the primary key, row count matches expectation, totals reconcile to a known value.
4. Debug by asking for hypotheses
When a cell errors or produces something weird, ask:
"This cell was supposed to return 1,847 rows but returned 0. Given the upstream cells (paste output), list the top 5 causes ranked by likelihood and how to check each."
This is faster than guessing.
Chained Workflows
A chained workflow is a pipeline where the output of one AI step feeds the next. For analysts, common chains look like:
- SQL generation → 2. Execute on warehouse → 3. Pandas summary → 4. Chart → 5. Narrative
You can build these in three ways:
Approach A: Manual chaining in a notebook
Just run each step yourself, copy the output, and feed it to the next AI prompt. This is the simplest and easiest to debug. Start here.
Approach B: Templated notebook chains
Build a notebook with parameterized cells that run end-to-end. Ask AI to generate the template:
"Create a Jupyter notebook template for a weekly customer retention report. Parameters:
start_date,end_date,cohort_definition. The notebook should:
- Load data from Snowflake using the provided dates
- Compute cohort retention matrix
- Plot retention curves
- Generate an AI narrative describing the top 3 observations
- Export a PDF report
Use
papermill-style parameterization. Each section should be a separate cell so I can re-run individual steps."
Once you have the template, a weekly run is as simple as changing two dates.
Approach C: Multi-step agents
Tools like LangChain, AutoGPT, CrewAI, OpenAI Assistants, and Claude's tool use allow you to chain AI calls programmatically — one step picks what to do next based on the output of the previous step.
For analyst workflows, the useful pattern is:
Agent 1: Question Parser
Given a user question in English, break it into:
- Metric
- Filters
- Time period
- Expected output shape
Agent 2: SQL Writer
Given the parsed question and the schema, write SQL.
Agent 3: Query Runner
Execute the SQL against the warehouse and return a DataFrame.
Agent 4: Result Checker
Verify the DataFrame has reasonable values. Flag any anomalies.
Agent 5: Narrative Writer
Summarize the result in one paragraph for a stakeholder.
Frameworks like LangChain make this straightforward to build. Whether the complexity is worth it depends on how often you run the workflow.
When NOT to use agents
Resist the temptation to agent-ify everything:
- If a workflow runs once a week, a parameterized notebook is better than an agent
- If the output requires nuanced judgement, keep a human in the loop
- If the data is regulated, agents that act autonomously create audit problems
- Debugging a failing 5-step agent is painful; debugging a failing cell is easy
Reach for agents when the workflow runs daily or more, the inputs are well-bounded, and the cost of a wrong output is low (internal metrics, not board-reported numbers).
Example: Weekly Retention Report Chain
Here is a concrete chain worth building:
- Trigger: Every Monday at 6am
- Step 1 — Query: Pull last 90 days of user events from the warehouse
- Step 2 — Clean: Apply standard cleaning: dedupe, dtype coerce, filter test accounts
- Step 3 — Compute: Cohort retention matrix for last 6 monthly cohorts
- Step 4 — Chart: Retention curves plotted as a PNG
- Step 5 — Narrate: AI-generated 150-word summary of what changed vs last week
- Step 6 — Deliver: Email to stakeholders with chart and narrative
You can build this as a parameterized Jupyter notebook run by cron, or as a proper agentic pipeline. Start simple, add complexity when needed.
Validating Chain Output
Every chained workflow needs validation. For the retention report:
- Is the total user count within 5% of last week? If not, alert.
- Did the query return the expected number of cohorts? If not, skip.
- Is the narrative length reasonable (50-250 words)? If not, regenerate.
- Does the chart render correctly? If not, fall back to static.
These checks prevent the chain from confidently sending wrong numbers when something upstream breaks.
Notebook-to-Production
When a notebook becomes critical (runs weekly, stakeholders depend on it), productionize:
- Move SQL into a dbt model
- Move pandas logic into a Python script or a dbt Python model
- Schedule via Airflow, Prefect, Dagster, or GitHub Actions
- Keep the notebook for exploration, move the scheduled run to pipeline code
AI helps with this migration:
"Convert this Jupyter notebook into a production Python script with:
- Proper logging (use
loggingmodule, INFO for progress, ERROR for failures)- Retry logic on warehouse calls (3 attempts with exponential backoff)
- Input validation (assert expected row counts)
- Output artifacts to a specified S3 bucket
- A Dagster asset decorator so it runs in our pipeline
Paste notebook code: {...}"
Key Takeaways
- Pick the notebook surface that matches your environment (Jupyter AI, Hex, Deepnote, ChatGPT ADA)
- Write the plan first, generate code one step at a time, narrate each cell
- Chain workflows manually before automating — manual chains are easy to debug
- Multi-step agents pay off only for high-frequency, low-risk workflows
- Add validation checks to every chained workflow to catch upstream breakage
- When a notebook becomes critical, productionize — AI helps with the migration

