AI-Powered Notebooks and Chained Workflows

Your notebook is now the richest AI surface you have. Jupyter, VS Code, Hex, Deepnote, and Noteable all ship with AI features that can write cells for you, explain output, generate charts, and even run multi-step analyses from a single natural-language request. Chained workflows go further: AI automatically runs a sequence of steps (query → clean → analyze → summarize) without you prompting at each one.

This lesson shows how to use AI inside notebooks productively and how to chain multiple AI steps into repeatable analyst workflows.

What You'll Learn

The main AI notebook surfaces and what each one is best for
Effective in-notebook prompting habits
Building chained workflows for repeated analyses
When multi-step AI agents are worth it — and when they are not

AI Notebook Surfaces You Should Know

Jupyter AI (open-source): Provides a %ai magic and a /ai chat panel. Works with Claude, ChatGPT, Gemini, and local models. Good for data scientists who live in Jupyter.

GitHub Copilot in VS Code notebooks: Inline cell completions, "explain this cell," and "fix this error" actions. Best if your notebooks live in VS Code alongside your code repo.

Hex AI: Commercial notebook with built-in AI that can generate entire cells from natural-language descriptions, suggest next steps, and summarize results. Strong for shared analyst work.

Deepnote AI: Similar to Hex — natural-language to SQL and Python, AI-generated narrative, and shared collaborative notebooks.

ChatGPT Advanced Data Analysis: A sandboxed notebook experience inside ChatGPT. You upload a CSV and chat; it writes and runs Python under the hood.

Claude with the analysis tool: Similar to ChatGPT's code interpreter, running Python and producing charts inline.

For most analysts, Hex or Deepnote shine for collaborative analyst work, and ChatGPT / Claude sandboxes shine for one-off analyses where you do not want to set up a local environment.

In-Notebook Prompting Habits

The notebook is a streaming environment — you see intermediate results. Use that to your advantage.

1. Write the plan as a markdown cell first

Before asking AI to generate code, write the analysis plan:

## Goal
Understand Q1 2026 churn drivers

## Steps
1. Load orders and customers for 2025-10-01 to 2026-03-31
2. Define churn = 90 days since last order
3. Compute RFM per customer at churn boundary
4. Cross-tab churn vs segment
5. Logistic regression: churn ~ segment + RFM
6. Summarize top 3 drivers

Then ask the AI to write code for one step at a time, in a new cell each time. This gives you an auditable, modular notebook.

2. Narrate what each cell does

After every cell, add a markdown cell with a one-sentence summary:

"This cell loads orders from 2025-10-01 to 2026-03-31 and filters to completed orders only. Expected row count: ~4.8M."

Ask the AI to generate these summaries for you:

"For each code cell in this notebook, write a one-line markdown summary to go above it."

Future-you will thank present-you.

3. Ask "what would a senior analyst check here?"

After running a cell, ask:

"I just ran this groupby aggregation. What checks should a senior analyst do on this output before trusting it? Give me the specific assertions (Python code) to add."

You will get useful checks: no nulls in the primary key, row count matches expectation, totals reconcile to a known value.

4. Debug by asking for hypotheses

When a cell errors or produces something weird, ask:

"This cell was supposed to return 1,847 rows but returned 0. Given the upstream cells (paste output), list the top 5 causes ranked by likelihood and how to check each."

This is faster than guessing.

Chained Workflows

A chained workflow is a pipeline where the output of one AI step feeds the next. For analysts, common chains look like:

SQL generation → 2. Execute on warehouse → 3. Pandas summary → 4. Chart → 5. Narrative

You can build these in three ways:

Approach A: Manual chaining in a notebook

Just run each step yourself, copy the output, and feed it to the next AI prompt. This is the simplest and easiest to debug. Start here.

Approach B: Templated notebook chains

Build a notebook with parameterized cells that run end-to-end. Ask AI to generate the template:

"Create a Jupyter notebook template for a weekly customer retention report. Parameters: start_date, end_date, cohort_definition. The notebook should:

Load data from Snowflake using the provided dates

Compute cohort retention matrix

Plot retention curves

Generate an AI narrative describing the top 3 observations

Export a PDF report

Use papermill-style parameterization. Each section should be a separate cell so I can re-run individual steps."

Once you have the template, a weekly run is as simple as changing two dates.

Approach C: Multi-step agents

Tools like LangChain, AutoGPT, CrewAI, OpenAI Assistants, and Claude's tool use allow you to chain AI calls programmatically — one step picks what to do next based on the output of the previous step.

For analyst workflows, the useful pattern is:

Agent 1: Question Parser
  Given a user question in English, break it into:
    - Metric
    - Filters
    - Time period
    - Expected output shape

Agent 2: SQL Writer
  Given the parsed question and the schema, write SQL.

Agent 3: Query Runner
  Execute the SQL against the warehouse and return a DataFrame.

Agent 4: Result Checker
  Verify the DataFrame has reasonable values. Flag any anomalies.

Agent 5: Narrative Writer
  Summarize the result in one paragraph for a stakeholder.

Frameworks like LangChain make this straightforward to build. Whether the complexity is worth it depends on how often you run the workflow.

When NOT to use agents

Resist the temptation to agent-ify everything:

If a workflow runs once a week, a parameterized notebook is better than an agent
If the output requires nuanced judgement, keep a human in the loop
If the data is regulated, agents that act autonomously create audit problems
Debugging a failing 5-step agent is painful; debugging a failing cell is easy

Reach for agents when the workflow runs daily or more, the inputs are well-bounded, and the cost of a wrong output is low (internal metrics, not board-reported numbers).

Example: Weekly Retention Report Chain

Here is a concrete chain worth building:

Trigger: Every Monday at 6am
Step 1 — Query: Pull last 90 days of user events from the warehouse
Step 2 — Clean: Apply standard cleaning: dedupe, dtype coerce, filter test accounts
Step 3 — Compute: Cohort retention matrix for last 6 monthly cohorts
Step 4 — Chart: Retention curves plotted as a PNG
Step 5 — Narrate: AI-generated 150-word summary of what changed vs last week
Step 6 — Deliver: Email to stakeholders with chart and narrative

You can build this as a parameterized Jupyter notebook run by cron, or as a proper agentic pipeline. Start simple, add complexity when needed.

Validating Chain Output

Every chained workflow needs validation. For the retention report:

Is the total user count within 5% of last week? If not, alert.
Did the query return the expected number of cohorts? If not, skip.
Is the narrative length reasonable (50-250 words)? If not, regenerate.
Does the chart render correctly? If not, fall back to static.

These checks prevent the chain from confidently sending wrong numbers when something upstream breaks.

Notebook-to-Production

When a notebook becomes critical (runs weekly, stakeholders depend on it), productionize:

Move SQL into a dbt model
Move pandas logic into a Python script or a dbt Python model
Schedule via Airflow, Prefect, Dagster, or GitHub Actions
Keep the notebook for exploration, move the scheduled run to pipeline code

AI helps with this migration:

"Convert this Jupyter notebook into a production Python script with:

Proper logging (use logging module, INFO for progress, ERROR for failures)

Retry logic on warehouse calls (3 attempts with exponential backoff)

Input validation (assert expected row counts)

Output artifacts to a specified S3 bucket

A Dagster asset decorator so it runs in our pipeline

Paste notebook code: {...}"

Key Takeaways

Pick the notebook surface that matches your environment (Jupyter AI, Hex, Deepnote, ChatGPT ADA)
Write the plan first, generate code one step at a time, narrate each cell
Chain workflows manually before automating — manual chains are easy to debug
Multi-step agents pay off only for high-frequency, low-risk workflows
Add validation checks to every chained workflow to catch upstream breakage
When a notebook becomes critical, productionize — AI helps with the migration

AI-Powered Notebooks and Chained Workflows

This lesson shows how to use AI inside notebooks productively and how to chain multiple AI steps into repeatable analyst workflows.

What You'll Learn

The main AI notebook surfaces and what each one is best for
Effective in-notebook prompting habits
Building chained workflows for repeated analyses
When multi-step AI agents are worth it — and when they are not

AI Notebook Surfaces You Should Know

Jupyter AI (open-source): Provides a %ai magic and a /ai chat panel. Works with Claude, ChatGPT, Gemini, and local models. Good for data scientists who live in Jupyter.

GitHub Copilot in VS Code notebooks: Inline cell completions, "explain this cell," and "fix this error" actions. Best if your notebooks live in VS Code alongside your code repo.

Hex AI: Commercial notebook with built-in AI that can generate entire cells from natural-language descriptions, suggest next steps, and summarize results. Strong for shared analyst work.

Deepnote AI: Similar to Hex — natural-language to SQL and Python, AI-generated narrative, and shared collaborative notebooks.

ChatGPT Advanced Data Analysis: A sandboxed notebook experience inside ChatGPT. You upload a CSV and chat; it writes and runs Python under the hood.

Claude with the analysis tool: Similar to ChatGPT's code interpreter, running Python and producing charts inline.

For most analysts, Hex or Deepnote shine for collaborative analyst work, and ChatGPT / Claude sandboxes shine for one-off analyses where you do not want to set up a local environment.

In-Notebook Prompting Habits

The notebook is a streaming environment — you see intermediate results. Use that to your advantage.

1. Write the plan as a markdown cell first

Before asking AI to generate code, write the analysis plan:

## Goal
Understand Q1 2026 churn drivers

## Steps
1. Load orders and customers for 2025-10-01 to 2026-03-31
2. Define churn = 90 days since last order
3. Compute RFM per customer at churn boundary
4. Cross-tab churn vs segment
5. Logistic regression: churn ~ segment + RFM
6. Summarize top 3 drivers

Then ask the AI to write code for one step at a time, in a new cell each time. This gives you an auditable, modular notebook.

2. Narrate what each cell does

After every cell, add a markdown cell with a one-sentence summary:

"This cell loads orders from 2025-10-01 to 2026-03-31 and filters to completed orders only. Expected row count: ~4.8M."

Ask the AI to generate these summaries for you:

"For each code cell in this notebook, write a one-line markdown summary to go above it."

Future-you will thank present-you.

3. Ask "what would a senior analyst check here?"

After running a cell, ask:

"I just ran this groupby aggregation. What checks should a senior analyst do on this output before trusting it? Give me the specific assertions (Python code) to add."

You will get useful checks: no nulls in the primary key, row count matches expectation, totals reconcile to a known value.

4. Debug by asking for hypotheses

When a cell errors or produces something weird, ask:

"This cell was supposed to return 1,847 rows but returned 0. Given the upstream cells (paste output), list the top 5 causes ranked by likelihood and how to check each."

This is faster than guessing.

Chained Workflows

A chained workflow is a pipeline where the output of one AI step feeds the next. For analysts, common chains look like:

SQL generation → 2. Execute on warehouse → 3. Pandas summary → 4. Chart → 5. Narrative

You can build these in three ways:

Approach A: Manual chaining in a notebook

Just run each step yourself, copy the output, and feed it to the next AI prompt. This is the simplest and easiest to debug. Start here.

Approach B: Templated notebook chains

Build a notebook with parameterized cells that run end-to-end. Ask AI to generate the template:

"Create a Jupyter notebook template for a weekly customer retention report. Parameters: start_date, end_date, cohort_definition. The notebook should:

Load data from Snowflake using the provided dates

Compute cohort retention matrix

Plot retention curves

Generate an AI narrative describing the top 3 observations

Export a PDF report

Use papermill-style parameterization. Each section should be a separate cell so I can re-run individual steps."

Once you have the template, a weekly run is as simple as changing two dates.

Approach C: Multi-step agents

For analyst workflows, the useful pattern is:

Agent 1: Question Parser
  Given a user question in English, break it into:
    - Metric
    - Filters
    - Time period
    - Expected output shape

Agent 2: SQL Writer
  Given the parsed question and the schema, write SQL.

Agent 3: Query Runner
  Execute the SQL against the warehouse and return a DataFrame.

Agent 4: Result Checker
  Verify the DataFrame has reasonable values. Flag any anomalies.

Agent 5: Narrative Writer
  Summarize the result in one paragraph for a stakeholder.

Frameworks like LangChain make this straightforward to build. Whether the complexity is worth it depends on how often you run the workflow.

When NOT to use agents

Resist the temptation to agent-ify everything:

If a workflow runs once a week, a parameterized notebook is better than an agent
If the output requires nuanced judgement, keep a human in the loop
If the data is regulated, agents that act autonomously create audit problems
Debugging a failing 5-step agent is painful; debugging a failing cell is easy

Reach for agents when the workflow runs daily or more, the inputs are well-bounded, and the cost of a wrong output is low (internal metrics, not board-reported numbers).

Example: Weekly Retention Report Chain

Here is a concrete chain worth building:

Trigger: Every Monday at 6am
Step 1 — Query: Pull last 90 days of user events from the warehouse
Step 2 — Clean: Apply standard cleaning: dedupe, dtype coerce, filter test accounts
Step 3 — Compute: Cohort retention matrix for last 6 monthly cohorts
Step 4 — Chart: Retention curves plotted as a PNG
Step 5 — Narrate: AI-generated 150-word summary of what changed vs last week
Step 6 — Deliver: Email to stakeholders with chart and narrative

You can build this as a parameterized Jupyter notebook run by cron, or as a proper agentic pipeline. Start simple, add complexity when needed.

Validating Chain Output

Every chained workflow needs validation. For the retention report:

Is the total user count within 5% of last week? If not, alert.
Did the query return the expected number of cohorts? If not, skip.
Is the narrative length reasonable (50-250 words)? If not, regenerate.
Does the chart render correctly? If not, fall back to static.

These checks prevent the chain from confidently sending wrong numbers when something upstream breaks.

Notebook-to-Production

When a notebook becomes critical (runs weekly, stakeholders depend on it), productionize:

Move SQL into a dbt model
Move pandas logic into a Python script or a dbt Python model
Schedule via Airflow, Prefect, Dagster, or GitHub Actions
Keep the notebook for exploration, move the scheduled run to pipeline code

AI helps with this migration:

"Convert this Jupyter notebook into a production Python script with:

Proper logging (use logging module, INFO for progress, ERROR for failures)

Retry logic on warehouse calls (3 attempts with exponential backoff)

Input validation (assert expected row counts)

Output artifacts to a specified S3 bucket

A Dagster asset decorator so it runs in our pipeline

Paste notebook code: {...}"

Key Takeaways

Pick the notebook surface that matches your environment (Jupyter AI, Hex, Deepnote, ChatGPT ADA)
Write the plan first, generate code one step at a time, narrate each cell
Chain workflows manually before automating — manual chains are easy to debug
Multi-step agents pay off only for high-frequency, low-risk workflows
Add validation checks to every chained workflow to catch upstream breakage
When a notebook becomes critical, productionize — AI helps with the migration

AI-Powered Notebooks and Chained Workflows

What You'll Learn

AI Notebook Surfaces You Should Know

In-Notebook Prompting Habits

1. Write the plan as a markdown cell first

2. Narrate what each cell does

3. Ask "what would a senior analyst check here?"

4. Debug by asking for hypotheses

Chained Workflows

Approach A: Manual chaining in a notebook

Approach B: Templated notebook chains

Approach C: Multi-step agents

When NOT to use agents

Example: Weekly Retention Report Chain

Validating Chain Output

Notebook-to-Production

Key Takeaways

Quiz

Questions & Answers

AI-Powered Notebooks and Chained Workflows

What You'll Learn

AI Notebook Surfaces You Should Know

In-Notebook Prompting Habits

1. Write the plan as a markdown cell first

2. Narrate what each cell does

3. Ask "what would a senior analyst check here?"

4. Debug by asking for hypotheses

Chained Workflows

Approach A: Manual chaining in a notebook

Approach B: Templated notebook chains

Approach C: Multi-step agents

When NOT to use agents

Example: Weekly Retention Report Chain

Validating Chain Output

Notebook-to-Production

Key Takeaways

Quiz

Questions & Answers