Justin Langseth

Chief Technology Officer
LinkedIn
At Snowflake, Justin helped launch the data marketplace and worked on the AI strategy. Before that, he co-founded and led several companies, including Zoomdata and Clarabridge. He holds 51 technology patents related to data sharing, protection, and analysis. He graduated from MIT with a degree in Management of Information Technology.
October 20, 2025

Blueprints: How We Teach Agents to Work the Way Data Engineers Do

Justin Langseth
Chief Technology Officer
Stay in the Fast Lane
News and product updates in Agentic AI for enterprise data teams.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

TL;DR: Most data agents fail in production not because they can't code, but because they don't know how to work. They skip ambiguity checks, pick the wrong source tables, and build confidently in the wrong direction. At Genesis, we solve this with Blueprints: structured, repeatable methodologies that guide agents through multi-stage data tasks the same way an experienced engineer would — clarifying, validating, pausing for human input, and only then generating code. Once a team runs a Blueprint a few times, it becomes their own versioned artifact in Git. This post is Part 1 of a three-part series. Parts 2 and 3 cover Context Management and Progressive Tool Use.

Why Agents Need More Than a Prompt

I'm Justin Langseth, CTO at Genesis, and I've spent most of my career building data systems — from large-scale analytics platforms to agents that can actually reason about data.

Over the past 2 years, my team and I have been working on what we call data agents: autonomous systems that can build, fix, and extend data pipelines on their own.

From the outside, that sounds simple — connect a language model to a few APIs and let it run. But the hard problems appear once you try to make those agents operate for hours, stay coherent, and integrate with real enterprise stacks.

The issues we've faced; context overload, tool sprawl, forgotten state, are universal to anyone working with agentic systems. I started writing this series to show what happens behind the scenes: how we structure agent workflows through blueprints, manage their memory so they can think clearly, and regulate tool use so they don't drown in complexity. These are lessons learned the hard way, and I'm sharing them because every data engineering team will face the same obstacles eventually — we just happened to get there first.

This is the first post in a three-part series about how we're making agentic systems actually work at Genesis. The next two will cover Context Management and Progressive Tool Use. But none of that makes sense until you understand what a Blueprint is.

The Problem: Confidence Without Procedure

When people first see a data agent run, the first reaction is, "Wow, it can code." That's true, but it can also code the wrong thing, very confidently.

Ask a model to build a reporting table and it will give you something plausible. The problem is, it doesn't know how the work gets done. It doesn't understand the order of operations, where ambiguity hides, when to pause, or when to check with a human.

Experienced data engineers do all that instinctively. They follow a quiet, repeatable process. So our question became: how do we give an agent that same sense of procedure?

What a Blueprint Is

A Blueprint is our answer. It's not a prompt template or a script; it's a living methodology that tells an agent how to move through a multi-stage task.

Think of it like the binder every data team wishes they had: When someone asks for a new dashboard, here's how we handle it. Step 1 — clarify the metric. Step 2 — look for candidate tables. Step 3 — check lineage. Step 4 — get sign-off before coding.

Except in our case, the binder isn't for people — it's for agents. The Blueprint sits behind the agent's reasoning loop and defines what happens at each phase: what information it needs, which tools are relevant, what questions to ask, and what triggers a pause.

You can see Blueprints in action in our Genesis Walkthrough #3: Using a Blueprint to Launch a Mission.

A Real Example: Building a Sales Report

Let's take something every data team does and every language model struggles with: building a sales report.

A human data engineer hears "sales" and immediately starts unpacking it. Does this mean gross? Net? Adjusted? Is it by order date or invoice date? They know there are probably a dozen candidate tables across Snowflake, Databricks, maybe some legacy warehouse, all shaped a little differently.

A model doesn't know any of that. It grabs the first thing called sales_data and starts writing SQL.

How Blueprints Guide Agent Workflows. The Blueprint Transformation Journey

The Blueprint Transformation Journey

In our Field Mapping Blueprint, the agent follows a defined flow that mirrors how engineers actually think:

  1. Clarify the request. Identify every ambiguous term and restate the goal in plain language.
  2. Search for candidate sources. Use metadata, lineage, and naming conventions to surface possible tables and fields.
  3. Evaluate ambiguity. When multiple matches exist, summarize them — don't decide yet.
  4. Pause for feedback. Generate concise, directed questions for two audiences:
    • Business users: "When you say 'sales,' do you mean booked or invoiced?"
    • Data owners: "Between these three tables, which one is authoritative?"
  5. Update the plan. Integrate the answers, document the reasoning, and store it in Markdown for traceability.
  6. Generate the code. Translate the confirmed mapping into dbt, Snowpark, or Databricks logic.

At any step, the agent can stop and wait for humans, minutes or days, before continuing. That's deliberate. It's the digital version of a mid-level engineer walking over to someone's desk to double-check before building the wrong thing.

How Blueprints Evolve With Your Team

Each Blueprint covers one repeatable process: mapping, pipeline build, failure analysis, QA. When we deliver them, they work out of the box. But the real value starts once a customer runs them a few times.

We call this the safety-driving phase. The agent runs the Blueprint while a human watches and occasionally grabs the wheel — adding notes, clarifying edge cases, maybe rewriting a step or two. After a few drives, the agent has captured enough local knowledge to fork the Blueprint into that company's own version.

That fork becomes a living artifact: the company's data-engineering methodology rendered in code and versioned in Git. It's their intellectual property now, the way they build things, distilled into a repeatable sequence both humans and agents can follow.

For teams already doing this kind of work manually, read our post on The Evolution of Data Work: Introducing Agentic Data Engineering for the bigger picture on how we think about this shift.

The Foundation for Everything Else

Agents are powerful but forgetful. They lose context fast. Blueprints provide the skeleton the rest of the system hangs on. Context Management (which I cover in Part 2) keeps their memory clean; Blueprints tell them what to remember and when it matters.

Without structure, you get speed without reliability. With it, you get a system that behaves like a mid-level engineer — cautious, methodical, aware of what it doesn't know.

That's the goal. Not to replace data teams, but to encode their judgment so every run, every agent, and every project gets a little smarter.

To see how this applies in a concrete pipeline scenario, check out our walkthrough on how Genesis automates data pipeline development in hours.

Frequently Asked Questions

What is a Blueprint in the context of Genesis data agents? A Blueprint is a structured methodology that tells a data agent how to move through a multi-stage task, step by step, in the same sequence an experienced engineer would follow. It defines what information to gather, which tools to use, when to pause for human input, and what to do with the answers. It's not a prompt template; it's closer to a documented process playbook, except it runs inside an agent's reasoning loop.

How is a Blueprint different from a standard prompt or script? A script executes in a fixed sequence without adapting. A prompt gives the model general direction but no procedural guardrails. A Blueprint is structured enough to enforce order of operations and checkpoints, but flexible enough to branch based on what the agent discovers along the way, including stopping entirely until a human responds.

Can a Blueprint be customized for a specific company's workflow? Yes, and that's one of the core design goals. Blueprints ship with sensible defaults, but as a team runs them and adds their own context — edge cases, naming conventions, preferred data sources — the Blueprint forks into a company-specific version. That version lives in Git as the team's own intellectual property.

What kinds of tasks do Blueprints cover? Genesis ships Blueprints for common data engineering processes: field mapping, pipeline builds, failure analysis, QA, and more. The Field Mapping Blueprint described in this post handles the full cycle from ambiguous request to validated code generation in dbt, Snowpark, or Databricks.

What happens when an agent using a Blueprint encounters something it doesn't know? The Blueprint has built-in pause points. Rather than guessing, the agent generates targeted questions for the right audience — business users for definitional clarity, data owners for source-of-truth decisions — and waits for answers before continuing. This prevents the most common failure mode: confident code built on wrong assumptions.

How do Blueprints relate to Context Management and Progressive Tool Use? Blueprints define the process structure. Context Management ensures the agent keeps only relevant information in working memory as it executes that structure over long runs. Progressive Tool Use controls which tools are active at each stage so the agent isn't overwhelmed by unnecessary options. All three work together.

Is this approach specific to Snowflake or Databricks environments? No. While the examples in this post reference Snowflake and Databricks (common enterprise stacks), the Blueprint approach is platform-agnostic. Genesis integrates with a wide range of data warehouses and transformation tools. See our deployments page for the full list.

This is Part 1 of a three-part series. Continue reading: Part 2: Context Management: The Hardest Problem in Long-Running Agents and Part 3: Progressive Tool Use.

Want to learn more? Get in touch!

Experience what Genesis can do for your team.
Request a Demo
Stay in the Fast Lane
News and product updates in Agentic AI for enterprise data teams.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Keep Reading

Genesis Walkthrough #1: Exploring an S3 Bucket with Genesis Agents
Genesis Walkthrough #2: Loading data from S3 into Snowflake with Genesis
The Evolution of Data Work: Introducing Agentic Data Engineering
Stefan Williams, Snowflake & Matt Glickman, Genesis Computing | Snowflake Summit 2025
View All Articles
April 27, 2026
From "Something's Broken" to Root Cause in 5 Minutes
No items found.
No items found.
April 23, 2026
40 Minutes to Reverse-Engineer a Legacy Data Warehouse (Including the Ghost Artifacts Nobody Knew Existed)
No items found.
No items found.
April 22, 2026
From Raw Claims Data to a Live Analytics Dashboard in 7 Minutes
No items found.
No items found.
April 20, 2026
Meet Genesis Twin: The Digital Twin That Ends the Monday Morning Data Fire Drill
No items found.
No items found.
April 9, 2026
Super Data Science: ML & AI Podcast with Jon Krohn
Matt Glickman
April 8, 2026
Connecting Data Sources in Genesis
Todd Beauchene
Promotional banner for Genesis Computing
March 31, 2026
How Genesis Automates Synthetic Data Generation for Databricks Dev Environments in Under 34 Minutes
Todd Beauchene
March 19, 2026
The Death of Traditional BI - Part 1
Genesis Computing
March 11, 2026
AI Agent Builds dbt Analytics Schema in 30 Minutes
Todd Beauchene
February 26, 2026
Genesis Bronze, Silver, Gold Agentic Data Engineering: From Dashboard Sketch to Production Pipeline
Genesis Computing
February 19, 2026
How Genesis Automates Data Pipeline Development in Hours
Genesis Computing
February 12, 2026
3 cortex Codes Running in Parallel?
Justin Langseth
February 10, 2026
Powering Up Cortex Code with Genesis Superpowers
Matt Glickman
February 2, 2026
Automate Dashboard Creation with Genesis
Justin Langseth
January 27, 2026
Using AI Agents to Generate Synthetic Data
Justin Langseth
January 12, 2026
The Junior Data Engineer is Now an AI Agent
Matt Glickman
December 22, 2025
From Requirements to Production Pipelines With Genesis Missions
Genesis Computing
December 4, 2025
20 Years at Goldman Taught Me How to Manage People. Turns Out, Managing AI Agents Isn't That Different.
Anton Gorshkov
December 2, 2025
A CEO's Perspective on the Shift to AI Agents
Genesis Computing
December 2, 2025
Genesis Walkthrough #1: Exploring an S3 Bucket with Genesis Agents
Todd Beauchene
December 2, 2025
Genesis Walkthrough #2: Loading data from S3 into Snowflake with Genesis
Todd Beauchene
December 2, 2025
Genesis Walkthrough #3: Using a Blueprint to launch a mission
Todd Beauchene
December 2, 2025
Genesis Walkthrough #4: Genesis Mission prompt for required information
Todd Beauchene
December 2, 2025
Genesis Walkthrough #5: Checking in on a running mission
Todd Beauchene
December 2, 2025
Genesis Walkthrough #6: Mission document flow
Todd Beauchene
December 2, 2025
Genesis Walkthrough #7: Exploring Mission Results
Todd Beauchene
December 2, 2025
Genesis Walkthrough #8: DBT Engineering Blueprint
Todd Beauchene
November 7, 2025
Exploring Genesis UI: Agents & Their Tool
Todd Beauchene
November 7, 2025
Launching the Genesis App through the Snowflake Marketplace
Todd Beauchene
November 7, 2025
Exploring Mission Features in Genesis UI
Todd Beauchene
November 6, 2025
How Hard Could It Be? A Tale of Building an Enterprise Agentic Data Engineering Platform
Anton Gorshkov
November 4, 2025
Better Together: Genesis and Snowflake Cortex Agents API Integration
Genesis Computing
October 31, 2025
Exploring Genesis UI: Agent Workflows
Todd Beauchene
October 27, 2025
Agent Server [1/3]: Where Enterprise AI Agents Live, Work, and Scale
Justin Langseth
October 27, 2025
Agent Server [2/3]: Where Should Your Agent Server Run?
Justin Langseth
October 27, 2025
Agent Server [3/3]: Agent Access Control Explained: RBAC, Caller Limits, and Safer A2A
Justin Langseth
October 26, 2025
Delivering on agentic potential: how can financial services firms develop agents to add real value?
No items found.
No items found.
October 20, 2025
Blueprints: How We Teach Agents to Work the Way Data Engineers Do
Justin Langseth
October 20, 2025
Context Management: The Hardest Problem in Long-Running Agents
Justin Langseth
October 20, 2025
Progressive Tool Use
Genesis Computing
August 22, 2025
Your Data Backlog Isn't Just a List — It's a Risk Ledger
Genesis Computing
August 14, 2025
The Future of Data Engineering: From Months to Hours with Agentic AI
Genesis Computing
Matt Glickman gives an interview at Snowflake Summit 2025
June 27, 2025
Ex-Snowflake execs launch Genesis Computing to ease data pipeline burnout with AI agents
No items found.
No items found.
June 25, 2025
GXS Uses Autonomous AI Agents to Speed Data Engineering from Months to Hours
No items found.
No items found.
June 5, 2025
Enterprise AI Data Agents: Automating Bronze Layer to Snowflake dbt Pipelines
No items found.
No items found.
June 4, 2025
Stefan Williams, Snowflake & Matt Glickman, Genesis Computing | Snowflake Summit 2025
No items found.
No items found.
The Evolution of Data Work: Introducing Agentic Data Engineering
Matt Glickman
Justin Langseth