June 18, 2026

Tokenflation Is a Symptom. The Cure Is Context-Aware AI Architecture

Genesis Computing

‍

TL;DR: Banks pushed AI adoption hard and it worked, but token costs are now outpacing the economics underneath them, a gap the industry calls tokenflation. Routing, FinOps tracking, and owned compute all manage the symptom. The real fix is architectural: agents that do not pay to relearn the data environment on every run.

‍

For two years, the message inside large enterprises was simple: use AI everywhere, for everything. Then it worked. And working, it turns out, carries a price tag few organizations modeled for.

The numbers coming out of banking this quarter are a useful early warning for anyone running AI at scale. On its most recent earnings call, Royal Bank of Canada disclosed that its AI token usage, the meter that runs with every model call, had climbed 500% year over year since 2025. At JPMorgan, the chief data and analytics officer of the Payments division told Semafor that some employees are now spending more on tokens than they earn in salary. Commonwealth Bank of Australia CEO Matt Comyn put the mechanism plainly at an industry conference, noting that once a workload adds reasoning, tool use, and large context windows, token costs stop scaling on a linear basis.

The industry has started calling this tokenflation: the gap that opens up when AI adoption succeeds faster than the economics underneath it.

‍

Why Agentic Work Breaks the Budget

The reason the bills are unpredictable is structural. In early rollouts, costs stayed modest because tasks were simple: a summary here, a draft there. But as tools become agentic, a single prompt no longer maps to a single answer. It can quietly trigger hours of autonomous work.

Task type	Typical token range	What drives the cost
Simple summarization	Roughly 1,000 tokens	One pass, no tool calls, fixed input length
Multi-step reasoning with tool use	100,000+ tokens for comparable output length	Repeated reasoning passes, tool calls, context reloaded at each step
Long-running agentic task (CIBC's CAI 2.0 pilot)	10 million to 100 million tokens per task	Hours of autonomous work, full context re-established repeatedly across a long session

‍

Deloitte makes the same point to technology buyers directly: AI spend is structurally volatile and nonlinear by design, and a workload's token footprint is the cumulative result of decisions made across the stack, including model selection, context length, and orchestration. A deployment that looks cheap at pilot volume produces a very different bill once it hits production complexity.

‍

The Fixes Banks Are Reaching For, and Their Ceiling

Faced with a runaway meter, the instinct is to manage behavior, and banks are getting inventive about it, as reported by Evident Insights' Banking Brief:

CIBC is taking model choice out of users' hands in a new, more agentic version of its internal AI platform. The system classifies each prompt by task type and auto-routes it to whichever model clears the bar most economically, rather than reaching for the most expensive tool when a cheaper one will do the job.
TD Bank has stood up an AI FinOps function to track usage trends, and is coaching managers to treat tokens like any other line-item expense: using AI because the task calls for it, not just because the option exists.
PNC is going furthest, building out its own compute so it is not renting every token from an outside lab, with the added benefit that external models never get near its data.

Every one of these moves is rational. But notice what they share: each one bolts a new cost onto the architecture to manage a cost the architecture keeps generating by default. Rationing, routing, monitoring, and re-platforming are all overhead. Someone has to build the router, staff the FinOps team, stand up the GPUs. They treat the symptom.

It does not help that the labs keep changing the math. Per-token pricing has been drifting upward, and the newest top-tier models cost meaningfully more per token than the generation before them. Renting every token, for every task, indefinitely, is not a strategy. It is a liability with a subscription fee attached.

‍

In Data Engineering, the Hidden Cost Is Context

Here is the part the headline numbers leave out: in data and analytics work, much of what an agent burns tokens on is not the task itself, it is re-learning the environment in order to do the task. On every run, the agent re-reads schemas, re-traces lineage, re-checks governance rules, and reloads business context into its window before it can take a single useful action. The data estate gets paid for again and again, as context. Matt Comyn, Commonwealth Bank of Australia's CEO, named context as one of the three things that break linear scaling. In data engineering specifically, context is not an occasional input, it is the largest and most repetitive one.

This sits on top of a problem most enterprises already have. Gartner's Data Engineering 2.0 research found that 74% of data and analytics leaders say their current practices cannot effectively support AI use cases, and only 10% believe they can meet AI project timelines. We covered Genesis's recognition in that same Gartner research here. Throwing token-hungry agents at a data estate no one has mapped does not fix that gap. It just puts a meter on it.

No router or spending cap touches this particular cost. A team can route a task to a cheaper model and still pay the context tax on every single invocation.

‍

What Changes When the Knowledge Layer Is Persistent

Genesis was built on a different premise: an enterprise's institutional knowledge, meaning how its data is structured, how it connects, how it is governed, and what it means to the business, should be modeled once and made durable, not reconstructed on every run. That premise is what the Genesis Context Graph does in practice. It gives agents contextual awareness of enterprise systems, workflows, governance policies, and business semantics, autonomously extracted and then refined by human experts. It turns institutional knowledge that was previously scattered across the data estate into a persistent operational asset. Agents do not rediscover the environment each time they wake up. They operate against a standing map of it. We go deeper on how this works in our post on context-first agent design.

Two design choices compound that advantage.

Native deployment. Genesis agents run inside an organization's own cloud environment, whether that is Snowflake, Databricks, AWS, Google BigQuery, Azure, or on-prem Kubernetes, wherever the data already lives. Nothing is shipped out to an external orchestration layer to be processed and shipped back. This is precisely the instinct PNC is now building by hand: keep the work close to the data, and keep the data in-house. Genesis ships that as the default, not as a retrofit.
Pretrained, purpose-built agents. Because the agents are built specifically for data engineering, teams are not paying frontier-model premiums to brute-force a specialized task through a general assistant with heavy prompting and heavy token usage. It is the same right-size-the-tool logic CIBC is now applying manually through its task classifier, except here it lives in the architecture itself rather than in a routing rule someone has to build and maintain.

The proof points here are, fittingly, lean data teams rather than banks with unlimited budgets. When a major acquisition tripled the migration backlog at GrowthZone, the company's Director of Data Services shelved a $400,000 hiring plan and deployed Genesis instead, taking the same four-person team from roughly 10 migrations a year to 30 to 50. Read the full GrowthZone case study here. Neither story is about token rationing. Both are about what happens when a system stops paying to re-learn what it already knew.

Be honest about what this does and does not do

Tokens are not free, and there is no point pretending otherwise. Agentic data engineering consumes compute, and anyone who says it does not is selling something. The point of naming tokenflation is not that spend should be zero; it’s that spend should be proportional to the value created, predictable enough to plan around, and free of the waste that comes from re-deriving the same context over and over.

Every bank in this story is, one workaround at a time, reinventing what a native, context-aware, purpose-built agent platform already provides. The faster an organization grows its AI footprint, the sooner that math catches up with it.

If tokenflation is showing up in a data estate, the real question is not how to ration the way out of it– it’s whether the agents doing the work are paying to learn the environment every single time they run.

Frequently Asked Questions

What is tokenflation?

Tokenflation refers to AI token costs rising faster than expected as adoption scales, because agentic workflows consume far more tokens per task than simple chat-style use cases. Banks including RBC and JPMorgan have reported sharp token cost increases as they moved from basic AI tools to more autonomous, multi-step agents.

Why do AI token costs rise faster than usage?

Costs rise nonlinearly because agentic tasks involve reasoning steps, tool calls, and large context windows, none of which scale one-to-one with the number of requests. A single agentic task can consume tens or hundreds of thousands of tokens compared to roughly a thousand for a simple summarization.

Can model routing alone fix rising AI token costs?

Model routing reduces the cost of using an expensive model for a simple task, but it does not address the cost of an agent re-learning an enterprise's data environment, including schemas, lineage, and governance rules, on every single run. That context-reloading cost persists regardless of which model handles the request.

What is the Genesis Context Graph?

The Genesis Context Graph is a persistent map of an enterprise's data systems, workflows, governance policies, and business semantics. It is built once, refined by human experts, and referenced by agents on every run, so agents do not need to rediscover the environment from scratch each time.

How did GrowthZone scale its data migrations without new hires?

GrowthZone's four-person data engineering team faced a tripling of migration volume after an acquisition. Rather than hiring two to three additional engineers at an estimated $300,000 to $450,000 a year, the company deployed Genesis and increased annual migration capacity from roughly 10 to 30 to 50, using the same headcount.

Keep Reading

20 Years at Goldman Taught Me How to Manage People. Turns Out, Managing AI Agents Isn't That Different.

3 Cortex Codes Running in Parallel?

AI Agent Builds dbt Analytics Schema in 30 Minutes

A CEO's Perspective on the Shift to AI Agents

View All Articles

June 18, 2026

Tokenflation Is a Symptom. The Cure Is Context-Aware AI Architecture

Genesis Computing

June 11, 2026

Genesis Computing Announced as Validated Technology Partner of Databricks

Yahoo Finance

May 29, 2026

Genesis Computing Recognised in Gartner's "Data Engineering 2.0" Research

Yahoo Finance

May 12, 2026

Why AI Agents That Have Context First Build Better Pipelines

Genesis Computing

May 5, 2026

What’s Actually Blocking Agentic Commerce for CPGs? Not AI. The Data Pipeline.

Genesis Computing

May 5, 2026

What Does $17.4M in Undetected Royalty Exposure Look Like? Eight Platforms. Fifty Titles. Zero Unified View.

Genesis Computing

April 27, 2026

From "Something's Broken" to Root Cause in 5 Minutes

No items found.

April 23, 2026

40 Minutes to Reverse-Engineer a Legacy Data Warehouse (Including the Ghost Artifacts Nobody Knew Existed)

Genesis Computing

April 22, 2026

From Raw Claims Data to a Live Analytics Dashboard in 7 Minutes

Genesis Computing

April 20, 2026

Meet Genesis Twin: The Digital Twin That Ends the Monday Morning Data Fire Drill

Genesis Computing

April 9, 2026

Super Data Science: ML & AI Podcast with Jon Krohn

Matt Glickman

April 8, 2026

Connecting Data Sources in Genesis

Todd Beauchene

Promotional banner for Genesis Computing

March 31, 2026

How Genesis Automates Synthetic Data Generation for Databricks Dev Environments in Under 34 Minutes

Todd Beauchene

March 19, 2026

The Death of Traditional BI - Part 1

Genesis Computing

March 11, 2026

AI Agent Builds dbt Analytics Schema in 30 Minutes

Todd Beauchene

February 26, 2026

Genesis Bronze, Silver, Gold Agentic Data Engineering: From Dashboard Sketch to Production Pipeline

Genesis Computing

February 19, 2026

How Genesis Automates Data Pipeline Development in Hours

Genesis Computing

February 12, 2026

3 Cortex Codes Running in Parallel?

Justin Langseth

February 10, 2026

Powering Up Cortex Code with Genesis Superpowers

Matt Glickman

February 2, 2026

Automate Dashboard Creation with Genesis

Justin Langseth

January 27, 2026

Using AI Agents to Generate Synthetic Data

Justin Langseth

January 12, 2026

The Junior Data Engineer is Now an AI Agent

Matt Glickman

December 22, 2025

From Requirements to Production Pipelines With Genesis Missions

Genesis Computing

December 4, 2025

20 Years at Goldman Taught Me How to Manage People. Turns Out, Managing AI Agents Isn't That Different.

Anton Gorshkov

December 2, 2025

A CEO's Perspective on the Shift to AI Agents

Genesis Computing

December 2, 2025

Genesis Walkthrough #1: Exploring an S3 Bucket with Genesis Agents

Todd Beauchene

December 2, 2025

Genesis Walkthrough #2: Loading data from S3 into Snowflake with Genesis

Todd Beauchene

December 2, 2025

Genesis Walkthrough #3: Using a Blueprint to launch a mission

Todd Beauchene

December 2, 2025

Genesis Walkthrough #4: Genesis Mission prompt for required information

Todd Beauchene

December 2, 2025

Genesis Walkthrough #5: Checking in on a running mission

Todd Beauchene

December 2, 2025

Genesis Walkthrough #6: Mission document flow

Todd Beauchene

December 2, 2025

Genesis Walkthrough #7: Exploring Mission Results

Todd Beauchene

December 2, 2025

Genesis Walkthrough #8: DBT Engineering Blueprint

Todd Beauchene

November 7, 2025

Exploring Genesis UI: Agents & Their Tool

Todd Beauchene

November 7, 2025

Launching the Genesis App through the Snowflake Marketplace

Todd Beauchene

November 7, 2025

Exploring Mission Features in Genesis UI

Todd Beauchene

November 6, 2025

How Hard Could It Be? A Tale of Building an Enterprise Agentic Data Engineering Platform

Anton Gorshkov

November 4, 2025

Better Together: Genesis and Snowflake Cortex Agents API Integration

Genesis Computing

October 31, 2025

Exploring Genesis UI: Agent Workflows

Todd Beauchene

October 27, 2025

Agent Server [1/3]: Where Enterprise AI Agents Live, Work, and Scale

Justin Langseth

October 27, 2025

Agent Server [2/3]: Where Should Your Agent Server Run?

Justin Langseth

October 27, 2025

Agent Server [3/3]: Agent Access Control Explained: RBAC, Caller Limits, and Safer A2A

Justin Langseth

October 26, 2025

Delivering on agentic potential: how can financial services firms develop agents to add real value?

Genesis Computing

October 20, 2025

Blueprints: How We Teach Agents to Work the Way Data Engineers Do

Justin Langseth

October 20, 2025

Context Management: The Hardest Problem in Long-Running Agents

Justin Langseth

October 20, 2025

Progressive Tool Use

Genesis Computing

August 22, 2025

Your Data Backlog Isn't Just a List — It's a Risk Ledger

Genesis Computing

August 14, 2025

The Future of Data Engineering: From Months to Hours with Agentic AI

Genesis Computing

Matt Glickman gives an interview at Snowflake Summit 2025

June 27, 2025

Ex-Snowflake execs launch Genesis Computing to ease data pipeline burnout with AI agents

Genesis Computing

June 25, 2025

GXS Uses Autonomous AI Agents to Speed Data Engineering from Months to Hours

Genesis Computing

June 5, 2025

Enterprise AI Data Agents: Automating Bronze Layer to Snowflake dbt Pipelines

Genesis Computing

June 4, 2025

Stefan Williams, Snowflake & Matt Glickman, Genesis Computing | Snowflake Summit 2025

Genesis Computing

The Evolution of Data Work: Introducing Agentic Data Engineering

Matt Glickman

Justin Langseth

Stay Connected!

Discover the latest breakthroughs, insights, and company news. Join our community to be the first to learn what’s coming next.

Tokenflation Is a Symptom. The Cure Is Context-Aware AI Architecture

Why Agentic Work Breaks the Budget

The Fixes Banks Are Reaching For, and Their Ceiling

In Data Engineering, the Hidden Cost Is Context

What Changes When the Knowledge Layer Is Persistent

Be honest about what this does and does not do

Frequently Asked Questions

Want to learn more? Get in touch!

Keep Reading

Keep Reading