mattwood.blog

The Field and The Frontier

Two dynamics are running simultaneously in AI, and they are easy to confuse because they are measured on different axes. One tracks what it costs to keep pushing the frontier. The other tracks the price of buying a fixed level of capability. Read together they look contradictory. Read apart they explain much of where enterprise value is going to come from.

At the frontier, budgets are rising. Not because any fixed unit of capability is getting more expensive, but because the definition of frontier work keeps moving. Each generation absorbs more reasoning, more tokens, more tool use, more context, more evaluation. The frontier never holds capability fixed; it spends every efficiency gain on greater ambition. So in practical terms frontier budgets do not shrink, even as the price of any specific capability falls, because what we ask of the frontier expands to fill the new envelope.

Behind the frontier, prices are collapsing. For any fixed level of capability, the price of achieving it is falling at rates that take a moment to absorb: not incrementally, but by orders of magnitude, year over year. Both are true at the same time. The first attracts nearly all the attention. The second is where much of the enterprise value will be created this year.

The numbers are not subtle. Epoch AI and Artificial Analysis track price per million tokens for models that hit specific benchmark thresholds: fixed-capability price curves. Benchmarks are imperfect proxies for work, but they are useful here because they hold a capability threshold constant while price moves. GPT-3.5 Turbo-level performance on general knowledge has fallen at roughly 9x per year, the slowest rate in the data. GPT-4 level performance on PhD-level science questions has fallen at roughly 40x per year: models that cost $30 per million tokens in early 2023 had equivalents below $1 by mid-2024, and below $0.10 by early 2025. Some tiers fall even faster, though on fewer data points. The 9x and 40x rates alone are enough. This is not an efficiency gain at the margin. It is a category change in the price of benchmark-equivalent inference.

The mechanism is straightforward. As the frontier advances, the techniques that produced yesterday's leading models become available for optimization. Smaller models are trained to match larger predecessors; distillation and architectural improvements compress capability into fewer parameters; open-weight models proliferate; competition pushes commodity inference toward utility economics. Any specific capability level becomes radically cheaper over time, even as the frontier itself keeps moving forward.

For most of the history of this curve, the decline was interesting and practically irrelevant. The cheapest models were not good enough to perform useful knowledge work at the standard organizations need. That changed somewhere around GPT-4-class usefulness, first for a widening class of professional knowledge work: summarization, drafting, classification, synthesis, coding assistance, analysis, decision support. Once models cleared that bar on those tasks, falling price stopped being background noise and became deployable surface area.

Below the threshold, cheap AI is a curiosity. Above it, cheap AI is a material. Materials behave differently from tools. A tool gets picked up and put down. A material gets embedded: in infrastructure, in process, in the operating assumptions of a business. Cheap enough and capable enough, intelligence stops being an application you reach for and starts being a substrate you build on, closer to electricity or bandwidth than to software.

This is the field: the broad space behind the frontier where capabilities are no longer novel, but are cheap enough and available enough to be built into ordinary work. In almost every large enterprise the pattern is the same: experimentation is broad, production deployment is narrow, and process-level orchestration is rare. Forrester's 2025 automation predictions estimate that genAI will orchestrate less than 1% of core business processes, even as it affects process design, development, and data integration. The shortfall is not for lack of trying. It is that very few organizations have worked systematically across the whole operating surface of the business, rather than in scattered pilots. The field is wide open not because no one is looking, but because almost no one is covering it with discipline.

The hard problem has changed. Models have been accessible via API for some time; what is new is that capable inference is now cheap enough to be economically viable across workloads many organizations have not attempted yet, and that threshold drops every few months, pulling more organizations and more use cases into range. When capable inference was expensive, the model was the system. Now that it is cheap and getting cheaper, the model is a component, and the system is the work.

You can hear the shift in what customers ask. The question used to be: which model is best? The question now is: how do I select and chain models so the right one meets the right workload? That is the field thinking out loud. Customers have started doing the real work: proving capability against their own data, their own processes, and their own risk.

A workflow that saves an analyst fifteen minutes is not worth $5 in inference cost. At $0.05, the calculation changes. At $0.005, it changes again. Most untouched workflows were passed over not because automation was impossible, but because the whole system was not worth building: model cost was only one line in a budget that also included integration, exception handling, and ownership. Falling inference cost does not erase those other lines. It changes the return enough that many more workflows become worth instrumenting.

Three postures follow. Frontier exploration: learn what just became possible, because capabilities emerge unevenly and the only way to map the edge is to probe it. Frontier-first development: build at frontier prices on the assumption that price reduction follows capability, so that what is frontier when you start is field by the time you reach scale. Field-first deployment: survey the workflows you already run, and where the return is real, instrument them, evaluate them honestly, and deploy across a far wider surface than most organizations have attempted. The discipline is in choosing well, not in automating everything that is technically feasible. Most organizations are over-indexed on the first two. The third is where the untapped value sits.

Which points at who wins. When the model is a component, access alone stops being the differentiator. The advantage moves to two kinds of machinery. The first is technical: routing, evaluation, observability, cost controls, the plumbing that matches workloads to models and keeps them honest in production. The second is organizational, and harder to buy off a shelf: proprietary data, process knowledge, clear ownership, and the authority to change how work actually gets done. The first is necessary. The second is what separates the organizations that pull ahead from the ones that run impressive pilots.

None of these compete. The frontier creates capability; the price curve makes that capability ordinary. Today's frontier is next year's field.

For most enterprises, the most valuable work in AI over the next several years will be the patient, unglamorous business of proving and deploying known capabilities against local data, local workflows, and local risk, at prices that finally make the attempt worthwhile.

The frontier gets explored. The field gets cultivated.