The AI-driven cost crisis Wall Street has not started pricing yet | Official Website of Louis Velazquez, entrepreneur, finance guy and tech innovator

Every technology boom eventually arrives at the same uncomfortable moment: when the question stops being who is growing fastest and starts being who can actually afford to keep growing. For the AI software industry, that moment may be arriving faster than investors expected.

The numbers that triggered the conversation are not subtle. The four largest US technology companies alone (Alphabet, Amazon, Meta, and Microsoft) are forecast to spend $650 billion on AI infrastructure in 2026, according to Bloomberg. Wall Street analysts at Evercore and Bank of America are already projecting that total hyperscaler AI capex could cross $1 trillion in 2027, according to CNBC.

That is the top of the stack. Below it, the companies running on that infrastructure are facing a different version of the same pressure. According to CloudZero’s report, average monthly AI spending among enterprise software companies jumped 36% year over year, from $62,964 to $85,521, while the share planning to spend more than $100,000 per month more than doubled, from 20% to 45%.

Only 51% of organizations can confidently calculate the return on that spending. The capital is moving fast. The clarity about whether it is working is not catching up.

The problem with paying for AI at scale

The economics of AI software are structurally different from the economics of traditional software in one important way: inference is not free. Every time an AI system responds to a query, processes a document, routes a conversation, or completes a task, it consumes compute. That consumption has a cost, and as enterprise AI usage grows, so does the bill.

More AI:

“Traditional software companies build a product once and distribute it at near-zero marginal cost. An AI-native software company builds a product that has to pay a compute toll every time a customer uses it,” Nimrod Ron, CEO of CX OS provider Callers.ai, told TheStreet in an interview.

That distinction was easy to ignore when AI adoption was early and usage was low. It becomes much harder to ignore when enterprise customers are running AI workflows at the scale the CloudZero data suggests they now are.

The consequence is producing visible behavior across the industry. Some AI vendors have reportedly started repricing contracts mid-cycle as infrastructure costs exceed the assumptions in their original pricing. Others have publicly committed to pricing stability despite the same cost increases. The gap between those two responses comes down to infrastructure, not pricing strategy.

What is actually happening when an AI vendor reprices a contract

The interpretation emerging from inside the sector is pointed. “When an AI vendor reprices a contract mid-cycle, it is usually not a commercial decision. It is an infrastructure confession,” Ron told TheStreet. “It means the company built its product on a fixed dependency to one or two LLM providers and had no structural way to absorb cost increases as usage scaled. The customer is absorbing the consequences of an architecture decision the vendor made years earlier.”

That reframes what investors should actually be looking at. A vendor that reprices mid-contract is doing more than chasing margin. It is revealing that its infrastructure was not designed to handle the cost curve that comes with scaling, and the customer facing the new bill is effectively paying for that architectural decision.

The alternative, building infrastructure that routes dynamically across multiple LLM providers in real time rather than locking into a single dependency, is more expensive and more complex to build upfront. But it provides a structural hedge against any single provider’s pricing decisions.

Companies that made that investment early are now in a fundamentally different position than those that did not, and the repricing behavior now visible across the industry is one of the first places that difference surfaces.

Why investors should be tracking gross margin trend rather than revenue growth

The investment implications of this split are still early but increasingly visible. For most of the AI boom, investors have evaluated software companies on revenue growth, net revenue retention, and enterprise customer counts. Those metrics remain important, but they do not reveal what happens to the economics of the business as usage scales.

“The market has been evaluating AI software companies largely on revenue growth and net retention. Those are lagging indicators,” Ron added. “What investors are starting to ask is: what is your gross margin trajectory as inference costs rise? That question leads directly to infrastructure design.”

By the time a static-dependency AI company’s revenue growth slows because repricing damaged customer retention, investors watching only the top line will already be behind. The margin signal arrives first. It arrives in the cost of goods sold line, in gross margin compression, in the gap between revenue growth and free cash flow generation.

The enterprise software market has started scrutinizing AI vendors the way it once scrutinized industrial companies

Morsa/Getty Images

What different infrastructure choices are producing in 2026

The AI software industry contains very different companies, and the next phase of deployment is starting to reveal which ones are which. The cost pressure has already hit at the hyperscaler level: Meta’s free cash flow dropped from $26 billion in Q1 2025 to just $1.2 billion in Q1 2026, in part because of higher AI component costs including memory pricing, according to CNBC.

If companies at Meta’s scale and margin profile are feeling it, the effect on smaller AI-native software vendors with thinner unit economics and less pricing power hits harder. The infrastructure decisions that individual companies made in 2023 and 2024 are going to produce very different income statements in 2026 and 2027.

The vendors that invested in dynamic routing infrastructure are entering a period of increasing volume with a cost structure that improves as usage grows. The more conversations, transactions, or inferences they process, the more arbitrage opportunity they have across providers, and the more their per-unit cost tends to fall. The vendors that built on fixed LLM dependencies are entering the same period with a cost structure that can move in the opposite direction: as usage grows, so does exposure to provider pricing.

The conversational AI and AI agent sectors are facing this pressure most acutely because their core product is inference-heavy by design. Every customer interaction is a compute event.

A conversational AI company with a million active users is processing potentially hundreds of millions of inference calls per month. At that volume, a difference of a few cents per thousand tokens between a well-optimized routing architecture and a single-provider dependency translates directly into points of gross margin. At scale, those points determine whether a business compounds value or erodes it.

The enterprise software market has started scrutinizing AI vendors the way it once scrutinized industrial companies. Capital intensity, cost structure, and operating leverage now carry as much weight as logo count and net revenue retention.

For investors evaluating AI software companies in 2026, the useful questions are increasingly specific: What percentage of cost of goods sold is tied to third-party LLM inference? Does the architecture allow for dynamic provider routing, or is the product locked to a fixed model stack? Has gross margin been stable, expanding, or compressing as enterprise usage has scaled over the past four quarters?

Those questions do not appear in most equity research reports on AI software companies today. The repricing behavior now becoming visible across the industry suggests they probably should.