114-year-old tech behemoth takes biggest swing at AI yet

IBM (IBM) felt mostly like an afterthought in AI. Not anymore. 

The 114-year-old tech giant is focusing on a simple plan, where it’s looking to build a robust AI platform that can be delivered anywhere and run on the mission-critical systems that effectively keep sensitive data safe.

Investors have noticed and have rewarded IBM this year, with its stock surging nearly 30% year to date.

In tandem with its AI endeavors, IBM’s quantum agenda kicked up a notch this year. 

The company seems to have efficiently mapped out a unique path toward fault-tolerant quantum computing, including its new “Nighthawk” hardware and quantum tools geared toward real error-correction progress.

That said, IBM just struck a fresh blow in AI infrastructure that’s likely to supercharge its entire strategy. IBM just lined up a new partner, focusing purely on inference speed and cost. For perspective, that’s exactly the part of AI that basically determines whether agentic apps, contact centers, fraud checks, and supply chains feel real-time or not. 

If the execution matches the talk, this might be the throughput and latency upgrade that turn IBM’s AI “plumbing” into a much bigger sales flywheel.

A fresh partnership aims to cut inference costs and latency, turning IBM’s platform-and-plumbing strategy into real throughput gains.

Bloomberg/Getty Images

IBM expands AI infrastructure push with new Groq partnership

IBM’s push into AI infrastructure just got its biggest upgrade yet.

In an interview with Bloomberg Technology, IBM Senior Vice President of Software Rob Thomas and Groq CEO Jonathan Ross laid out how the new partnership reshapes the way enterprises deploy and scale AI, and how quickly they can make it pay off.

The IBM-Groq partnership marks a major inflection point in AI infrastructure, layering IBM’s enterprise reach and its potent watsonx ecosystem with Groq’s ultra-fast inference hardware called LPUs, or Language Processing Units

Related: Bank of America revamps Google stock price target ahead of earnings

That’s like going up from a dial-up to a broadband connection, as Ross puts it.

Groq’s chips efficiently run AI inference, often referred to as the “thinking” phase, and are up to 5x faster, at nearly 20% of the cost of current GPU setups. That naturally results in lower latency, lower costs, and real-time AI for call centers, supply chains, and agentic workloads.

IBM will be looking to distribute Groq’s potent technology through its salesforce and integrate it into watsonx and IBM Cloud. “We’ve already seen clients getting an impact to how they’re deploying AI because of this integration,” Thomas said. 

More Tech Stocks:

Groq users get instantaneous access to IBM’s colossal enterprise client base while accelerating IBM’s own AI “book of business,” which stands at a whopping $7.5 billion.

Financially, the move fits in remarkably well with IBM’s monetization flywheel.

Routing that $7.5 billion in projects through quick and more cost-effective inference could convert backlog to sales faster, while the revenue-sharing model results in incremental gains as deployments scale up. 

AI clearly has a cost problem, and this deal breaks through that, positioning IBM as a critical plumber of enterprise AI.

IBM leans into its AI plumbing, and the numbers show it’s working

IBM’s AI story isn’t exactly about chasing the flashiest model or the biggest hype cycle.

In contrast, the tech giant is quietly building the infrastructure that enables enterprise AI to run, including governance, infrastructure, and hybrid cloud, plus integration, with the numbers starting to back things up.

Related: Cybersecurity expert offers blunt verdict on AWS outage

In Q2 2025, IBM’s generative-AI “book of business,” which includes the running total of itssigned software and consulting deals, jumped to a whopping $7.5 billion, up remarkablyfrom early in the year. 

Revenue numbers surged to $17 billion, up 8% year over year. Software jumped 10%, while infrastructure went up 14% as AI projects pulled through licenses and mainframe upgrades. 

Moreover, IBM’s Granite models are intentionally designed to be efficient andopen-sourced under the Apache 2.0 license. Granite 3.0 and 3.2 offer multimodal andreasoning variants along with lighter forecasting models, living insidewatsonx, IBM’s incredibly potent full-stack AI suite, which manages data, risk, andcompliance.

Then comes Red Hat, IBM’s dependable distribution arm. 

With OpenShift AI, the company is looking to effectively push and run any model on any hardware in any cloud, facilitating its clients to deploy across Nvidia, AMD, or internal systems. 

That flexibility helps keep enterprise data where it belongs while cutting inference costs. Underneath it all sits IBM’s mainframe and hybrid-cloud core, integrating on-chip AI inferencing with watsonx orchestration.

Related: Apple’s iPhone 17 story just took an unexpected turn