From 900,000 Nodes to AI Inference: How Gradient Built Its Open Intelligence Stack

From 900,000 Nodes to AI Inference: How Gradient Built Its Open Intelligence Stack

May 1, 2026
6 min read
distributed-aillm-inferenceopen-sourcedecentralized-computellm-routingreinforcement-learningcrypto-ai

A few days before this conversation, Alex Mirran stood in front of a room in Austin and demoed a local routing proxy called UncommonRoute. The event was Closton, organized around the global open Claude community, and the crowd ranged from founders with barely an MVP to engineers from Dell, Meta, and Google. That breadth, Mirran says, reflects where the AI industry sits right now.

Listen on your favorite platform

View full episode details

Mirran is the Business Development lead at Gradient, an AI research and development lab whose products sit at the intersection of distributed systems and inference. He spent years in the Filecoin and IPFS communities, mined Ethereum during the proof-of-work era, and came to AI through that same lens of decentralized infrastructure. His path to Gradient, and Gradient's own origin story, follows a similar arc.

Starting with 900,000 Nodes

Before Gradient became an AI lab, it was a content delivery network. The company's CEO Eric came from Sequoia Capital China and had previously co-founded and exited a live streaming company in the Web3 space. The other co-founder, Yuan, came from Helium Network, where he led growth on the original team. Together they built a decentralized CDN that at one point reached roughly 900,000 nodes globally. That infrastructure background shaped everything that followed.

Over time, Gradient's team saw more opportunity in applying their distributed systems expertise to AI workloads, specifically inference and reinforcement learning. The engineering team shifted focus, the lab published four or five research papers, and the question became which of those research directions to commercialize. The answer led to the products Gradient ships today: Commonstack, UncommonRoute, Parallax, and Echo.

"I Don't Have Time to Look at Evals"

Commonstack is Gradient's cloud inference product, structured similarly to OpenRouter. It aggregates frontier models from various hosting providers, including Novita, under a single API key and a single API format. Swapping models is a matter of changing a parameter, not rewriting integration code.

The value proposition is partly convenience and partly cost. At the time of the conversation, Claude Opus 4.6 was priced at five dollars per million input tokens and twenty-five dollars per million output tokens on the platform. Qwen 3.6, which Mirran described as "super, super good and super strong," was available at around one dollar forty. MiniMax M2.7 came in at thirty cents input and one dollar twenty output. For tasks where top-tier reasoning is not required, the difference in spend is substantial.

"The value of having all the models in one place is that you can build your system to route to these different ones based on the tasks. And then we also do some cool stuff with our routing, so when we notice a provider giving us latency or giving us 429 rate limiting errors or even it just goes down, we switch you to a different provider of the same model."

Mirran keeps a running spreadsheet of what models people say they are using for their businesses, because the answer changes week to week. He described a recurring conversation at meetups and customer calls:

"I have been just keeping a spreadsheet with all the answers that I get. And it's been actually a really cool discussion to have with people who just they're like I don't have time to look at evals, I don't have time to keep track. Can you just tell me like I need a really good reasoning model that's not too expensive?"

That question is part of what motivated UncommonRoute.

A Local Proxy That Routes for You

UncommonRoute is an open-source local proxy that sits in front of any inference API and makes model selection decisions automatically. It looks at three signals before routing a request: the conversational structure, semantic similarity to known task patterns, and feature complexity. Based on those signals, it assigns the request to a model and fires it off.

Gradient benchmarked the tool using its own internal system, Commonstack Bench, and reported cost savings of up to 82% compared to routing all requests to a single frontier model. The tool supports three modes: auto for a balanced approach, fast for cost-first routing, and best for quality-first routing.

Mirran was careful to distinguish UncommonRoute from Commonstack's built-in reliability routing. Commonstack monitors latency, rate limiting errors, and provider downtime and switches providers automatically. UncommonRoute is the additional layer that selects among models based on task complexity. Some developers, he acknowledged, prefer to choose their own models and would not want an automated layer making that call for them. Commonstack does not include that automated selection by default.

"It's Day Zero" for Crypto and AI

One recurring theme was the state of the AI and blockchain intersection. Mirran, who came up in the Filecoin and Ethereum communities, was candid about where things stand. He identified three areas he thinks have genuine long-term value: compute aggregation and incentivization, verification of outputs in zero-trust environments using encrypted enclaves, and agent payments over crypto rails.

On compute, he pointed to EigenLayer's Darkroom effort, which aggregates Mac minis to put idle compute to work, and to patterns like Bittensor's subnet model, which allow specialized clusters to form and dissolve around specific workloads. Gradient is thinking about a similar model, where clients could spin up their own cluster for a training run and then spin it back down.

On agent payments, he was more measured. Protocols like MPPP from Tempo and what he described as XRO2 from Coinbase are attempting to standardize how agents interact with commerce using stablecoins and smart contracts. The concept of programming agent spending limits through smart contracts is interesting to Mirran, but he sees it as genuinely early. "I think it's just really early days," he said, "and I think everyone's trying to understand product-market fit for certain use cases."

For now, Gradient is focused on customers with AI problems rather than on integrating blockchain functionality into its stack. The intention to build on-chain mechanisms over time is present, but the company is not leading with it.

Echo and Continuous Learning

The part of Gradient's roadmap Mirran described as the most significant upcoming direction is the commercialization of Echo, the company's distributed reinforcement learning framework. Gradient published papers on both Echo and Echo 2, and the team is now working on turning it into a standalone cloud platform and integrating it into Commonstack as a continuous learning feature.

The framing behind that direction draws on a Sequoia Capital paper Mirran referenced, which argued that customers should pay for outcomes rather than tools. He used TurboTax as an example: you might pay ten thousand dollars for the software, but you would pay a hundred thousand for someone to use it to close your books. The inference API, in this framing, is the tool. A continuously improving model trained on a customer's specific workload is closer to the outcome.

Gradient is looking for early partners to test the post-training pipeline, particularly teams with specific vertical needs. Mirran mentioned blockchain use cases explicitly, including market making, bridging, and trading, as areas where post-training on open-source models might produce something meaningfully better than routing requests to a general-purpose frontier model.

Gradient's open-source tools, including Parallax and the Echo framework, remain free to use, and the company describes its approach as local-first with data ownership in mind. Whether the commercial layer around Commonstack and the forthcoming Echo platform will carry the same ethos forward is a question the roadmap has not yet fully answered, though Mirran said to watch the company's Twitter for movement on crypto and on-chain functionality.

Share this post

Share on X