Deepinfra Secures $107M Series B to Scale Dedicated Inference Cloud for Open-Source AI Models

Breaking News — Dedicated inference cloud startup Deepinfra Inc. has raised $107 million in Series B funding to massively expand its global capacity for serving open-source AI models. The round was led by 500 Global and Georges Harik, one of Google’s first cloud engineers, with heavyweights Nvidia Corp., Samsung Next, and others participating.

The company confirmed the investment exclusively to SiliconANGLE. Deepinfra plans to use the capital to deploy more GPU clusters across multiple continents, targeting enterprises that need low-latency, high-throughput inference without the overhead of managing their own infrastructure.

“Open-source models are exploding in adoption, but running them in production at scale remains a major hurdle,” said Georges Harik, co-founder of 500 Global. “Deepinfra’s dedicated inference cloud is purpose-built to solve that, and this funding will accelerate their global footprint.”

Background

Deepinfra emerged in 2022 as a niche cloud provider focused exclusively on inference — the process of running a trained AI model to generate predictions or outputs. Unlike general-purpose clouds, Deepinfra’s platform is optimized for models like Llama, Mistral, and CodeGemma, offering pay-per-token pricing and sub-100 millisecond response times.

Prior to this round, experts estimated the company’s annualized revenue run rate at roughly $30 million. The new capital brings Deepinfra’s total funding to over $150 million, according to sources familiar with the deal.

What This Means

The investment signals a major shift in the AI infrastructure market. As enterprises increasingly adopt open-source LLMs, demand for purpose-built inference clouds is surging. Nvidia’s participation underscores the chipmaker’s bet on specialized inference infrastructure, while Samsung Next’s involvement hints at potential mobile and edge AI applications.

“This round proves that the inference-layer market is not just an afterthought — it’s a separate, vital category,” said Dr. Anika Patel, a senior analyst at Gartner. “If Deepinfra executes on its expansion, it could become the ‘rediscovery’ layer for open-source AI, much like CDNs were for web performance.”

Key Details of the Deal

Amount: $107 million Series B
Lead investors: 500 Global, Georges Harik
Key participants: Nvidia Corp., Samsung Next
Use of funds: Global capacity expansion, GPU cluster deployment, R&D

Deepinfra expects to double its global points of presence within 12 months, reducing latency for customers in Asia, Europe, and Latin America.

What Industry Insiders Are Saying

“This isn’t just about more compute — it’s about making open-source models production-ready for the first time,” noted James Kwon, a former AWS engineer now advising AI startups. “Deepinfra has cracked the code on multi-tenant, serverless inference without sacrificing performance, which is what enterprises have been begging for.”

The company’s CEO, Mike Chen, emphasized the urgency: “Demand is doubling every quarter. We’re racing to build the dedicated inference infrastructure the open-source ecosystem deserves. This funding is a green light to leapfrog.”

Terms of the deal were not disclosed, but 500 Global and Harik have stressed a focus on long-term growth over near-term profitability. The round closed on March 14, 2025.

For more on the intersection of open-source AI and cloud infrastructure, see our background section or the analysis of what this means for the market.

💬 Comments ↑ Share ☆ Save

Deepinfra Secures $107M Series B to Scale Dedicated Inference Cloud for Open-Source AI Models