Intel Delivers Open, Scalable AI Performance in MLPerf Inference v6.0

April 1, 2026 Published

MLPerf Inference v6.0 benchmarks showcase Intel Xeon 6 and Intel Arc Pro B-Series GPUs delivering powerful, low-latency AI inference for workstations and edge systems

In this article:

What’s New: Today, MLCommons released its latest MLPerf Inference v6.0 benchmarks, showcasing results across four key benchmarks for Intel’s GPU Systems. Intel’s AI systems featured Intel® Xeon® 6 CPUs and Intel® Arc™ Pro B70 graphics, demonstrating accessible AI workload solutions across high-end workstations, datacenter, and edge applications.

The results show a four GPU Intel Arc Pro B70/B65 system delivers 128GB of VRAM to run 120B parameter models with high concurrency, with the Arc Pro B70 providing up to 1.8x higher inference performance than the Arc Pro B60¹. Software optimizations, configured in an open, containerized software stack efficiently scales inference performance from single node to multi-GPU enterprise deployments improving performance and delivering up to 1.18x higher gains on the same Intel Arc Pro B60 hardware versus MLPerf v5.1².

“The combination of Intel Xeon 6 and Intel’s Arc Pro B-Series GPUs represent our investment to expand customer choice and value, offering real-world solutions that address both LLM models as well as traditional machine learning workloads, with leading performance and incredible value for graphics professionals and AI developers worldwide.”

- Anil Nanduri, Intel vice president, AI Products and GTM, Intel Data Center Group

Why It Matters: As the demand for AI inference grows, the professional compute market is going through a major transition whereby graphics creators and AI developers seek out performance and value, without compromising data privacy or incurring heavy subscription costs tied to proprietary AI models.

Intel GPU Systems, featuring newly launched Intel Arc Pro B70/B65 GPUs, are designed to meet the needs of modern AI inference and provide an all-in-one inference platform combining full-stack validated hardware and software. With enhanced memory capacity, they aim to simplify the adoption and ease of use with a containerized solution built for Linux environments, optimized to deliver incredible inference performance with multi-GPU scaling and PCIe P2P data transfers, and designed to include enterprise-class reliability and manageability features such as ECC, SRIOV, telemetry and remote firmware updates. For example, when compared to comparable competitor GPU solutions the Intel Arc Pro B70 is able to handle significantly larger models and context windows in multi-GPU setups – powering up to 1.6x as much KV cache capacity when running larger models.

AI inference is increasingly defined not only by GPU throughput but also by CPU-accelerated system performance. The CPU, shaping overall cluster efficiency and total cost of ownership, is also responsible for critical functions such as memory management, task orchestration, and workload distribution, while ensuring the security, reliability, and operational continuity essential to modern AI infrastructure.

Intel continues to be the only server processor vendor to submit stand-alone CPU results for MLPerf inference benchmarks, underscoring its leadership and strong commitment to advancing AI inference across both compute and accelerator centric platforms. As the most widely used host CPU in AI accelerated systems—with over half of MLPerf 6.0 submissions powered by Xeon—Intel further reinforces its position at the core of the industry’s AI infrastructure. This leadership extends to the silicon itself: Intel Xeon 6 processors with P-cores delivered up to a 1.9x generational performance gain in MLPerf Inference v5.1, while built-in AI acceleration technologies such as AMX and AVX512 allow workloads like LLM inference, fine tuning, and classical machine learning to run efficiently without the need for dedicated accelerator hardware.

More Context: MLPerf Inference v6.0 Results

Notices & Disclaimers

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. Visit MLCommons for more details. No product or component can be absolutely secure.

¹ Based on MLPerf Inference v6.0 benchmark, Intel Arc Pro B60 used for performance claims are configured with an Intel Xeon 698X, 2x Arc Pro B60 Dual GPU cards (4x Arc Pro B60 equivalent), and 8 sticks of 16GB DDR5 6400MT/s memory.

Intel Arc Pro B70 used for performance claims are configured with an Intel Xeon 698X, 4x Arc Pro B70 GPU cards, and 8 sticks of 16GB DDR5 6400MT/s memory, as of February 2026.

² Based on MLPerf Inference v6.0 benchmark and previous v5.1 benchmark, Intel Arc Pro B60 used for performance claims are configured with an Intel Xeon 698X, 4x Arc Pro B60 Dual GPU cards (8x Arc Pro B60 equivalent), and 8 sticks of 16GB DDR5 6400MT/s memory, as of February 2026.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation.

Intel Delivers Open, Scalable AI Performance in MLPerf Inference v6.0

MLPerf Inference v6.0 benchmarks showcase Intel Xeon 6 and Intel Arc Pro B-Series GPUs delivering powerful, low-latency AI inference for workstations and edge systems

Intel at Computex 2026: Advancing the Next Era of AI-Driven Computing

ChatPPT Collaborates with Intel to Launch Hybrid AI PC Edition

Intel, Google Deepen Collaboration to Advance AI Infrastructure

Intel and SambaNova Advance Agentic AI with Xeon 6

MLPerf Inference v6.0 benchmarks showcase Intel Xeon 6 and Intel Arc Pro B-Series GPUs delivering powerful, low-latency AI inference for workstations and edge systems

Related Posts

Intel at Computex 2026: Advancing the Next Era of AI-Driven Computing

ChatPPT Collaborates with Intel to Launch Hybrid AI PC Edition

Intel, Google Deepen Collaboration to Advance AI Infrastructure

Intel and SambaNova Advance Agentic AI with Xeon 6