Intel Arc Pro B60 GPUs deliver 4x better performance-per-dollar than NVIDIA in MLPerf v5.1 benchmarks.
Intel has published new benchmarks of its Project Battlematrix workstation featuring Intel Arc Pro B60 GPUs through MLCommons' latest MLPerf Inference v5.1 release. The results demonstrate the performance of Intel GPU Systems featuring Intel Xeon with P-cores and Intel Arc Pro B60 graphics across 6 key benchmarks. In Llama 8B inference workloads, the Intel Arc Pro B60 delivers performance-per-dollar advantages of up to 1.25x compared to NVIDIA RTX Pro 6000 and up to 4x compared to the L40S.
More specifically, in Llama 3.1 (8B Datacenter) inference workloads, the Intel Arc Pro B60 "Project Battlematrix" solution generates 6472.37 Samples/s in offline mode and 5348.45 Queries/s, compared to 1642.22 samples/s and 1207.14 Queries/s on the NVIDIA L40S. While the RTX PRO 6000 "Blackwell" GPU delivers faster performance overall, Intel claims 25% better performance per dollar with its solution, addressing the cost factor that has limited adoption of high-performance inference platforms.
Until now, professionals seeking platforms capable of delivering high inference performance while maintaining data privacy and avoiding heavy subscription costs tied to proprietary AI models have faced limited options. Intel's Project Battlematrix is designed to address this gap, providing an all-in-one inference platform combining validated hardware and software to meet the needs of modern AI inference deployment for large language models.
The system includes a containerized solution built for Linux environments, optimized for multi-GPU scaling and PCIe P2P data transfers, along with enterprise-class reliability and manageability features such as ECC, SRIOV, telemetry, and remote firmware updates. CPUs play a vital orchestration role in these systems, handling preprocessing, transmission, and overall system coordination. Intel Xeon has established itself as the preferred CPU for hosting and managing AI workloads in GPU-powered systems through sustained improvements in CPU-based AI performance over the past four years.
Intel remains the only vendor submitting server CPU results to MLPerf, underscoring its commitment to accelerating AI inference capabilities across both compute and accelerator architectures. The Intel Xeon 6 with P-cores achieved a 1.9x performance improvement generation-over-generation in MLPerf Inference v5.1, further demonstrating the company's leadership in CPU-based AI inference.