Registered users can unlock up to five pieces of premium content each month.
Hardware Refresh to Bolster Enterprise Edge Offering; AI PC Portfolio and Ecosystem Shows Promise |
NEWS |
Intel’s Chief Executive Officer (CEO), Pat Gelsinger, used his April 9 keynote to launch the data center Gaudi 3 workhorse, which has been engineered from the ground up for training and inference on generative Artificial Intelligence (AI) workloads. It will be offered as a mezzanine card and 8-unit baseboard—a modular subsystem ready for large scale-out in data centers, which should ease Original Equipment Manufacturers’ (OEMs) migration from other systems, and enable high-volume shipments. The 6th generation of Xeon data center Central Processing Units (CPUs), manufactured on the in-house Intel 3 node, was also unveiled, with versions made up of performance and efficiency cores. Support for the enterprise AI story came from a host of on-stage partner testimonials, from those who have put their faith in the platform for inferencing workloads, as well as sold-out Gaudi capacity on Intel's developer cloud, which signals significant interest in the hardware.
On the client side, Intel has dedicated significant marketing resources to its AI PC hardware and claims suggest it has paid off, with over 5 million units shipped to date, and a forecast 40 million shipments by the end of this year. This is underpinned by some 230 OEM and Independent Hardware Vendor (IHV) partnerships, and over 100 Independent Software Vendor (ISV) partners developing AI applications, which have resulted in 500 models optimized for Intel’s heterogenous Core Ultra System in Packages (SiPs). Energy efficiency received renewed focus for the next line of AI PC chipsets, as well as more AI compute, to keep up with Microsoft’s requirements for Copilot on-device.
Economics, Physics, and Land: Intel's Theory Guiding Its Bet on Enterprise Edge |
IMPACT |
ABI Research concurs with the tenet of Intel's edge story, which has been distilled into three areas: the laws governing economics (networking costs), physics (latency), and land (data sovereignty). Moreover, the company is behind in the high-end data center AI chip race: NVIDIA has pushed the envelope with innovation and evermore performant architectures, most recently with Blackwell. NVIDIA has also diverted resources from its edge proposition to its data center portfolio, as explored in ABI Research’s GTC 2024 whitepaper, which coincides with a renewed focus by Intel on the enterprise edge. Thus, as Intel cannot outcompete NVIDIA’s systems in training frontier models, it is betting on an increasing demand from enterprise customers for AI solutions at the enterprise edge, most of which will involve less demanding inferencing and fine-tuning workloads. So, to address the hitherto low success rate of their customers’ edge AI deployments, and as part of its journey from semiconductor vendor to a systems-vertical player, Intel is offering less complex systems-level designs and solutions addressing specific pain points, to be powered by its Gaudi and Xeon processors. The goal is to simplify enterprise edge AI deployments, brought together with the Tiber Edge Platform to build, run, and scale AI solutions with cloud-like simplicity.
By responding to customers’ pain points, the goal is to sell Intel’s diverse hardware portfolio—from CPUs and Graphics Processing Units (GPUs) to accelerators and Field Programmable Gate Arrays (FPGAs)—to unlock productivity gains and innovation at the enterprise level. The “killer” use case that underpins this proposition is Retrieval Augmented Generation (RAG). Gelsinger’s bet on RAG rests heavily on the assumption that businesses are—and will remain—reluctant to allow proprietary data to migrate beyond their private servers for training domain-specific models on third-party platforms. RAG leverages on-device or on-premise compute via natural language prompts to generate referenced, Large Language Model (LLM)-augmented results for productivity AI applications, including domain-specific “experts,” which Intel’s partners like Infosys corroborated. The claim is that smaller (e.g., sub-13B parameter) LLMs can scale beyond frontier models and be deployed on enterprises’ vast internal datasets—data sovereignty unscathed.
Further (Software) Unification Will Cement the Value of Intel's AI Offering—RAG and Beyond |
RECOMMENDATIONS |
Intel should address several issues to further its journey to systems-level vertical player. The first becomes clear when you contrast the transition from Gaudi 2 to Gaudi 3 with the relative simplicity of upgrading even single NVIDIA accelerators in a system. First, Gaudi is not part of Intel’s edge software platform, Tiber, and the Gaudi 2 Application Programming Interface (API) would require extensive and time-consuming porting to be compatible with Gaudi 3. Future transitions of Gaudi hardware should be made as simple as the plug-and-play developers are accustomed to with NVIDIA, and the Gaudi portfolio should form part of the end-to-end Tiber Edge Platform to allow developers to scale applications in the same environment as Intel’s other hardware.
Second, Intel has placed great emphasis on the RAG solutions it sees as the next stage in the enterprise generative AI journey. But what comes beyond RAG is unknown, and whatever AI workloads emerge at the next stage could render CPUs, a core component of Intel’s revenue, less relevant. Intel must continue to invest in its hardware accelerator business and diligently acquire startups developing compelling applications for meaningful horizontal enterprise use cases to remain a relevant systems-level player. Intel’s acquisitions of Granulate and cnvrg.io are examples of value-adding portfolio growth.
Third, Intel should continue to promote the United Acceleration (UXL) Foundation, which promotes a hardware-agnostic toolkit that would, in theory, enable developers to write once and deploy their application on a range of AI hardware. This framework will create greater openness, which will be beneficial for Intel chips on workloads where they claim to offer lower Total Cost of Ownership (TCO) compared to NVIDIA’s H100 and H200, for example. Furthermore, the foundation should accelerate efforts to create a tool to convert software written for CUDA into code capable of running on other companies’ AI chips, which would incentivize developers to migrate away from NVIDIA platforms. This particularly applies to inferencing workloads at the enterprise edge, where NVIDIA’s chips are less competitive from a TCO standpoint. Whether we see a future where NVIDIA owns the data center and Intel the enterprise edge depends on many factors, but a software tool bridging CUDA code to Intel’s edge accelerated hardware would catalyze such a scenario.
Finally, on-device AI PC applications beyond RAG are lacking, and consumer and enterprise customers will need tangible benefits to make the investment in premium hardware and software. Gelsinger’s claim that 2024 is the time for a hardware refresh may stack up against the 2025 Windows 10 end of life, but to drive sales—especially in the desktop segment where alternative chipsets will remain for some time—killer productivity AI applications will need to be marketed. Maintaining a deep level of congruence with Microsoft on developing the AI PC is vital; although the two have worked together on defining the AI PC, rumors that the productivity application dubbed “AI Explorer” might not feature on the first batch of Intel’s Core Ultra AI PCs is problematic. Intel must remain at the forefront of any Windows AI PC developments.