Intel announced the release of fourth-generation Xeon scalable server processors, codenamed Sapphire Rapids. They will be available in both regular and Xeon Max variants with onboard HBM2e memory. In addition, the manufacturer announced the release of Data Center GPU Max server accelerators, codenamed Ponte Vecchio.
Intel’s portfolio has been supplemented with 52 fourth-generation Xeon Scalable server processors. Unlike the competition, given AMD’s EPYC Genoa chips that offer up to 96 cores, Intel’s new products are poised to only offer up to 60 physical cores. However, according to Intel, increasing the core count of the fourth generation Xeon Scalable against the background of the third generation Xeon Scalable (Ice Lake) resulted in a 53% increase in computing power.
One of the main features of Sapphire Rapids are special hardware blocks designed to speed up certain types of tasks such as data transfer, compression, encryption, data analysis, etc. The company calls them “accelerators”.
According to Intel, the average performance per watt of Sapphire Rapids is up to 2.9 times higher than its predecessors; in AI-related tasks, the performance is increased up to 10 times; and with workloads related to analytical work, productivity is tripled.
For Xeon Sapphire Rapids, which is based on the Intel 7 process technology, support for the PCIe 5.0 interface, DDR5-4000, DDR5-4400 and DDR5-4800 RAM and the CXL 1.1 bus will be added announced.
As mentioned above, the scalable fourth generation Xeon processor series is represented by 52 models. Some of them are designed for general workloads, some models are specialized and designed for use with liquid cooling systems, other models can be used in network infrastructure, HPC, database and cloud computing. Sapphire Rapids processors are represented by Max, Platinum, Gold, Silver and Bronze models with different numbers of cores.
Support for AVX-512 instructions, Deep Leaning Boost (DLBoost) and Advanced Matrix Extensions (AMX) is declared for new products. The latter significantly increase the performance of processors in tasks related to the operation of AI algorithms and machine learning.
As before, the fourth generation of Xeon Scalable supports system configurations with one, two, four and eight processor sockets. Incidentally, the same AMD Genoa processors only scale to two sockets. However, competing solutions offer support for more PCIe 5.0 lanes – up to 128. Intel processors, in turn, support up to 80 lanes of the new interface.
The manufacturer points out that the fourth generation of Xeon Scalable supports the installation of up to 1.5 TB of eight-channel DDR5-4800 memory per processor socket. The same AMD Genoa offers the installation of up to 6 TB of DDR5-4800 RAM for 12 channels. Intel chips support the installation of two DDR5-4400 RAM modules per channel.
Prices for the new 4th Gen Intel Xeon Scalable processors range from $415 for an eight-core Xeon Scalable Bronze 3408U up to 1.9 GHz, 22.5 MB L3 cache and a TDP of 125 W to $17,000 for a 60-core Xeon Scalable Platinum 8490H model with 120 virtual threads, 112.5 MB L3 cache, and 350 W TDP. For comparison: AMD Genoa has a similar power consumption of 360 watts. Right, in this case we are talking about a 96-core processor model.
Processor prices depend not only on the number of cores they contain, but also on the number of these very active hardware acceleration units for certain types of tasks mentioned above. Customers can choose the right processor models based on their needs and, if needed, enable the missing acceleration blocks through the Intel On Demand License Agreement. The company has not yet announced prices for the latter. However, the service for their acquisition will be available, for example, through OEM assemblers of server systems and activation through software and licensed programs.
The whole idea of enabling acceleration blocks for arithmetic operations boils down to the fact that the initial purchase of one or another model of fourth-generation Xeon scalable processor means that the buyer only pays for the acceleration blocks they need at the moment and not too much for additional unnecessary processor functions pay. But if necessary, he can pay for the necessary functions.
The hardware acceleration blocks of the Sapphire Rapids processors themselves are divided into four types:
- Data Streaming Accelerator (DSA) – improves data movement performance by offloading traditional CPU processing units from data copy and transformation operations;
- Dynamic Load Balancer (DLB) – designed to prioritize packets and dynamically balance network traffic redistribution between CPU cores as system load fluctuates;
- In-Memory Analytics Accelerator (IAA) – accelerates analysis tasks and offloads CPU cores, increasing the speed of queries to the database and other functions;
- Quick Assist Technology (QAT) – speeds up cryptography, compression/decompression tasks. Previously, this hardware acceleration block was part of the chipset. Intel has been using it for a long time and it has extensive software support.
Intel introduced the processors of the Xeon Max series last fall. These are Sapphire Rapids equipped with 64GB of HBM2e memory. Chips in this series offer 32 to 56 cores with support for up to 112 virtual threads and have a TDP of 350 W.
These processors are designed for use in fluid dynamics, climate and weather forecasting, AI and neural network training, big data analytics, resident databases, and more.
Key features of the Xeon Max series processors are support for PCIe 5.0 and CXL 1.1 interfaces. HBM2e storage can be used both as additional cache and additional RAM. In addition, a server with Xeon Max cannot be equipped with RAM modules at all – the system relies solely on HBM. The main competitors of Intel Xeon Max will be AMD EPYC Milan-X processors with 3D V-Cache technology to increase cache memory.
Along with the fourth-generation Intel Xeon Scalable processors, including the Sapphire Rapids HBM models, the manufacturer officially announced the availability of new computing accelerators Data center GPU max codenamed Ponte Vecchio. They were also introduced a few months ago.
These accelerators are manufactured both in the form of regular PCIe expansion cards and in the form of OAM modules. Intel claims they are up to 2.4x faster than NVIDIA A100 accelerators. You can read more about Data Center GPU Max in our previous article.