At CES 2023, AMD revealed the Instinct MI300, an APU accelerator card for next-generation data centers that combines CPU, GPU and memory all in one package, dramatically shortening DDR memory strokes and CPU-GPU PCIe strokes to dramatically improve performance and efficiency.
The accelerator card is a Chiplet design with 13 small chips, based on a 3D stack, including 24 Zen4 CPU cores, incorporating CDNA 3 and 8 HBM3 memory stacks, with integrated 5nm and 6nm IP, containing a total of 128GB of HBM3 memory and 146 billion transistors, and will be available in the second half of 2023.
As of now, the AMD Instinct MI300 has more transistors than Intel’s 100 billion transistor Ponte Vecchio, making it the largest chip AMD has ever put into production. We can also see from the photo of Ms. Zifeng Su holding the Instinct MI300 in her hand that it has surpassed half a human hand in size and looks quite exaggerated.
AMD said that it has nine 5nm chiplets based on 3D stacking (according to the previous rule, there should be 3 CPUs and 6 GPUs), and 4 small chips based on 6nm, surrounded by packaged HBM memory chips, with a total of 146 billion transistor parts. AMD said that the AI performance of this accelerator card is much higher than that of the previous generation (MI250X).
AMD has only announced this information so far, and the production version of the chip will be available in the second half of 2023, when there may also be competitors such as NVIDIA Grace and Hopper GPUs, but it should be a bit earlier than Intel’s Falcon Shores.
Judging from the MI300 samples shown by AMD representatives, the 9 small chips have an active design that enables communication not only between the I/O tiles, but also with the memory controller interfaced to the HBM3 stack, resulting in incredible data throughput while also allowing the CPU and GPU to simultaneously process the same data in memory (zero copy), saving power, improving performance and simplifying the process.
AMD claims the Instinct MI300 delivers 8x the AI performance and 5x the performance-per-watt improvement of the MI250 accelerator card (based on sparsity FP8 benchmarking), and that it can save millions of dollars in electricity costs by reducing the training time for very large AI models such as ChatGPT and DALL-E from months to weeks.
Notably, the Instinct MI300 will be used in the upcoming next-generation 20-billion-degree El Capitan supercomputer in the U.S., which represents the fastest supercomputer in the world when El Capitan completes deployment in 2023.