Most Viewed Content:

Topband released sodium-ion battery: energy density 120Wh/kg

Topband released a sodium-ion battery at the 15th Shenzhen...

Windows 10 latest KB5026361 patch causes trouble: blue screen of death, random system reboots

Earlier this month Microsoft released the KB5026361 update to...

Micron Crucial T700 PCIe 5.0 SSD is available: 12.4 GB/s, 1TB $180

The T700 PCIe 5.0 SSD, which was officially announced...

Nvidia Announces DGX GH200 Supercomputer with 256 GH200 Chips

NVIDIA today announced a number of big announcements at Computex Taipei 2023, most notably that its Grace Hopper superchips are now in full production. These chips are the core components of NVIDIA’s new DGX GH200 artificial intelligence supercomputing platform and MGX system, which are designed to handle massive amounts of generative artificial intelligence tasks. NVIDIA also announced its new Spectrum-X Ethernet networking platform, optimized for AI servers and supercomputing clusters.

The Grace Hopper superchip is an integrated CPU+GPU solution developed by NVIDIA based on the Arm architecture that integrates a 72-core Grace CPU, Hopper GPU, 96GB of HBM3 and 512GB of LPDDR5X in the same package with a total of 200 billion transistors. This combination provides an amazing data bandwidth between the CPU and GPU of up to 1 TB/s, providing a huge advantage for certain memory-constrained workloads.

The DGX GH200 artificial intelligence supercomputing platform is NVIDIA’s system and reference architecture designed for the most high-end artificial intelligence and high-performance computing workloads. The current DGX A100 system can only combine eight A100 GPUs as a single unit, and given the explosive growth of generative artificial intelligence, NVIDIA’s customers urgently need larger, more powerful systems, and the DGX GH200 is designed to provide maximum throughput and scalability by using NVIDIA’s custom NVLink Switch chip to avoid the limitations of standard cluster connectivity options such as InfiniBand and Ethernet.

Details of the DGX GH200 are less clear, but it has been confirmed that NVIDIA is using a new NVLink Switch system containing 36 NVLink switches to connect 256 GH200 Grace Hopper chips and 144TB of shared memory into a single unit, with NVIDIA CEO Jen-Hsun Huang stating that the GH200 chips are “a giant GPU”. This is the first time NVIDIA has used the NVLink Switch topology to build an entire supercomputer cluster, which NVIDIA says provides 10x more GPU-to-GPU and 7x more CPU-to-GPU bandwidth than previous-generation systems. It is also designed to deliver interconnect power efficiency 5x better than the competition and up to 128 TB/s of pairwise bandwidth. The system has 150 miles (Note: about 241.4 kilometers) of fiber and weighs 40,000 pounds, but presents itself as a single GPU. 256 Grace Hopper superchips boost the DGX GH200’s “AI performance” to exaflop (one million trillion times).

NVIDIA will provide a reference blueprint of the DGX GH200 to its major customers Google, Meta and Microsoft, and will also use the system as a reference architecture design for cloud service providers and hyperscale data centers. NVIDIA itself will also deploy a new NVIDIA Helios supercomputer consisting of four DGX GH200 systems for its own R&D efforts. The four systems have a total of 1024 Grace Hopper chips and are connected using NVIDIA’s Quantum-2 InfiniBand 400 Gb/s network.

NVIDIA DGX is for the highest-end systems, HGX systems are for hyperscale data centers, and the new MGX systems fall in between, and DGX and HGX will coexist with the new MGX systems. NVIDIA’s OEM partners face new challenges in designing servers for AI centers that can slow down design and deployment. NVIDIA’s new MGX reference architecture is designed to accelerate this process, offering more than 100 reference designs.

The MGX system consists of modular designs that cover all aspects of NVIDIA’s CPUs and GPUs, DPUs and networking systems, but also includes designs based on common x86 and Arm processors. NVIDIA also offers air and liquid-cooled design options to suit a variety of application scenarios. ASUS, Gigabyte, Winrock and PEGATRON will all use the MGX reference architecture to develop systems that will be available later this year and into early next year.

As for the new Spectrum-X networking platform, NVIDIA calls it a “high-performance Ethernet for artificial intelligence” networking platform. The Spectrum-X design uses NVIDIA’s 51 Tb/s Spectrum-4 400 GbE Ethernet switch and NVIDIA’s Bulefield-3 DPU, paired with software and SDKs that enable developers to adapt the system to the unique needs of AI workloads.

Compared to other Ethernet-based systems, NVIDIA claims that Spectrum-X is lossless, which provides better QoS and latency. It also features new adaptive routing technology, which is particularly useful in multi-tenant environments.

Latest

Google YouTube Announces End of Stories Service on June 26

Google YouTube has announced that the YouTube Stories service,...

Redmi Note 12T Pro appearance announced with trendy small vertical edge

Today Redmi officially announced the Redmi Note 12T Pro...

OnePlus Nord N30 5G appears on Geekbench equipped with Snapdragon 695 chip

The OnePlus Nord CE 3 Lite phone was released...

HKC Huike’s first QHD 500Hz display unveiled: with 40~500Hz variable refresh rate

HKC Huike announced its first QHD 500Hz display today....
spot_img

Newsletter

Don't miss

Google YouTube Announces End of Stories Service on June 26

Google YouTube has announced that the YouTube Stories service,...

Redmi Note 12T Pro appearance announced with trendy small vertical edge

Today Redmi officially announced the Redmi Note 12T Pro...

OnePlus Nord N30 5G appears on Geekbench equipped with Snapdragon 695 chip

The OnePlus Nord CE 3 Lite phone was released...

HKC Huike’s first QHD 500Hz display unveiled: with 40~500Hz variable refresh rate

HKC Huike announced its first QHD 500Hz display today....

EZVIZ launched the new C8C smart home camera: equipped with a 5-megapixel lens

EZVIZ launched a new C8C smart home camera with...
Threza Gabriel
Threza Gabrielhttps://www.techgoing.com
TechGoing is a global tech media to brings you the latest technology stories, including smartphones, electric vehicles, smart home devices, gaming, wearable gadgets, and all tech trending.
spot_imgspot_img

GALAX releases new GeForce RTX 4090 water-cooled graphics card with 360mm radiator

GALAX released the new RTX 4090 HYDRO water-cooled graphics card overseas, equipped with a 360mm radiator. According to reports, this graphics card uses a fully...

Xiaomi releases 10000mAh Pocket Edition with its own cable power bank priced at 129 CNY

Xiaomi has launched a "Xiaomi Power Bank 10000mAh Pocket Edition" with a price of 129 CNY. The power bank has a sleek design and is...

Microsoft launches Windows 11 Virtual Machine version 2305

Microsoft today made the latest version 2305 of its Windows 11 virtual machines available for developers to download for free on its official website....