Home News Micron DDR5 with 4th Generation AMD EPYC Processor Official Benchmarks

Micron DDR5 with 4th Generation AMD EPYC Processor Official Benchmarks

0
Micron DDR5 with 4th Generation AMD EPYC Processor Official Benchmarks

Micron and AMD are establishing a joint server lab in Austin to reduce server memory validation time and conduct workload testing together during product validation and launch, according to a Micron release. Micron’s DDR5 memory for data centers and fourth-generation AMD EPYCTM (Skylar) processors are now shipping and have been officially benchmarked against some common high-performance computing (HPC) workloads.

Supercomputers have long been responsible for high-performance computing workloads. Such large-scale, data-intensive workloads require running terabytes of data to perform millions of parallel operations to solve difficult problems in the human world, such as weather and climate prediction; earthquake modeling; chemical, physical and biological analysis; and more.

With advances in computer architecture, such workloads are often hosted in very large “scalable” clusters of high-performance servers. These server clusters require a combination of the most powerful computing power, architecture, memory and storage infrastructure to meet the scalability, low latency and high-performance needs of critical workloads. However, as server CPU performance and throughput continue to grow, DDR4 cannot provide enough memory bandwidth to meet the growing number of high-performance cores.

To alleviate this bottleneck, Micron DDR5 memory has been combined with fourth-generation AMD EPYC processors featuring the Zen 4 server architecture, enabling server CPUs to better match memory offerings to meet the performance and efficiency demands of data-intensive workloads. IT House has learned that Micron conducted industry benchmark tests of the latest AMD Zen 4 96-core CPUs and Micron DDR5 for high-performance computing workloads, with all results showing a two-fold performance improvement.

STREAM1 is a common benchmarking tool used to measure the memory bandwidth of high-performance computers, capturing the peak memory bandwidth of high-performance computing systems.

The software stack used for this workload

● Alma 9 Linux kernel 5.14

● STREAM.f, the version released on November 29, 2021

Test Setup

● DDR4 system with 3rd generation 64-core 3.7 GHz AMD EPYC processor; DDR4 3200 MHz system 2 with full RDIMM memory slots, 64GB total

● DDR5 system with 4th generation 96-core 3.7 GHz AMD EPYC processor; DDR5 4800 MHz system with full RDIMM memory slot 3, 64GB total

Test Results

● DDR5 system doubles memory bandwidth per slot to 378 GB/s

● The results mean customers can run larger artificial intelligence/machine learning (AI/ML) projects or take advantage of DDR5’s increased memory bandwidth for more high-performance computing.

The HPC workload code used in this test was weather and climate specific. the WRF model performed well in some traditional HPC architectures that support high-performance floating point processing, high memory bandwidth, low latency networks, etc., over the continental United States (CONUS) at a lateral resolution of 2.5 km.

The software stack used for this workload

● Alma 9 Linux kernel 5.14

● WRF 2.3.5 & 4.3.3

● Open MPI v4.1.1

Test Setup

● DDR4 system with 3rd generation 64-core 3.7 GHz AMD EPYC processor; DDR4 3200 MHz system with 2 full RDIMM memory slots, 64GB total

DDR5 system with 4th generation 96-core 3.7 GHz AMD EPYC processor; DDR5 4800 MHz system with full RDIMM memory slot 3, 64GB total

Test Results

● Micron DDR5 with 4th generation AMD EPYC processor achieves 1.3567-time steps/second vs. 2.8533-time steps/second for DDR4 systems

Faster speeds mean that larger databases can be used or more models can be run for weather prediction, which in turn improves prediction accuracy.

OpenFOAM is an open-source high-performance computational workload for computational fluid dynamics (CFD) that is used in a wide range of industries to help reduce development time and costs. From consumer product design to aerospace design, OpenFOAM is capable of simulating physical interactions in diverse applications, including motorcycle windshield turbulence.

In this simulation, OpenFOAM can calculate the steady airflow around the motorcycle and rider. OpenFOAM can perform load-balanced calculations based on a user-specified number of processes, thus breaking the mesh into multiple parts and assigning them to different processes to solve. Once the solution is complete, the mesh and solution are reassembled into a single domain.

Software stacks used for this workload

OpenFOAM CFD software (version 8), where the motorcycle mesh size is 600 x 240 x 240

● Alma 9 Linux kernel 5.14

● Open MPI v4.1.1

Test setup

● DDR4 system with 3rd generation 64-core 3.7 GHz AMD EPYC processor; DDR4 3200 MHz system with 2 full RDIMM memory slots, 64GB total

DDR5 system with 4th generation 96-core 3.7 GHz AMD EPYC processor; DDR5 4800 MHz system with full RDIMM memory slot 3, 64GB total

Test Results

Test results show that Micron’s DDR5 product portfolio improves the performance of OpenFOAM, one of the top five high-performance computing software platforms with a large open-source community. Widely used in universities and R&D centers, it enables highly parallel operations using high bandwidth memory and high-performance CPUs with dense cores.

CP2K is an open-source quantum chemistry tool for many applications, including solid-state biosystem simulations, and provides a common framework for different modeling approaches. The test subject is the density functional theory (DFT) of water (H2O) with a total of 6,144 atoms (2,048 water molecules) in the simulation box.

The software stack used for this workload

H2O-DFT-LS.NREP4 and H2O-DFT-LS

● Alma 9 Linux kernel 5.14

Test setup

● DDR4 system with 3rd generation 64-core 3.7 GHz AMD EPYC processor; DDR4 3200 MHz system with 2 full RDIMM memory slots, 64GB total

DDR5 system with 4th generation 96-core 3.7 GHz AMD EPYC processor; DDR5 4800 MHz system with full RDIMM memory slot 3, 64GB total

Test Results

Test results show that Micron’s DDR5 portfolio increases molecular dynamics performance by a factor of 2.1. The performance of such workloads also improved significantly as the number of cores and memory bandwidth increased.

Summary

These are preliminary results as only a small number of high-performance computing workloads have been tested. Combining high-performance, high-bandwidth memory with the latest server processors, such as fourth-generation AMD EPYC processors, opens up new possibilities for HPC customers.

1 STREAM Benchmark with 2.5 billion vectors configured in STREAM Benchmark – running on a single AMD CPU system

2 AMD DDR4 system on a 64-core AMD EPYC 7763 processor with 64GB of DDR4-3200 MHz RDIMM memory slots fully populated

3 AMD DDR5 system is a 96-core AMD EPYC 9654 processor with full RDIMM memory slots for DDR5-4800 MHz for a total of 64GB

4 The WRF with a lateral resolution of 12.5 km CONUS ran for 929 seconds on a DDR4 system and 287 seconds on a DDR5 system (both including memory input/output time). The WRF configuration for this test was 2.5 km CONUS, and the test result was 1.3567 time steps/second, compared to 2.8533 time steps/second for DDR4.

5 For OpenFOAM, three variants were run.

5a: 1004040 runtimes, 1,144 seconds for DDR4 systems and 478 seconds for DDR5 systems

5b: 1084646 runtimes, DDR4 system runtime of 1,633 seconds, DDR5 system runtime of 698 seconds

5c: 1305252 runtimes, DDR4 system runtime 2,522 seconds, DDR5 system runtime 1,091 seconds

6 molecular dynamics workloads of 2,519 seconds on DDR4 systems and 1,242 seconds on DDR5 systems