At the Cloud Next conference held today, Google Cloud announced that it will launch A3 virtual machine instances next month. Google Cloud announced the A3 instance at the I/O Developer Conference held in May this year. The biggest highlight is that it is equipped with Nvidia H100 Tensor Core GPU to meet the needs of generative AI and large language models.
It was previously reported that the A3 instance uses the 4th generation Intel Xeon Scalable processor, 2TB DDR5-4800 memory, and 8 Nvidia H100 “Hopper” GPUs, achieving a bisection bandwidth of 3.6 TBps through NVLink 4.0 and NVSwitch.
The new A3 supercomputer is “dedicated to training and serving the most demanding AI models that power today’s generative AI and large-scale language model innovations.” According to reports, this supercomputer can provide 26 exaFlops of artificial intelligence performance.
Google Cloud also introduced the new TPU v5e at today’s launch, the most cost-effective and accessible cloud TPU to date. These TPUs or custom ASICs are designed to accelerate AI and ML workloads.
SDxCentral reports that the TPU v5e doubles the training performance per dollar and 2.5 times the inference performance per dollar compared to its predecessor.