On Wednesday, local time, chip companies Intel, ARM and Nvidia jointly released a draft specification for a so-called common exchange format for artificial intelligence, which aims to make the process of processing artificial intelligence by machines faster and more efficient. In the draft, Intel, ARM and NVIDIA recommend the 8-bit FP8 floating-point processing format for artificial intelligence systems. They say the FP8 floating-point processing format has the potential to optimize hardware memory usage and thus accelerate AI. The format is suitable for both AI training and inference, helping to develop faster and more efficient AI systems.
Figure 1 – Language model AI training (from: NVIDIA)
When developing an AI system, the key problem facing data scientists is not only collecting large amounts of data to train the system. It is also necessary to choose a format to express the system weights, which are important factors that the AI learns from the training data to influence the system’s predictive effectiveness. Weights allow AI systems like GPT-3 to automatically generate entire paragraphs from a long sentence cue, and also allow the DALL-E 2 AI to generate realistic portraits based on a particular headline.
Common formats for AI system weights are half-precision floating-point FP16, which uses 16 bits of data to represent system weights and single-precision floating-point FP32. Half-precision floating-point numbers and lower-precision floating-point numbers reduce the memory space required to train and run AI systems, while also speeding up computation and even reducing the number of bandwidth resources and power consumed. But because there are fewer bits than in single-precision floating-point numbers, accuracy can be reduced.
However, many companies in the industry, including Intel, ARM and Nvidia, are looking to the 8-bit FP8 floating-point processing format as the best choice. In a blog post, Sasha Narasimhan, director of product marketing at NVIDIA, noted that the FP8 floating-point processing format offers comparable accuracy to half-precision floating-point with “significant” speedups in use cases such as computer vision and image generation systems.
Figure 2 – Language Model AI Reasoning
NVIDIA, ARM and Intel say they will make the FP8 floating-point processing format an open standard that other companies can use without a license. The three companies described FP8 in detail in a white paper. Narasimhan said the specifications will all be submitted to the technical standardization organization IEEE to see if the FP8 format can become a common standard for the artificial intelligence industry.
Narasimhan said, “We believe that a common interchange format will lead to rapid advances in hardware and software platforms, improving interoperability and thus advancing AI computing.”
Of course, the reason the three companies have gone to great lengths to push for the FP8 format to become a universal switching format is also due to their own research. NVIDIA’s GH100 Hopper architecture has already implemented support for the FP8 format, and Intel’s Gaudi2 AI training chipset also supports the FP8 format.
But a common FP8 format would also benefit competitors such as SambaNova, AMD, Groq, IBM, Graphcore and Cerebras, all of which have experimented with or adopted the FP8 format in the development of AI systems. Simon Knowles, co-founder and CTO of Graphcore, a developer of artificial intelligence systems, wrote in a blog post in July that “the advent of 8-bit floating point numbers brings huge advantages to AI computing in terms of processing performance and efficiency.” Knowles also called it “an opportunity” for the industry to define a “single open standard” that would be much better than adopting multiple formats to compete with each other.