ChatGPT took the world by storm, technology giants have entered the game, and generative AI behind its large model-based artificial intelligence has become the direction of industry investment.
The so-called “big model” is usually trained on a large data set without labeling, using self-supervised learning methods. In other scenarios, developers only need to fine-tune the model or use a small amount of data for secondary training to meet the needs of new application scenarios.
However, training large general-purpose models is very “costly”. According to the report “How much computing power does ChatGPT need”, the cost of a single training session for GPT-3 is estimated to be around $1.4 million, and for some larger LLMs (Large Language Models), the training cost ranges from $2 million to $12 million. With an average of 13 million unique visitors to ChatGPT in January, the corresponding chip requirement is more than 30,000 Nvidia A100 GPUs, with an initial investment cost of about $800 million and a daily electricity cost of about $50,000.
If the current ChatGPT is deployed to every search conducted by Google, 512820.51 A100 HGX servers and a total of 4102568 A100 GPUs are required, and the total cost of these servers and network is over $100 billion in capital expenditure alone.
According to Guosun Securities, training costs in the millions to tens of millions of dollars on the public cloud are not cheap for global technology majors such as Google, but they are within the acceptable range and not expensive.