This year, Nvidia has once again solidified its position as a leader in the AI chip industry with the unveiling of the Blackwell B200 GPU. This new chip is being hailed as the most powerful chip for AI to date, offering groundbreaking performance and efficiency for a wide range of AI workloads.
According to Nvidia, the B200 GPU boasts an impressive 20 petaflops of FP4 horsepower, derived from its 208 billion transistors. Additionally, the GB200 "superchip," which combines two B200 GPUs with a single Grace CPU, promises to deliver 30 times the performance for LLM inference workloads while significantly enhancing energy efficiency. Nvidia claims that this superchip can reduce costs and energy consumption by up to 25 times compared to its predecessor, the H100.
One of the key advancements of the Blackwell B200 GPU is its second-gen transformer engine, which doubles the compute, bandwidth, and model size by utilizing four bits for each neuron instead of eight. This innovation results in the impressive 20 petaflops of FP4 horsepower that the chip offers. Furthermore, Nvidia has introduced a next-gen NVLink switch that enables 576 GPUs to communicate with each other, delivering 1.8 terabytes per second of bidirectional bandwidth.
Nvidia's CEO revealed that training a 1.8 trillion parameter model previously required 8,000 Hopper GPUs and 15 megawatts of power. However, with the Blackwell B200 GPU, only 2,000 GPUs are needed, consuming just four megawatts. In a GPT-3 LLM benchmark with 175 billion parameters, the GB200 showcased seven times the performance of an H100 and four times the training speed.
The company is anticipating high demand for the Blackwell B200 GPU, especially with cloud service providers such as Amazon, Google, Microsoft, and Oracle planning to offer NVL72 racks in their services. These racks can support massive AI models, with one rack accommodating a 27-trillion parameter model.
Alongside the GPU, Nvidia also introduced the DGX Superpod for DGX GB200, a system that combines eight units for a total of 288 CPUs, 576 GPUs, and 240TB of memory. This setup delivers an impressive 11.5 exaflops of FP4 computing power.
By pushing the boundaries of AI chip technology, Nvidia is poised to maintain its position at the forefront of the industry. The Blackwell B200 GPU represents a significant leap forward in AI computing capabilities, offering unmatched performance and efficiency for demanding AI workloads.