Jensen Huang, president of Nvidia, holding the Grace hopper superchip CPU used for generative AI on the Supermicro keynote presentation during Computex 2023.
Walid Berrazeg | Lightrocket | Getty Images
Nvidia on Monday unveiled the H200, a graphics processing unit designed for training and deploying the sorts of artificial intelligence models which can be powering the generative AI boom.
The brand new GPU is an upgrade from the H100, the chip OpenAI used to coach its most advanced large language model, GPT-4. Big corporations, startups and government agencies are all vying for a limited supply of the chips.
H100 chips cost between $25,000 and $40,000, in line with an estimate from Raymond James, and hundreds of them working together are needed to create the largest models in a process called “training.”
Excitement over Nvidia’s AI GPUs has supercharged the corporate’s stock, which is up greater than 230% up to now in 2023. Nvidia expects around $16 billion of revenue for its fiscal third quarter, up 170% from a yr ago.
The important thing improvement with the H200 is that it includes 141GB of next-generation “HBM3” memory that can help the chip perform “inference,” or using a big model after it’s trained to generate text, images or predictions.
Nvidia said the H200 will generate output nearly twice as fast because the H100. That is based on a test using Meta’s Llama 2 LLM.
The H200, which is predicted to ship within the second quarter of 2024, will compete with AMD’s MI300X GPU. AMD’s chip, just like the H200, has additional memory over its predecessors, which helps fit big models on the hardware to run inference.
Nvidia H200 chips in an eight-GPU Nvidia HGX system.
Nvidia
Nvidia said the H200 will probably be compatible with the H100, meaning that AI corporations who’re already training with the prior model won’t need to vary their server systems or software to make use of the new edition.
Nvidia says it can be available in four-GPU or eight-GPU server configurations on the corporate’s HGX complete systems, in addition to in a chip called GH200, which pairs the H200 GPU with an Arm-based processor.
Nevertheless, the H200 may not hold the crown of the fastest Nvidia AI chip for long.
While corporations like Nvidia offer many various configurations of their chips, latest semiconductors often take a giant step forward about every two years, when manufacturers move to a special architecture that unlocks more significant performance gains than adding memory or other smaller optimizations. Each the H100 and H200 are based on Nvidia’s Hopper architecture.
In October, Nvidia told investors that it could move from a two-year architecture cadence to a one-year release pattern resulting from high demand for its GPUs. The corporate displayed a slide suggesting it can announce and release its B100 chip, based on the forthcoming Blackwell architecture, in 2024.
WATCH: We’re a giant believer within the AI trend going into next yr
Don’t miss these stories from CNBC PRO: