Google accompanied the latest launch of its Gemini AI fashions with the most recent model of its flagship tensor processing unit (TPU) for AI coaching and inference, in what seems to be an try to tackle Nvidia’s personal market-leading GPU’ is.
TPU v5p – Google’s strongest purpose-built AI accelerator – has been deployed to energy the corporate’s ‘AI hypercomputer’. It is a supercomputing structure constructed particularly to run AI purposes, slightly than supercomputers that often run scientific workloads, as a result of TPUs are unsuitable for this.
The newest model of its TPU has 8,960 chips per unit. pod (which incorporates the system), versus 4,096 in v4, and it is 4 instances extra scalable when it comes to the overall availability of FLOPs per pod. These new pods present a throughput of 4,800 Gbps. The brand new pods even have 95GB of high-bandwidth reminiscence (HBM) versus 32GB of HBM RAM within the TPU v4.
Nvidia H100 vs Google TPU v5p: Which is quicker?
Not like Nvidia, which presents its GPUs to different corporations to purchase, Google’s customized TPUs stay in-house to be used throughout its personal services. Google’s TPUs have lengthy been used to energy its companies, together with Gmail, YouTube and Android, and the most recent model has additionally been used to coach Gemini.
Google’s v5p TPUs are as much as 2.8x quicker to coach massive language fashions than TPU v4 and provide 2.1x worth for cash. Though the intermediate model, TPU v5e, launched earlier this 12 months, presents probably the most worth for cash of the three, it is just as much as 1.9 instances quicker than TPU v4, making TPU v5p probably the most highly effective.
It is even highly effective sufficient to compete with Nvidia’s extremely sought-after H100 GPU, which is likely one of the finest graphics card on the market for AI workloads. This part is 4 instances quicker at coaching load than Nvidia’s A100 GPU, in accordance with the corporate’s personal knowledge.
Google’s TPU v4, in the meantime, is estimated to be between 1.2 and 1.7 instances quicker than the A100, in accordance with analysis, it was revealed in April. Extremely tough calculations recommend that the TPU v5p is subsequently roughly between 3.4 and 4.8 instances quicker than the A100 – making it on par or superior to the H100, though extra detailed benchmarking is required earlier than any conclusions may be drawn .