TPUv3 paper at CACM

By: David Kanter (dkanter.delete@this.realworldtech.com), June 21, 2020 10:32 am
Room: Moderated Discussions
CACM just published a paper by Google's TPUv3 team which was a really interesting read:

https://cacm.acm.org/magazines/2020/7/245702-a-domain-specific-supercomputer-for-training-deep-neural-networks/fulltext

There are some additional microarchitectural details, which I appreciated. It reiterates their emphasis on system-level integration (of the router) and gives some details on the interconnect.

The performance comparisons are also fascinating. Their section on energy efficiency is challenging because the energy/FLOP for BF16 is so much smaller than FP64. However, it's intriguing that a single TPU cluster on a real workload exceeds the FLOP/s of even the world's largest supercomputers on linpack...which is a rubbish workload.

While CNN's like AlphaZero are dense compute, they still have a variety of matrix shapes, which make them much more challenging than regular Linpack.

Anyway, it's a good read.

David
 Next Post in Thread >
TopicPosted ByDate
TPUv3 paper at CACMDavid Kanter2020/06/21 10:32 AM
  TPUv3 paper at CACMDaniel B2020/06/22 01:37 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?