Performance of H100 SXM vs H100 PCIe

By: Matt Hughes (, May 27, 2022 3:17 am
Room: Moderated Discussions
The peak performance specs of NVIDIA’s H100 SXM are only 25% higher than NVIDIA’s H100 PCIe even though the SXM version uses 700W versus 350W for the PCIe version. The SXM version has faster HBM DRAM (3 TB/s vs 2 TB/s) and faster NVLink (900 GB/s vs 600 GB/s) but that isn’t enough to explain the power difference.

Why would NVIDIA double the thermal design power to get only 25% more peak performance?

The only explanation I can think of is that the SXM version has much more than 25% more sustained performance than the PCIe version. Can anyone think of a different explanation?

What is a meaningful way for vendors to specify the performance of power-limited chips for each operation type (such as FP64 matrix operations)? Should vendors provide both a peak spec and a sustained spec or is there a better method? Under what conditions should the sustained spec be measured (HBM active, PCIe active, NVLink active, ...)? Should the sustained spec for each operation type have a minimum and a maximum value? benchmarks are difficult to interpret and do not reveal the sustained performance of each operation type.

The H100 datasheet is here:
 Next Post in Thread >
TopicPosted ByDate
Performance of H100 SXM vs H100 PCIeMatt Hughes2022/05/27 03:17 AM
  Performance of H100 SXM vs H100 PCIeGroo2022/05/27 06:44 AM
  Performance of H100 SXM vs H100 PCIeme2022/05/27 11:13 AM
  Performance of H100 SXM vs H100 PCIeaaron spink2022/05/27 06:15 PM
  Why? MoneyMark Roulo2022/05/28 07:53 AM
    Why? MoneyMatt Hughes2022/05/28 03:54 PM
      Why? Moneyaaron spink2022/05/28 11:05 PM
Reply to this Topic
Body: No Text
How do you spell avocado?