FP64 in Apple GPU

By: Mark Roulo (nothanks.delete@this.xxx.com),
Room: Moderated Discussions
Mark Heath (none.delete@this.none.none) on June 4, 2024 6:32 am wrote:
> none (none.delete@this.none.com) on June 3, 2024 6:22 am wrote:
> > Mark Heath (none.delete@this.none.none) on June 3, 2024 6:16 am wrote:
> > [...]
> > > Why wouldn't HPC use this GPU instead of SME, especially since
> > > the GPU has more DRAM/SLC bandwidth than a P-core cluster?
> >
> > That wouldn't work for the HPC workloads that need FP64. That's why high-end GPU for HPC have
> > much more FP64 performance than high-end consumer GPU.
>
> You're right that some calculations need FP64. One example I know about is the final iterations
> of the self-consistent field calculation in quantum chemistry. The initial iterations
> can be done in FP32 but the last few iterations need to be in FP64. Admittedly, quantum
> chemistry calculations are not something a lot of Apple customers would do.
>
> I don't know how much faith to have in a website named CPU Monkey, but if this website is correct,
> Apple's M3 Max GPU has 3.55 TFLOPs of FP64. Two P-clusters of SME/AMX provide 1 TFLOPs of FP64.
> It therefore appears that Apple's GPU has significantly more FP64 performance than the SME/AMX
> units. The ratio of FP32 to FP64 performance in both the GPU and SME/AMX units is 4. This suggests
> Apple is using four FP32 multipliers to make one FP64 multiplier.
>
> https://www.cpu-monkey.com/en/igpu-apple_m3_max_40_core
>

Googling suggests that the Apple GPUs don't have fp64 hardware support at all. I see at least one github project to provide fp64 emulation.

I can't find any definitive answer from an Apple site. There might be an answer somewhere here: https://rosenzweig.io/

< Previous Post in ThreadNext Post in Thread >
Thread (107 posts)
TopicPosted ByPosted
All aboard the SME AI hype train---
  All aboard the SME AI hype trainRayla
  How are people not sick of hearing about AI already [nt]me
  All aboard the SME AI hype traindmcq
  All aboard the SME AI hype trainLinus Torvalds
    Coprocessors for matrix mathMark Heath
      Coprocessors for matrix mathBjörn Ragnar Björnsson
        Coprocessors for matrix mathanon2
        Coprocessors for matrix mathMark Heath
          Coprocessors for matrix mathanon
            Coprocessors for matrix mathMark Heath
      Coprocessors for matrix mathLinus Torvalds
        Coprocessors for matrix mathMark Heath
        Coprocessors for matrix mathMark Heath
    All aboard the SME AI hype trainBjörn Ragnar Björnsson
    All aboard the SME AI hype trainEric Fink
      All aboard the SME AI hype trainMichael S
      All aboard the SME AI hype trainLinus Torvalds
        All aboard the SME AI hype trainFreddie
          All aboard the SME AI hype trainRobert Wessel
          All aboard the SME AI hype trainMark Heath
            All aboard the SME AI hype trainFreddie
        All aboard the SME AI hype trainUngo
        All aboard the SME AI hype trainanon
        All aboard the SME AI hype trainEric Fink
          All aboard the SME AI hype trainMark Heath
          All aboard the SME AI hype trainMichael S
            All aboard the SME AI hype trainDoug S
              All aboard the SME AI hype trainBjörn Ragnar Björnsson
                All aboard the SME AI hype trainMichael S
                  My understanding of acceloperator taxonomy (for this discussion)Mark Roulo
                    My understanding of acceloperator taxonomy (for this discussion)Robert Wessel
                      My understanding of acceloperator taxonomy (for this discussion)zArchJon
                    My understanding of acceloperator taxonomy (for this discussion)Doug S
                    My understanding of acceloperator taxonomy (for this discussion)Eric Fink
                    My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                    My understanding of acceloperator taxonomy (for this discussion)
                    My understanding of acceloperator taxonomy (for this discussion)Mark Heath
                      My understanding of acceloperator taxonomy (for this discussion)dmcq
                      My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                        My understanding of acceloperator taxonomy (for this discussion)Mark Heath
                          My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                            My understanding of acceloperator taxonomy (for this discussion)Mark Heath
                              My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                                My understanding of acceloperator taxonomy (for this discussion)Freddie
                                  My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                                Applications that use a huge number of threadsMark Heath
                                  Applications that use a huge number of threadsSimon Farnsworth
                                    Applications that use a huge number of threadsblaine
                                      Applications that use a huge number of threadsMark Heath
                                        Applications that use a huge number of threadsblaine
                                        Applications that use a huge number of threadsDoug S
                                          Applications that use a huge number of threadsMark Heath
                                          Applications that use a huge number of threads---
                                            Applications that use a huge number of threadsJoern Engel
                                              Applications that use a huge number of threadsDoug S
                                              Applications that use a huge number of threadsnone
                                              Applications that use a huge number of threadsEtienne
                                                Applications that use a huge number of threadsJoern Engel
                            My understanding of acceloperator taxonomy (for this discussion)Etienne
                              My understanding of acceloperator taxonomy (for this discussion)Mark Heath
                                My understanding of acceloperator taxonomy (for this discussion)Doug S
                                  My understanding of acceloperator taxonomy (for this discussion)Mark Heath
                                    My understanding of acceloperator taxonomy (for this discussion)---
                                      Scalable Matrix Extension (SME)Mark Heath
                                        Scalable Matrix Extension (SME)none
                                          FP64 in Apple GPUMark Heath
                                            FP64 in Apple GPUMark Roulo
                                              FP64 in Apple GPUMark Heath
                                                FP64 in Apple GPUMark Roulo
                                            FP64 in Apple GPUnoko
                                        Scalable Matrix Extension (SME)---
                                          Swift for programming Apple's GPUsMark Heath
                                            Swift for programming Apple's GPUs---
                                              Swift for programming Apple's GPUsDoug S
                                                Swift for programming Apple's GPUs---
                                        Scalable Matrix Extension (SME)Eric Fink
                                          Scalable Matrix Extension (SME)Mark Heath
                                        Scalable Matrix Extension (SME)Freddie
                                          Cache blocking GPU kernelsMark Heath
                                            Cache blocking GPU kernelsFreddie
                                        Tiny size of floating point arithmetic unitsMark Heath
                        My understanding of acceloperator taxonomy (for this discussion)Freddie
                          My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                            My understanding of acceloperator taxonomy (for this discussion)dmcq
                              My understanding of acceloperator taxonomy (for this discussion)Ben LaHaise
                                My understanding of acceloperator taxonomy (for this discussion)dmcq
                                  My understanding of acceloperator taxonomy (for this discussion)Simon Farnsworth
                                    My understanding of acceloperator taxonomy (for this discussion)Michael S
                                    My understanding of acceloperator taxonomy (for this discussion)Robert Wessel
                                      My understanding of acceloperator taxonomy (for this discussion)dmcq
                                        MY66000?Paul A. Clayton
                                      My understanding of acceloperator taxonomy (for this discussion)Marcus
                                      My understanding of acceloperator taxonomy (for this discussion)anon2
                                        My understanding of acceloperator taxonomy (for this discussion)Robert Wessel
                      My understanding of acceloperator taxonomy (for this discussion)Doug S
                  All aboard the SME AI hype trainBjörn Ragnar Björnsson
    All aboard the SME AI hype traindmcq
      All aboard the SME AI hype traindmcq
    All aboard the SME AI hype trainDoug S
      All aboard the SME AI hype traindmcq
        All aboard the SME AI hype traindmcq
    All aboard the SME AI hype trainKonrad Schwarz
      All aboard the SME AI hype trainLinus Torvalds
        All aboard the SME AI hype traindmcq
    TI C7x Matrix Multiplication AcceleratorMarcus
      TI C7x Matrix Multiplication Accelerator---