By: Michael S (already5chosen.delete@this.yahoo.com), August 17, 2020 2:32 am
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on August 17, 2020 1:01 am wrote:
> Crystal S. Diamond (cdiamond.delete@this.diamondgirls.com) on August 16, 2020 10:20 pm wrote:
> > Here it is boys...
> >
> > https://www.hardwareluxx.de/index.php/news/hardware/prozessoren/53864-ibm-power10-bietet-30-kerne-mit-smt8-pcie-5-und-omi-memory.amp.html?__twitter_impression=true
>
>
>
> So IBM also jumps on the bandwagon of adding ISA extensions for matrix
> computations, like NVIDIA, Intel Sapphire Rapids, future ARM etc.
>
>
>
>
>
IBM appears to do it on FP32 inputs. Is it the case for the rest of them?
4x4 FP32 matmul would be useful not just for deep learning.
The problem is that 2048 bits per core per cycle is not very impressive for such massive core, that is more similar to 2-4 cores by other manufacturers.
A64FX and Cascade Lake already do 1024 bits per core without special instructions.
I wonder why they expose their engine as an outer-product.
Of course, outer product is very generic, and likely fits well in pipeline due to latency, comparable to "normal" FP operations, but it seems to me that it makes good FLOPS gains impossible, because of RF write bottleneck.
> Crystal S. Diamond (cdiamond.delete@this.diamondgirls.com) on August 16, 2020 10:20 pm wrote:
> > Here it is boys...
> >
> > https://www.hardwareluxx.de/index.php/news/hardware/prozessoren/53864-ibm-power10-bietet-30-kerne-mit-smt8-pcie-5-und-omi-memory.amp.html?__twitter_impression=true
>
>
>
> So IBM also jumps on the bandwagon of adding ISA extensions for matrix
> computations, like NVIDIA, Intel Sapphire Rapids, future ARM etc.
>
>
>
> New Processor Core Architectures in the IBM POWER10 processor with an embedded Matrix
> Math Accelerator which is extrapolated to provide 10x, 15x and 20x faster AI inference
> for FP32, BFloat16 and INT8 calculations per socket respectively than the IBM POWER9
> processor to infuse AI into business applications and drive greater insights.
>
>
>
IBM appears to do it on FP32 inputs. Is it the case for the rest of them?
4x4 FP32 matmul would be useful not just for deep learning.
The problem is that 2048 bits per core per cycle is not very impressive for such massive core, that is more similar to 2-4 cores by other manufacturers.
A64FX and Cascade Lake already do 1024 bits per core without special instructions.
I wonder why they expose their engine as an outer-product.
Of course, outer product is very generic, and likely fits well in pipeline due to latency, comparable to "normal" FP operations, but it seems to me that it makes good FLOPS gains impossible, because of RF write bottleneck.
Topic | Posted By | Date |
---|---|---|
IBM introduces POWER10 | Crystal S. Diamond | 2020/08/16 10:20 PM |
"New ISA Prefix Fusion" | QAnon | 2020/08/16 11:21 PM |
"New ISA Prefix Fusion" | Anon3 | 2020/08/17 06:59 AM |
"New ISA Prefix Fusion" | Kevin G | 2020/08/17 10:51 AM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/17 11:51 AM |
"New ISA Prefix Fusion" | Anon3 | 2020/08/17 04:10 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/17 04:34 PM |
"New ISA Prefix Fusion" | Anon3 | 2020/08/17 05:34 PM |
"New ISA Prefix Fusion" | Adrian | 2020/08/17 06:39 PM |
"New ISA Prefix Fusion" | anon2 | 2020/08/17 09:24 PM |
"New ISA Prefix Fusion" | Doug S | 2020/08/17 09:58 PM |
"New ISA Prefix Fusion" | hobold | 2020/08/18 01:47 AM |
"New ISA Prefix Fusion" | Michael S | 2020/08/18 04:48 AM |
"New ISA Prefix Fusion" | hobold | 2020/08/18 11:58 AM |
"New ISA Prefix Fusion" | dmcq | 2020/08/18 01:00 PM |
"New ISA Prefix Fusion" | Michael S | 2020/08/18 01:48 PM |
"New ISA Prefix Fusion" | hobold | 2020/08/18 02:29 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/18 03:46 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/18 03:42 PM |
"New ISA Prefix Fusion" | anon2 | 2020/08/18 07:04 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/18 09:17 PM |
"New ISA Prefix Fusion" | dmcq | 2020/08/19 04:08 AM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/19 10:02 AM |
"New ISA Prefix Fusion" | dmcq | 2020/08/19 11:08 AM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/19 12:05 PM |
"New ISA Prefix Fusion" | dmcq | 2020/08/19 02:14 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/19 02:44 PM |
IBM introduces POWER10 | Thu | 2020/08/16 11:56 PM |
IBM introduces POWER10 | Michael S | 2020/08/17 02:12 AM |
IBM introduces POWER10 | Thu | 2020/08/17 03:27 AM |
IBM introduces POWER10 | TransientStudent | 2020/08/17 04:23 AM |
IBM introduces POWER10 | Rayla | 2020/08/17 04:29 AM |
IBM introduces POWER10 | Maynard Handley | 2020/08/17 10:44 AM |
IBM introduces POWER10 | Kevin G | 2020/08/17 10:57 AM |
IBM introduces POWER10 | Rayla | 2020/08/17 04:26 AM |
IBM introduces POWER10 | Thu | 2020/08/17 05:00 PM |
Matrix Math Accelerator | Adrian | 2020/08/17 01:01 AM |
Matrix Math Accelerator | Michael S | 2020/08/17 02:32 AM |
Matrix Math Accelerator | Adrian | 2020/08/17 02:46 AM |
Matrix Math Accelerator | j | 2020/08/18 02:32 AM |