Some 'reverse-engineering' of Zen

By: Poindexter (cherullo.delete@this.gmail.com), November 1, 2015 9:25 am
Room: Moderated Discussions
juanrga (nospam.delete@this.juanrga.com) on November 1, 2015 7:00 am wrote:
> Poindexter (cherullo.delete@this.gmail.com) on October 31, 2015 2:47 pm wrote:
> > > I am sure the reason for 4ALU+2AGU and 128bit FP pipes is not because it is the best possible
> > > configuration. The real reason? We only can speculate at this time. Maybe a cache bottleneck did
> > > make adding a third AGU useless, maybe the fourth ALU is here for symmetry reasons, maybe...
> >
> > I find it funny that you like to tout pipe numbers, but you never discuss
> > other architectural features that have direct impact in this discussion:
> > - MOV elimination
> > - Store-to-load forwarding
> > - Memory reordering and memory disambiguation
> > - Instruction fusing
>
> I would like to know how you think they affect the discussion. E.g., how read-modify
> or read-modify-write fusion reduce the number of loads and stores?

It may reduce the number of required uops, and thus change the optimal (whatever that means) pipelines ratio.

Now, about read-modify instructions, let's examine the throughput of GPR integer addition and multiplication, and floating point vector addition and multiplication.
In the integer case, Agner says that Haswell can sustain a throughput of 2 ADDs per clock, limited by the number of load AGUs, and a throughput of 1 IMUL, limited by the sole integer multiplier on port 1.
Here, as far as we can tell, Zen will be able to sustain the same throughputs.

In the SSE/AVX case (128 bits), Haswell can sustain a throughput of 1 ADD(PS)(SD) limited by the sole vector adder on port 1 and 2 MUL(PS)(SD). Zen may be able to sustain a throughput of 2 for both instructions.
Remember, these numbers are assuming that a source operand is coming from memory.

About read-modify-write, they don't generally apply to AVX code. For ADD(PS)(SD) and MUL(PS)(SD), only a source operand can be a memory reference, never the destination:
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-2a-manual.pdf#page=99
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-2a-manual.pdf#page=681
Thus read-modify-write is not a factor for HPC. In the integer side, Agner says that Haswell has a throughput of 1 ADD per clock, limited by the data store on port 4, and 1 IMUL per clock.
Zen may mantain a throughput of 2 ADDs per clock for brief periods, and then settle on 1 ADD per clock. It may also sustain 1 IMUL per clock.

So Zen may be able to sustain pretty much the same throughputs on integer and even best Haswell on 128 bit vector.
Of course, there are many other things like LS units width, latencies and etc. but as far as issue goes, they look evenly matched for GPRs and SSE/AVX-128. For AVX-256, Haswell seems quite stronger, and Skylake is in another category altogether.

> > We have absolutely no idea how Zen will fare in this regard. And it's not just whether
> > Zen implements those things or not, how they are implemented is also very important.
> > You also like to tout other server architectures pipeline ratios but:
> > - Never provided any connection between Haswell's increased IPC over Ivy Bridge to the third AGU.
>
> Do you still believe that IPC gains are not related to the new execution
> units and that Haswell would be better with 4ALU+2AGU?

Sure they are related to the new units and to the other changes in the uop-cache, the increased reorder buffer, reservation station, PRF, etc. Can you quantify the performance gain obtained strictly by the third AGU? Can you quantify the performance gain obtained strictly by the forth ALU?

> > - You didn't realize that Jaguar can schedule HALF the number of loads that Bulldozer
> > can, and still enjoys comparable IPC (like you already stated in other forums).
>
> I already give you (in another forum) ratios of load:store for different workloads (from mobile to server)
> for both ARM and x86. Iff all the memory operations were loads then Bulldozer would have an advantage over
> Jaguar, but there are stores in code as well, and Bulldozer has only a slight advantage over Jaguar. Of course,
> having a slight advantage in this point, doesn't eliminate the other issues/bottlenecks of Bulldozer.

It goes to show that Bulldozer's 2 LD/ST are FAR from saturated with 2 ALUs.

> > - Never discussed whether Zen implements a dedicated memory
> > scheduler, even more aggressive than Jaguar's. It
> > could schedule non-aliasing loads ahead of older, ready to issue stores, reducing the need of a third AGU.
> >
> > Regarding the FPU, you never mention that Zen's FPU doesn't share ports with the integer ALUs like
> > Haswell does. You never mention that Zen's FPU has more ports and units than Haswell's. You only seem
> > to care about maximum throughput (in the e-penis sense), which frankly, is not that interesting.
>
> In fact I did all what you say I didn't. You would read my discussion with certain AMDfanboy at forum where
> you also participate, when he wrote that Zen will be faster than Haswell because Zen has 8 integer+float
> execution pipes whereas Haswell only have 4 pipes. Not only he apparently believes that memory ports are
> unneeded and that ALUs and FP units are feed from air, but he also believes all pipes are the same and a
> 128bit FMUL pipe counts the same than a 256bit FMA pipe (that is how he says "4 better than 2").
>
> You would also read my analysis of the four half-pipes on Zen and why I expect
> performance on non-FMA code to be more close to Bulldozer than to IvyBridge.

Please, where can I find this analysis?

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Some 'reverse-engineering' of Zenjuanrga2015/10/03 04:48 AM
  Some 'reverse-engineering' of ZenHeikki Kultala2015/10/03 08:01 AM
    Some 'reverse-engineering' of Zenitsmydamnation2015/10/03 03:47 PM
      Some 'reverse-engineering' of ZenAnders2015/10/03 11:04 PM
        Some 'reverse-engineering' of ZenContrarian2015/10/04 11:53 AM
          Some 'reverse-engineering' of ZenJukka Larja2015/10/05 09:15 AM
            Some 'reverse-engineering' of Zenmpx2015/10/05 09:37 AM
              Some 'reverse-engineering' of ZenContrarian2015/10/25 12:25 PM
                Some 'reverse-engineering' of Zenitsmydamnation2015/10/26 01:49 AM
                Some 'reverse-engineering' of Zenjuanrga2015/10/29 02:16 AM
                  Some 'reverse-engineering' of Zenanon2015/10/30 06:54 AM
                    Some 'reverse-engineering' of Zenjuanrga2015/10/30 10:37 AM
                    POWER8 load/store unitsGabriele Svelto2015/10/30 06:34 PM
                Some 'reverse-engineering' of Zenlurker2015/10/29 03:12 PM
                  Some 'reverse-engineering' of ZenDavid Kanter2015/10/30 12:06 AM
                    Some 'reverse-engineering' of Zenlurker2015/10/30 02:39 AM
                      Some 'reverse-engineering' of ZenGabriele Svelto2015/10/30 03:14 AM
                        Some 'reverse-engineering' of Zenlurker2015/10/30 03:41 AM
                      Some 'reverse-engineering' of Zendmcq2015/10/30 05:12 AM
                        Some 'reverse-engineering' of Zenlurker2015/10/30 05:56 AM
                          Some 'reverse-engineering' of Zendmcq2015/10/30 06:09 AM
                            Some 'reverse-engineering' of Zennone2015/10/30 06:16 AM
                          Some 'reverse-engineering' of ZenDavid Kanter2015/10/30 08:16 AM
                            Some 'reverse-engineering' of Zenlurker2015/10/30 08:44 AM
                          Some 'reverse-engineering' of Zenjuanrga2015/10/30 01:00 PM
                            AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)Heikki Kultala2015/10/30 03:45 PM
                              AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)juanrga2015/10/31 06:20 AM
                                AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)Heikki Kultala2015/10/31 02:19 PM
                                  AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)juanrga2015/11/01 07:41 AM
                                    AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)Anne O. Nymous2015/11/01 09:39 AM
                                AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)Matthias Waldhauer2015/11/02 01:26 PM
                                  AVX and 128-bit FPUs in Zen(Re:Some 'reverse-engineering' of Zen)juanrga2015/11/03 03:42 AM
                            Some 'reverse-engineering' of ZenMatthias Waldhauer2015/11/02 11:39 AM
                        Some 'reverse-engineering' of Zenbakaneko2015/10/31 07:28 AM
                          Some 'reverse-engineering' of Zendmcq2015/10/31 08:19 AM
                            Some 'reverse-engineering' of ZenJukka Larja2015/10/31 09:21 AM
                            Some 'reverse-engineering' of Zenbakaneko2015/10/31 10:23 AM
                              Some 'reverse-engineering' of Zendmcq2015/10/31 04:12 PM
                                Some 'reverse-engineering' of Zenbakaneko2015/10/31 05:25 PM
                                  Some 'reverse-engineering' of Zendmcq2015/11/01 07:36 AM
                                    Some 'reverse-engineering' of Zenbakaneko2015/11/01 10:11 AM
                                      Some 'reverse-engineering' of Zendmcq2015/11/01 10:27 AM
                                        Some 'reverse-engineering' of Zenbakaneko2015/11/01 03:35 PM
                                          Some 'reverse-engineering' of Zendmcq2015/11/01 04:52 PM
                                            Some 'reverse-engineering' of Zenbakaneko2015/11/03 03:17 AM
                                              Some 'reverse-engineering' of Zendmcq2015/11/03 04:17 AM
                  Some 'reverse-engineering' of Zenjuanrga2015/10/30 12:43 PM
                    Some 'reverse-engineering' of Zenlurker2015/10/30 02:09 PM
                      Some 'reverse-engineering' of Zenjuanrga2015/10/31 06:58 AM
                        Some 'reverse-engineering' of Zenlurker2015/10/31 08:07 AM
                          Some 'reverse-engineering' of Zenjuanrga2015/10/31 12:50 PM
                            Some 'reverse-engineering' of ZenPoindexter2015/10/31 02:47 PM
                              Some 'reverse-engineering' of Zenlurker2015/10/31 04:06 PM
                                Some 'reverse-engineering' of ZenPoindexter2015/10/31 05:37 PM
                                  Some 'reverse-engineering' of Zenlurker2015/11/01 03:46 AM
                                  Some 'reverse-engineering' of Zenjuanrga2015/11/01 08:16 AM
                                    Some 'reverse-engineering' of ZenMaynard Handley2015/11/01 06:33 PM
                                      Some 'reverse-engineering' of Zenjuanrga2015/11/02 05:06 AM
                                        Zen transistor countHeikki Kultala2015/11/04 01:30 AM
                                          Zen transistor countjuanrga2015/11/05 05:34 AM
                                    Some 'reverse-engineering' of ZenSymmetry2015/11/02 06:56 AM
                                      Some 'reverse-engineering' of ZenDavid Hess2015/11/02 07:16 AM
                                      Some 'reverse-engineering' of Zennobody2015/11/02 08:19 AM
                                        Some 'reverse-engineering' of ZenJukka Larja2015/11/02 09:34 PM
                                          Some 'reverse-engineering' of Zennobody2015/11/03 01:35 AM
                                            Some 'reverse-engineering' of ZenJukka Larja2015/11/03 02:41 AM
                                              Some 'reverse-engineering' of Zennobody2015/11/03 05:10 AM
                                                Some 'reverse-engineering' of ZenJukka Larja2015/11/04 02:52 AM
                                            Some 'reverse-engineering' of ZenKlimax2015/11/03 02:47 AM
                                              Some 'reverse-engineering' of Zennobody2015/11/03 04:19 AM
                                                Some 'reverse-engineering' of ZenKlimax2015/11/03 12:09 PM
                                          Some 'reverse-engineering' of ZenBanana_Comedown2015/11/04 11:17 AM
                                            Some 'reverse-engineering' of ZenJukka Larja2015/11/05 03:51 AM
                                              Some 'reverse-engineering' of ZenBanana_Comedown2015/11/05 01:23 PM
                                      Some 'reverse-engineering' of Zenjuanrga2015/11/03 04:13 AM
                                Some 'reverse-engineering' of Zennobody2015/10/31 08:16 PM
                                  Some 'reverse-engineering' of Zenlurker2015/11/01 03:48 AM
                                    Some 'reverse-engineering' of Zensylt2015/11/01 04:07 AM
                                    Some 'reverse-engineering' of Zennobody2015/11/01 11:49 AM
                                  Some 'reverse-engineering' of Zenmd2015/11/01 01:06 PM
                                    Some 'reverse-engineering' of Zenblu2015/11/01 01:59 PM
                                      Some 'reverse-engineering' of Zenjuanrga2015/11/02 05:10 AM
                              Some 'reverse-engineering' of Zenjuanrga2015/11/01 07:00 AM
                                Some 'reverse-engineering' of ZenPoindexter2015/11/01 09:25 AM
                                  Some 'reverse-engineering' of Zenjuanrga2015/11/02 04:47 AM
                                    Some 'reverse-engineering' of ZenPoindexter2015/11/02 05:39 AM
                                      Some 'reverse-engineering' of Zenjuanrga2015/11/02 01:26 PM
                                        Some 'reverse-engineering' of ZenDan Downs2015/11/02 03:12 PM
                                          Some 'reverse-engineering' of Zenjuanrga2015/11/03 03:51 AM
                            Some 'reverse-engineering' of Zenlurker2015/10/31 03:57 PM
                              Some 'reverse-engineering' of Zenjuanrga2015/11/01 07:26 AM
                                Some 'reverse-engineering' of ZenHeikki Kultala2015/11/01 09:57 AM
                                  Some 'reverse-engineering' of ZenMaynard Handley2015/11/01 06:39 PM
                              Some 'reverse-engineering' of ZenUngo2015/11/01 02:05 PM
                  Some 'reverse-engineering' of Zenquzhujian2015/10/31 10:39 AM
                  Some 'reverse-engineering' of ZenPoindexter2015/11/04 06:10 AM
                    Some 'reverse-engineering' of Zenlurker2015/11/05 01:54 PM
                      Some 'reverse-engineering' of Zennobody2015/11/05 03:28 PM
                      The sad status of tech mediajuanrga2015/11/06 05:19 AM
                        The sad status of tech mediaJoel2015/11/06 06:45 PM
                        The sad status of tech mediaMatthias Waldhauer2015/11/07 03:12 PM
                          The sad status of tech mediaJoel2015/11/07 05:40 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊