By: juanrga (nospam.delete@this.juanrga.com), November 2, 2015 5:06 am
Room: Moderated Discussions
Maynard Handley (name99.delete@this.name99.org) on November 1, 2015 6:33 pm wrote:
> juanrga (nospam.delete@this.juanrga.com) on November 1, 2015 8:16 am wrote:
> > Poindexter (cherullo.delete@this.gmail.com) on October 31, 2015 5:37 pm wrote:
> > > lurker (lurker9000.delete@this.realemail.mail) on October 31, 2015 4:06 pm wrote:
> > > > Poindexter (cherullo.delete@this.gmail.com) on October 31, 2015 2:47 pm wrote:
> > > > > I find it funny that you like to tout pipe numbers, but you never discuss
> > > > > other architectural features that have direct impact in this discussion:
> > > > > - MOV elimination
> > > > > - Store-to-load forwarding
> > > > > - Memory reordering and memory disambiguation
> > > > > - Instruction fusing
> > > >
> > > > I think the point is that we don't really know the details on the rest of the architecture?
> > > > So I think it's not a problem if he focuses on the details we do know.
> > >
> > > Yeah, but to assert that the architecture is unbalanced just by looking at the number of ports,
> > > doing some back-of-the-envelope math about it, without access to a single simulation, is just a
> > > *bit* too far, don't you think? Makes it look like every AMD engineer is plain incompetent.
> > >
> > > > > - Never provided any connection between Haswell's increased IPC over Ivy Bridge to the third AGU.
> > > >
> > > > I have no idea if this is the reason for it, but in Haswell SMT works a lot better than in Ivy.
> > > > Say if you were doing multi-core compilation in Ivy, the whole system would just freeze up until
> > > > the work is completed. It seems to be a bit better in Haswell so perhaps that extra AGU helps?
> > >
> > > Sure it helps. Then again Haswell also got a new issue port for ALU operations, including a branch
> > > unit which might also help during SMT compilation. Which one is the most important? I don't know!
> > >
> > > > > Regarding the FPU, you never mention that Zen's FPU doesn't share ports with the integer ALUs like
> > > > > Haswell does. You never mention that Zen's FPU has more ports and units than Haswell's. You only seem
> > > > > to care about maximum throughput (in the e-penis sense), which frankly, is not that interesting.
> > > >
> > > > Is it really that much of an advantage that FPU doesn't share ports with interger ALUs? In SMT perhaps?
> > > > I guess the separate ports for the ADD and MUL units are an advantage in some workloads.
> > > > And to be fair maximum throughput is important in HPC, right? I
> > > > don't think it's that important in general workloads though.
> > >
> > > Having separate ports is certainly an advantage. Just like
> > > Haswell's third AGU is. The tricky part is how you
> > > quantify this advantage, how you compare those features. With the information we have, it's impossible.
> > >
> > > Throughput is important for HPC, that's true, but juan condemned Zen in all markets.
> >
> > Because Zen looks inferior to competence in any metric?
> >
> > > In HPC's case, the lack of AVX-512 has vastly more influence in Zen's ability to get into those
> > > HPC supercomputers than a missing AGU (then again, it wasn't long ago that he was saying that
> > > ARMs were better than Haswells when driving HPC GPUs, same perf, lower power, but I guess he
> > > changed his mind about it). We can't really conclude anything about any market.
> > >
> > > I just did some static instruction count on LAPACK (default package on Ubuntu) - it doesn't use
> > > packed AVX instructions and it's full of LEAs (think it's due Fortran's calling convention). Looks
> > > just the kind of code where Zen's FPU can issue 4 instructions per cycle while the integer side
> > > is also on full tilt. Zen may be much better than Haswell for scientific computing.
> >
> > The ratio 1:2 for mem:FP is not the ideal for scientific computing. Zen will be inferior
> > to Haswell/Broadwell, Power8, and very inferior to Skylake (Xeon), KNL Phi, XIfx,...
> >
> > > It also may be great for office work, for laptops, games, all kinds of web servers, so on and so forth.
> >
> > 90% of people don't need to purchase Zen for desktop office work, because will not boost applications over
> > existent products and for the rest there are other products in the market as well. Laptops don't use 8-core
> > CPUs, and when first Zen-based APUs arise, Kabylake will be on market and Canonlake close to release.
>
> There's a bizarre sort of unreality to this whole discussion.
> 90% of people don't give a damn what CPU they buy. They go to Best Buy and choose the laptop that seems
> right to them, which is some combination of mostly price and the rest packaging and appearance.
> Which means how Zen does is driven by its cost and its strengths/weaknesses
> (and so where Dell, HP, Lenovo etc use it in their lineups).
>
> What the average person thinks of the CPU doesn't matter in the slightest. What matters is the performance/dollar
> (so can the manufacturers move it to higher tier machines) and the battery life under Windows 10
> (so can the manufacturers boast about that or have to hide it shamefully).
>
> The ONLY point at which 2 AGUs:4 ALUs matters is if they exceed Intel single-threaded performance
> and the 1% of people who actually care about CPUs want to investigate how they did it. Since exceeding
> Intel performance seems highly unlikely, this is not a point that actually matters.
> For everyone else, this is all driven by ECONOMICS, not functional unit ratios. Assuming basic competence
> (a hell of an assumption with AMD, but still), the CPU will get into HPC (or not) and laptops (or not)
> based on its price and its energy efficiency, the two variables that are not being discussed here.
I think Zen will be expensive because (i) there are many more transistors than Excavator, (ii) the FinFET node is not cheap, and (iii) AMD needs the money.
I think AMD will be behind in efficiency, unless drops clocks enough to compensate.
> juanrga (nospam.delete@this.juanrga.com) on November 1, 2015 8:16 am wrote:
> > Poindexter (cherullo.delete@this.gmail.com) on October 31, 2015 5:37 pm wrote:
> > > lurker (lurker9000.delete@this.realemail.mail) on October 31, 2015 4:06 pm wrote:
> > > > Poindexter (cherullo.delete@this.gmail.com) on October 31, 2015 2:47 pm wrote:
> > > > > I find it funny that you like to tout pipe numbers, but you never discuss
> > > > > other architectural features that have direct impact in this discussion:
> > > > > - MOV elimination
> > > > > - Store-to-load forwarding
> > > > > - Memory reordering and memory disambiguation
> > > > > - Instruction fusing
> > > >
> > > > I think the point is that we don't really know the details on the rest of the architecture?
> > > > So I think it's not a problem if he focuses on the details we do know.
> > >
> > > Yeah, but to assert that the architecture is unbalanced just by looking at the number of ports,
> > > doing some back-of-the-envelope math about it, without access to a single simulation, is just a
> > > *bit* too far, don't you think? Makes it look like every AMD engineer is plain incompetent.
> > >
> > > > > - Never provided any connection between Haswell's increased IPC over Ivy Bridge to the third AGU.
> > > >
> > > > I have no idea if this is the reason for it, but in Haswell SMT works a lot better than in Ivy.
> > > > Say if you were doing multi-core compilation in Ivy, the whole system would just freeze up until
> > > > the work is completed. It seems to be a bit better in Haswell so perhaps that extra AGU helps?
> > >
> > > Sure it helps. Then again Haswell also got a new issue port for ALU operations, including a branch
> > > unit which might also help during SMT compilation. Which one is the most important? I don't know!
> > >
> > > > > Regarding the FPU, you never mention that Zen's FPU doesn't share ports with the integer ALUs like
> > > > > Haswell does. You never mention that Zen's FPU has more ports and units than Haswell's. You only seem
> > > > > to care about maximum throughput (in the e-penis sense), which frankly, is not that interesting.
> > > >
> > > > Is it really that much of an advantage that FPU doesn't share ports with interger ALUs? In SMT perhaps?
> > > > I guess the separate ports for the ADD and MUL units are an advantage in some workloads.
> > > > And to be fair maximum throughput is important in HPC, right? I
> > > > don't think it's that important in general workloads though.
> > >
> > > Having separate ports is certainly an advantage. Just like
> > > Haswell's third AGU is. The tricky part is how you
> > > quantify this advantage, how you compare those features. With the information we have, it's impossible.
> > >
> > > Throughput is important for HPC, that's true, but juan condemned Zen in all markets.
> >
> > Because Zen looks inferior to competence in any metric?
> >
> > > In HPC's case, the lack of AVX-512 has vastly more influence in Zen's ability to get into those
> > > HPC supercomputers than a missing AGU (then again, it wasn't long ago that he was saying that
> > > ARMs were better than Haswells when driving HPC GPUs, same perf, lower power, but I guess he
> > > changed his mind about it). We can't really conclude anything about any market.
> > >
> > > I just did some static instruction count on LAPACK (default package on Ubuntu) - it doesn't use
> > > packed AVX instructions and it's full of LEAs (think it's due Fortran's calling convention). Looks
> > > just the kind of code where Zen's FPU can issue 4 instructions per cycle while the integer side
> > > is also on full tilt. Zen may be much better than Haswell for scientific computing.
> >
> > The ratio 1:2 for mem:FP is not the ideal for scientific computing. Zen will be inferior
> > to Haswell/Broadwell, Power8, and very inferior to Skylake (Xeon), KNL Phi, XIfx,...
> >
> > > It also may be great for office work, for laptops, games, all kinds of web servers, so on and so forth.
> >
> > 90% of people don't need to purchase Zen for desktop office work, because will not boost applications over
> > existent products and for the rest there are other products in the market as well. Laptops don't use 8-core
> > CPUs, and when first Zen-based APUs arise, Kabylake will be on market and Canonlake close to release.
>
> There's a bizarre sort of unreality to this whole discussion.
> 90% of people don't give a damn what CPU they buy. They go to Best Buy and choose the laptop that seems
> right to them, which is some combination of mostly price and the rest packaging and appearance.
> Which means how Zen does is driven by its cost and its strengths/weaknesses
> (and so where Dell, HP, Lenovo etc use it in their lineups).
>
> What the average person thinks of the CPU doesn't matter in the slightest. What matters is the performance/dollar
> (so can the manufacturers move it to higher tier machines) and the battery life under Windows 10
> (so can the manufacturers boast about that or have to hide it shamefully).
>
> The ONLY point at which 2 AGUs:4 ALUs matters is if they exceed Intel single-threaded performance
> and the 1% of people who actually care about CPUs want to investigate how they did it. Since exceeding
> Intel performance seems highly unlikely, this is not a point that actually matters.
> For everyone else, this is all driven by ECONOMICS, not functional unit ratios. Assuming basic competence
> (a hell of an assumption with AMD, but still), the CPU will get into HPC (or not) and laptops (or not)
> based on its price and its energy efficiency, the two variables that are not being discussed here.
I think Zen will be expensive because (i) there are many more transistors than Excavator, (ii) the FinFET node is not cheap, and (iii) AMD needs the money.
I think AMD will be behind in efficiency, unless drops clocks enough to compensate.