By: juanrga (nospam.delete@this.juanrga.com), October 30, 2015 12:43 pm
Room: Moderated Discussions
lurker (lurker9000.delete@this.realemail.mail) on October 29, 2015 3:12 pm wrote:
> Regarding Zen performance, a guy who worked for AMD (at least his linkedin profile says that) and
> who, as he claims, worked on designing L2 cache for Zen and K12 said that their focus was to be
> competitive against Intel. He no longer works there but apparently his old colleague who still
> works there said Zen chips have already been tested and so far "it has met all expectation" and
> they "haven't found any significant bottlenecks". Apparently they haven't finalized the specifications
> for the clocks and TDP, but their partners in server market are "very excited".
Bulldozer engineers also claimed that their creation was "balanced" to avoid bottlenecks due to sharing resources, whereas providing "high throughput" efficiently... Years latter AMD admitted Bulldozer was a "fiasco" and Zen is a new design.
It happens I also know people that have friends. One of them (who call himself a "Red Team member" in certain forum signature, and that upload photos of him meeting with higher AMD people including former chip chief John Byrne) said some of us in public that the goal of the Zen team was "faster than Skylake" and that if the goal was missed at least Zen would so fast.
Then it happens that another person (with some inside info and working for media: he was one of the first to publish leaks about Zen) said me in private that certain Zen engineer mentioned that the target was "100% higher IPC than Piledriver".
Some inside rumors claim that Zen tests put it behind Haswell and very far from the supposed targets. Moreover, it is not evident if those tests are made on silicon or not (one source claims that Zen is not tapeout still, other source claims otherwise).
We also know certain AMD engineer did some claims about Zen in his linkedin profile and those claims were later deleted from his account.
I am simply pointing that this stuff is complex. Was Zen tapeout before Keller's departure or wasn't? Did Zen met the team expectations or didn't? It depends whom you ask.
> It's not much detail, but I think if there was a problem from having
> only 2 AGUs, it would count as a significant bottleneck.
> Also this is my first post ever, I just usually lurk here and this is the first
> time I have something useful to add to the discussion. Please no bully.
The point is not how many AGUs the core has. The point is on which is the ratio of AGUs to rest of execution units. For instance:
- Excavator has 2 AGU and 2 ALU.
- Power 8 has four memory units (2 load/store plus 2 loads) and 2 integer units.
- SPARC X+ has 2 memory units and 2 integer units.
- Trace-7 has 2 memory units and 2 integer units. Trace-14 has 4 memory units and 4 integer units. Trace-28 has 8 memory units and 8 integer units.
Those designs look balanced. However a hypothetical design with 6 ALU and 1 AGU would be rather unbalanced, which most of ALUs being idle most of time.
Therefore 2 AGUs can be enough for some design but not enough for other.
It also depends of the ISA and of the type of workload.
For instance about a third of total ops of mobile applications for ARM are loads/stores. However, SPECFP for x86 requires about one half. Server applications for ARM require more loads/stores than mobile applications for the same ISA. Some HPC applications require more load/store ops than SPECFP. And so on.
Cyclone is a six-issue ARM chip oriented to mobile; therefore, its 2 AGU + 4ALU configuration looks balanced regarding ALU:mem ratio.
2/2+4 = 1/3.
On the other hand, Vulkan which is also six-issue but address server applications is a 3ALU + 3mem core with 2 load/store units and 1 store-data unit. This ARM server design is more close to 1:1 ratio.
Zen 4ALU+2AGU looks unbalanced to me for a x86 server/HPC architecture.
> Regarding Zen performance, a guy who worked for AMD (at least his linkedin profile says that) and
> who, as he claims, worked on designing L2 cache for Zen and K12 said that their focus was to be
> competitive against Intel. He no longer works there but apparently his old colleague who still
> works there said Zen chips have already been tested and so far "it has met all expectation" and
> they "haven't found any significant bottlenecks". Apparently they haven't finalized the specifications
> for the clocks and TDP, but their partners in server market are "very excited".
Bulldozer engineers also claimed that their creation was "balanced" to avoid bottlenecks due to sharing resources, whereas providing "high throughput" efficiently... Years latter AMD admitted Bulldozer was a "fiasco" and Zen is a new design.
It happens I also know people that have friends. One of them (who call himself a "Red Team member" in certain forum signature, and that upload photos of him meeting with higher AMD people including former chip chief John Byrne) said some of us in public that the goal of the Zen team was "faster than Skylake" and that if the goal was missed at least Zen would so fast.
Then it happens that another person (with some inside info and working for media: he was one of the first to publish leaks about Zen) said me in private that certain Zen engineer mentioned that the target was "100% higher IPC than Piledriver".
Some inside rumors claim that Zen tests put it behind Haswell and very far from the supposed targets. Moreover, it is not evident if those tests are made on silicon or not (one source claims that Zen is not tapeout still, other source claims otherwise).
We also know certain AMD engineer did some claims about Zen in his linkedin profile and those claims were later deleted from his account.
I am simply pointing that this stuff is complex. Was Zen tapeout before Keller's departure or wasn't? Did Zen met the team expectations or didn't? It depends whom you ask.
> It's not much detail, but I think if there was a problem from having
> only 2 AGUs, it would count as a significant bottleneck.
> Also this is my first post ever, I just usually lurk here and this is the first
> time I have something useful to add to the discussion. Please no bully.
The point is not how many AGUs the core has. The point is on which is the ratio of AGUs to rest of execution units. For instance:
- Excavator has 2 AGU and 2 ALU.
- Power 8 has four memory units (2 load/store plus 2 loads) and 2 integer units.
- SPARC X+ has 2 memory units and 2 integer units.
- Trace-7 has 2 memory units and 2 integer units. Trace-14 has 4 memory units and 4 integer units. Trace-28 has 8 memory units and 8 integer units.
Those designs look balanced. However a hypothetical design with 6 ALU and 1 AGU would be rather unbalanced, which most of ALUs being idle most of time.
Therefore 2 AGUs can be enough for some design but not enough for other.
It also depends of the ISA and of the type of workload.
For instance about a third of total ops of mobile applications for ARM are loads/stores. However, SPECFP for x86 requires about one half. Server applications for ARM require more loads/stores than mobile applications for the same ISA. Some HPC applications require more load/store ops than SPECFP. And so on.
Cyclone is a six-issue ARM chip oriented to mobile; therefore, its 2 AGU + 4ALU configuration looks balanced regarding ALU:mem ratio.
2/2+4 = 1/3.
On the other hand, Vulkan which is also six-issue but address server applications is a 3ALU + 3mem core with 2 load/store units and 1 store-data unit. This ARM server design is more close to 1:1 ratio.
Zen 4ALU+2AGU looks unbalanced to me for a x86 server/HPC architecture.