By: Peter Lewis (peter.delete@this.notyahoo.com), June 2, 2022 1:22 am
Room: Moderated Discussions
> Are you saying where they are now is the limit, or will the goalposts shift again in 5 years when they take the next step wider?
I’m not saying 6 instructions per clock is the limit for x86 decode. My guess is that both x86 and ARM will increase the number of instructions decoded per clock, but the increase will be faster for ARM. Because of the difficulty of decoding variable length instructions in parallel, x86 will favor wider vectors than ARM. This is something we already see today with 512-bit vector operations (two of them per clock) on x86 and 128-bit vector operations (four of them per clock) on Apple’s M1.
> Apple's M1 cores have lower power consumption because they are designed with power consumption
> having primacy over performance because phones are Apple's most profitable product.
You have to admit it is impressive that a 3.2 GHz M1 core is within 13% of the single thread performance of the fastest x86 core (Intel’s 5.2 GHz Golden Cove in Alder Lake), at least according to Geekbench 5.
As you mentioned, Apple’s CPU cores are optimized for cell phones. In a cell phone, the goal is to do whatever computing is required as quickly as possible and then put the CPU back to sleep, shutting off leakage power. I can’t imagine cell phone apps using a lot of CPU cores, so I would guess cell phones will favor a small number of very wide CPU cores, in addition to many special-purpose computing blocks like they have today. The Mac Pro will eventually have a lot of those very wide CPU cores. It will be interesting to see how that compares to a Xeon W.
You asked what, if anything, will eventually replace x86. Here is one possibility: millcomputing.com
I’m not saying 6 instructions per clock is the limit for x86 decode. My guess is that both x86 and ARM will increase the number of instructions decoded per clock, but the increase will be faster for ARM. Because of the difficulty of decoding variable length instructions in parallel, x86 will favor wider vectors than ARM. This is something we already see today with 512-bit vector operations (two of them per clock) on x86 and 128-bit vector operations (four of them per clock) on Apple’s M1.
> Apple's M1 cores have lower power consumption because they are designed with power consumption
> having primacy over performance because phones are Apple's most profitable product.
You have to admit it is impressive that a 3.2 GHz M1 core is within 13% of the single thread performance of the fastest x86 core (Intel’s 5.2 GHz Golden Cove in Alder Lake), at least according to Geekbench 5.
As you mentioned, Apple’s CPU cores are optimized for cell phones. In a cell phone, the goal is to do whatever computing is required as quickly as possible and then put the CPU back to sleep, shutting off leakage power. I can’t imagine cell phone apps using a lot of CPU cores, so I would guess cell phones will favor a small number of very wide CPU cores, in addition to many special-purpose computing blocks like they have today. The Mac Pro will eventually have a lot of those very wide CPU cores. It will be interesting to see how that compares to a Xeon W.
You asked what, if anything, will eventually replace x86. Here is one possibility: millcomputing.com