By: --- (---.delete@this.redheron.com), June 17, 2022 9:33 am
Room: Moderated Discussions
Sean M (sean.delete@this.none.com) on June 17, 2022 4:18 am wrote:
> --- (---.delete@this.redheron.com) on June 16, 2022 10:53 pm wrote:
> > Let me remind you once again that Apple has ambitions far grander
> > than most people here imagine. Apple was pretty much the platform
> > of choice for the STEM crowd when they were on Intel, and they
> > have every ambition to be back there (along with other goals).
>
> I’m interested in your comment above about Apple’s STEM ambitions. The paragraph
> [0048] in the patent you provided a link for mentions support for page sizes up to 512
> MBytes. The L2 TLB in the M1 has 3072 entries so that would cover 1.5 TBytes with 512
> MByte pages. The maximum DRAM size of the current Mac Pro is also 1.5 TBytes.
>
> Are you saying that since consumers don’t need 1.5 TBytes of DRAM, support for 512 MByte pages
> suggests Apple is interested in STEM applications? Or did you accidently post a link to the wrong
> Apple patent (Reducing Translation Lookaside Buffer Searches for Splintered Pages)?
That's the patent I wanted.
BWT one of the two companion patents filed at the same time strongly suggests that the L2 TLB in a "new" design (perhaps already the A15) will have 4096 entries (256 sets, 16 ways, as opposed to the current 256 sets 12 ways).
The point of interest is not just the huge pages, it's the RANGE of page sizes. Why would a company that has been happy with 16K pages for years now decide that
- it wants to support large pages
- and not just one large page size
- and (most specifically) why THOSE sizes...
What does this capability GET you...
> STEM applications need a lot of 64-bit floating point performance while cell phone applications don’t.
> How do you think Apple will reconcile this difference in requirements? Would it be by putting a lot of 64-bit
> floating point hardware in the AMX block and not including the AMX block in their cell phone chips? Since
> there is only one AMX block per four P-cores, how could that be competitive with dual AXV-512 units with
> fused multiply-adds in every core? Do you think Apple will make two different implementations of SVE2, with
> long hardware vectors in the M series and short hardware vectors in the A series? Does Apple currently support
> floating point fused multiply-adds in NEON, which is an optional ARM NEON extension? Other than this patent,
> have you seen other hints that suggest Apple is interested in STEM applications?
I already posted a month or so ago my analysis of recent AMX activity and how that is being modified from an outer product/matrix product engine to something that can perform a lot of AMX FP heavy lifting.
I have no idea what Apple WILL do, I simply note what I consider to be strong patterns.
The TLB patents, the AMX patents, the new scalable coherency protocol, the packaging patents for substantial (and independent) scaling of cores and DRAM -- none of this work is being done to make faster phones and Macbook Airs. Even the use of Optane (or something similar) as a slow, persistent, but very dense DRAM augmentation, something I thought had ended around 2016 still seems to be ongoing with a recent patent in that space.
I don't believe Apple want to sell a supercomputer based on 2023 Apple Silicon (though they do want to sell a whole lot of these MacBook Pros, to individuals, and Mac Studios to departments, on the strength of these being the best available "cheap" devices for Mathematica, or development, or running a certain degree of large PDE solvers). And if you want hypervisors and Linux within which to run those apps, go right ahead.
And 2033 Apple Silicon for super computers??? I wouldn't rule it out.
But of course the nature of the non-visionary is to refuse to see something until it is literally before their eyes (and then to claim it was obvious all along and no big deal). Pro/Max and the Ultra were both completely unexpected, but still we get people insisting that Ultra "obviously" represents as large as Apple will go. Oh well.
> --- (---.delete@this.redheron.com) on June 16, 2022 10:53 pm wrote:
> > Let me remind you once again that Apple has ambitions far grander
> > than most people here imagine. Apple was pretty much the platform
> > of choice for the STEM crowd when they were on Intel, and they
> > have every ambition to be back there (along with other goals).
>
> I’m interested in your comment above about Apple’s STEM ambitions. The paragraph
> [0048] in the patent you provided a link for mentions support for page sizes up to 512
> MBytes. The L2 TLB in the M1 has 3072 entries so that would cover 1.5 TBytes with 512
> MByte pages. The maximum DRAM size of the current Mac Pro is also 1.5 TBytes.
>
> Are you saying that since consumers don’t need 1.5 TBytes of DRAM, support for 512 MByte pages
> suggests Apple is interested in STEM applications? Or did you accidently post a link to the wrong
> Apple patent (Reducing Translation Lookaside Buffer Searches for Splintered Pages)?
That's the patent I wanted.
BWT one of the two companion patents filed at the same time strongly suggests that the L2 TLB in a "new" design (perhaps already the A15) will have 4096 entries (256 sets, 16 ways, as opposed to the current 256 sets 12 ways).
The point of interest is not just the huge pages, it's the RANGE of page sizes. Why would a company that has been happy with 16K pages for years now decide that
- it wants to support large pages
- and not just one large page size
- and (most specifically) why THOSE sizes...
What does this capability GET you...
> STEM applications need a lot of 64-bit floating point performance while cell phone applications don’t.
> How do you think Apple will reconcile this difference in requirements? Would it be by putting a lot of 64-bit
> floating point hardware in the AMX block and not including the AMX block in their cell phone chips? Since
> there is only one AMX block per four P-cores, how could that be competitive with dual AXV-512 units with
> fused multiply-adds in every core? Do you think Apple will make two different implementations of SVE2, with
> long hardware vectors in the M series and short hardware vectors in the A series? Does Apple currently support
> floating point fused multiply-adds in NEON, which is an optional ARM NEON extension? Other than this patent,
> have you seen other hints that suggest Apple is interested in STEM applications?
I already posted a month or so ago my analysis of recent AMX activity and how that is being modified from an outer product/matrix product engine to something that can perform a lot of AMX FP heavy lifting.
I have no idea what Apple WILL do, I simply note what I consider to be strong patterns.
The TLB patents, the AMX patents, the new scalable coherency protocol, the packaging patents for substantial (and independent) scaling of cores and DRAM -- none of this work is being done to make faster phones and Macbook Airs. Even the use of Optane (or something similar) as a slow, persistent, but very dense DRAM augmentation, something I thought had ended around 2016 still seems to be ongoing with a recent patent in that space.
I don't believe Apple want to sell a supercomputer based on 2023 Apple Silicon (though they do want to sell a whole lot of these MacBook Pros, to individuals, and Mac Studios to departments, on the strength of these being the best available "cheap" devices for Mathematica, or development, or running a certain degree of large PDE solvers). And if you want hypervisors and Linux within which to run those apps, go right ahead.
And 2033 Apple Silicon for super computers??? I wouldn't rule it out.
But of course the nature of the non-visionary is to refuse to see something until it is literally before their eyes (and then to claim it was obvious all along and no big deal). Pro/Max and the Ultra were both completely unexpected, but still we get people insisting that Ultra "obviously" represents as large as Apple will go. Oh well.
Topic | Posted By | Date |
---|---|---|
M2 benchmarks | - | 2022/06/15 12:27 PM |
You mean "absurd ARM"? ;-) (NT) | Rayla | 2022/06/15 02:18 PM |
It has PPC heritage :) (NT) | anon2 | 2022/06/15 02:55 PM |
Performance per clock | — | 2022/06/15 03:05 PM |
Performance per single clock cycle | hobold | 2022/06/16 05:12 AM |
Performance per single clock cycle | dmcq | 2022/06/16 06:59 AM |
Performance per single clock cycle | hobold | 2022/06/16 07:42 AM |
Performance per single clock cycle | Doug S | 2022/06/16 09:39 AM |
Performance per single clock cycle | hobold | 2022/06/16 12:36 PM |
More like cascaded ALUs | Paul A. Clayton | 2022/06/16 01:13 PM |
SuperSPARC ALU | Mark Roulo | 2022/06/16 01:57 PM |
LEA | Brett | 2022/06/16 02:52 PM |
M2 benchmarks | DaveC | 2022/06/15 03:31 PM |
M2 benchmarks | anon2 | 2022/06/15 05:06 PM |
M2 benchmarks | — | 2022/06/15 07:21 PM |
M2 benchmarks | --- | 2022/06/15 07:33 PM |
M2 benchmarks | Adrian | 2022/06/15 10:11 PM |
M2 benchmarks | Eric Fink | 2022/06/16 12:07 AM |
M2 benchmarks | Adrian | 2022/06/16 02:09 AM |
M2 benchmarks | Eric Fink | 2022/06/16 05:46 AM |
M2 benchmarks | Adrian | 2022/06/16 09:27 AM |
M2 benchmarks | --- | 2022/06/16 10:08 AM |
M2 benchmarks | Adrian | 2022/06/16 11:43 AM |
M2 benchmarks | Dummond D. Slow | 2022/06/16 01:03 PM |
M2 benchmarks | Adrian | 2022/06/17 03:34 AM |
M2 benchmarks | Dummond D. Slow | 2022/06/17 07:35 AM |
M2 benchmarks | none | 2022/06/16 10:14 AM |
M2 benchmarks | Adrian | 2022/06/16 12:44 PM |
M2 benchmarks | Eric Fink | 2022/06/17 02:05 AM |
M2 benchmarks | Anon | 2022/06/16 06:28 AM |
M2 benchmarks => MT | Adrian | 2022/06/16 11:04 AM |
M2 benchmarks => MT | Anon | 2022/06/18 02:38 AM |
M2 benchmarks => MT | Adrian | 2022/06/18 03:25 AM |
M2 benchmarks => MT | --- | 2022/06/18 10:14 AM |
M2 benchmarks | Doug S | 2022/06/16 09:49 AM |
M2 Pro at 3nm | Eric Fink | 2022/06/17 02:51 AM |
M2 benchmarks | Sean M | 2022/06/16 01:00 AM |
M2 benchmarks | Doug S | 2022/06/16 09:56 AM |
M2 benchmarks | joema | 2022/06/16 01:28 PM |
M2 benchmarks | Sean M | 2022/06/16 02:53 PM |
M2 benchmarks | Doug S | 2022/06/16 09:19 PM |
M2 benchmarks | Doug S | 2022/06/16 09:21 PM |
M2 benchmarks | --- | 2022/06/16 10:53 PM |
M2 benchmarks | Doug S | 2022/06/17 12:37 AM |
Apple’s STEM Ambitions | Sean M | 2022/06/17 04:18 AM |
Apple’s STEM Ambitions | --- | 2022/06/17 09:33 AM |
Mac Pro with Nvidia H100 | Tony Wu | 2022/06/17 06:37 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/17 10:37 PM |
Mac Pro with Nvidia H100 | Tony Wu | 2022/06/18 06:49 AM |
Mac Pro with Nvidia H100 | Dan Fay | 2022/06/18 07:40 AM |
Mac Pro with Nvidia H100 | Anon4 | 2022/06/20 09:04 AM |
Mac Pro with Nvidia H100 | Simon Farnsworth | 2022/06/20 10:09 AM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:32 AM |
Mac Pro with Nvidia H100 | Simon Farnsworth | 2022/06/20 11:20 AM |
Mac Pro with Nvidia H100 | Anon4 | 2022/06/20 04:16 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:19 AM |
Mac Pro with Nvidia H100 | me | 2022/06/18 07:17 AM |
Mac Pro with Nvidia H100 | Tony Wu | 2022/06/18 09:28 AM |
Mac Pro with Nvidia H100 | me | 2022/06/19 10:08 AM |
Mac Pro with Nvidia H100 | Dummond D. Slow | 2022/06/19 10:51 AM |
Mac Pro with Nvidia H100 | Elliott H | 2022/06/19 06:39 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/19 06:16 PM |
Mac Pro with Nvidia H100 | --- | 2022/06/19 06:56 PM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/19 11:00 PM |
Mac Pro with Nvidia H100 | --- | 2022/06/20 06:25 AM |
Mac Pro with Nvidia H100 | anon5 | 2022/06/20 08:41 AM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/20 07:22 PM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/20 07:13 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:19 PM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/22 12:06 AM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/22 09:18 AM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:38 AM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/20 07:17 PM |
Mac Pro with Nvidia H100 | Dummond D. Slow | 2022/06/20 05:46 PM |
Apple’s STEM Ambitions | noko | 2022/06/17 07:32 PM |
Quick aside: huge pages also useful for nested page tables (virtualization) (NT) | Paul A. Clayton | 2022/06/18 06:28 AM |
Quick aside: huge pages also useful for nested page tables (virtualization) | --- | 2022/06/18 10:16 AM |
Not this nonsense again | Anon | 2022/06/16 03:06 PM |
Parallel video encoding | Wes Felter | 2022/06/16 04:57 PM |
Parallel video encoding | Dummond D. Slow | 2022/06/16 07:16 PM |
Parallel video encoding | Wes Felter | 2022/06/16 07:49 PM |
Parallel video encoding | --- | 2022/06/16 07:41 PM |
Parallel video encoding | Dummond D. Slow | 2022/06/16 10:08 PM |
Parallel video encoding | --- | 2022/06/16 11:03 PM |
Parallel video encoding | Dummond D. Slow | 2022/06/17 07:45 AM |
Not this nonsense again | joema | 2022/06/16 09:13 PM |
Not this nonsense again | --- | 2022/06/16 11:18 PM |
M2 benchmarks-DDR4 vs DDR5 | Per Hesselgren | 2022/06/16 01:09 AM |
M2 benchmarks-DDR4 vs DDR5 | Rayla | 2022/06/16 08:12 AM |
M2 benchmarks-DDR4 vs DDR5 | Doug S | 2022/06/16 09:58 AM |
M2 benchmarks-DDR4 vs DDR5 | Rayla | 2022/06/16 11:58 AM |