By: --- (---.delete@this.redheron.com), June 16, 2022 11:18 pm
Room: Moderated Discussions
joema (joema4.delete.delete.delete@this.this.this.gmail.com) on June 16, 2022 9:13 pm wrote:
> Anon (lkasdfj.delete@this.fjdksalf.com) on June 16, 2022 3:06 pm wrote:
> >
> > ..For those scratching your heads, this dude's an armchair quarterback...
> > who believes the main reason Apple's chip architects put in many hardware codec
> > blocks was to accelerate single stream encode/decode
> > to a higher degree than a single block could on its own. He just *knows* it
> > must be possible to split the work into multiple threads so that
> > multiple codec "cores" can collaborate on it
>
> Actually Apple's own Compressor already does segmented rendering by
> transparently splitting a single stream into multiple segments which are encoded
> in parallel by multiple TranscoderService processes, then concatenated.
> Since ProRes is an all-intra codec there are no GOPs to complicate this.
>
> Anyone with an M1 Max or M1 Ultra system can try this by simply using
> Compressor's advanced preferences to enable additional Compressor instances.
>
> You can also transcode four input files in parallel, each handled by a
> Compressor instance. That doesn't scale either, as you increase the
> files and Compressor instances from 1 to 4.
>
> M1 Ultra with four ProRes encode/decode units is no faster than
> M1 Max (which has two) on single-file or multi-file ProRes transcoding, yet they aren't
> bottlenecked on CPU, GPU or I/O.
So ProRes is I-only. It certainly seems like it should be scalable.
(a) What are the numbers that make you believe that it's not DRAM or SSD bandwidth? I've no idea what your source looks like in terms of data rate.
(b) Have you submitted a bug report to Apple? I know bug reporting to Apple can be massively futile, but something as basic as this (trivial to reproduce, not subject to any sort of opinions about how it "should" behave) is often more successful.
I can reach out to my old friends in QuickTime, but doing so before a bug report has been submitted is likely to result in nothing but a "we can't talk about that".
A bug report may also bring to light (the point I made earlier) that people simply have not yet got around to it. There are even more basic ProRes issues in play like the exact set of input formats the HW encoders will accept (for example some HDR formats), and it may be that increasing that set is considered a more immediate priority than parallelization.
> Anon (lkasdfj.delete@this.fjdksalf.com) on June 16, 2022 3:06 pm wrote:
> >
> > ..For those scratching your heads, this dude's an armchair quarterback...
> > who believes the main reason Apple's chip architects put in many hardware codec
> > blocks was to accelerate single stream encode/decode
> > to a higher degree than a single block could on its own. He just *knows* it
> > must be possible to split the work into multiple threads so that
> > multiple codec "cores" can collaborate on it
>
> Actually Apple's own Compressor already does segmented rendering by
> transparently splitting a single stream into multiple segments which are encoded
> in parallel by multiple TranscoderService processes, then concatenated.
> Since ProRes is an all-intra codec there are no GOPs to complicate this.
>
> Anyone with an M1 Max or M1 Ultra system can try this by simply using
> Compressor's advanced preferences to enable additional Compressor instances.
>
> You can also transcode four input files in parallel, each handled by a
> Compressor instance. That doesn't scale either, as you increase the
> files and Compressor instances from 1 to 4.
>
> M1 Ultra with four ProRes encode/decode units is no faster than
> M1 Max (which has two) on single-file or multi-file ProRes transcoding, yet they aren't
> bottlenecked on CPU, GPU or I/O.
So ProRes is I-only. It certainly seems like it should be scalable.
(a) What are the numbers that make you believe that it's not DRAM or SSD bandwidth? I've no idea what your source looks like in terms of data rate.
(b) Have you submitted a bug report to Apple? I know bug reporting to Apple can be massively futile, but something as basic as this (trivial to reproduce, not subject to any sort of opinions about how it "should" behave) is often more successful.
I can reach out to my old friends in QuickTime, but doing so before a bug report has been submitted is likely to result in nothing but a "we can't talk about that".
A bug report may also bring to light (the point I made earlier) that people simply have not yet got around to it. There are even more basic ProRes issues in play like the exact set of input formats the HW encoders will accept (for example some HDR formats), and it may be that increasing that set is considered a more immediate priority than parallelization.
Topic | Posted By | Date |
---|---|---|
M2 benchmarks | - | 2022/06/15 12:27 PM |
You mean "absurd ARM"? ;-) (NT) | Rayla | 2022/06/15 02:18 PM |
It has PPC heritage :) (NT) | anon2 | 2022/06/15 02:55 PM |
Performance per clock | — | 2022/06/15 03:05 PM |
Performance per single clock cycle | hobold | 2022/06/16 05:12 AM |
Performance per single clock cycle | dmcq | 2022/06/16 06:59 AM |
Performance per single clock cycle | hobold | 2022/06/16 07:42 AM |
Performance per single clock cycle | Doug S | 2022/06/16 09:39 AM |
Performance per single clock cycle | hobold | 2022/06/16 12:36 PM |
More like cascaded ALUs | Paul A. Clayton | 2022/06/16 01:13 PM |
SuperSPARC ALU | Mark Roulo | 2022/06/16 01:57 PM |
LEA | Brett | 2022/06/16 02:52 PM |
M2 benchmarks | DaveC | 2022/06/15 03:31 PM |
M2 benchmarks | anon2 | 2022/06/15 05:06 PM |
M2 benchmarks | — | 2022/06/15 07:21 PM |
M2 benchmarks | --- | 2022/06/15 07:33 PM |
M2 benchmarks | Adrian | 2022/06/15 10:11 PM |
M2 benchmarks | Eric Fink | 2022/06/16 12:07 AM |
M2 benchmarks | Adrian | 2022/06/16 02:09 AM |
M2 benchmarks | Eric Fink | 2022/06/16 05:46 AM |
M2 benchmarks | Adrian | 2022/06/16 09:27 AM |
M2 benchmarks | --- | 2022/06/16 10:08 AM |
M2 benchmarks | Adrian | 2022/06/16 11:43 AM |
M2 benchmarks | Dummond D. Slow | 2022/06/16 01:03 PM |
M2 benchmarks | Adrian | 2022/06/17 03:34 AM |
M2 benchmarks | Dummond D. Slow | 2022/06/17 07:35 AM |
M2 benchmarks | none | 2022/06/16 10:14 AM |
M2 benchmarks | Adrian | 2022/06/16 12:44 PM |
M2 benchmarks | Eric Fink | 2022/06/17 02:05 AM |
M2 benchmarks | Anon | 2022/06/16 06:28 AM |
M2 benchmarks => MT | Adrian | 2022/06/16 11:04 AM |
M2 benchmarks => MT | Anon | 2022/06/18 02:38 AM |
M2 benchmarks => MT | Adrian | 2022/06/18 03:25 AM |
M2 benchmarks => MT | --- | 2022/06/18 10:14 AM |
M2 benchmarks | Doug S | 2022/06/16 09:49 AM |
M2 Pro at 3nm | Eric Fink | 2022/06/17 02:51 AM |
M2 benchmarks | Sean M | 2022/06/16 01:00 AM |
M2 benchmarks | Doug S | 2022/06/16 09:56 AM |
M2 benchmarks | joema | 2022/06/16 01:28 PM |
M2 benchmarks | Sean M | 2022/06/16 02:53 PM |
M2 benchmarks | Doug S | 2022/06/16 09:19 PM |
M2 benchmarks | Doug S | 2022/06/16 09:21 PM |
M2 benchmarks | --- | 2022/06/16 10:53 PM |
M2 benchmarks | Doug S | 2022/06/17 12:37 AM |
Apple’s STEM Ambitions | Sean M | 2022/06/17 04:18 AM |
Apple’s STEM Ambitions | --- | 2022/06/17 09:33 AM |
Mac Pro with Nvidia H100 | Tony Wu | 2022/06/17 06:37 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/17 10:37 PM |
Mac Pro with Nvidia H100 | Tony Wu | 2022/06/18 06:49 AM |
Mac Pro with Nvidia H100 | Dan Fay | 2022/06/18 07:40 AM |
Mac Pro with Nvidia H100 | Anon4 | 2022/06/20 09:04 AM |
Mac Pro with Nvidia H100 | Simon Farnsworth | 2022/06/20 10:09 AM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:32 AM |
Mac Pro with Nvidia H100 | Simon Farnsworth | 2022/06/20 11:20 AM |
Mac Pro with Nvidia H100 | Anon4 | 2022/06/20 04:16 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:19 AM |
Mac Pro with Nvidia H100 | me | 2022/06/18 07:17 AM |
Mac Pro with Nvidia H100 | Tony Wu | 2022/06/18 09:28 AM |
Mac Pro with Nvidia H100 | me | 2022/06/19 10:08 AM |
Mac Pro with Nvidia H100 | Dummond D. Slow | 2022/06/19 10:51 AM |
Mac Pro with Nvidia H100 | Elliott H | 2022/06/19 06:39 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/19 06:16 PM |
Mac Pro with Nvidia H100 | --- | 2022/06/19 06:56 PM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/19 11:00 PM |
Mac Pro with Nvidia H100 | --- | 2022/06/20 06:25 AM |
Mac Pro with Nvidia H100 | anon5 | 2022/06/20 08:41 AM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/20 07:22 PM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/20 07:13 PM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:19 PM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/22 12:06 AM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/22 09:18 AM |
Mac Pro with Nvidia H100 | Doug S | 2022/06/20 10:38 AM |
Mac Pro with Nvidia H100 | Sam G | 2022/06/20 07:17 PM |
Mac Pro with Nvidia H100 | Dummond D. Slow | 2022/06/20 05:46 PM |
Apple’s STEM Ambitions | noko | 2022/06/17 07:32 PM |
Quick aside: huge pages also useful for nested page tables (virtualization) (NT) | Paul A. Clayton | 2022/06/18 06:28 AM |
Quick aside: huge pages also useful for nested page tables (virtualization) | --- | 2022/06/18 10:16 AM |
Not this nonsense again | Anon | 2022/06/16 03:06 PM |
Parallel video encoding | Wes Felter | 2022/06/16 04:57 PM |
Parallel video encoding | Dummond D. Slow | 2022/06/16 07:16 PM |
Parallel video encoding | Wes Felter | 2022/06/16 07:49 PM |
Parallel video encoding | --- | 2022/06/16 07:41 PM |
Parallel video encoding | Dummond D. Slow | 2022/06/16 10:08 PM |
Parallel video encoding | --- | 2022/06/16 11:03 PM |
Parallel video encoding | Dummond D. Slow | 2022/06/17 07:45 AM |
Not this nonsense again | joema | 2022/06/16 09:13 PM |
Not this nonsense again | --- | 2022/06/16 11:18 PM |
M2 benchmarks-DDR4 vs DDR5 | Per Hesselgren | 2022/06/16 01:09 AM |
M2 benchmarks-DDR4 vs DDR5 | Rayla | 2022/06/16 08:12 AM |
M2 benchmarks-DDR4 vs DDR5 | Doug S | 2022/06/16 09:58 AM |
M2 benchmarks-DDR4 vs DDR5 | Rayla | 2022/06/16 11:58 AM |