By: hobold (hobold.delete@this.vectorizer.org), October 11, 2020 12:32 pm
Room: Moderated Discussions
anon (anon.delete@this.ymous.org) on October 11, 2020 2:58 am wrote:
[...]
> Value prediction may predict previously unknown values, much like prefetchers can predict
> unknown addresses (e.g., Stride prefetcher and its counterpart Stride value predictor).
> In practice those "unseen before" values may well be addresses (or values feeding address
> generation e.g., loop induction variable) that the prefetcher would have gotten too.
>
The words "unseen before" mean something other than "unknown". Just like "prediction" does not mean the same as accurately foretelling the actual future.
[...]
> On latency, I would contend that in isolation, latency is always greater than 0 because you have
> to at least fetch the instruction. However with superscalar and OoO the latency of some instructions
> does not appear on the critical path. Moreover, for instructions that *do* appear on the critical
> path, superscalar fetch + some optimizations (move elimination, zero/one-idiom elimination and
> now memory bypassing) can cause instructions to appear has having 0 latency.
The word "appear" means something other than "be".
I am splitting hairs here. I get that I may appear to be hell-bent on having the final word on this. But I hate it when engineers begin to think in concepts that are rooted more in marketing than in observable phenomena. Knowing the future beforehand is possible only under very specific circumstances. Negative latency is possible only with the help of a time machine (in the macroscopic world). Thinking in incorrect terminology misleads the thinkers to make blatant mistakes.
Take my favourite example from recent history: the retire stage of a microprocessor. Its function and purpose is to create the limited appearance of strictly sequential program execution, despite the reality of dynamic out of order execution.
If one's mental concept is that the retire stage creates anything more than a limited appearance, then one might be tempted to implement further functionality in that same pipeline stage. If this line of thought ever leads to the implementation of security features in the retire stage ("we can remove this from the critical path!"), then one ends up with a processor that creates the limited appearance of security. A very real meltdown.
[...]
> Value prediction may predict previously unknown values, much like prefetchers can predict
> unknown addresses (e.g., Stride prefetcher and its counterpart Stride value predictor).
> In practice those "unseen before" values may well be addresses (or values feeding address
> generation e.g., loop induction variable) that the prefetcher would have gotten too.
>
The words "unseen before" mean something other than "unknown". Just like "prediction" does not mean the same as accurately foretelling the actual future.
[...]
> On latency, I would contend that in isolation, latency is always greater than 0 because you have
> to at least fetch the instruction. However with superscalar and OoO the latency of some instructions
> does not appear on the critical path. Moreover, for instructions that *do* appear on the critical
> path, superscalar fetch + some optimizations (move elimination, zero/one-idiom elimination and
> now memory bypassing) can cause instructions to appear has having 0 latency.
The word "appear" means something other than "be".
I am splitting hairs here. I get that I may appear to be hell-bent on having the final word on this. But I hate it when engineers begin to think in concepts that are rooted more in marketing than in observable phenomena. Knowing the future beforehand is possible only under very specific circumstances. Negative latency is possible only with the help of a time machine (in the macroscopic world). Thinking in incorrect terminology misleads the thinkers to make blatant mistakes.
Take my favourite example from recent history: the retire stage of a microprocessor. Its function and purpose is to create the limited appearance of strictly sequential program execution, despite the reality of dynamic out of order execution.
If one's mental concept is that the retire stage creates anything more than a limited appearance, then one might be tempted to implement further functionality in that same pipeline stage. If this line of thought ever leads to the implementation of security features in the retire stage ("we can remove this from the critical path!"), then one ends up with a processor that creates the limited appearance of security. A very real meltdown.
Topic | Posted By | Date |
---|---|---|
Zen 3 | Blue | 2020/10/08 09:58 AM |
Zen 3 | Rayla | 2020/10/08 10:10 AM |
Zen 3 | Adrian | 2020/10/08 10:13 AM |
Does anyone know whether Zen 3 has AVX-512? (NT) | Foo_ | 2020/10/08 11:54 AM |
Does anyone know whether Zen 3 has AVX-512? | Adrian | 2020/10/08 12:11 PM |
Zen 3 - Number of load/store units | ⚛ | 2020/10/08 10:21 AM |
Zen 3 - Number of load/store units | Rayla | 2020/10/08 10:28 AM |
Zen 3 - Number of load/store units | ⚛ | 2020/10/08 11:22 AM |
Zen 3 - Number of load/store units | Adrian | 2020/10/08 11:53 AM |
Zen 3 - Number of load/store units | Travis Downs | 2020/10/08 09:45 PM |
Zen 3 - CAD benchmark | Per Hesselgren | 2020/10/09 07:29 AM |
Zen 3 - CAD benchmark | Adrian | 2020/10/09 09:27 AM |
Zen 3 - Number of load/store units | itsmydamnation | 2020/10/08 02:38 PM |
Zen 3 - Number of load/store units | Groo | 2020/10/08 02:48 PM |
Zen 3 - Number of load/store units | Wilco | 2020/10/08 03:02 PM |
Zen 3 - Number of load/store units | Dummond D. Slow | 2020/10/08 04:39 PM |
Zen 3 - Number of load/store units | Doug S | 2020/10/09 08:11 AM |
Zen 3 - Number of load/store units | Dummond D. Slow | 2020/10/09 09:43 AM |
Zen 3 - Number of load/store units | Doug S | 2020/10/09 01:43 PM |
N7 and N7P are not load/Store units - please fix the topic in your replies (NT) | Heikki Kultala | 2020/10/10 07:37 AM |
Zen 3 | Jeff S. | 2020/10/08 12:16 PM |
Zen 3 | anon | 2020/10/08 01:57 PM |
Disappointing opening line in paper | Paul A. Clayton | 2020/10/11 06:16 AM |
Thoughts on "Improving the Utilization of µop Caches..." | Paul A. Clayton | 2020/10/14 12:11 PM |
Thoughts on "Improving the Utilization of µop Caches..." | anon | 2020/10/15 11:56 AM |
Thoughts on "Improving the Utilization of µop Caches..." | anon | 2020/10/15 11:57 AM |
Sorry about the mess | anon | 2020/10/15 11:58 AM |
Sorry about the mess | Brett | 2020/10/16 03:22 AM |
Caching dependence info in µop cache | Paul A. Clayton | 2020/10/16 06:20 AM |
Caching dependence info in µop cache | anon | 2020/10/16 12:36 PM |
Caching dependence info in µop cache | Paul A. Clayton | 2020/10/18 01:28 PM |
Zen 3 | juanrga | 2020/10/09 10:12 AM |
Zen 3 | Mr. Camel | 2020/10/09 06:30 PM |
Zen 3 | anon.1 | 2020/10/10 12:44 AM |
Cinebench is terrible benchmark | David Kanter | 2020/10/10 10:36 AM |
Cinebench is terrible benchmark | anon.1 | 2020/10/10 12:06 PM |
Cinebench is terrible benchmark | hobold | 2020/10/10 12:33 PM |
Some comments on benchmarks | Paul A. Clayton | 2020/10/14 12:11 PM |
Some comments on benchmarks | Mark Roulo | 2020/10/14 03:21 PM |
Zen 3 | Adrian | 2020/10/10 01:59 AM |
Zen 3 | Adrian | 2020/10/10 02:18 AM |
Zen 3 | majord | 2020/10/15 04:02 AM |
Zen 3 | hobold | 2020/10/10 08:58 AM |
Zen 3 | Maynard Handley | 2020/10/10 10:36 AM |
Zen 3 | hobold | 2020/10/10 12:19 PM |
Zen 3 | anon | 2020/10/11 02:58 AM |
Zen 3 | hobold | 2020/10/11 12:32 PM |
Zen 3 | anon | 2020/10/11 01:07 PM |
Zen 3 | hobold | 2020/10/11 02:22 PM |
Zen 3 | anon | 2020/10/10 11:51 AM |
Zen 3 | Michael S | 2020/10/11 01:16 AM |
Zen 3 | hobold | 2020/10/11 02:13 AM |
Zen 3 | Michael S | 2020/10/11 02:18 AM |
Zen 3 | anon.1 | 2020/10/11 12:17 PM |
Zen 3 | David Hess | 2020/10/12 06:43 AM |
more power? (NT) | anonymous2 | 2020/10/12 01:26 PM |
I think he's comparing 65W 3700X vs 105W 5800X (NT) | John H | 2020/10/12 04:33 PM |
?! Those are apples and oranges! (NT) | anon | 2020/10/12 04:49 PM |