By: Felid (Felid.delete@this.mailinator.com), November 15, 2012 1:40 pm
Room: Moderated Discussions
Paul A. Clayton (paaronclayton.delete@this.gmail.com) on November 15, 2012 7:15 am wrote:
> Felid (Felid.delete@this.mailinator.com) on November 15, 2012 12:49 am wrote:
> [snip]
> If the fused operations are adjacent, there can be no additional uses of the mov's destination (given typical
> destructive [source and destination the same] x86 instructions). This, of course, means that preserving a
> register value by moving it to another location that is used much later would not allow this optimization,
> but that practice has been suboptimal for a while since one generally wants to exploit result forwarding.
>
> Even with move elimination in the renamer (which allows more cases to be handled), doing limited
> move elimination in the decoder can be beneficial (especially if one has a µop cache).
This requires double effort: more macrofusion rules and logic for (pre)decoders and more renaming logic for allocator. That's exactly, how it is done in BD (mov+op fusion added in 45 nm K10 and 0-clock moves — in BD). But apparently not in IB.
> Felid (Felid.delete@this.mailinator.com) on November 15, 2012 12:49 am wrote:
> [snip]
> If the fused operations are adjacent, there can be no additional uses of the mov's destination (given typical
> destructive [source and destination the same] x86 instructions). This, of course, means that preserving a
> register value by moving it to another location that is used much later would not allow this optimization,
> but that practice has been suboptimal for a while since one generally wants to exploit result forwarding.
>
> Even with move elimination in the renamer (which allows more cases to be handled), doing limited
> move elimination in the decoder can be beneficial (especially if one has a µop cache).
This requires double effort: more macrofusion rules and logic for (pre)decoders and more renaming logic for allocator. That's exactly, how it is done in BD (mov+op fusion added in 45 nm K10 and 0-clock moves — in BD). But apparently not in IB.
Topic | Posted By | Date |
---|---|---|
Haswell CPU article online | David Kanter | 2012/11/13 02:43 PM |
Haswell CPU article online | Eric | 2012/11/13 03:10 PM |
Haswell CPU article online | hobold | 2012/11/13 04:13 PM |
Haswell CPU article online | Ricardo B | 2012/11/13 05:09 PM |
Haswell CPU article online | anonymou5 | 2012/11/13 04:44 PM |
Haswell CPU article online | none | 2012/11/14 02:40 AM |
Haswell CPU article online | tarlinian | 2012/11/13 03:56 PM |
Fixed (NT) | David Kanter | 2012/11/13 05:06 PM |
Haswell CPU article online | Jacob Marley | 2012/11/14 01:18 AM |
Haswell CPU article online | randomshinichi | 2012/11/14 01:53 AM |
LLC == Last Level Cache (usually L3) (NT) | Paul A. Clayton | 2012/11/14 04:50 AM |
Haswell CPU article online | Joe | 2012/11/14 09:38 AM |
LLC vs. L3 vs. L4 | David Kanter | 2012/11/14 10:09 AM |
LLC vs. L3 vs. L4; LLC = Link Layer Controller | Ray | 2012/11/14 09:08 PM |
A pit there are only 17000 TLAs... (NT) | EduardoS | 2012/11/15 02:14 AM |
Haswell CPU article online | anon | 2012/11/14 04:10 AM |
Move elimination can be a µop fusion | Paul A. Clayton | 2012/11/14 05:41 AM |
That should be "mov R10 <- R9"! (NT) | Paul A. Clayton | 2012/11/14 05:43 AM |
Move elimination can be a µop fusion | anon | 2012/11/14 06:25 AM |
It does avoid the scheduler (NT) | Paul A. Clayton | 2012/11/14 07:47 AM |
Move elimination can be a µop fusion | Stubabe | 2012/11/14 12:43 PM |
Move elimination can be a µop fusion | anon | 2012/11/14 08:33 PM |
Move elimination can be a µop fusion | Felid | 2012/11/14 11:49 PM |
Move elimination can be a µop fusion | anon | 2012/11/15 12:23 AM |
Move elimination can be a µop fusion | Stuart | 2012/11/15 04:04 AM |
Move elimination can be a µop fusion | Stubabe | 2012/11/15 04:14 AM |
Move elimination can be a µop fusion | anon | 2012/11/15 04:48 AM |
Move elimination can be a µop fusion | EduardoS | 2012/11/15 05:00 AM |
Move elimination can be a µop fusion | anon | 2012/11/15 05:14 AM |
Move elimination can be a µop fusion | EduardoS | 2012/11/15 05:21 AM |
Move elimination can be a µop fusion | anon | 2012/11/15 05:31 AM |
Move elimination can be a µop fusion | Stubabe | 2012/11/15 10:38 AM |
There can be only one dependence | Paul A. Clayton | 2012/11/15 11:50 AM |
Move elimination can be a µop fusion | Felid | 2012/11/15 02:19 PM |
Move elimination can be a µop fusion | anon | 2012/11/16 03:07 AM |
Move elimination can be a µop fusion | Felid | 2012/11/16 06:43 PM |
Move elimination can be a µop fusion | Felid | 2012/11/15 01:50 PM |
Move elimination can be a µop fusion | Felid | 2012/11/15 02:03 PM |
Correction! | Felid | 2012/11/19 12:23 AM |
Thanks, I wasn't aware of the change in SB. Good to know... (NT) | Stubabe | 2012/11/15 02:43 PM |
Move fusion assumes adjacency | Paul A. Clayton | 2012/11/15 06:15 AM |
Move fusion assumes adjacency | Felid | 2012/11/15 01:40 PM |
Move elimination can be a µop fusion | Patrick Chase | 2012/11/21 10:52 AM |
Move elimination can be a µop fusion | Patrick Chase | 2012/11/21 11:12 AM |
Haswell CPU article online | Ricardo B | 2012/11/14 08:12 AM |
Haswell CPU article online | gmb | 2012/11/14 07:28 AM |
Haswell CPU article online | Felid | 2012/11/14 10:58 PM |
Haswell CPU article online | David Kanter | 2012/11/15 08:59 AM |
Haswell CPU article online | Felid | 2012/11/15 01:15 PM |
Instruction queue | David Kanter | 2012/11/16 11:23 AM |
Instruction queue | Felid | 2012/11/16 12:05 PM |
128-bit division unit? | Eric Bron | 2012/11/16 03:57 AM |
128-bit division unit? | David Kanter | 2012/11/16 07:59 AM |
128-bit division unit? | Eric Bron | 2012/11/16 08:47 AM |
128-bit division unit? | Felid | 2012/11/16 11:46 AM |
128-bit division unit? | Eric Bron | 2012/11/16 12:24 PM |
128-bit division unit? | Felid | 2012/11/16 06:19 PM |
128-bit division unit? | Eric Bron | 2012/11/18 07:41 AM |
128-bit division unit? | Michael S | 2012/11/17 11:50 AM |
128-bit division unit? | Felid | 2012/11/17 12:44 PM |
128-bit division unit? | Michael S | 2012/11/17 01:45 PM |
128-bit division unit? | Felid | 2012/11/17 04:49 PM |
128-bit division unit? | Michael S | 2012/11/17 05:56 PM |
128-bit division unit? | Eric Bron | 2012/11/18 07:35 AM |
Haswell CPU article online | Jim F | 2012/11/18 08:45 AM |
Haswell CPU article online | Gabriele Svelto | 2012/11/18 11:52 AM |
Probable bottleneck | Laurent Birtz | 2012/11/23 12:45 PM |
Probable bottleneck | EduardoS | 2012/11/23 12:58 PM |
Probable bottleneck | Laurent Birtz | 2012/11/24 09:10 AM |
Probable bottleneck | Stubabe | 2012/11/25 02:08 AM |
Probable bottleneck | EduardoS | 2012/11/25 07:15 AM |
Probable bottleneck | Stubabe | 2012/11/28 03:36 PM |
Urgh. Post got mangled by LESS THAN sign | Stubabe | 2012/11/28 03:41 PM |
Probable bottleneck | Laurent Birtz | 2012/11/29 07:34 AM |
Haswell CPU article online | Mr. Camel | 2012/11/28 02:47 PM |
Haswell CPU article online | EduardoS | 2012/11/28 03:06 PM |
Haswell CPU article online | Mr. Camel | 2012/11/28 06:23 PM |
Haswell CPU article online | EduardoS | 2012/11/28 06:27 PM |
Haswell CPU article online | Mr. Camel | 2012/12/12 12:39 PM |
Much faster iGPU clock ... | Mark Roulo | 2012/12/12 02:53 PM |
Much faster iGPU clock ... | Exophase | 2012/12/12 10:46 PM |
Much faster iGPU clock ... or not :-) | Mark Roulo | 2012/12/13 08:11 AM |
Much faster iGPU clock ... or not :-) | EduardoS | 2012/12/13 09:38 PM |
Much faster iGPU clock ... or not :-) | Michael S | 2012/12/14 04:33 AM |
Much faster iGPU clock ... or not :-) | EduardoS | 2012/12/14 06:06 AM |
Much faster iGPU clock ... or not :-) | Doug S | 2012/12/14 11:13 AM |
Much faster iGPU clock ... or not :-) | EduardoS | 2012/12/14 11:43 AM |
Much faster iGPU clock ... or not :-) | Mr. Camel | 2012/12/14 09:50 AM |
Much faster iGPU clock ... | Michael S | 2012/12/13 01:44 AM |
Much faster iGPU clock ... | Mark Roulo | 2012/12/13 08:09 AM |
Haswell CPU article online | Yang | 2012/12/09 07:28 PM |
possible spam bot? (NT) | I.S.T. | 2012/12/10 02:40 PM |
CPU Crystal Well behavior w/ eGPU? | Robert Williams | 2013/04/17 01:16 PM |
CPU Crystal Well behavior w/ eGPU? | Nicolas Capens | 2013/04/17 02:30 PM |
CPU Crystal Well behavior w/ eGPU? | RecessionCone | 2013/04/17 03:20 PM |
CPU Crystal Well behavior w/ eGPU? | Robert Williams | 2013/04/17 06:37 PM |
CPU Crystal Well behavior w/ eGPU? | Eric Bron | 2013/04/17 08:10 PM |
Haswell CPU article online | Sireesh | 2014/09/01 01:48 PM |
Haswell CPU article online | Maynard Handley | 2014/09/01 02:51 PM |
Great post | David Kanter | 2014/09/01 06:12 PM |
Thanks :) | Alberto | 2014/09/02 12:42 AM |
Thanks (NT) | Poindexter | 2014/09/02 08:31 AM |
Haswell CPU article online | EduardoS | 2014/09/01 03:21 PM |
Haswell CPU article online | Albert | 2015/10/06 12:48 AM |
Haswell CPU article online | Michael S | 2015/10/06 01:10 AM |
Haswell CPU article online | SHK | 2015/10/06 02:51 AM |