By: , June 3, 2013 9:59 am
Room: Moderated Discussions
rwessel (robertwessel.delete@this.yahoo.com) on June 3, 2013 12:09 am wrote:
> The load/store units *are* types of execution units - they happen to handle loads and stores to memory,
> rather, that, say, arithmetic operations. Instructions get to them because the dispatcher sends
> them there. The store unit gets operands just like any other execution unit does, and the load unit
> sends results to registers, just like and arithmetic unit that generates results does.
>
> Nominally, when an instruction is issued OoO, and before its operands are ready, what it's waiting on is a
> prior instruction to write its result to a register, which it can then read from the register file. The many
> rename registers on modern OoO processors are part of the mechanism to be able to execute around dependencies
> caused by register reuse. But even that isn't quite enough, the path between a result being stored into a
> register and a dependent instruction reading that result from the register is far too long in most cases,
> and most fast processors (and not just OoO ones), implement a forwarding network that allows results to be
> transmitted from one unit to another directly, and in parallel to the update of the register.
>
> In the case of non-RISC machines, the operation of the load and store units is complicated by the fact that
> there are operations that are read-modify-update in nature. Exactly how those are handled is very dependent
> on the microarchitecture, but the earlier implementation all broke operations like "add memory,1" into several
> micro-ops (perhaps a load, an add and a store), more complex designs can compress that into fewer operations.
> Instructions that get split into multiple micro-ops need special handling on the back end, as you cannot architecturally
> take an exception in the “middle” of an instruction (and least on most machines).
Thanks for yet another great post! Sorry for replying late, was a bit busy the last couple days.
- So load and store units also are an execution unit. Thanks, that makes sense, as it would be weird to imagine a unit to just have things "pulled" to it with no controller to send or request resources.
- Well that makes sense, but what about in cases that two processes are not dependant on eachother whatsoever? What about in a situation wehre two threads are working on the same core, and one thread needs operand [b] and the other thread needs operand [k], and they have nothing to do with eachother? How does the load unit load the required operands into the registers? Whats the path, is what I'm asking?
Thanks for the reply!
> The load/store units *are* types of execution units - they happen to handle loads and stores to memory,
> rather, that, say, arithmetic operations. Instructions get to them because the dispatcher sends
> them there. The store unit gets operands just like any other execution unit does, and the load unit
> sends results to registers, just like and arithmetic unit that generates results does.
>
> Nominally, when an instruction is issued OoO, and before its operands are ready, what it's waiting on is a
> prior instruction to write its result to a register, which it can then read from the register file. The many
> rename registers on modern OoO processors are part of the mechanism to be able to execute around dependencies
> caused by register reuse. But even that isn't quite enough, the path between a result being stored into a
> register and a dependent instruction reading that result from the register is far too long in most cases,
> and most fast processors (and not just OoO ones), implement a forwarding network that allows results to be
> transmitted from one unit to another directly, and in parallel to the update of the register.
>
> In the case of non-RISC machines, the operation of the load and store units is complicated by the fact that
> there are operations that are read-modify-update in nature. Exactly how those are handled is very dependent
> on the microarchitecture, but the earlier implementation all broke operations like "add memory,1" into several
> micro-ops (perhaps a load, an add and a store), more complex designs can compress that into fewer operations.
> Instructions that get split into multiple micro-ops need special handling on the back end, as you cannot architecturally
> take an exception in the “middle” of an instruction (and least on most machines).
Thanks for yet another great post! Sorry for replying late, was a bit busy the last couple days.
- So load and store units also are an execution unit. Thanks, that makes sense, as it would be weird to imagine a unit to just have things "pulled" to it with no controller to send or request resources.
- Well that makes sense, but what about in cases that two processes are not dependant on eachother whatsoever? What about in a situation wehre two threads are working on the same core, and one thread needs operand [b] and the other thread needs operand [k], and they have nothing to do with eachother? How does the load unit load the required operands into the registers? Whats the path, is what I'm asking?
Thanks for the reply!