By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), October 1, 2021 11:01 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on October 1, 2021 5:04 am wrote:
>
> Like, first instruction brings destination to [coarse] aligned boundary etc...
It could be even simpler.
The first instruction might not do anything about the actual copy at all.
It might just do pure bookkeeping functionality, like "check overlapping ranges" or "check if it's large enough and mutually aligned so that you can do cacheline level optimizations". Things like setting flags to say how to copy (kind of like how x86 uses the DF flag).
That would make the first instruction fairly uninteresting, and the second instruction would be the one that does all the repeating work (with the third instruction doing what? Maybe the final tail, maybe just some internal state cleanup?)
But if the restart happens on the second instructions, I don't know where the first instruction would squirrel away any state information it has determined, though. It would have to be in some architected register state, so that nested memory copies work (ie taking a page fault, doing another memory copy in the kernel or VMM).
So I personally think it would be best to always cause restarts to restart at the first instruction, exactly so that you could have magic micro-architectural hidden state. If you always restart at the first instruction, you could literally have hidden "previous read" buffers for the mutually unaligned case, hidden "do it with cache transfers" flags, or direction flags etc, and never expose your random microarchitectural choices anywhere else.
And so it would allow you to migrate cleanly between different microarchitectures (either BIG.little or just VM migration) without any odd special cases.
VM migration is an interesting case, and having it happen in the middle of a big memory copy is not at all some kind of exceptionally unusual situation. So any model that does something special in the first instruction - and then exposes restarts on the second one - sounds a bit iffy to me.
IOW, restart at the first instruction really seems like the technically correct solution.
This is something the x86 "rep movs" got right. No odd partial instruction restart cases.
Of course, "rep movs" has other problems, so..
Linus
>
> Like, first instruction brings destination to [coarse] aligned boundary etc...
It could be even simpler.
The first instruction might not do anything about the actual copy at all.
It might just do pure bookkeeping functionality, like "check overlapping ranges" or "check if it's large enough and mutually aligned so that you can do cacheline level optimizations". Things like setting flags to say how to copy (kind of like how x86 uses the DF flag).
That would make the first instruction fairly uninteresting, and the second instruction would be the one that does all the repeating work (with the third instruction doing what? Maybe the final tail, maybe just some internal state cleanup?)
But if the restart happens on the second instructions, I don't know where the first instruction would squirrel away any state information it has determined, though. It would have to be in some architected register state, so that nested memory copies work (ie taking a page fault, doing another memory copy in the kernel or VMM).
So I personally think it would be best to always cause restarts to restart at the first instruction, exactly so that you could have magic micro-architectural hidden state. If you always restart at the first instruction, you could literally have hidden "previous read" buffers for the mutually unaligned case, hidden "do it with cache transfers" flags, or direction flags etc, and never expose your random microarchitectural choices anywhere else.
And so it would allow you to migrate cleanly between different microarchitectures (either BIG.little or just VM migration) without any odd special cases.
VM migration is an interesting case, and having it happen in the middle of a big memory copy is not at all some kind of exceptionally unusual situation. So any model that does something special in the first instruction - and then exposes restarts on the second one - sounds a bit iffy to me.
IOW, restart at the first instruction really seems like the technically correct solution.
This is something the x86 "rep movs" got right. No odd partial instruction restart cases.
Of course, "rep movs" has other problems, so..
Linus