By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), July 17, 2015 4:29 pm
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on July 17, 2015 1:57 pm wrote:
>
> Has the ancient trick of prepadding with 2 bytes been forgotten? That allows for 8 byte alignment of
> both header and payload. Prepadding with 10 bytes would allow 16-byte alignment of the payload.
It doesn't necessarily work well in the general case. It breaks in the face of various encapsulation things. It can also be absolutely horrible when you have other (bigger) alignment concerns, like making sure PCI DMA writes from the network card are aligned to 16-byte boundaries or whatever. Some network cards can't even do unaligned packet writes etc etc.
That is, btw, one of the best examples of why special instructions (or worse, instruction sequences) for unaligned handling is completely broken. Because it is indeed possible to often set things up so that in practice, 100% of all accesses are aligned.
But that "100%" is still not a guarantee. It's just a "under normal circumstances, we have laid out the data structures so that all the important accesses are perfectly aligned, and you always hit the good case".
.. but then you have the odd cases when that doesn't work out. Either because of some encapsulation issue or because some particular hardware had other alignment concerns, or whatever. It may never ever happen for some particular common setup (like an important benchmark), but the unaligned case still needs to be handled correctly.
And trapping doesn't work either. Well, it "works". But the problem is that it's so expensive, that if there are situations where the unaligned case goes from "never happens" to "when you encapsulate the ethernet packets using xyz, it happens for every packet", you went from good performance to absolutely unacceptable performance.
The whole "aligned data is the usual case by far, but we cannot guarantee it absolutely" is not that unusual in the end.
Not that dissimilar from things like denormals in FP. They also "never" happen in practice. But it's usually something you can't absolutely guarantee, and when they do, they end up often happening a lot (ie once you see one, you often see thousands), and you can't afford to suck too much at it.
Linus
>
> Has the ancient trick of prepadding with 2 bytes been forgotten? That allows for 8 byte alignment of
> both header and payload. Prepadding with 10 bytes would allow 16-byte alignment of the payload.
It doesn't necessarily work well in the general case. It breaks in the face of various encapsulation things. It can also be absolutely horrible when you have other (bigger) alignment concerns, like making sure PCI DMA writes from the network card are aligned to 16-byte boundaries or whatever. Some network cards can't even do unaligned packet writes etc etc.
That is, btw, one of the best examples of why special instructions (or worse, instruction sequences) for unaligned handling is completely broken. Because it is indeed possible to often set things up so that in practice, 100% of all accesses are aligned.
But that "100%" is still not a guarantee. It's just a "under normal circumstances, we have laid out the data structures so that all the important accesses are perfectly aligned, and you always hit the good case".
.. but then you have the odd cases when that doesn't work out. Either because of some encapsulation issue or because some particular hardware had other alignment concerns, or whatever. It may never ever happen for some particular common setup (like an important benchmark), but the unaligned case still needs to be handled correctly.
And trapping doesn't work either. Well, it "works". But the problem is that it's so expensive, that if there are situations where the unaligned case goes from "never happens" to "when you encapsulate the ethernet packets using xyz, it happens for every packet", you went from good performance to absolutely unacceptable performance.
The whole "aligned data is the usual case by far, but we cannot guarantee it absolutely" is not that unusual in the end.
Not that dissimilar from things like denormals in FP. They also "never" happen in practice. But it's usually something you can't absolutely guarantee, and when they do, they end up often happening a lot (ie once you see one, you often see thousands), and you can't afford to suck too much at it.
Linus