By: Jan Wassenberg (jan.wassenberg.delete@this.gmail.com), May 22, 2022 10:46 pm
Room: Moderated Discussions
Jörn Engel (joern.delete@this.purestorage.com) on May 22, 2022 2:49 pm wrote:
> If all your data is aligned, it doesn't matter. So the question is what you do about unaligned
> data. One approach is to have two instructions, load vs. loadu or something like that. Typing
> loadu is annoying, so by default everyone uses load. If you really legitimately need to handle
> unaligned data, you have to use loadu. Basically, you're leaving warts all over the code.
>
> Another approach is to only provide load, which is implemented using the unaligned variant.
> Now all code looks the same. If people unintentionally have unaligned data, performance
> suffers. And it can take quite a while until such mistakes are noticed.
>
> So, what is better? Warts all over the code and potential runtime fault in case
> of mistakes? Or hiding the distinction between aligned and unaligned and making
> it easier to hurt performance?
Yeah, tough question. To Charlie's point, it's infeasible to distinguish "unaligned and has to be that way" from "unaligned and could maybe be refactored" via tooling.
With 5 years of hindsight, I'm still pretty happy with our decision to have both Store and StoreU - it's a very small wart, and much cheaper to put those in when writing the loop, vs. looking over all Store() later trying to find those that should have been aligned. Unaligned loads mainly hurt if you don't have a lot of compute, such as dot product; but unaligned stores can be costly and are best avoided.
I agree that an argument/tag indicating aligned/unaligned would be too large a wart, though.
> If all your data is aligned, it doesn't matter. So the question is what you do about unaligned
> data. One approach is to have two instructions, load vs. loadu or something like that. Typing
> loadu is annoying, so by default everyone uses load. If you really legitimately need to handle
> unaligned data, you have to use loadu. Basically, you're leaving warts all over the code.
>
> Another approach is to only provide load, which is implemented using the unaligned variant.
> Now all code looks the same. If people unintentionally have unaligned data, performance
> suffers. And it can take quite a while until such mistakes are noticed.
>
> So, what is better? Warts all over the code and potential runtime fault in case
> of mistakes? Or hiding the distinction between aligned and unaligned and making
> it easier to hurt performance?
Yeah, tough question. To Charlie's point, it's infeasible to distinguish "unaligned and has to be that way" from "unaligned and could maybe be refactored" via tooling.
With 5 years of hindsight, I'm still pretty happy with our decision to have both Store and StoreU - it's a very small wart, and much cheaper to put those in when writing the loop, vs. looking over all Store() later trying to find those that should have been aligned. Unaligned loads mainly hurt if you don't have a lot of compute, such as dot product; but unaligned stores can be costly and are best avoided.
I agree that an argument/tag indicating aligned/unaligned would be too large a wart, though.