By: Jan Wassenberg (jan.wassenberg.delete@this.gmail.com), May 30, 2022 10:07 pm
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on May 30, 2022 8:36 am wrote:
> Why would I possibly want to sort hashes?
A simple and good way to random shuffle large items [key = hash in upper part, index in lower] :)
> Heavily optimized non-specialized library is is either an overkill or not enough.
Fair enough, I understand this perspective, but it's difficult when there are lots of sort() users.
> As said in previous post, I don't think that either of the two is good when N is so large.
> But radix sort has much worse cache behaviour than quicksort so I fully believe that it suffers
> more impact when dataset no longer fits in L2$, esp. in parallel multithreading case.
Agreed.
> Only mergesort has predictably good cache access patterns.
> But yes, it is the worst algo in terms of taking advantage
> of lucky opportunities.
Everything is a tradeoff :)
> Table 12 ? Where?
Oops, my bad, I forgot the link: https://www.vldb.org/pvldb/vol15/p259-arman.pdf
> Why would I possibly want to sort hashes?
A simple and good way to random shuffle large items [key = hash in upper part, index in lower] :)
> Heavily optimized non-specialized library is is either an overkill or not enough.
Fair enough, I understand this perspective, but it's difficult when there are lots of sort() users.
> As said in previous post, I don't think that either of the two is good when N is so large.
> But radix sort has much worse cache behaviour than quicksort so I fully believe that it suffers
> more impact when dataset no longer fits in L2$, esp. in parallel multithreading case.
Agreed.
> Only mergesort has predictably good cache access patterns.
> But yes, it is the worst algo in terms of taking advantage
> of lucky opportunities.
Everything is a tradeoff :)
> Table 12 ? Where?
Oops, my bad, I forgot the link: https://www.vldb.org/pvldb/vol15/p259-arman.pdf