By: anonymou5 (no.delete@this.spam.com), June 6, 2022 8:07 am
Room: Moderated Discussions
> > So yes, Huffman will help for the super popular ones, but overall it
> > really won't: the numbers simply are against you.
>
> The sentence "Huffman will help for the super popular ones, but overall it really won't" is
> a mathematical contradiction. Are you sure you understand how Huffman/Shannon coding works?
>
> Huffman/Shannon codings don't work only in case the premise "there are super popular ones" isn't fulfilled.
> In other words, they do not work if the probability distribution of the characters/symbols in the data
> stream is approximately the same (such as: 8 symbols, probability 1/8=0.125 for each symbol).
>
> -atom
It's fairly easy to play with this one, no?
Give yourself a list of ~1-2k entries (depending on your ISA of choice), dial in
a distribution (as described: a few dozen popular ones + an endless long tail),
and see what your encoding choice gives you.
> > really won't: the numbers simply are against you.
>
> The sentence "Huffman will help for the super popular ones, but overall it really won't" is
> a mathematical contradiction. Are you sure you understand how Huffman/Shannon coding works?
>
> Huffman/Shannon codings don't work only in case the premise "there are super popular ones" isn't fulfilled.
> In other words, they do not work if the probability distribution of the characters/symbols in the data
> stream is approximately the same (such as: 8 symbols, probability 1/8=0.125 for each symbol).
>
> -atom
It's fairly easy to play with this one, no?
Give yourself a list of ~1-2k entries (depending on your ISA of choice), dial in
a distribution (as described: a few dozen popular ones + an endless long tail),
and see what your encoding choice gives you.