By: ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com), June 6, 2022 11:08 am
Room: Moderated Discussions
anonymou5 (no.delete@this.spam.com) on June 6, 2022 8:07 am wrote:
> > > So yes, Huffman will help for the super popular ones, but overall it
> > > really won't: the numbers simply are against you.
> >
> > The sentence "Huffman will help for the super popular ones, but overall it really won't" is
> > a mathematical contradiction. Are you sure you understand how Huffman/Shannon coding works?
> >
> > Huffman/Shannon codings don't work only in case the premise "there are super popular ones" isn't fulfilled.
> > In other words, they do not work if the probability distribution of the characters/symbols in the data
> > stream is approximately the same (such as: 8 symbols, probability 1/8=0.125 for each symbol).
> >
> > -atom
>
> It's fairly easy to play with this one, no?
No. There is almost nothing to play with here if the goal is an instruction encoding fairly close to the best possible instruction encoding.
> Give yourself a list of ~1-2k entries (depending on your ISA of choice), dial in
> a distribution (as described: a few dozen popular ones + an endless long tail),
> and see what your encoding choice gives you.
Unless you mean something like a dynamic Huffman coding (for example: specific to a 4 KiB page of code - where the Huffman tree of the variable-length ISA is specified by the page descriptor and for example selected from a predefined list of encoding options), you are mistaken.
-atom
> > > So yes, Huffman will help for the super popular ones, but overall it
> > > really won't: the numbers simply are against you.
> >
> > The sentence "Huffman will help for the super popular ones, but overall it really won't" is
> > a mathematical contradiction. Are you sure you understand how Huffman/Shannon coding works?
> >
> > Huffman/Shannon codings don't work only in case the premise "there are super popular ones" isn't fulfilled.
> > In other words, they do not work if the probability distribution of the characters/symbols in the data
> > stream is approximately the same (such as: 8 symbols, probability 1/8=0.125 for each symbol).
> >
> > -atom
>
> It's fairly easy to play with this one, no?
No. There is almost nothing to play with here if the goal is an instruction encoding fairly close to the best possible instruction encoding.
> Give yourself a list of ~1-2k entries (depending on your ISA of choice), dial in
> a distribution (as described: a few dozen popular ones + an endless long tail),
> and see what your encoding choice gives you.
Unless you mean something like a dynamic Huffman coding (for example: specific to a 4 KiB page of code - where the Huffman tree of the variable-length ISA is specified by the page descriptor and for example selected from a predefined list of encoding options), you are mistaken.
-atom