By: Marcus (m.delete@this.bitsnbites.eu), August 7, 2022 9:50 pm
Room: Moderated Discussions
Björn Ragnar Björnsson (bjorn.ragnar.delete@this.gmail.com) on August 7, 2022 5:14 pm wrote:
> Marcus (m.delete@this.bitsnbites.eu) on August 7, 2022 10:16 am wrote:
> > Adrian (a.delete@this.acm.org) on August 3, 2022 2:00 am wrote:
> > > Anon (no.delete@this.spam.com) on August 2, 2022 11:46 am wrote:
> > > >
> > > > Getting quality data is much harder than you think, unlike you I have seen a lot of empirical
> > > > data, often contradictory data, the problem is that the data depends on compiler technology,
> > > > the existing ISAs, the workload, etc, improve one side and the other becomes suboptimal.
> > > >
> > >
> > > [snip]
> > >
> > > After compiling the static frequencies from the entire set of programs can be gathered trivially for any
> > > instruction feature, e.g. to count the instructions with the same mnemonic and sort them by frequency:
> > >
> > > objdump -d /usr/bin/* | cut -f3 | grep -oE "^[a-z]+" | sort | uniq -c | sort -n
> > >
> > >
> > > This is just the simplest example. Classifying the immediate constants by
> > > size ranges would require more sophisticated parsing of the disassembly output,
> > > but that would still not be difficult for a Python/Perl/AWK script.
> > >
> >
> > I have found that you can cover more useful/common immediate
> > value ranges by using "clever" (non-linear) encodings.
> > E.g. in MRISC32 (a 32-bit RISC ISA with fixed instruction
> > word size) I have a 15-bit immediate field for regular
> > arithmetic and logical operations etc. I decided to use one of the 15 bits to signify HI (1) or LO (0), and
> > position the rest of the 14 bits accordingly in the actual
> > 32-bit value (constant operand). Thus the following
> > constants are possible to express with the 15-bit field of the instruction word:
> >
> > MUL R7,R1,#-42 // LO (sign extend)
> > AND R8,R2,#0x7F800000 // HI
> > XOR R9,R3,#0x000FFFFF // HI (bit-extend LSB into lower part)
> >
> > And there's a similar encoding trick for loading larger immediate values (1+20 bits immediate
> > field), that can be used for loading many binary32 FP constants too, for instance.
> >
> > My point is that just gathering statistics on immediate value ranges (e.g. a histogram
> > of log2(abs(x))) may not be optimal when designing an ISA. You really need to look into
> > different use cases (aritmetic, bitwise, memory offsets, floating-point, etc) and take
> > into account possible encoding variants for different instruction classes.
> >
> > /Marcus
> >
>
> I hadn't seen MRISC32 before. At first glance it looks pretty good, a very respectable start.
> Congrats on that Marcus. Probably good enough for embedded and IOT stuff. Lot's left to do
> to make it usable in a modern general purpose computer. You probably don't need my advice but
> I'll give it anyway. When completing the spec in the general purpose direction take special
> note of Linus' numerous and detailed notes on memory ordering here at realworldtech.
>
>
Thank you! That's good advice. I'm not sure how far I'll take MRISC32 in those directions - it's kind of a prototype for MRISC64 that should be more suitable as a GP CPU.
> Marcus (m.delete@this.bitsnbites.eu) on August 7, 2022 10:16 am wrote:
> > Adrian (a.delete@this.acm.org) on August 3, 2022 2:00 am wrote:
> > > Anon (no.delete@this.spam.com) on August 2, 2022 11:46 am wrote:
> > > >
> > > > Getting quality data is much harder than you think, unlike you I have seen a lot of empirical
> > > > data, often contradictory data, the problem is that the data depends on compiler technology,
> > > > the existing ISAs, the workload, etc, improve one side and the other becomes suboptimal.
> > > >
> > >
> > > [snip]
> > >
> > > After compiling the static frequencies from the entire set of programs can be gathered trivially for any
> > > instruction feature, e.g. to count the instructions with the same mnemonic and sort them by frequency:
> > >
> > > objdump -d /usr/bin/* | cut -f3 | grep -oE "^[a-z]+" | sort | uniq -c | sort -n
> > >
> > >
> > > This is just the simplest example. Classifying the immediate constants by
> > > size ranges would require more sophisticated parsing of the disassembly output,
> > > but that would still not be difficult for a Python/Perl/AWK script.
> > >
> >
> > I have found that you can cover more useful/common immediate
> > value ranges by using "clever" (non-linear) encodings.
> > E.g. in MRISC32 (a 32-bit RISC ISA with fixed instruction
> > word size) I have a 15-bit immediate field for regular
> > arithmetic and logical operations etc. I decided to use one of the 15 bits to signify HI (1) or LO (0), and
> > position the rest of the 14 bits accordingly in the actual
> > 32-bit value (constant operand). Thus the following
> > constants are possible to express with the 15-bit field of the instruction word:
> >
> > MUL R7,R1,#-42 // LO (sign extend)
> > AND R8,R2,#0x7F800000 // HI
> > XOR R9,R3,#0x000FFFFF // HI (bit-extend LSB into lower part)
> >
> > And there's a similar encoding trick for loading larger immediate values (1+20 bits immediate
> > field), that can be used for loading many binary32 FP constants too, for instance.
> >
> > My point is that just gathering statistics on immediate value ranges (e.g. a histogram
> > of log2(abs(x))) may not be optimal when designing an ISA. You really need to look into
> > different use cases (aritmetic, bitwise, memory offsets, floating-point, etc) and take
> > into account possible encoding variants for different instruction classes.
> >
> > /Marcus
> >
>
> I hadn't seen MRISC32 before. At first glance it looks pretty good, a very respectable start.
> Congrats on that Marcus. Probably good enough for embedded and IOT stuff. Lot's left to do
> to make it usable in a modern general purpose computer. You probably don't need my advice but
> I'll give it anyway. When completing the spec in the general purpose direction take special
> note of Linus' numerous and detailed notes on memory ordering here at realworldtech.
>
>
Thank you! That's good advice. I'm not sure how far I'll take MRISC32 in those directions - it's kind of a prototype for MRISC64 that should be more suitable as a GP CPU.