Empirical data on ISA design parameters

By: Adrian (a.delete@this.acm.org), August 3, 2022 2:00 am
Room: Moderated Discussions
Anon (no.delete@this.spam.com) on August 2, 2022 11:46 am wrote:
>
> Getting quality data is much harder than you think, unlike you I have seen a lot of empirical
> data, often contradictory data, the problem is that the data depends on compiler technology,
> the existing ISAs, the workload, etc, improve one side and the other becomes suboptimal.
>


The dependence on compiler technology is the greatest difficulty in obtaining quality data.

Otherwise the static frequencies for any features of the instructions or of their encodings could be very easy to obtain from, e.g. the entire collection of packages of a Linux distribution. The dynamic frequencies need much more time to collect and also test input data, but using options to compile instrumented code makes that possible.


However such data obtained from compiled programs is only partially useful. There is a vicious circle where the compilers avoid using certain instructions, certain addressing modes, certain ways of generating immediate constants etc., because those happen to have a poor implementation on the current CPUs.

Then the results of collecting such empirical data shows that those features are not used by compilers, which is used to justify that in the next CPU designs such features can also be implemented poorly, or they can be completely omitted, when not forced to keep them for compatibility.

So what the current compilers generate seldom proves how the best ISA would look.

The most accurate empirical data can be generated only when carefully translating by hand a set of representative programs into optimal sequences of machine instructions and then recording the frequencies of whatever is desired.

Of course, manual translation of an adequate number of programs takes too much, so this is also not a solution.


Therefore I believe that a compromise must be made when an accurate comparison of different ISAs is intended.

One must select a reasonably large number of short programs that are considered typical for applications, in the spirit of the Livermore Loops. These must be translated by hand for every ISA that is compared. Gathering statistics over these would provide one set of empirical data.


The second set of empirical data, from compiled programs, is easier to gather.

For static frequencies that is quite trivial to do if you use some OS distribution that compiles everything from source code, e.g. a Gentoo Linux or a FreeBSD, which you could use in a VM or emulator to compile everything for a given target ISA and with given compiler flags.

After compiling the static frequencies from the entire set of programs can be gathered trivially for any instruction feature, e.g. to count the instructions with the same mnemonic and sort them by frequency:

objdump -d /usr/bin/* | cut -f3 | grep -oE "^[a-z]+" | sort | uniq -c | sort -n


This is just the simplest example. Classifying the immediate constants by size ranges would require more sophisticated parsing of the disassembly output, but that would still not be difficult for a Python/Perl/AWK script.

For dynamic frequencies most work would be in generating representative test input data, to be able to run the programs for gathering the statistics.


How to weigh the results from the 2 sets of empirical data, is unlikely to have an objective answer.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Empirical data on ISA design parametersNvaxPlus2022/08/02 08:45 AM
  Empirical data on ISA design parameters---2022/08/02 10:25 AM
    Empirical data on ISA design parametersRayla2022/08/02 10:29 AM
  Empirical data on ISA design parametersAnon2022/08/02 11:46 AM
    Empirical data on ISA design parametersAdrian2022/08/03 02:00 AM
      Immediate ranges (was: Empirical data on ISA...)Marcus2022/08/07 10:16 AM
        Immediate ranges (was: Empirical data on ISA...)Björn Ragnar Björnsson2022/08/07 05:14 PM
          Immediate ranges (was: Empirical data on ISA...)Marcus2022/08/07 09:50 PM
  I hope you find something, but there are challengesMark Roulo2022/08/02 06:48 PM
    I hope you find something, but there are challengesBrett2022/08/02 10:41 PM
      I hope you find something, but there are challengeshobold2022/08/03 03:17 AM
        I hope you find something, but there are challengesBrett2022/08/03 11:37 AM
    I hope you find something, but there are challengesvonk2022/08/03 12:22 AM
    I hope you find something, but there are challengesAdrian2022/08/03 02:19 AM
      I hope you find something, but there are challengesNoSpammer2022/08/03 07:55 AM
        I hope you find something, but there are challengesAnon2022/08/03 09:25 AM
          I hope you find something, but there are challengesLinus Torvalds2022/08/03 11:31 AM
          I hope you find something, but there are challengesNoSpammer2022/08/04 03:18 AM
            I hope you find something, but there are challengesAdrian2022/08/04 04:56 AM
              I hope you find something, but there are challengesLinus Torvalds2022/08/04 11:03 AM
                I hope you find something, but there are challengesMr. Camel2022/08/04 12:29 PM
              I hope you find something, but there are challengesNoSpammer2022/08/08 09:31 AM
            I hope you find something, but there are challengesAnon2022/08/04 02:54 PM
        I hope you find something, but there are challengesAdrian2022/08/03 11:33 AM
          I hope you find something, but there are challengesBrett2022/08/03 12:21 PM
          I hope you find something, but there are challenges---2022/08/03 02:55 PM
            I hope you find something, but there are challengesBrett2022/08/03 04:31 PM
              Rebirth of the 68k archBrett2022/08/05 01:17 PM
                Rebirth of the 68k archMarcus2022/08/06 04:36 AM
                  Rebirth of the 68k archMegol2022/08/07 02:01 PM
                    Rebirth of the 68k archMarcus2022/08/07 11:30 PM
                      Rebirth of the 68k archBrett2022/08/08 12:31 AM
                        Rebirth of the 68k archMarcus2022/08/08 01:46 AM
                  Rebirth of the 68k archAnon2022/08/07 02:57 PM
                    Rebirth of the 68k archBrett2022/08/07 05:37 PM
                      68K was not a kludgeMark Roulo2022/08/07 06:05 PM
                        68K was not a kludgeBrett2022/08/07 09:56 PM
                          68K was not a kludgenone2022/08/08 01:00 AM
                            rich man's VAX and more O.T.Michael S2022/08/08 02:44 AM
                              rich man's VAX and more O.T.none2022/08/08 02:51 AM
                Rebirth of the 68k archBrett2022/08/10 11:59 PM
                  Rebirth of the 68k archUngo2022/08/11 03:53 AM
                    Rebirth of the 68k archAnon42022/08/11 12:08 PM
                      Rebirth of the 68k archrwessel2022/08/11 01:02 PM
            I hope you find something, but there are challengesAdrian2022/08/04 12:07 AM
              I hope you find something, but there are challengesEtienne2022/08/04 05:15 AM
          I hope you find something, but there are challengesAnon2022/08/03 05:15 PM
        I hope you find something, but there are challengesblaine2022/08/03 12:03 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊