Computations on Big Integers

By: none (none.delete@this.none.com), July 27, 2022 1:22 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on July 26, 2022 5:55 am wrote:
> Adrian (a.delete@this.acm.org) on July 26, 2022 2:53 am wrote:
> > Bill G (bill.delete@this.g.com) on July 26, 2022 1:40 am wrote:
> > > Thank you for the link to the GMPbench results. Intel Alder Lake (i5 12600K)
> > > is faster than Apple M1 on GMPbench by the percentages shown below:
> > >
> > > RSA encryption 78%
> > > Multiply 58%
> > > Divide 52%
> > > GMP-bench 45%
> > > Extended Greatest Common Divisor 23%
> > > Greatest Common Divisor 22%
> > > Pi computation 21%
> > >
> > > GMPbench is a single-threaded integer benchmark. If Alder Lake is running at its maximum turbo
> > > frequency during this benchmark, Alder Lake would have a 53% higher clock frequency than Apple
> > > M1 during this benchmark. In that case, Alder Lake and Apple M1 are doing about the same amount
> > > of work per clock for multiply and divide. Alder Lake is doing about 31% less work per clock
> > > than Apple M1 for the two greatest common divisor benchmarks and the pi computation.
> > >
> > > The web page you linked to says “Most operations in GMP up to quite large operand sizes
> > > would scale very well over multi-core systems, but poorly with SMT.” That seems to favor
> > > Xeon and EPYC processors, at least until the Mac Pro with Apple Silicon comes out.
> >
> >
> > The performance for big number computations is determined in large part
> > by the performance of the multiplication of big integer numbers.
> >
>
> I am afraid, the limiting factor is not # of multipliers, but # of dev hours spend on target-specific
> tuning. And the 2nd limiting factor is still not # of multipliers, but quality of compilers.
>
> Take, for example, division of relatively short Bigints, say 128 to 512 bits. Now take two relatively
> similar cores, Zen2 and Zen3. Zen2 has o.k. integer division. Zen3 has a bloody fast one.
> I wouldn't be surprised if that fact alone will lead to completely different optimal strategies for
> division of numbers in above-mentioned ranges.
>
> Now, what if we take something more seriously different, like Apple M1?
> Being ARM it has no "long division" at all and it's "short" 64-bit division is in the same class
> or little better as "long division" on Zen3. But it has something else - fully pipelined FP (double-precision
> division unit. Could it be used to speed up big-integer division for above-mentioned lengths? My
> answer is "I don't know before I try.". And "try" here does not mean something quick, but rather
> a week of four of hard work of writing and comparing several alternatives.

That's why we have libraries: factor out the optimization efforts :-)
Of course GMP will never have the fastest specialized code for particular purposes but the
library is carefully tuned and does a pretty good job.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Yitian 710 anonymous22021/10/20 08:57 PM
  Yitian 710 Adrian2021/10/21 12:20 AM
  Yitian 710 Wilco2021/10/21 03:47 AM
    Yitian 710 Rayla2021/10/21 05:52 AM
      Yitian 710 Wilco2021/10/21 11:59 AM
        Yitian 710 anon22021/10/21 05:16 PM
        Yitian 710 Wilco2022/07/16 12:21 PM
          Yitian 710 Anon2022/07/16 08:22 PM
            Yitian 710 Rayla2022/07/17 09:10 AM
              Yitian 710 Anon2022/07/17 12:04 PM
                Yitian 710 Rayla2022/07/17 12:08 PM
                  Yitian 710 Wilco2022/07/17 01:16 PM
                    Yitian 710 Anon2022/07/17 01:32 PM
                      Yitian 710 Wilco2022/07/17 02:22 PM
                        Yitian 710 Anon2022/07/17 02:47 PM
                          Yitian 710 Wilco2022/07/17 03:50 PM
                            Yitian 710 Anon2022/07/17 08:46 PM
                              Yitian 710 Wilco2022/07/18 03:01 AM
                                Yitian 710 Anon2022/07/19 11:21 AM
                                  Yitian 710 Wilco2022/07/19 06:15 PM
                                    Yitian 710 Anon2022/07/21 01:25 AM
                                      Yitian 710 none2022/07/21 01:49 AM
                                        Yitian 710 Anon2022/07/21 03:03 AM
                                          Yitian 710 none2022/07/21 04:34 AM
                                      Yitian 710 James2022/07/21 02:29 AM
                                        Yitian 710 Anon2022/07/21 03:05 AM
                                      Yitian 710 Wilco2022/07/21 04:31 AM
                                        Yitian 710 Anon2022/07/21 05:17 AM
                                          Yitian 710 Wilco2022/07/21 05:33 AM
                                            Yitian 710 Anon2022/07/21 05:50 AM
                                              Yitian 710 Wilco2022/07/21 06:07 AM
                                                Yitian 710 Anon2022/07/21 06:20 AM
                                                  Yitian 710 Wilco2022/07/21 10:02 AM
                                                    Yitian 710 Anon2022/07/21 10:22 AM
                    Yitian 710 Adrian2022/07/17 11:09 PM
                      Yitian 710 Wilco2022/07/18 01:15 AM
                        Yitian 710 Adrian2022/07/18 02:35 AM
          Yitian 710 Adrian2022/07/16 11:19 PM
            Computations on Big IntegersBill G2022/07/25 10:06 PM
              Computations on Big Integersnone2022/07/25 11:35 PM
                x86 MUL 64x64 Eric Fink2022/07/26 01:06 AM
                  x86 MUL 64x64 Adrian2022/07/26 02:27 AM
                  x86 MUL 64x64 none2022/07/26 02:38 AM
                    x86 MUL 64x64 Jörn Engel2022/07/26 10:17 AM
                      x86 MUL 64x64 Linus Torvalds2022/07/27 10:13 AM
                        x86 MUL 64x64 2022/07/28 09:40 AM
                        x86 MUL 64x64 Jörn Engel2022/07/28 10:18 AM
                          More than 3 registers per instruction-.-2022/07/28 07:01 PM
                            More than 3 registers per instructionAnon2022/07/28 10:39 PM
                            More than 3 registers per instructionJörn Engel2022/07/28 10:42 PM
                              More than 3 registers per instruction-.-2022/07/29 04:31 AM
                Computations on Big IntegersBill G2022/07/26 01:40 AM
                  Computations on Big Integersnone2022/07/26 02:17 AM
                    Computations on Big IntegersBill G2022/07/26 03:52 AM
                    Computations on Big Integers---2022/07/26 09:57 AM
                  Computations on Big IntegersAdrian2022/07/26 02:53 AM
                    Computations on Big IntegersBill G2022/07/26 03:39 AM
                      Computations on Big IntegersAdrian2022/07/26 04:21 AM
                    Computations on Big Integers in Apple AMX UnitsBill G2022/07/26 04:28 AM
                      Computations on Big Integers in Apple AMX UnitsAdrian2022/07/26 05:13 AM
                        TypoAdrian2022/07/26 05:20 AM
                          IEEE binary64 is 53 bits rather than 52. (NT)Michael S2022/07/26 05:34 AM
                            IEEE binary64 is 53 bits rather than 52.Adrian2022/07/26 07:32 AM
                              IEEE binary64 is 53 bits rather than 52.Michael S2022/07/26 10:02 AM
                                IEEE binary64 is 53 bits rather than 52.Adrian2022/07/27 06:58 AM
                                  IEEE binary64 is 53 bits rather than 52.none2022/07/27 07:14 AM
                                    IEEE binary64 is 53 bits rather than 52.Adrian2022/07/27 07:55 AM
                                      Thanks a lot for the link to the article! (NT)none2022/07/27 08:09 AM
                          TypozArchJon2022/07/26 09:51 AM
                            TypoMichael S2022/07/26 10:25 AM
                              TypozArchJon2022/07/26 11:52 AM
                                TypoMichael S2022/07/26 01:02 PM
                    Computations on Big IntegersMichael S2022/07/26 05:55 AM
                      Computations on Big IntegersAdrian2022/07/26 07:59 AM
                        IFMA and DivisionBill G2022/07/26 04:25 PM
                          IFMA and Divisionrwessel2022/07/26 08:16 PM
                          IFMA and DivisionAdrian2022/07/27 07:25 AM
                      Computations on Big Integersnone2022/07/27 01:22 AM
                    Big integer multiplication with vector IFMABill G2022/07/29 01:06 AM
                      Big integer multiplication with vector IFMAAdrian2022/07/29 01:35 AM
                        Big integer multiplication with vector IFMA-.-2022/07/29 04:32 AM
                          Big integer multiplication with vector IFMAAdrian2022/07/29 09:47 PM
                            Big integer multiplication with vector IFMAAnon2022/07/30 08:12 AM
                              Big integer multiplication with vector IFMAAdrian2022/07/30 09:27 AM
                                AVX-512 unfriendly to heter-performance coresPaul A. Clayton2022/07/31 03:20 PM
                                  AVX-512 unfriendly to heter-performance coresAnon2022/07/31 03:33 PM
                                    AVX-512 unfriendly to heter-performance coresanonymou52022/07/31 05:03 PM
                                  AVX-512 unfriendly to heter-performance coresBrett2022/07/31 07:26 PM
                                  AVX-512 unfriendly to heter-performance coresAdrian2022/08/01 01:45 AM
                                    Why can't E-cores have narrow/slow AVX-512? (NT)anonymous22022/08/01 03:37 PM
                                      Why can't E-cores have narrow/slow AVX-512?Ivan2022/08/02 12:09 AM
                                        Why can't E-cores have narrow/slow AVX-512?anonymou52022/08/02 10:13 AM
                                        Why can't E-cores have narrow/slow AVX-512?Dummond D. Slow2022/08/02 03:02 PM
                                    AVX-512 unfriendly to heter-performance coresPaul A. Clayton2022/08/02 01:19 PM
                                      AVX-512 unfriendly to heter-performance coresAnon2022/08/02 09:09 PM
                                      AVX-512 unfriendly to heter-performance coresAdrian2022/08/03 12:50 AM
                                        AVX-512 unfriendly to heter-performance coresAnon2022/08/03 09:15 AM
                                          AVX-512 unfriendly to heter-performance cores-.-2022/08/03 08:17 PM
                                            AVX-512 unfriendly to heter-performance coresAnon2022/08/03 09:02 PM
                        IFMA: empty promises from Intel as usualKent R2022/07/29 07:15 PM
                          No hype lasts foreverAnon2022/07/30 08:06 AM
                        Big integer multiplication with vector IFMAme2022/07/30 09:15 AM
                Computations on Big Integers---2022/07/26 09:48 AM
                  Computations on Big Integersnone2022/07/27 01:10 AM
                    Computations on Big Integers---2022/07/28 11:43 AM
                      Computations on Big Integers---2022/07/28 06:44 PM
              Computations on Big Integersdmcq2022/07/26 02:27 PM
                Computations on Big IntegersAdrian2022/07/27 08:15 AM
                  Computations on Big IntegersBrett2022/07/27 11:07 AM
      Yitian 710 Wes Felter2021/10/21 12:51 PM
        Yitian 710 Adrian2021/10/21 01:25 PM
    Yitian 710 Anon2021/10/21 06:08 AM
      Strange definition of the word single. (NT)anon22021/10/21 05:00 PM
        AMD Epyc uses chiplets. This is why "strange"?Mark Roulo2021/10/21 05:08 PM
          AMD Epyc uses chiplets. This is why "strange"?anon22021/10/21 05:34 PM
            Yeah. Blame spec.org, too, though!Mark Roulo2021/10/21 05:58 PM
              Yeah. Blame spec.org, too, though!anon22021/10/21 08:07 PM
                Yeah. Blame spec.org, too, though!Björn Ragnar Björnsson2022/07/17 06:23 AM
              Yeah. Blame spec.org, too, though!Rayla2022/07/17 09:13 AM
                Yeah. Blame spec.org, too, though!Anon2022/07/17 12:01 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊