New article: AMD's Jaguar Microarchitecture

Article: AMD's Jaguar Microarchitecture
By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 4, 2014 1:00 pm
Room: Moderated Discussions
UnmaskedUnderflow (whoawhoawhoa.delete@this.whoa.com) on April 4, 2014 11:45 am wrote:
>
> I can't defend "denormals on by default". For legacy chips like x86/x87, that decision was made long ago
> and kept alive so 30-year old DOS/Fortran programs the govt owns will still work on upgrades with no recompile.
> Denormals in this case require a microtrap so they can 1.) respond to UNmasked specs via the 1985 IEEE-754
> requirement and 2.) to still send an FERR to the southbridge, as original FPs were not part of the main
> cpu. You'd think this and things like A20 bits would be gone by now, but they're not.

Yeah, I agree, it's largely historical baggage. That said, standards are good, and the cost they impose are better than the alternative. So you may not like all the details, but...

> FTZ/DAZ were added later. I wish they were on by default. I wish compilers forced them
> on by default. But such it is. Perhaps someone of your reputation could contact the
> ivory tower ISA greybeards and/or compilers to convince them so? I support that.

I'm too chicken to change the default in the kernel (because the few people who do actually use denorms would quite correctly blame me for breaking their code). Plus it's in a control register that user space has access to, and know libc ends up messing with some other bits, so it's questionable whether the kernel defaults would even matter.

For similar reasons, I don't think libraries should change existing behavior.

But I think would be lovely if the compiler people decided that "if you compile a program anew, you will get FTZ/DAZ behavior". Changing behavior for existing binaries is a nightmare, but when recompiling them with new libraries and compilers, I think everybody would be ready to accept a change in behavior for something like this.

However, even compiler people seem to not be willing to go there. I think icc has an option to turn on ftz/daz by the startup code, but I don't think it's on by default. And I don't think gcc even has the option.

I suspect one reason is that most good chips already do so well on denormals that it just doesn't matter. I haven't timed it myself, but I thought both Intel and AMD have no penalty at all or only a slight slowdown these days on their main cores. So the "it slows things down enormously" doesn't even happen on most cores, it only happens on the small ones.

Which makes a really rare problem be something that most compiler developers will never even see on the machines they use day-to-day. I can understand why they might not consider it a big deal. "Here's a nickel, kid, buy yourself a real computer".

> For those who need denormals (HPC)...IBM makes several chips that support them in-line. The people who
> really need them (physicists, mathematicians, wave equations) know already and choose accordingly.

I'm pretty certain at least modern Intel Core CPU's handle denormal arithmetic with no penalty at all (not even a couple of extra cycles for fixing things up). So this is not "IBM makes these chips". This is literally "most desktop CPU's you buy today have no denormal penalty".

> For a small chip like that discussed here, denormal support is crippled on purpose. It,
> like any chip design, is a tradeoff. Small chips adding big hardware to denormal support
> is the wrong choice, now and always. Moore's Law -- fast denormals on phones?

I suspect that the fact that the big chips handle them so well is going to make it even less likely that software will change.

The good news is that denormals *are* rare. The fact that geekbench happened to hit it was just a fiasco, but that benchmark has other problems, so..

That said, they can and do happen. I remember some (helicopter flying?) game in the late 90's (maybe early 2000's) that caused tons and tons of denormal exceptions, and as a result we did really badly at it at Transmeta because we had thought that they never happen on real code.

It probably wasn't intentional there either, but just some random case where something got sufficiently close to zero just by happenstance, and the developers had never noticed nor cared.

People used to think they wouldn't need floating point at *all* in the embedded space. I think your cellphones will end up handling denorms just fine some day too.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: AMD's Jaguar MicroarchitectureDavid Kanter2014/04/01 12:19 AM
  New article: AMD's Jaguar MicroarchitectureSHK2014/04/01 05:09 AM
    New article: AMD's Jaguar MicroarchitectureJeff Rupley2014/04/01 06:13 PM
      New article: AMD's Jaguar MicroarchitectureSHK2014/04/02 05:45 AM
        CMOV is 3 operand given register renamingPaul A. Clayton2014/04/02 08:11 AM
          CMOV is 3 operand given register renamingSHK2014/04/02 11:17 AM
            Limited operand tags in issue queue entriesPaul A. Clayton2014/04/02 12:32 PM
        New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 11:48 AM
          New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 01:32 PM
  New article: AMD's Jaguar MicroarchitectureGeorge2014/04/01 01:10 PM
  New article: AMD's Jaguar Microarchitecturewillmore2014/04/01 05:37 PM
    New article: AMD's Jaguar Microarchitecturewillmore2014/04/01 06:08 PM
    New article: AMD's Jaguar MicroarchitectureNaN2014/04/02 07:58 AM
      New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/04 06:16 AM
        New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 07:54 AM
          New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/04 10:45 AM
            New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 01:00 PM
              New article: AMD's Jaguar MicroarchitectureNoSpammer2014/04/04 02:15 PM
              New article: AMD's Jaguar MicroarchitectureTREZA2014/04/04 02:18 PM
                New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 03:56 PM
                  New article: AMD's Jaguar MicroarchitectureTREZA2014/04/04 04:34 PM
                  New article: AMD's Jaguar MicroarchitectureMichael S2014/04/05 10:02 AM
                  New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/05 05:50 PM
                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 12:22 AM
                    New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 04:29 AM
                      New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/06 06:33 AM
                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 02:12 AM
                          New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 05:58 AM
                        New article: AMD's Jaguar MicroarchitectureEduardoS2014/04/07 03:34 PM
                      New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/06 06:53 AM
                      New article: AMD's Jaguar MicroarchitectureMegol2014/04/06 07:21 AM
                        New article: AMD's Jaguar Microarchitecturenone2014/04/06 08:07 AM
                          New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 08:23 AM
                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 01:48 PM
                          New article: AMD's Jaguar MicroarchitectureTREZA2014/04/06 02:47 PM
                            New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 01:34 AM
                              New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 02:27 AM
                                New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 04:39 AM
                                  New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/07 11:26 AM
                                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 12:42 PM
                                    New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 12:50 PM
                                      New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/07 01:11 PM
                                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 04:44 PM
                                      New article: AMD's Jaguar MicroarchitectureTREZA2014/04/07 02:38 PM
              denormal on IvyB and HaswellMichael S2014/04/05 09:45 AM
                Forum searchiz2014/04/05 11:54 AM
                denormal on IvyB and HaswellLinus Torvalds2014/04/06 08:55 AM
                  denormal on IvyB and HaswellMichael S2014/04/17 05:43 PM
            New article: AMD's Jaguar Microarchitecturedmcq2014/04/05 05:52 AM
            New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/05 09:38 AM
              New article: AMD's Jaguar MicroarchitectureMichael S2014/04/05 09:59 AM
                New article: AMD's Jaguar MicroarchitectureBrett2014/04/05 11:12 AM
                  New article: AMD's Jaguar MicroarchitectureEduardoS2014/04/05 11:29 AM
                    New article: AMD's Jaguar MicroarchitectureBrett2014/04/05 12:00 PM
                      New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 01:18 AM
                        New article: AMD's Jaguar MicroarchitectureBrett2014/04/06 09:08 AM
                          New article: AMD's Jaguar MicroarchitectureBrett2014/04/06 09:11 AM
                New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/05 05:01 PM
                  New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 12:50 AM
                    New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/06 02:52 PM
                      New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 01:20 AM
                        New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/07 09:38 AM
                          New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 09:47 AM
                            New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/07 01:52 PM
                              New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 03:01 PM
                                New article: AMD's Jaguar MicroarchitectureSeni2014/04/08 01:03 PM
                                  New article: AMD's Jaguar MicroarchitectureWilco2014/04/08 01:56 PM
                                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/08 03:05 PM
                                      New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/08 05:55 PM
                                        New article: AMD's Jaguar MicroarchitectureMichael S2014/04/09 12:12 AM
                  New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 03:51 AM
  New article: AMD's Jaguar MicroarchitectureWaltC2014/04/02 12:52 PM
    New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 01:25 PM
      New article: AMD's Jaguar Microarchitectureitsmydamnation2014/04/02 11:19 PM
      New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/09 12:44 PM
        New article: AMD's Jaguar MicroarchitectureDavid Kanter2014/04/10 10:24 PM
          New article: AMD's Jaguar Microarchitecturenone2014/04/11 12:49 AM
          New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/11 08:14 AM
    New article: AMD's Jaguar MicroarchitectureRyan Dean2014/04/03 12:04 AM
  New article: AMD's Jaguar MicroarchitecturePaul A. Clayton2014/04/02 04:02 PM
  New article: AMD's Jaguar MicroarchitectureRicky Chan2014/04/03 06:50 AM
    New article: AMD's Jaguar Microarchitecturesomeone2014/04/04 06:18 AM
  New article: AMD's Jaguar Microarchitecturebakaneko2014/04/09 02:08 PM
    New article: AMD's Jaguar MicroarchitectureTREZA2014/04/09 04:34 PM
  Jaguar's detailsHugo DĂ©charnes2014/06/07 03:08 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?