Article: AMD's Jaguar Microarchitecture
By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 4, 2014 1:00 pm
Room: Moderated Discussions
UnmaskedUnderflow (whoawhoawhoa.delete@this.whoa.com) on April 4, 2014 11:45 am wrote:
>
> I can't defend "denormals on by default". For legacy chips like x86/x87, that decision was made long ago
> and kept alive so 30-year old DOS/Fortran programs the govt owns will still work on upgrades with no recompile.
> Denormals in this case require a microtrap so they can 1.) respond to UNmasked specs via the 1985 IEEE-754
> requirement and 2.) to still send an FERR to the southbridge, as original FPs were not part of the main
> cpu. You'd think this and things like A20 bits would be gone by now, but they're not.
Yeah, I agree, it's largely historical baggage. That said, standards are good, and the cost they impose are better than the alternative. So you may not like all the details, but...
> FTZ/DAZ were added later. I wish they were on by default. I wish compilers forced them
> on by default. But such it is. Perhaps someone of your reputation could contact the
> ivory tower ISA greybeards and/or compilers to convince them so? I support that.
I'm too chicken to change the default in the kernel (because the few people who do actually use denorms would quite correctly blame me for breaking their code). Plus it's in a control register that user space has access to, and know libc ends up messing with some other bits, so it's questionable whether the kernel defaults would even matter.
For similar reasons, I don't think libraries should change existing behavior.
But I think would be lovely if the compiler people decided that "if you compile a program anew, you will get FTZ/DAZ behavior". Changing behavior for existing binaries is a nightmare, but when recompiling them with new libraries and compilers, I think everybody would be ready to accept a change in behavior for something like this.
However, even compiler people seem to not be willing to go there. I think icc has an option to turn on ftz/daz by the startup code, but I don't think it's on by default. And I don't think gcc even has the option.
I suspect one reason is that most good chips already do so well on denormals that it just doesn't matter. I haven't timed it myself, but I thought both Intel and AMD have no penalty at all or only a slight slowdown these days on their main cores. So the "it slows things down enormously" doesn't even happen on most cores, it only happens on the small ones.
Which makes a really rare problem be something that most compiler developers will never even see on the machines they use day-to-day. I can understand why they might not consider it a big deal. "Here's a nickel, kid, buy yourself a real computer".
> For those who need denormals (HPC)...IBM makes several chips that support them in-line. The people who
> really need them (physicists, mathematicians, wave equations) know already and choose accordingly.
I'm pretty certain at least modern Intel Core CPU's handle denormal arithmetic with no penalty at all (not even a couple of extra cycles for fixing things up). So this is not "IBM makes these chips". This is literally "most desktop CPU's you buy today have no denormal penalty".
> For a small chip like that discussed here, denormal support is crippled on purpose. It,
> like any chip design, is a tradeoff. Small chips adding big hardware to denormal support
> is the wrong choice, now and always. Moore's Law -- fast denormals on phones?
I suspect that the fact that the big chips handle them so well is going to make it even less likely that software will change.
The good news is that denormals *are* rare. The fact that geekbench happened to hit it was just a fiasco, but that benchmark has other problems, so..
That said, they can and do happen. I remember some (helicopter flying?) game in the late 90's (maybe early 2000's) that caused tons and tons of denormal exceptions, and as a result we did really badly at it at Transmeta because we had thought that they never happen on real code.
It probably wasn't intentional there either, but just some random case where something got sufficiently close to zero just by happenstance, and the developers had never noticed nor cared.
People used to think they wouldn't need floating point at *all* in the embedded space. I think your cellphones will end up handling denorms just fine some day too.
Linus
>
> I can't defend "denormals on by default". For legacy chips like x86/x87, that decision was made long ago
> and kept alive so 30-year old DOS/Fortran programs the govt owns will still work on upgrades with no recompile.
> Denormals in this case require a microtrap so they can 1.) respond to UNmasked specs via the 1985 IEEE-754
> requirement and 2.) to still send an FERR to the southbridge, as original FPs were not part of the main
> cpu. You'd think this and things like A20 bits would be gone by now, but they're not.
Yeah, I agree, it's largely historical baggage. That said, standards are good, and the cost they impose are better than the alternative. So you may not like all the details, but...
> FTZ/DAZ were added later. I wish they were on by default. I wish compilers forced them
> on by default. But such it is. Perhaps someone of your reputation could contact the
> ivory tower ISA greybeards and/or compilers to convince them so? I support that.
I'm too chicken to change the default in the kernel (because the few people who do actually use denorms would quite correctly blame me for breaking their code). Plus it's in a control register that user space has access to, and know libc ends up messing with some other bits, so it's questionable whether the kernel defaults would even matter.
For similar reasons, I don't think libraries should change existing behavior.
But I think would be lovely if the compiler people decided that "if you compile a program anew, you will get FTZ/DAZ behavior". Changing behavior for existing binaries is a nightmare, but when recompiling them with new libraries and compilers, I think everybody would be ready to accept a change in behavior for something like this.
However, even compiler people seem to not be willing to go there. I think icc has an option to turn on ftz/daz by the startup code, but I don't think it's on by default. And I don't think gcc even has the option.
I suspect one reason is that most good chips already do so well on denormals that it just doesn't matter. I haven't timed it myself, but I thought both Intel and AMD have no penalty at all or only a slight slowdown these days on their main cores. So the "it slows things down enormously" doesn't even happen on most cores, it only happens on the small ones.
Which makes a really rare problem be something that most compiler developers will never even see on the machines they use day-to-day. I can understand why they might not consider it a big deal. "Here's a nickel, kid, buy yourself a real computer".
> For those who need denormals (HPC)...IBM makes several chips that support them in-line. The people who
> really need them (physicists, mathematicians, wave equations) know already and choose accordingly.
I'm pretty certain at least modern Intel Core CPU's handle denormal arithmetic with no penalty at all (not even a couple of extra cycles for fixing things up). So this is not "IBM makes these chips". This is literally "most desktop CPU's you buy today have no denormal penalty".
> For a small chip like that discussed here, denormal support is crippled on purpose. It,
> like any chip design, is a tradeoff. Small chips adding big hardware to denormal support
> is the wrong choice, now and always. Moore's Law -- fast denormals on phones?
I suspect that the fact that the big chips handle them so well is going to make it even less likely that software will change.
The good news is that denormals *are* rare. The fact that geekbench happened to hit it was just a fiasco, but that benchmark has other problems, so..
That said, they can and do happen. I remember some (helicopter flying?) game in the late 90's (maybe early 2000's) that caused tons and tons of denormal exceptions, and as a result we did really badly at it at Transmeta because we had thought that they never happen on real code.
It probably wasn't intentional there either, but just some random case where something got sufficiently close to zero just by happenstance, and the developers had never noticed nor cared.
People used to think they wouldn't need floating point at *all* in the embedded space. I think your cellphones will end up handling denorms just fine some day too.
Linus