Article: AMD's Jaguar Microarchitecture
By: computational_scientist (brian.bj.parker99.delete@this.gmail.com), April 6, 2014 6:33 am
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on April 6, 2014 5:29 am wrote:
> computational_scientist (brian.bj.parker99.delete@this.gmail.com) on April 5, 2014 6:50 pm wrote:
> > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on April 4, 2014 4:56 pm wrote:
> > > TREZA (no.delete@this.ema.il) on April 4, 2014 3:18 pm wrote:
> > > >
> > > > Denormals is what made IEEE FP superior to VAX. It is the correct way of doing FP math!
> > >
> > > Don't get me wrong - I actually personally like denormals, and think they are
> > > definitely required for serious FP math, but the thing is, 99% of FP use isn't
> > > really serious, and most people don't really understand FP anyway.
> > >
> > > The kind of people who really understand FP and do serious math using it (to the point of
> > > caring about the order of operations, never mind denormals), those kinds of people really
> > > do know what they are doing, and sometimes (although not always) really do want denormals.
> > >
> > > But there's the other side of FP math, which really just wants good enough values quickly. There are a lot
> > > of people who are ok with single-precision and no denormals. And yes, quite often 24 bits of precision is
> > > not enough, and they decide they actually need double precision in order to avoid odd visual artifacts.
> > >
> > > Yeah, I'm talking about things like games.
> > >
> > > And the thing is, the defaults tend to be the wrong way around. People who don't know what they
> > > are doing with floating point basically *never* need denormals. You will generally hit other issues
> > > long before you hit the "oops, I lost precision because I didn't have denormals". But exactly
> > > *because* they don't know about floating point, they also don't know to disable them.
> > >
> > > So I think that from a hardware designer standpoint, it actually
> > > would make more sense if the default was "denormals
> > > are zero" (or "flush to zero"), because the people who do want denormals also know about them.
> > > So while
> > > I like denormals, the onus on disabling denormals when not needed is kind of the wrong way around.
> >
> > This is back to front reasoning: physicists, mathematicians and most computational scientists don't
> > know anything about the details of floating point and use the defaults like everyone else.
> > Even simple assumptions like x==y iff x-y==0 don't always hold in FTZ mode; default denormal mode
> > is the simplest floating point model mathematically. By contrast, people running gaming benchmarks
> > who know about these things can easily enable FTZ to speed them up for marketing purposes.
>
> You can't rely on x==y iff x-y==0 even with denormals. IEEE floating point is extremely complex
> with lots of odd non-intuitive corner cases. Very few mathematical identities work on IEEE,
> with or without denormals. Denormals are a major pain in algorithms, for example you need
> special cases to avoid the catastrophic loss of precision (and the infinities/NaNs caused
> by calculations with a denormal) as well as the huge slowdown on many CPUs.
>
> > http://www.cs.berkeley.edu/~wkahan/19July10.pdf
> > has some interesting history and rationale for denormals.
> > (There are several readable summaries of the rationale for other IEEE754
> > features on Kahan's web site that are well worth reading).
> >
> > >
> > > But it's too late, and big CPU's largely do already handle them right
> > > and don't care, so it's mostly a problem for the embedded space.
> >
> > Using the denormal benchmark at http://charm.cs.uiuc.edu/subnormal/ , I see an (acceptable) 8x slowdown
> > of denormals on my 2.3 GHz Intel Core i7 macbook pro in SSE mode and an (unacceptable) 53x slowdown
> > in x87 mode (which is particularly egregious as x87 mode is needed for precise numerical work).
> > FMAC instructions make fast denormal processing easy to implement,
> > which is probably why Jaguar's denormal handling is slow.
>
> 8x slowdown is still unacceptable in my book - 10% is acceptable. POWER and ARM are at that level.
>
> > It is a shame that as the importance of computational methods to society and the need for accurate
> > and reliable floating point has increased in *absolute* terms over the last decades, the *relative*
> > decreased usage compared with multimedia applications has lead to floating point hardware capabilities
> > being degraded to meet gamers' needs. Processor speed, memory size, even screen resolution, have
> > all increased monotonically over the last decades- floating point precision, range and reliability
> > is the only feature that has actually decreased... very unfortunate.
>
> The fact is the contrary is actually true. The non-IEEE compliant 80-bit x87 is finally dead
The 80-bit extended precision format has always been a recommended format of IEEE 754
(and is far from dead; in any case how can decreasing precision and range be progress?)
>- a huge
> step forward for floating point. Programming languages and compilers adopted IEEE and by default provide
> IEEE compliant optimizations as well user selectable adventurous FP optimizations. Libraries have improved
> hugely as well, providing far more accurate math functions (
The improvements in fp in C99 and libraries is indeed good progress.
>the norm is now 128-bit IEEE is becoming available in hardware.
Really? I haven't seen any evidence that 128-bit IEEE support is planned for x86 (or ARM anything else except IBM mainframes). If it was then x87 extended precision format truly could be made redundant.
The 80-bit IEEE extended precision format was actually explicitly designed to be upgraded from 80 to 128 bits (without any software changes required)- it has the same exponent width as 128-bit IEEE and, indeed, in current designs is actually spilled as 128 bits in memory.
>So we've made huge progress in the last decades.
>
> Wilco
>
> computational_scientist (brian.bj.parker99.delete@this.gmail.com) on April 5, 2014 6:50 pm wrote:
> > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on April 4, 2014 4:56 pm wrote:
> > > TREZA (no.delete@this.ema.il) on April 4, 2014 3:18 pm wrote:
> > > >
> > > > Denormals is what made IEEE FP superior to VAX. It is the correct way of doing FP math!
> > >
> > > Don't get me wrong - I actually personally like denormals, and think they are
> > > definitely required for serious FP math, but the thing is, 99% of FP use isn't
> > > really serious, and most people don't really understand FP anyway.
> > >
> > > The kind of people who really understand FP and do serious math using it (to the point of
> > > caring about the order of operations, never mind denormals), those kinds of people really
> > > do know what they are doing, and sometimes (although not always) really do want denormals.
> > >
> > > But there's the other side of FP math, which really just wants good enough values quickly. There are a lot
> > > of people who are ok with single-precision and no denormals. And yes, quite often 24 bits of precision is
> > > not enough, and they decide they actually need double precision in order to avoid odd visual artifacts.
> > >
> > > Yeah, I'm talking about things like games.
> > >
> > > And the thing is, the defaults tend to be the wrong way around. People who don't know what they
> > > are doing with floating point basically *never* need denormals. You will generally hit other issues
> > > long before you hit the "oops, I lost precision because I didn't have denormals". But exactly
> > > *because* they don't know about floating point, they also don't know to disable them.
> > >
> > > So I think that from a hardware designer standpoint, it actually
> > > would make more sense if the default was "denormals
> > > are zero" (or "flush to zero"), because the people who do want denormals also know about them.
> > > So while
> > > I like denormals, the onus on disabling denormals when not needed is kind of the wrong way around.
> >
> > This is back to front reasoning: physicists, mathematicians and most computational scientists don't
> > know anything about the details of floating point and use the defaults like everyone else.
> > Even simple assumptions like x==y iff x-y==0 don't always hold in FTZ mode; default denormal mode
> > is the simplest floating point model mathematically. By contrast, people running gaming benchmarks
> > who know about these things can easily enable FTZ to speed them up for marketing purposes.
>
> You can't rely on x==y iff x-y==0 even with denormals. IEEE floating point is extremely complex
> with lots of odd non-intuitive corner cases. Very few mathematical identities work on IEEE,
> with or without denormals. Denormals are a major pain in algorithms, for example you need
> special cases to avoid the catastrophic loss of precision (and the infinities/NaNs caused
> by calculations with a denormal) as well as the huge slowdown on many CPUs.
>
> > http://www.cs.berkeley.edu/~wkahan/19July10.pdf
> > has some interesting history and rationale for denormals.
> > (There are several readable summaries of the rationale for other IEEE754
> > features on Kahan's web site that are well worth reading).
> >
> > >
> > > But it's too late, and big CPU's largely do already handle them right
> > > and don't care, so it's mostly a problem for the embedded space.
> >
> > Using the denormal benchmark at http://charm.cs.uiuc.edu/subnormal/ , I see an (acceptable) 8x slowdown
> > of denormals on my 2.3 GHz Intel Core i7 macbook pro in SSE mode and an (unacceptable) 53x slowdown
> > in x87 mode (which is particularly egregious as x87 mode is needed for precise numerical work).
> > FMAC instructions make fast denormal processing easy to implement,
> > which is probably why Jaguar's denormal handling is slow.
>
> 8x slowdown is still unacceptable in my book - 10% is acceptable. POWER and ARM are at that level.
>
> > It is a shame that as the importance of computational methods to society and the need for accurate
> > and reliable floating point has increased in *absolute* terms over the last decades, the *relative*
> > decreased usage compared with multimedia applications has lead to floating point hardware capabilities
> > being degraded to meet gamers' needs. Processor speed, memory size, even screen resolution, have
> > all increased monotonically over the last decades- floating point precision, range and reliability
> > is the only feature that has actually decreased... very unfortunate.
>
> The fact is the contrary is actually true. The non-IEEE compliant 80-bit x87 is finally dead
The 80-bit extended precision format has always been a recommended format of IEEE 754
(and is far from dead; in any case how can decreasing precision and range be progress?)
>- a huge
> step forward for floating point. Programming languages and compilers adopted IEEE and by default provide
> IEEE compliant optimizations as well user selectable adventurous FP optimizations. Libraries have improved
> hugely as well, providing far more accurate math functions (
The improvements in fp in C99 and libraries is indeed good progress.
>the norm is now 128-bit IEEE is becoming available in hardware.
Really? I haven't seen any evidence that 128-bit IEEE support is planned for x86 (or ARM anything else except IBM mainframes). If it was then x87 extended precision format truly could be made redundant.
The 80-bit IEEE extended precision format was actually explicitly designed to be upgraded from 80 to 128 bits (without any software changes required)- it has the same exponent width as 128-bit IEEE and, indeed, in current designs is actually spilled as 128 bits in memory.
>So we've made huge progress in the last decades.
>
> Wilco
>