New article: AMD's Jaguar Microarchitecture

Article: AMD's Jaguar Microarchitecture
By: Wilco (Wilco.Dijkstra.delete@this.ntlworld.com), April 7, 2014 3:12 am
Room: Moderated Discussions
computational_scientist (brian.bj.parker99.delete@this.gmail.com) on April 6, 2014 7:33 am wrote:
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on April 6, 2014 5:29 am wrote:
> > computational_scientist (brian.bj.parker99.delete@this.gmail.com) on April 5, 2014 6:50 pm wrote:
> > > Linus Torvalds (torvalds.delete@this.linux-foundation.org) on April 4, 2014 4:56 pm wrote:
> > > > TREZA (no.delete@this.ema.il) on April 4, 2014 3:18 pm wrote:
> > > > >
> > > > > Denormals is what made IEEE FP superior to VAX. It is the correct way of doing FP math!
> > > >
> > > > Don't get me wrong - I actually personally like denormals, and think they are
> > > > definitely required for serious FP math, but the thing is, 99% of FP use isn't
> > > > really serious, and most people don't really understand FP anyway.
> > > >
> > > > The kind of people who really understand FP and do serious math using it (to the point of
> > > > caring about the order of operations, never mind denormals), those kinds of people really
> > > > do know what they are doing, and sometimes (although not always) really do want denormals.
> > > >
> > > > But there's the other side of FP math, which really just wants good enough values quickly. There are a lot
> > > > of people who are ok with single-precision and no denormals. And yes, quite often 24 bits of precision is
> > > > not enough, and they decide they actually need double precision in order to avoid odd visual artifacts.
> > > >
> > > > Yeah, I'm talking about things like games.
> > > >
> > > > And the thing is, the defaults tend to be the wrong way around. People who don't know what they
> > > > are doing with floating point basically *never* need denormals. You will generally hit other issues
> > > > long before you hit the "oops, I lost precision because I didn't have denormals". But exactly
> > > > *because* they don't know about floating point, they also don't know to disable them.
> > > >
> > > > So I think that from a hardware designer standpoint, it actually
> > > > would make more sense if the default was "denormals
> > > > are zero" (or "flush to zero"), because the people who do want denormals also know about them.
> > > > So while
> > > > I like denormals, the onus on disabling denormals when not needed is kind of the wrong way around.
> > >
> > > This is back to front reasoning: physicists, mathematicians and most computational scientists don't
> > > know anything about the details of floating point and use the defaults like everyone else.
> > > Even simple assumptions like x==y iff x-y==0 don't always hold in FTZ mode; default denormal mode
> > > is the simplest floating point model mathematically. By contrast, people running gaming benchmarks
> > > who know about these things can easily enable FTZ to speed them up for marketing purposes.
> >
> > You can't rely on x==y iff x-y==0 even with denormals. IEEE floating point is extremely complex
> > with lots of odd non-intuitive corner cases. Very few mathematical identities work on IEEE,
> > with or without denormals. Denormals are a major pain in algorithms, for example you need
> > special cases to avoid the catastrophic loss of precision (and the infinities/NaNs caused
> > by calculations with a denormal) as well as the huge slowdown on many CPUs.
> >
> > > http://www.cs.berkeley.edu/~wkahan/19July10.pdf
> > > has some interesting history and rationale for denormals.
> > > (There are several readable summaries of the rationale for other IEEE754
> > > features on Kahan's web site that are well worth reading).
> > >
> > > >
> > > > But it's too late, and big CPU's largely do already handle them right
> > > > and don't care, so it's mostly a problem for the embedded space.
> > >
> > > Using the denormal benchmark at http://charm.cs.uiuc.edu/subnormal/ , I see an (acceptable) 8x slowdown
> > > of denormals on my 2.3 GHz Intel Core i7 macbook pro in SSE mode and an (unacceptable) 53x slowdown
> > > in x87 mode (which is particularly egregious as x87 mode is needed for precise numerical work).
> > > FMAC instructions make fast denormal processing easy to implement,
> > > which is probably why Jaguar's denormal handling is slow.
> >
> > 8x slowdown is still unacceptable in my book - 10% is acceptable. POWER and ARM are at that level.
> >
> > > It is a shame that as the importance of computational methods to society and the need for accurate
> > > and reliable floating point has increased in *absolute* terms over the last decades, the *relative*
> > > decreased usage compared with multimedia applications has lead to floating point hardware capabilities
> > > being degraded to meet gamers' needs. Processor speed, memory size, even screen resolution, have
> > > all increased monotonically over the last decades- floating point precision, range and reliability
> > > is the only feature that has actually decreased... very unfortunate.
> >
> > The fact is the contrary is actually true. The non-IEEE compliant 80-bit x87 is finally dead
>
> The 80-bit extended precision format has always been a recommended format of IEEE 754
> (and is far from dead; in any case how can decreasing precision and range be progress?)

The IEEE standard doesn't recommend 80-bit formats at all. The specified IEEE formats are always powers of 2. You're allowed to define extended formats in any way you like, but they remain internal formats which are not standardized in any way.

x87 is dead because of SSE which finally supports correct IEEE arithmetic. Yes you can still use x87 if you like, but it's not widely supported (eg. try using long double in VC++ and see how far you get).

> >- a huge
> > step forward for floating point. Programming languages and compilers adopted IEEE and by default provide
> > IEEE compliant optimizations as well user selectable adventurous FP optimizations. Libraries have improved
> > hugely as well, providing far more accurate math functions (
>
> The improvements in fp in C99 and libraries is indeed good progress.
>
> >the norm is now 128-bit IEEE is becoming available in hardware.
>
> Really? I haven't seen any evidence that 128-bit IEEE support is planned for x86 (or ARM anything else
> except IBM mainframes). If it was then x87 extended precision format truly could be made redundant.

128-bit format support is already available in various compilers and libraries, and SPARC/POWER support it in the ISA. When x86/ARM will follow is a good question (as FP beyond 64-bit is a niche), but if you need it, it is available today.

> The 80-bit IEEE extended precision format was actually explicitly designed to be upgraded from
> 80 to 128 bits (without any software changes required)- it has the same exponent width as 128-bit
> IEEE and, indeed, in current designs is actually spilled as 128 bits in memory.

The 80-bit x87 format is in no way binary compatible with 128-bit quad precision.

Wilco
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: AMD's Jaguar MicroarchitectureDavid Kanter2014/04/01 01:19 AM
  New article: AMD's Jaguar MicroarchitectureSHK2014/04/01 06:09 AM
    New article: AMD's Jaguar MicroarchitectureJeff Rupley2014/04/01 07:13 PM
      New article: AMD's Jaguar MicroarchitectureSHK2014/04/02 06:45 AM
        CMOV is 3 operand given register renamingPaul A. Clayton2014/04/02 09:11 AM
          CMOV is 3 operand given register renamingSHK2014/04/02 12:17 PM
            Limited operand tags in issue queue entriesPaul A. Clayton2014/04/02 01:32 PM
        New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 12:48 PM
          New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 02:32 PM
  New article: AMD's Jaguar MicroarchitectureGeorge2014/04/01 02:10 PM
  New article: AMD's Jaguar Microarchitecturewillmore2014/04/01 06:37 PM
    New article: AMD's Jaguar Microarchitecturewillmore2014/04/01 07:08 PM
    New article: AMD's Jaguar MicroarchitectureNaN2014/04/02 08:58 AM
      New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/04 07:16 AM
        New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 08:54 AM
          New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/04 11:45 AM
            New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 02:00 PM
              New article: AMD's Jaguar MicroarchitectureNoSpammer2014/04/04 03:15 PM
              New article: AMD's Jaguar MicroarchitectureTREZA2014/04/04 03:18 PM
                New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/04 04:56 PM
                  New article: AMD's Jaguar MicroarchitectureTREZA2014/04/04 05:34 PM
                  New article: AMD's Jaguar MicroarchitectureMichael S2014/04/05 11:02 AM
                  New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/05 06:50 PM
                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 01:22 AM
                    New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 05:29 AM
                      New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/06 07:33 AM
                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 03:12 AM
                          New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 06:58 AM
                        New article: AMD's Jaguar MicroarchitectureEduardoS2014/04/07 04:34 PM
                      New article: AMD's Jaguar Microarchitecturecomputational_scientist2014/04/06 07:53 AM
                      New article: AMD's Jaguar MicroarchitectureMegol2014/04/06 08:21 AM
                        New article: AMD's Jaguar Microarchitecturenone2014/04/06 09:07 AM
                          New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 09:23 AM
                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 02:48 PM
                          New article: AMD's Jaguar MicroarchitectureTREZA2014/04/06 03:47 PM
                            New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 02:34 AM
                              New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 03:27 AM
                                New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 05:39 AM
                                  New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/07 12:26 PM
                                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 01:42 PM
                                    New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 01:50 PM
                                      New article: AMD's Jaguar MicroarchitectureUnmaskedUnderflow2014/04/07 02:11 PM
                                        New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 05:44 PM
                                      New article: AMD's Jaguar MicroarchitectureTREZA2014/04/07 03:38 PM
              denormal on IvyB and HaswellMichael S2014/04/05 10:45 AM
                Forum searchiz2014/04/05 12:54 PM
                denormal on IvyB and HaswellLinus Torvalds2014/04/06 09:55 AM
                  denormal on IvyB and HaswellMichael S2014/04/17 06:43 PM
            New article: AMD's Jaguar Microarchitecturedmcq2014/04/05 06:52 AM
            New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/05 10:38 AM
              New article: AMD's Jaguar MicroarchitectureMichael S2014/04/05 10:59 AM
                New article: AMD's Jaguar MicroarchitectureBrett2014/04/05 12:12 PM
                  New article: AMD's Jaguar MicroarchitectureEduardoS2014/04/05 12:29 PM
                    New article: AMD's Jaguar MicroarchitectureBrett2014/04/05 01:00 PM
                      New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 02:18 AM
                        New article: AMD's Jaguar MicroarchitectureBrett2014/04/06 10:08 AM
                          New article: AMD's Jaguar MicroarchitectureBrett2014/04/06 10:11 AM
                New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/05 06:01 PM
                  New article: AMD's Jaguar MicroarchitectureMichael S2014/04/06 01:50 AM
                    New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/06 03:52 PM
                      New article: AMD's Jaguar MicroarchitectureMichael S2014/04/07 02:20 AM
                        New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/07 10:38 AM
                          New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 10:47 AM
                            New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/07 02:52 PM
                              New article: AMD's Jaguar MicroarchitectureWilco2014/04/07 04:01 PM
                                New article: AMD's Jaguar MicroarchitectureSeni2014/04/08 02:03 PM
                                  New article: AMD's Jaguar MicroarchitectureWilco2014/04/08 02:56 PM
                                    New article: AMD's Jaguar MicroarchitectureMichael S2014/04/08 04:05 PM
                                      New article: AMD's Jaguar MicroarchitectureMaynard Handley2014/04/08 06:55 PM
                                        New article: AMD's Jaguar MicroarchitectureMichael S2014/04/09 01:12 AM
                  New article: AMD's Jaguar MicroarchitectureWilco2014/04/06 04:51 AM
  New article: AMD's Jaguar MicroarchitectureWaltC2014/04/02 01:52 PM
    New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/02 02:25 PM
      New article: AMD's Jaguar Microarchitectureitsmydamnation2014/04/03 12:19 AM
      New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/09 01:44 PM
        New article: AMD's Jaguar MicroarchitectureDavid Kanter2014/04/10 11:24 PM
          New article: AMD's Jaguar Microarchitecturenone2014/04/11 01:49 AM
          New article: AMD's Jaguar MicroarchitectureLinus Torvalds2014/04/11 09:14 AM
    New article: AMD's Jaguar MicroarchitectureRyan Dean2014/04/03 01:04 AM
  New article: AMD's Jaguar MicroarchitecturePaul A. Clayton2014/04/02 05:02 PM
  New article: AMD's Jaguar MicroarchitectureRicky Chan2014/04/03 07:50 AM
    New article: AMD's Jaguar Microarchitecturesomeone2014/04/04 07:18 AM
  New article: AMD's Jaguar Microarchitecturebakaneko2014/04/09 03:08 PM
    New article: AMD's Jaguar MicroarchitectureTREZA2014/04/09 05:34 PM
  Jaguar's detailsHugo Décharnes2014/06/07 04:08 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?