Article: AMD's Jaguar Microarchitecture
By: UnmaskedUnderflow (unmasked.delete@this.unmasked.org), April 7, 2014 11:26 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on April 7, 2014 5:39 am wrote:
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on April 7, 2014 3:27 am wrote:
> > >
> > > The real problem is more exotic - x87 precision control can reduce the precision of
> > > mantissa, but it can't reduce the range of exponent. So, results of x87 computations
> > > with single or double precision remain the same as specified by IEEE only as long as
> > > you stay within official range. Which sounds nearly impossible in practice.
> >
> > You can store to memory after every operation to get the exponent right - this is still
> > not IEEE compliant as denormals suffer from double rounding. Of course doing this causes
> > another performance penalty but at least it gives more consistent results than variables
> > whose values suddenly change due to needing to be spilled to memory by the compiler.
> >
> >
> > Basically you cannot get IEEE results from x87. Quite ironic since
> > x87 was supposed to be the first IEEE implementation...
> >
>
This part of the conversation is confusing details. x87 precision will correctly round once if you set it to DP/SP and produce the correct memory result internally.
It is ONLY if you set precision to EP, then store it as a DP to memory. You are turning it into two operations -- operation then store/convert. That's the famous "double-round" bit. Per operation, that is exactly IEEE compliant.
>
> > > > The Motorola 68881/2 did not have that problem, IIRC.
> > > > (The whole x87 instruction set is a joke anyway)
> >
> > And ARM's FPA did get it right too. The broken stack implementation is another idiotic aspect of x87 indeed.
> >
> > Wilco
> >
1.) The x87 stack has been renamed for as long as renaming has been around. Hardware wise it just fronts the rules of a stack. If you must throw rocks, throw rocks at 8 arch registers.
2.) ARM is masked. x87/x86 must legacy support unmasked. I can't emphasize what that means enough. (Maybe I should put it in my name.) This means keeping the unrounded infinite precise result around. The implementations are not equivalent...but certainly in the realm of "who gives a crap" for 99.99999% of the community.
> Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on April 7, 2014 3:27 am wrote:
> > >
> > > The real problem is more exotic - x87 precision control can reduce the precision of
> > > mantissa, but it can't reduce the range of exponent. So, results of x87 computations
> > > with single or double precision remain the same as specified by IEEE only as long as
> > > you stay within official range. Which sounds nearly impossible in practice.
> >
> > You can store to memory after every operation to get the exponent right - this is still
> > not IEEE compliant as denormals suffer from double rounding. Of course doing this causes
> > another performance penalty but at least it gives more consistent results than variables
> > whose values suddenly change due to needing to be spilled to memory by the compiler.
> >
> >
> > Basically you cannot get IEEE results from x87. Quite ironic since
> > x87 was supposed to be the first IEEE implementation...
> >
>
This part of the conversation is confusing details. x87 precision will correctly round once if you set it to DP/SP and produce the correct memory result internally.
It is ONLY if you set precision to EP, then store it as a DP to memory. You are turning it into two operations -- operation then store/convert. That's the famous "double-round" bit. Per operation, that is exactly IEEE compliant.
>
> > > > The Motorola 68881/2 did not have that problem, IIRC.
> > > > (The whole x87 instruction set is a joke anyway)
> >
> > And ARM's FPA did get it right too. The broken stack implementation is another idiotic aspect of x87 indeed.
> >
> > Wilco
> >
1.) The x87 stack has been renamed for as long as renaming has been around. Hardware wise it just fronts the rules of a stack. If you must throw rocks, throw rocks at 8 arch registers.
2.) ARM is masked. x87/x86 must legacy support unmasked. I can't emphasize what that means enough. (Maybe I should put it in my name.) This means keeping the unrounded infinite precise result around. The implementations are not equivalent...but certainly in the realm of "who gives a crap" for 99.99999% of the community.