By: Michael S (already5chosen.delete@this.yahoo.com), November 17, 2012 2:45 pm
Room: Moderated Discussions
Felid (Felid.delete@this.mailinator.com) on November 17, 2012 1:44 pm wrote:
> Michael S (already5chosen.delete@this.yahoo.com) on November 17, 2012 12:50 pm wrote:
> > DIV r64 is even more interesting:
> >
> >
> > So, long integer division on IVB is not just partially pipelined,
> > but they somehow managed to cut worst case latency in half.
> > Looks like they now apply a different algorithm. Or, may be, just extended to 128b/64b
> > an old "two bits at time" algorithm, that was in use for 64b/32b division since P5,
>
> IVB's scalar divider is almost non-pipelined. Latency is 1-2 clocks more than throughput
> and very dependent on signigicant bitness (position of leftmost «1») of divider. For
> 32 bits it's from 26.5 (full divisin) to 9 (1/1) clocks. For 64 bits — 94.6 to 22.2.
So, reference manual is lying?
> Michael S (already5chosen.delete@this.yahoo.com) on November 17, 2012 12:50 pm wrote:
> > DIV r64 is even more interesting:
> >
Latency/Reciprocal Throughput
> > SNB IVB
> > DIV r64 80-90/??? 35-45/23
> > > >
> > So, long integer division on IVB is not just partially pipelined,
> > but they somehow managed to cut worst case latency in half.
> > Looks like they now apply a different algorithm. Or, may be, just extended to 128b/64b
> > an old "two bits at time" algorithm, that was in use for 64b/32b division since P5,
>
> IVB's scalar divider is almost non-pipelined. Latency is 1-2 clocks more than throughput
> and very dependent on signigicant bitness (position of leftmost «1») of divider. For
> 32 bits it's from 26.5 (full divisin) to 9 (1/1) clocks. For 64 bits — 94.6 to 22.2.
So, reference manual is lying?



