By: Anon (no.delete@this.email.com), September 29, 2015 2:23 pm
Room: Moderated Discussions
SHK (no.delete@this.mail.com) on September 29, 2015 1:04 pm wrote:
> Tim McCaffrey (timcaffrey.delete@this.aol.com) on September 29, 2015 12:18 pm wrote:
> > SHK (no.delete@this.mail.com) on September 29, 2015 6:38 am wrote:
> > > Finally Intel has released the new updated version (-031) of the Optimization Manual:
> > > http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
> > >
> > > There're some new details on Skylake:
> > >
> > > * Front end now has 5 decoders from the usual 4.
> > > * Micro-ops cache can deliver 6 m-ops/cycle instead of 4.
> > > * loop-buffer size is now 64 m-ops
> > > * bigger OoO structures (but no official numbers cited, IIRC ROB size is 224 entries, RS size is 97)
> > > * page split load penalities from 100 cycles to 5 (that's an improvement!)
> > > * longer idle time for the PAUSE instruction
> > > * faster L3, 2-cycles per line now
> > >
> > > That's what i've noticed from a fast browsing, i have yet have to dig into instruction latency tables.
> > >
> >
> > They cut the performance of MMX instructions in half (compared to Broadwell). They suggest
> > that you use AVX2 instead, except MMX is probably used mostly by 32 bit programs, and I don't
> > think AVX2 is available in 32 bit mode. Not that I think this is a big deal, just interesting.
> >
> > - Tim
>
> Well i think it's wise, hopefully the x87 will follow the same path soon. IIRC AVX2 are available
> in 32bit mode, but i cannot image why someone should target 32bit x86 when x86-64 is more than
> 10 year old, 16 registers are not much, 8 is a nightmere, way too many push/pop.
>
Because they have existing and required dusty deck 32bit libraries they must work with?
Seriously, that is very often a limitation.
Microsoft make it double hard, all sorts of OS functionality gets tired to the mode you are running in (for example codecs).
> Tim McCaffrey (timcaffrey.delete@this.aol.com) on September 29, 2015 12:18 pm wrote:
> > SHK (no.delete@this.mail.com) on September 29, 2015 6:38 am wrote:
> > > Finally Intel has released the new updated version (-031) of the Optimization Manual:
> > > http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
> > >
> > > There're some new details on Skylake:
> > >
> > > * Front end now has 5 decoders from the usual 4.
> > > * Micro-ops cache can deliver 6 m-ops/cycle instead of 4.
> > > * loop-buffer size is now 64 m-ops
> > > * bigger OoO structures (but no official numbers cited, IIRC ROB size is 224 entries, RS size is 97)
> > > * page split load penalities from 100 cycles to 5 (that's an improvement!)
> > > * longer idle time for the PAUSE instruction
> > > * faster L3, 2-cycles per line now
> > >
> > > That's what i've noticed from a fast browsing, i have yet have to dig into instruction latency tables.
> > >
> >
> > They cut the performance of MMX instructions in half (compared to Broadwell). They suggest
> > that you use AVX2 instead, except MMX is probably used mostly by 32 bit programs, and I don't
> > think AVX2 is available in 32 bit mode. Not that I think this is a big deal, just interesting.
> >
> > - Tim
>
> Well i think it's wise, hopefully the x87 will follow the same path soon. IIRC AVX2 are available
> in 32bit mode, but i cannot image why someone should target 32bit x86 when x86-64 is more than
> 10 year old, 16 registers are not much, 8 is a nightmere, way too many push/pop.
>
Because they have existing and required dusty deck 32bit libraries they must work with?
Seriously, that is very often a limitation.
Microsoft make it double hard, all sorts of OS functionality gets tired to the mode you are running in (for example codecs).