By: SHK (no.delete@this.mail.com),
Room: Moderated Discussions
Finally Intel has released the new updated version (-031) of the Optimization Manual:
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
There're some new details on Skylake:
* Front end now has 5 decoders from the usual 4.
* Micro-ops cache can deliver 6 m-ops/cycle instead of 4.
* loop-buffer size is now 64 m-ops
* bigger OoO structures (but no official numbers cited, IIRC ROB size is 224 entries, RS size is 97)
* page split load penalities from 100 cycles to 5 (that's an improvement!)
* longer idle time for the PAUSE instruction
* faster L3, 2-cycles per line now
That's what i've noticed from a fast browsing, i have yet have to dig into instruction latency tables.
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
There're some new details on Skylake:
* Front end now has 5 decoders from the usual 4.
* Micro-ops cache can deliver 6 m-ops/cycle instead of 4.
* loop-buffer size is now 64 m-ops
* bigger OoO structures (but no official numbers cited, IIRC ROB size is 224 entries, RS size is 97)
* page split load penalities from 100 cycles to 5 (that's an improvement!)
* longer idle time for the PAUSE instruction
* faster L3, 2-cycles per line now
That's what i've noticed from a fast browsing, i have yet have to dig into instruction latency tables.


