By: anon (anon.delete@this.anon.com), May 7, 2013 4:53 am
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on May 6, 2013 2:30 pm wrote:
> Silvermont is Intel’s first CPU core tailored for power efficient applications such as smartphones,
> tablets, and microservers. The 22nm microarchitecture features updated instruction set extensions,
> full out-of-order execution with a tightly coupled L2 cache, aggressive power management, and a new
> high performance SoC fabric. These enhancements deliver tremendous performance and frequency gains
> over the aging Atom core, putting Intel’s mobile strategy in a more competitive position.
>
> My detailed look at the microarchitecture is online: http://www.realworldtech.com/silvermont/
>
> Comments and questions are welcome, let the discussion begin!
I'm surprised there does not seem to be mention of any kind of cached decode work. It does not seem like there is a decoded instruction cache or loop buffer of any form (or can the instruction queue act as a loop buffer?). Is there any pre-decoding going on in the L1 icache?
I'm surprised because A15 has a small loop buffer, even though most people seem to say that ARM decoders should be simpler than x86. And the Silvermont's decoders seem to be much more capable and parallel than previous Atom. Perhaps they have made some significant gains in x86 decoding efficiency?
> Silvermont is Intel’s first CPU core tailored for power efficient applications such as smartphones,
> tablets, and microservers. The 22nm microarchitecture features updated instruction set extensions,
> full out-of-order execution with a tightly coupled L2 cache, aggressive power management, and a new
> high performance SoC fabric. These enhancements deliver tremendous performance and frequency gains
> over the aging Atom core, putting Intel’s mobile strategy in a more competitive position.
>
> My detailed look at the microarchitecture is online: http://www.realworldtech.com/silvermont/
>
> Comments and questions are welcome, let the discussion begin!
I'm surprised there does not seem to be mention of any kind of cached decode work. It does not seem like there is a decoded instruction cache or loop buffer of any form (or can the instruction queue act as a loop buffer?). Is there any pre-decoding going on in the L1 icache?
I'm surprised because A15 has a small loop buffer, even though most people seem to say that ARM decoders should be simpler than x86. And the Silvermont's decoders seem to be much more capable and parallel than previous Atom. Perhaps they have made some significant gains in x86 decoding efficiency?