By: anon (anon.delete@this.anon.com), May 7, 2013 6:08 am
Room: Moderated Discussions
none (none.delete@this.none.com) on May 7, 2013 4:59 am wrote:
> anon (anon.delete@this.anon.com) on May 7, 2013 4:53 am wrote:
> [...]
> > I'm surprised there does not seem to be mention of any kind of cached decode work. It does not
> > seem like there is a decoded instruction cache or loop buffer of any form (or can the instruction
> > queue act as a loop buffer?). Is there any pre-decoding going on in the L1 icache?
>
> Page 3 says this:
> "The 32-entry instruction queue separates the front-end of the pipeline from the
> out-of-order machinery and also functions as a loop cache. When executing out of
> the loop cache, the entire front-end is clock gated to reduce power consumption."
Thank you. My apologies, I missed that.
>
> > I'm surprised because A15 has a small loop buffer, even though most people seem to say that ARM decoders
> > should be simpler than x86. And the Silvermont's decoders seem to be much more capable and parallel
> > than previous Atom. Perhaps they have made some significant gains in x86 decoding efficiency?
>
> cf the above quote: the loop buffer main interest is to decrease
> power consumption. The same applies to Cortex-A15.
> anon (anon.delete@this.anon.com) on May 7, 2013 4:53 am wrote:
> [...]
> > I'm surprised there does not seem to be mention of any kind of cached decode work. It does not
> > seem like there is a decoded instruction cache or loop buffer of any form (or can the instruction
> > queue act as a loop buffer?). Is there any pre-decoding going on in the L1 icache?
>
> Page 3 says this:
> "The 32-entry instruction queue separates the front-end of the pipeline from the
> out-of-order machinery and also functions as a loop cache. When executing out of
> the loop cache, the entire front-end is clock gated to reduce power consumption."
Thank you. My apologies, I missed that.
>
> > I'm surprised because A15 has a small loop buffer, even though most people seem to say that ARM decoders
> > should be simpler than x86. And the Silvermont's decoders seem to be much more capable and parallel
> > than previous Atom. Perhaps they have made some significant gains in x86 decoding efficiency?
>
> cf the above quote: the loop buffer main interest is to decrease
> power consumption. The same applies to Cortex-A15.