New Silicon Insider Article

Article: Escape From the Planet of x86
By: Bill Todd (billtodd.delete@this.metrocast.net), June 19, 2003 5:00 pm
Room: Moderated Discussions
David Wang (dwang@realworldtech.com) on 6/19/03 wrote:
---------------------------
>Bill Todd (billtodd@metrocast.net) on 6/19/03 wrote:
>---------------------------
>>mas (mas769@hotmail.com) on 6/19/03 wrote:
>>---------------------------
>>>Bill Todd (billtodd@metrocast.net) on 6/19/03 wrote:
>>>---------------------------
>>>>Alberto (albertobu@libero.it) on 6/18/03 wrote:
>>>>
>>>>...
>>>>
>>>>>If you read:
>>>>>http://www.intel.com/design/itanium2/download/14_4_slides_r31_nsn.htm
>>>>
>>>>One more interesting tidbit that I noticed in that >presentation was the L3 latency
>>>>of 14 clock cycles. My recollection is that McKinley's L3 >latency was 12 cycles
>>>>(though it may have been more in some situations - ISTR >the range 12 - 15 cycles
>>>>being mentioned once): does this indicate a slightly sub->linear improvement in L3 performance for Madison?
>>>>
>>>
>>>Well it all depends on how that offsets against the improvements which are a 50%
>>>bandwidth improvement in all the caches and a doubling of the set associativity
>>>of the L3 in particular (12->24). My wag is that overall the cache structure has been improved, clock for clock.
>>
>>A 50% improvement in bandwidth would seem to translate to a 0% improvement 'clock
>>for clock'. And ISTR seeing some moderately authoritative source for a rule of
>>thumb that 8-way associativity yielded performance close enough to full associativity
>>that further effort was not justified (which would seem to make sense, given that
>>for the colder data in the cache random replacement works about as well as LRU replacement)
>>- so if going to 24-way cost any of that increased latency the trade-off would seem questionable.
>>
>>However, there's no getting around the fact that at least for the high-end product
>>the L3 cache size doubled: that certainly helps on average (though one could argue
>>that it would have needed to increase *some* in size just to compensate for the
>>fact that memory latency remained the same, so whether it's enough to make overall
>>performance increase linearly with the clock rate remains to be seen).
>>
>>My point was that if the size increase came at the expense of increased latency
>>(in terms of clock cycles, not absolute) then it was more of a mixed blessing than would otherwise have been the case.
>
>Size increases always comes at the expense of increased latency. If you want to
>hang more bits on the same wordline or the same bitline, array access will be slower.
>If you have more banks/segments/arrays, then getting access to any individual bank/segment/array would take longer.
>
>There's a monkey wrench in this comparison in that there's a process change involved,
>and there's more than just "more cache" that impacted the (cycle count) latency
>of the L3 cache. L3 cache is actually faster in wall clock ticks (15 cycles in
>1.5 GHz = 10ns, 12 cycles in 1 GHz = 12ns) just didn't get sped up as much as the rest of the chip.
>
>http://cpus.hp.com/technical_references/isscc_2002/isscc_2002_1.shtml
>
>-----------------------------------------------------------
>
>2) The memory system incorporates 3 levels of caching optimized for low latency,
>high bandwidth and high density respectively. The pre-validated, 4 port 16KB L1D
>cache [1] is tightly coupled to the integer units to achieve the half cycle load.
>As a result, the less latency sensitive FPU directly interfaces to the L2D cache
>[1] with 4 82b load ports (6 cycle latency) and 2 82b store ports. The 3MB, 12 cycle
>latency L3 cache [1] is implemented with 135 separate "subarrays" that enable high
>density and the ability to conform to the irregular shape of the processor core
>with flexible subarray placement. Each level of on-chip cache has matched bandwidths
>at 32GB/s across the hierarchy (figure 20.6.3).
>----------------------------------------------------------
>
>L1 is optimized for latency, L2 optimized for bandwidth, and L3 is optimzied for
>density. Slightly longer latency for L3 should be a good tradeoff when it gets you the even larger cache.
>
>There are ways to keep the latency of larger caches from increasing in cycle count,
>but they all involve trading off die area for larger cells/drivers/repeaters/sense
>units. Since L3 design calls for density optimization, it does not seem to be worth
>it to trade off area/transistor-count/power to keep latency at the same 12 cycles.

All that is fine and dandy, but irrelevant to the question I asked, which was whether L3 latency had scaled rather less than linearly with clock rate. AFAICT the answer is a simple 'yes', though still qualified by my impression that the L3 latency specs for McKinley gave a 12 - 15 cycle range whereas those for Madison seem to give a flat 14 cycle figure.

But we'll see soon enough just how linearly Madison performace scales with clock rate on various benchmarks. It seemed to do pretty well on SPECweb99_SSL, though was aided by use of a newer version of Zeus (and my impression from comments from someone who should know is that that often makes a non-negligible difference). TPC-C was less linear. Unless significant compiler advances have occurred since McKinley's SPECint scores were posted, I'm beginning to suspect that Madison at 1.5 GHz will have difficulty getting much above 1200, despite the doubling in L3 cache size.

- bill

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New Silicon Insider ArticleDavid Kanter2003/06/17 02:39 PM
  Srockholm Syndromeanonymous2003/06/17 02:50 PM
    Srockholm SyndromeNate Begeman2003/06/17 03:32 PM
      Srockholm Syndromeanonymous2003/06/18 01:23 PM
      Srockholm SyndromeScott Robinson2003/06/20 07:25 AM
  New Silicon Insider ArticleBill Todd2003/06/17 08:51 PM
    New Silicon Insider ArticleAlberto2003/06/18 06:29 AM
      New Silicon Insider ArticleJosé Javier Zarate2003/06/18 09:16 AM
      New Silicon Insider ArticleBill Todd2003/06/18 02:10 PM
        New Silicon Insider ArticleNate Begeman2003/06/18 02:25 PM
          New Silicon Insider ArticleTvar'2003/06/18 02:41 PM
            New Silicon Insider ArticleAlberto2003/06/18 02:58 PM
              New Silicon Insider ArticleTvar'2003/06/18 03:04 PM
                New Silicon Insider ArticleAlberto2003/06/18 03:24 PM
                  New Silicon Insider ArticleTvar'2003/06/18 03:32 PM
            New Silicon Insider ArticlePaul DeMone2003/06/18 03:13 PM
              New Silicon Insider ArticleTvar'2003/06/18 03:23 PM
          New Silicon Insider Articlemas2003/06/18 03:11 PM
        New Silicon Insider ArticleAlberto2003/06/18 02:45 PM
          New Silicon Insider ArticleBill Todd2003/06/18 10:46 PM
            New Silicon Insider ArticleDavid Wang2003/06/18 11:13 PM
              New Silicon Insider ArticleBill Todd2003/06/19 12:14 AM
              New Silicon Insider ArticleDavid Wang2003/06/19 09:52 AM
        New Silicon Insider ArticlePaul DeMone2003/06/18 03:04 PM
          New Silicon Insider ArticleBill Todd2003/06/18 10:28 PM
            New Silicon Insider ArticlePaul DeMone2003/06/18 11:43 PM
              New Silicon Insider ArticleRob Young2003/06/19 09:23 AM
                New Silicon Insider ArticleBill Todd2003/06/19 03:53 PM
      New Silicon Insider ArticleDavid Wang2003/06/18 10:29 PM
      New Silicon Insider ArticleBill Todd2003/06/18 11:03 PM
        New Silicon Insider ArticleJosé Javier Zarate2003/06/19 04:33 AM
        New Silicon Insider Articlemas2003/06/19 05:37 AM
          New Silicon Insider ArticleBill Todd2003/06/19 03:40 PM
            New Silicon Insider ArticleDavid Wang2003/06/19 04:25 PM
              New Silicon Insider ArticleBill Todd2003/06/19 05:00 PM
                New Silicon Insider ArticleAlberto2003/06/19 05:29 PM
                  New Silicon Insider ArticleSpeedy2003/06/19 05:48 PM
                    New Silicon Insider ArticleAlberto2003/06/20 03:57 AM
                New Silicon Insider ArticleDavid Wang2003/06/19 05:52 PM
                  New Silicon Insider ArticleBill Todd2003/06/19 08:00 PM
                    New Silicon Insider ArticleAnonymous2003/06/20 01:20 AM
                      New Silicon Insider ArticlePaul DeMone2003/06/20 08:11 AM
                        New Silicon Insider ArticleAnonymous2003/06/22 03:48 PM
                          New Silicon Insider ArticlePaul DeMone2003/06/22 04:49 PM
                            New Silicon Insider ArticleVincent Diepeveen2003/06/22 05:25 PM
                              New Silicon Insider ArticleJosé Javier Zarate2003/06/22 06:55 PM
                            New Silicon Insider ArticleAnonymous2003/06/23 08:59 AM
        New Silicon Insider ArticlePaul DeMone2003/06/19 06:53 PM
          New Silicon Insider ArticleBill Todd2003/06/19 07:53 PM
            New Silicon Insider ArticleDavid Wang2003/06/19 08:08 PM
              New Silicon Insider ArticleBill Todd2003/06/20 01:28 AM
                New Silicon Insider ArticleDavid Wang2003/06/20 10:35 AM
                  New Silicon Insider ArticlePaul DeMone2003/06/20 11:29 AM
                    New Silicon Insider ArticleBill Todd2003/06/20 06:10 PM
                      New Silicon Insider ArticleMarc M.2003/06/21 05:06 AM
                        New Silicon Insider ArticleBill Todd2003/06/21 11:07 AM
                  New Silicon Insider ArticleBill Todd2003/06/20 06:01 PM
                    New Silicon Insider ArticleDavid Wang2003/06/20 06:52 PM
                      New Silicon Insider ArticleBill Todd2003/06/20 07:53 PM
                        New Silicon Insider ArticleDavid Wang2003/06/20 08:14 PM
                          New Silicon Insider ArticleVincent Diepeveen2003/06/20 08:52 PM
                            New Silicon Insider ArticleMarc M.2003/06/21 07:16 AM
                              New Silicon Insider ArticleVincent Diepeveen2003/06/22 04:24 PM
                          New Silicon Insider ArticleSingh, S.R.2003/06/21 03:39 AM
                            New Silicon Insider ArticleDavid Wang2003/06/21 08:10 AM
                          IPF CompilersNate Begeman2003/06/21 09:10 AM
                            IPF CompilersPaul DeMone2003/06/21 09:45 AM
                        Use ILP to extract more ILPPaul DeMone2003/06/20 10:48 PM
            New Silicon Insider ArticlePaul DeMone2003/06/20 08:06 AM
              New Silicon Insider ArticleSingh, S.R.2003/06/20 09:41 AM
                New Silicon Insider ArticleDavid Kanter2003/06/21 03:34 PM
                  New Silicon Insider ArticlePaul DeMone2003/06/22 02:22 PM
              New Silicon Insider ArticleBill Todd2003/06/20 05:52 PM
              New Silicon Insider ArticleMarc M.2003/06/21 07:54 AM
    New Silicon Insider ArticleDaniel Gustafsson2003/06/19 11:12 AM
    New Silicon Insider ArticlePaul DeMone2003/06/20 02:20 PM
  New Silicon Insider ArticleBryan Gregory2003/06/20 01:14 PM
    New Silicon Insider Articlemas2003/06/20 01:43 PM
  New Silicon Insider ArticlePaul DeMone2003/06/25 10:29 AM
    New Silicon Insider ArticleJosé Javier Zarate2003/06/25 10:43 AM
      New Silicon Insider ArticlePaul DeMone2003/06/25 10:52 AM
    lol, amazing coincidence :-) (NT)mas2003/06/25 03:15 PM
  New Silicon Insider ArticleYoav2015/04/01 03:43 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?