number of sockets is wrong metric (was: New article: 8 socket commodity servers)

Article: 8-Socket Commodity Servers: Flourish or Perish?
By: David Kanter (dkanter.delete@this.realworldtech.com), March 14, 2010 12:56 pm
Room: Moderated Discussions
longtimelurker (rwt@nospam.maibaums.net) on 3/14/10 wrote:
---------------------------
>Vincent Diepeveen (diep@xs4all.nl) on 3/9/10 wrote:
>
>>
>>I'm not an AMD fanboy, but really this goes too far.
>>
>>You realize it's start 2010 and you write about a cheap solution that's there for
>>many years now total unrivalled i price by intel: "in theory".
>>
>
>I have to agree that the article seems very intel-biased.

The focus was unarguably on the impact of Nehalem-EX in the 8S server market.

The main reason is that AMD has essentially withdrawn from the 8S market. Magny-Cours is not available for 8 sockets, only 4. Partially this is because they would run out of bits in the probe filter with 2X the number of CPUs. So while AMD has been a viable option for a while, they aren't really any more.

In theory, there could be 8S systems based on Istanbul....but why would you want to do that instead of using Magny-Cours?

>8S-Opteron was a viable
>platform if you really needed the number of memory >channels and most importantly
>the total memory size. They didn't scale even close to >linearly compared with 4S,

Yes, that was definitely another issue.

>but were still significantly faster. If you just care >about number of cores and
>can parallelize well, you would build a cluster of 1S or >2S machines anyway.

That's a different sort of parallelism. Multiple processes in the same address space are different than multiple boxes over ethernet. I agree that most folks who can cluster will....but not all. For instance, Oracle's clustering is strongly limited; I don't believe there are production instances of RAC with more than ~20 boxes. So they need bigger individual boxes.

>Which brings me to the actual point I'm trying to make: Isn't the number of sockets
>an antiquated metric? What you are really interested in with those big boxes is
>the total number and throughput of the memory channels and >that they're reasonably low latency from any core.

Not just that, but also the I/O. I've actually been meaning to write up something comparing the memory capacity, bandwidth and I/O for various systems as they scale.

To some extent I addressed this indirectly - as the # of cores/socket grows...the need for sockets may decrease.

>IBM showed off Nehalem systems at Cebit that used external QPI links to attach
>extra enclosures just full of memory, and with Magny-Cours AMD just stuffed significantly
>more bandwidth and lower latency into 4S than they >previously had at 8S. Not sure
>if there'll be an 8S Magny-Cours, but it'll likely suffer from similar problems
>as the previous Opteron 8000. Maybe someone will build a >6S, the Chips are called Opteron 6000 after all.

Note that the way the probe filter works in Magny Cours requires that if a line is held in cache, it must have an entry in the probe filter. Thus if you run out of probe filter space, you may have to evict a line from cache. This is also the way that Intel's snoop filter worked for the Blackford and Seaburg chipsets. Intel found that it's important to have the snoop filter that covers at least 2X your caches (e.g. a system with 12MB total of cache should have 24MB of snoop filter coverage) to account for associativity conflicts.

AMD's probe filter coverage is actually slightly less than the overall cache capacity for a single chip (let alone the whole system). A single Istanbul has 268,288 cache lines across L1, L2 and 5MB of L3 (remember 1MB is removed for the probe filter). The probe filter can index 262,144 lines, so the coverage is 98%.

This is unfortunate, since it means that a given home node (i.e. a single Istanbul chip) can only have 16MB of it's local memory cached. If you end up with uneven memory accesses and hot spots, you'll be in trouble since the probe filter may prevent some lines from being cached.

Doubling the socket count to 8 would increase the amount of remote accesses, and hence increase the number of cache lines needed for an effective probe filter. Additionally, the size of each probe filter entry would grow to account for the additional sockets.

I can't see a way for Magny Cours to work effectively in 8S.


Now in all fairness, Nehalem-EX has no snoop filter (except perhaps for 3rd party chipsets). However, there are more CSI lanes, so the connectivity is better and more sockets are within one hop. It will be interesting to see how Nehalem-EX scales on regular chipsets vs. proprietary ones...

David
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New article: 8 socket commodity serversDavid Kanter2010/03/09 11:27 AM
  New article: 8 socket commodity serversVincent Diepeveen2010/03/09 02:46 PM
    number of sockets is wrong metric (was: New article: 8 socket commodity servers)longtimelurker2010/03/14 06:13 AM
      number of sockets is wrong metric (was: New article: 8 socket commodity servers)EduardoS2010/03/14 06:34 AM
      number of sockets is wrong metric (was: New article: 8 socket commodity servers)Wes Felter2010/03/14 11:33 AM
        Magny-CoursMax2010/03/14 05:56 PM
          Magny-Coursanonymous2010/03/14 07:33 PM
            Magny-Courslongtimelurker2010/03/15 03:54 AM
      number of sockets is wrong metric (was: New article: 8 socket commodity servers)Vincent Diepeveen2010/03/14 12:31 PM
        number of sockets is wrong metric (was: New article: 8 socket commodity servers)longtimelurker2010/03/14 02:37 PM
          number of sockets is wrong metric (was: New article: 8 socket commodity servers)Vincent Diepeveen2010/03/15 12:36 PM
      number of sockets is wrong metric (was: New article: 8 socket commodity servers)David Kanter2010/03/14 12:56 PM
        Bad mathDavid Kanter2010/04/01 02:24 AM
      number of sockets is wrong metric (was: New article: 8 socket commodity servers)slacker2010/03/14 03:51 PM
        number of sockets is wrong metric (was: New article: 8 socket commodity servers)Michael S2010/03/15 06:05 AM
          number of sockets is wrong metric (was: New article: 8 socket commodity servers)slacker2010/03/15 02:02 PM
            Memory interfacesDavid Kanter2010/03/15 02:17 PM
              Memory interfacesslacker2010/03/15 10:08 PM
                Patents on tiny components vs. large, complex thingsmpx2010/03/16 12:41 AM
                  Patents on tiny components vs. large, complex thingsRichard Cownie2010/03/16 06:58 AM
                    Patents on tiny components vs. large, complex thingsMS2010/03/17 06:42 PM
                      Patents on tiny components vs. large, complex thingsa reader2010/03/18 09:45 PM
          Serial Port Memory TechnologyDavid Hess2010/03/21 04:32 AM
  New article: 8 socket commodity serversMichael S2010/03/09 04:13 PM
    New article: 8 socket commodity serverstheluketaylor2010/03/09 06:32 PM
    New article: 8 socket commodity serversJesper Frimann2010/03/09 11:35 PM
    New article: 8 socket commodity serversDavid Kanter2010/03/10 01:38 AM
      New article: 8 socket commodity serversTim2010/03/16 09:44 AM
  New article: 8 socket commodity serversanon2010/03/09 07:59 PM
    New article: 8 socket commodity serversDavid Kanter2010/03/10 12:06 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?