Back on track...

Article: A Preview of Intel's Bensley Platform (Part II)
By: Dean Kent (dkent.delete@this.realworldtech.com), December 5, 2005 12:35 pm
Room: Moderated Discussions
Leonov (chrissinger@vigin.net) on 12/5/05 wrote:
---------------------------
>
>As far as I am concerned it is never wrong to disclose information when benchmarking
>a system and that (as far as I can see) is all that is being asked for.

Let's step back for a moment and look at the bigger picture.

For those who haven't ever actually tried, the problems involved with benchmarking/reviewing are tremendously difficult to solve. I made a contention a number of years ago that the attempts by publications to 'benchmark' processors/systems is a (next to) impossible task. The reason is that a true benchmark is one that represents *your* usage, not someone else's. The best you can for review purposes is to do a reasonable approximation.

First, you generally get notified of a new system/component a few months in advance of its release. You get your hands on it a few weeks before release. Time is your enemy, if you want to have a relevant and timely review.

Basically you have two choices for benchmarking: Standardized benchmarks (SPEC, TPC, etc.) and application benchmarks (Cinebench, Maya, Fluent, Cadalyst, Pro/Engineer, etc.) Yeah, you can use synthetic benchmarks to test various things but the problem is relating them to actual performance in the 'real world'.

The benefit of the standardized benchmarks is that some group of (supposedly) knowledgable people spent a lot of time and effort to profile hundreds of applications and pick the most representative (along with representative data sets). However, the drawback is that these are generally not complete applications, but 'representative samples'. In addition, these groups have to be funded and staffed, and that usually means by the manufacturer's who have the most to gain by this effort. Therefore, we have suspicions of how these efforts have been influenced.

The benefit of application benchmarks is that you have some ability to run what users are actually running. The drawbacks are many, however. You have to identify data sets that are realistic and representative, you have to become familiar with the software and create an automated script, you have to research the applicability of them for the system/component being benchmarked (is the app typically used in that market segment?)

Then there is the cost. You have to buy a license for the standardized benchmarks, or you have to buy a license for the applications. Sometimes (as with some SPEC benchmarks) you have to buy the license for both. Sure, you can get open source applications, but you still have to do the research to find out how suitable they are (how popular, how representative, how well written, etc.)

Then, after you have selected the benchmarks you may have to compile them. Which compiler? Intel's, or Pathscale, which has a relationship with AMD? Maybe Microsoft - which means another license to pay for... or use Linux/gcc.. but is it really representative? For some, yes. For others, no. Then you have to run them - more than once if you are being thorough. In many cases this is hours of runtime. You might have to profile them (with VTune), gather the results and analyze them. This isn't so hard with a dozen benchmarks, but try it with several dozen...

Finally, you have to figure out whether two benchmarks are really testing the same thing. In other words, benchmarking is really all about finding out the bottlenecks in a system/component. Two applications that use essentially the same resources will have the same bottleneck - so running both of them is somewhat redundant. What you ideally want to do is identify benchmarks that use the system resources sufficiently differently that you can actually identify the weak/strong parts of the system.

No matter how well you choose, setup and analyze, however, there is *always* someone complaining that one or more benchmarks used is somehow inappropriate for various reasons. When enough complaints have been filed, virtually all benchmarks have been disqualified for one reason or another.

The 'solution' to this problem has typically been 'open source', because anyone can download/research/compile them and verify what they consist of. Unfortunately, the vast majority of people (including reviewers) can't or won't do this - so the 'benefit' isn't really much of a benefit. My contention is that if you have the expertise and are willing to take the effort to identify what is in an open source benchmark, you probably have the ability to identify what a 'closed source' benchmark is doing via profiling/disassembling/etc. - if you *really* want to know.

In addition, it really doesn't solve the problem for those who *do* use 'closed source' applications and want to know how those will perform - even if they are optimized.

So what does a reviewer do? He/she wrings his/her hands, begs for licenses/data sets and takes whatever assistance he/she can get from the manufacturer. Reviewing an AMD part? - try getting Intel to provide you with assistance for obtaining realistic benchmarks, and vice-versa. So where do you *think* you will get assistance? They may provide you with some benchmarks, which you *know* will not include anything that doesn't make them look good - but you hope that these will become part of a suite of benchmarks you can use on other systems. But you may not have the ability to test the application on those other systems before doing your current review, so you don't know for sure how applicable it is. Then you get questioned about it in a way that might make you look bad.

So, maybe you just look at what everyone else is doing and use what they use hoping that they did their due diligence. Or maybe you take suggestions from regular readers, hoping that they have a clue. Or maybe you spin your wheels doing so much research that you don't even get a review out... jeopardizing your chances of getting another part to review. Or maybe you just take the manufacturer's recommendations because that is what you have time for.

In short - reviews suck. :-) But there isn't much alternative unless you are independently wealthy (and then you probably don't care)

Were I to have the time/money/ability, I would do it this way...

Spend a lot of time, effort and money to understand the system, components and tools you are using. Run all of the standardized, synthetic and application benchmarks possible on various systems (vary memory, hard drives, graphics, and other components as much as possible). Run software analyzers on each setup. Create a database of results that is sortable by many different criteria. Spend the time to analyze the results and try to group applications and benchmarks by their profiles (which have a large memory footprint, which are CPU bound, which are I/O bound, which are primarily integer vs floating point, etc., etc). In this way, when a benchmark is run, you could go to the database and find out what other applications/benchmarks have a similar resource usage/profile and use that to estimate how the benchmark applies to *your* situation, if at all. You could see the outliers, and weed them out of reviews. You could make realistic comparisons of which systems/components are comparable in performance. You could fairly quickly see the strengths/weaknesses of various setups.

Today, reviews are only marginally useful to the consumer (though extremely useful to the manufacturer), and so discussing whether one benchmark is 'skewed' or whether a manufacturer influences the results (which all will given the right opportunity) is pretty useless, IMO.

I stopped doing reviews awhile back because of the frustration about what I *couldn't* do. I started the Benchmark Examiner articles to try and focus on some of the problems I mentioned (but life got in the way - big time). I'd like to do some reviews again, and I very well might - but I'll probably be frustrated, dissatisfied and stressed about them much more than anyone reading them will be. I'd also like to do benchmark evaluations, create a benchmark data base and several other related tasks... but time is my enemy, of course.

Wouldn't it be great if the open source concept could be applied here - farm out a task, and then put the results into the data base? Have at least three people run the same tests, and if there are variations, try it again until you get consistent results... Then time and money wouldn't be such a problem... ;-)

Regards,
Dean


>
>L
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/29 01:45 AM
  Bensley Platform Preview (Part II) OnlineTemp2005/11/29 06:25 AM
    Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/29 11:55 AM
      Bensley Platform Preview (Part II) OnlineTemp2005/11/29 02:29 PM
        Bensley Platform Preview (Part II) Onlinerwessel2005/11/29 02:53 PM
    Bensley Platform Preview (Part II) OnlineDean Kent2005/11/29 12:01 PM
      Bensley Platform Preview (Part II) OnlineWilliam Campbell2005/11/29 12:48 PM
        Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/29 01:37 PM
          Well said! (NT)savantu2005/11/29 01:44 PM
          Peer reviewWilliam Campbell2005/11/29 04:12 PM
            To clarify intentWilliam Campbell2005/11/29 04:19 PM
            Peer reviewDavid Kanter2005/11/29 04:21 PM
              Peer reviewWilliam Campbell2005/11/29 06:13 PM
                Peer reviewnick2005/11/29 11:09 PM
                  Peer reviewWilliam Campbell2005/11/30 12:39 AM
                    Peer reviewDavid Kanter2005/11/30 01:21 AM
                Peer reviewDavid Kanter2005/11/29 11:25 PM
                  Yes please (NT)William Campbell2005/11/30 12:28 AM
                    Yes please (NT)David Kanter2005/11/30 06:19 PM
                      Thank youWilliam Campbell2005/11/30 08:51 PM
                        Thank youDavid Kanter2005/11/30 10:29 PM
            Peer reviewDean Kent2005/11/29 07:12 PM
              Peer reviewWilliam Campbell2005/11/29 07:50 PM
                Peer reviewDean Kent2005/11/30 05:16 AM
                  Peer reviewWilliam Campbell2005/11/30 08:49 PM
                    Peer reviewTemp2005/12/01 03:02 AM
                      Peer reviewWilliam Campbell2005/12/01 04:54 AM
                        Peer reviewTemp2005/12/01 05:11 AM
                  Peer reviewTemp2005/12/01 03:03 AM
                    Peer reviewDean Kent2005/12/01 07:55 AM
                      Peer reviewBill Todd2005/12/01 08:26 PM
                        Peer reviewDavid Kanter2005/12/01 09:52 PM
                          Peer reviewBill Todd2005/12/01 10:14 PM
                            Peer reviewDavid Kanter2005/12/01 11:04 PM
                              Peer reviewBill Todd2005/12/02 12:13 AM
                            Peer reviewDean Kent2005/12/02 07:02 AM
                              You lost this one.Ray2005/12/02 11:54 AM
                                You lost.tecate2005/12/02 02:55 PM
                                  I second that (NT)savantu2005/12/02 03:22 PM
                                  I wasn't in the game.Ray2005/12/02 04:19 PM
                                    I wasn't in the game.Dean Kent2005/12/02 10:20 PM
                                  You lost.Bill Todd2005/12/02 05:28 PM
                                    You lost.Anonymous2005/12/02 08:27 PM
                                      You lost.Bill Todd2005/12/02 08:56 PM
                                        You lost.Dean Kent2005/12/02 10:37 PM
                                          You lost.Bill Todd2005/12/03 12:08 AM
                                            All about the contextDavid Kanter2005/12/03 02:27 PM
                                              All about the contextBill Todd2005/12/03 02:51 PM
                                                All about the contextDavid Kanter2005/12/03 04:29 PM
                                    You lost.Ray2005/12/02 09:15 PM
                                      You lost.Bill Todd2005/12/02 10:00 PM
                                        You lost.Ray2005/12/02 11:09 PM
                                        You lost.anonymous2005/12/03 02:42 AM
                                          You lost.Bill Todd2005/12/03 02:45 PM
                                            Well...David Kanter2005/12/03 03:51 PM
                                            You lost.Ray2005/12/03 05:54 PM
                                              Bill is a self loathing AmericanNIKOLAS2005/12/03 06:25 PM
                                                Bill is a self loathing AmericanBill Todd2005/12/03 09:40 PM
                                                  Bill is a self loathing AmericanBill Todd2005/12/03 09:48 PM
                                                Bill is a self loathing AmericanDavid Kanter2005/12/03 09:48 PM
                                                  Bill is a self loathing AmericanBill Todd2005/12/03 11:17 PM
                                                    Bill is a self loathing AmericanDavid Kanter2005/12/04 12:37 AM
                                                      Bill is a self loathing AmericanBill Todd2005/12/04 01:19 AM
                                                        This whole thread is a symptom...Dean Kent2005/12/04 09:43 AM
                                                          This whole thread is a symptom...tecate2005/12/04 01:17 PM
                                                            This whole thread is a symptom...mas2005/12/04 02:02 PM
                                                              This whole thread is a symptom...tecate2005/12/05 06:21 AM
                                                          This whole thread is a symptom...tecate2005/12/04 01:18 PM
                                                          ...Temp2005/12/04 03:38 PM
                                                            ...Dean Kent2005/12/04 05:25 PM
                                                              Once more, alasTemp2005/12/05 02:23 AM
                                                                Once more, alasDean Kent2005/12/05 08:23 AM
                                                                  ByeTemp2005/12/05 10:47 AM
                                                                  Once more, alasBill Todd2005/12/05 10:58 AM
                                                              Sungard as a benchmarkTemp2005/12/05 03:42 AM
                                                                Sungard as a benchmarkDean Kent2005/12/05 10:06 AM
                                                                Sungard as a benchmarkDavid Kanter2005/12/05 08:08 PM
                                                                  Sungard as a benchmarkTemp2005/12/06 01:45 AM
                                                                    More info about SungardTemp2005/12/06 03:20 PM
                                                                      More info about SungardDavid Kanter2005/12/06 04:25 PM
                                                                        More info about SungardTemp2005/12/07 12:40 AM
                                                                        More info about SungardDean Kent2005/12/07 07:52 AM
                                                                      More info about SungardDean Kent2005/12/06 07:22 PM
                                                          This whole thread is a symptom...Bill Todd2005/12/04 09:31 PM
                                                            This whole thread is a symptom...Dean Kent2005/12/04 09:51 PM
                                              You lost.Bill Todd2005/12/03 11:14 PM
                                                You lost.Ray2005/12/04 01:06 AM
                                                  You lost.Bill Todd2005/12/04 01:54 AM
                                                    Enough with the politics... (NT)David Kanter2005/12/04 03:41 AM
                                            You lost.anonymous2005/12/04 04:03 AM
                                              Well Said! (NT)Anonymous2005/12/04 04:48 AM
                                              You lost.savantu2005/12/04 06:47 AM
                                              You lost.Bill Todd2005/12/04 09:39 PM
                                                You lost.anonymous2005/12/05 02:51 AM
                                You lost this one.Dean Kent2005/12/02 09:41 PM
                                  You lost this one.Leonov2005/12/03 12:55 AM
                                    You lost this one.tecate2005/12/03 05:27 AM
                                      You lost this one.Leonov2005/12/03 06:33 AM
                                        You lost this one.savantu2005/12/03 10:19 AM
                                          You lost this one.Leonov2005/12/03 12:19 PM
                                        For god sake.Anonymous2005/12/04 04:28 AM
                                          It's sadsav2005/12/04 06:43 AM
                                            It's sadmas2005/12/04 07:09 AM
                                              It's sadMichael S2005/12/04 07:33 AM
                                              PerfectNo one you'd know2005/12/04 10:52 AM
                                                Perfectmas2005/12/04 12:32 PM
                                                  PerfectDean Kent2005/12/04 12:50 PM
                                                    Perfectmas2005/12/04 01:16 PM
                                                      PerfectDean Kent2005/12/04 04:22 PM
                                                        Posts deleted, topic not open for discussionDavid Kanter2005/12/05 02:05 PM
                                                          Posts deleted, topic not open for discussionKeith Fiske2005/12/05 05:03 PM
                                              This will not be toleratedDavid Kanter2005/12/04 04:32 PM
                                          For god sake.Leonov2005/12/05 07:10 AM
                                            Back on track...Dean Kent2005/12/05 12:35 PM
                                              Back on track...Leonov2005/12/06 03:08 AM
                                  You lost this one.Temp2005/12/03 04:16 AM
                        Peer reviewDean Kent2005/12/02 06:22 AM
                          Peer reviewTemp2005/12/02 12:01 PM
      Bensley Platform Preview (Part II) Onlinean2005/11/29 01:17 PM
        Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/29 02:17 PM
          Bensley Platform Preview (Part II) Onlinean2005/11/30 07:52 AM
            Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/30 10:42 PM
        Bensley Platform Preview (Part II) OnlineDean Kent2005/11/29 04:11 PM
        Bensley Platform Preview (Part II) Onlineanonymous2005/11/29 05:38 PM
          It's calledWilliam Campbell2005/11/29 06:17 PM
      Bensley Platform Preview (Part II) OnlineTemp2005/11/29 02:41 PM
        Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/29 03:02 PM
        Bensley Platform Preview (Part II) OnlineDean Kent2005/11/29 07:41 PM
  2 small nitpicksan2005/11/29 02:03 PM
    2 small nitpicksDaniel Bizó2005/11/29 03:27 PM
      2 small nitpicksan2005/11/30 07:40 AM
        2 small nitpicksDaniel Bizó2005/11/30 11:17 AM
          2 small nitpicksan2005/11/30 12:30 PM
            2 small nitpicksDavid Kanter2005/11/30 02:32 PM
              2 small nitpicksan2005/11/30 02:49 PM
  Minor Comment about CineBenchRakesh Malik2005/11/29 02:22 PM
  Bensley Platform Preview (Part II) OnlinePiedPiper2005/11/29 08:04 PM
    Bensley Platform Preview (Part II) OnlinePiedPiper2005/11/29 08:08 PM
      Bensley Platform Preview (Part II) OnlineDavid Kanter2005/11/30 02:05 AM
        Bensley Platform Preview (Part II) OnlinePiedPiper2005/11/30 07:58 PM
          Bensley Platform Preview (Part II) OnlineDavid Kanter2005/12/01 01:45 AM
  Why no 64-bit tests?PiedPiper2005/11/29 08:37 PM
    Why no 64-bit tests?David Kanter2005/11/30 02:07 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?