Logic and Reason in Benchmarking

Pages: 1 2 3

Searching for Solutions

The complexity of benchmarking might be a bit clearer by looking at a definition of the term system. The dictionary might define is as “Any organized assembly of resources and procedures united and regulated by interaction or interdependence to accomplish a set of specific functions.” The terms ‘interaction’ and ‘interdependence’ are very, very important and add a great deal of complexity to the exercise, which also drives the cost of accurate analysis up. We really need to look at computer systems as being comprised of layers that include the hardware, operating system, drivers, applications, data and the user. Each of these layers contributes something to the overall performance, including the way the individual users actually operate the system.

In a large business environment, it may be cost effective to bring in experts who can accurately profile and measure the specific workload for that business. However, for the average user, the cost would be prohibitive, so the measurements are necessarily going to be more generic in nature. Making them generic allows the cost to be spread amongst a greater number of people, but it also makes the results less applicable to any one user. The greater the percentage of the population being profiled, the more generalized the benchmark and the less applicable to any one user it becomes. This is the dilemma.

Might the answer be to do research and categorize usage by various market segments, and then build benchmarks for each? Once again, the cost factor raises its ugly head. Certainly no single benchmark can provide all of the answers, but randomly throwing dozens of benchmarks at a component or system provides little more information. If a dozen benchmarks are run, nine of which are games, and 7 of those that use the same game engine – of what use is that to those who don’t play games? By carefully placing existing benchmarks into specific categories, perhaps we can at least provide some better information for analyzing results. This is one of the goals for this column.

There is money in benchmarking, to be sure, but not from the consumer. Vendors and publications are the source of revenues, and this is where the danger is, because the target audience for benchmark developers is not the end user. Since the end user is generally not willing to pay the price of real benchmarking, he/she is left with whatever the vendors and publications are willing to pay for – and in some cases specifically ask for. Industry consortiums, such as SPEC and BAPCo are supposed to minimize the problems, but politics and money are always going to be driving forces.

In the end, the fact of the matter is that “There Ain’t No Such Thing As A Free Lunch”. If we, the consumers, want accurate benchmarks, we either must pay with our dollars, or pay with our diligence. Uncovering every bug and corruption issue is probably far beyond my capabilities, but what I should be able to do is look at the issues, evaluate them as objectively and honestly as I am able, run my own tests and present the facts as clearly and completely as possible. In some cases, the conclusion may very well be “I just don’t know”, because there isn’t enough information available – but to my mind, the effort is necessary to even begin making the situation any better.


Pages: « Prev  1 2 3  

Be the first to discuss this article!