Some Background
For those who may not be familiar with it, COSBI (Comprehensive Open Source Benchmark Initiative) is Van Smith’s effort to wrest control of benchmarking design from the hands of corporate interests and into the hands of the user community (please refer to Van’s Hardware Journal for more information). While this is certainly a laudable goal, it is also fraught with potential problems. Thus, when Van’s Hardware Journal published some preliminary results of the Quick CPU Test (here and here), a question that has been nagging me for quite some time was brought to the fore… just how much of an effect do different compilers have on the performance of a program, particularly across platforms?
The issue for me is that I need to have some kind of frame of reference in order to really understand something. Without a foundation, or basis, on which to compare things, I am left wondering what it all means. For example, if a manufacturer tells me that their new electric car will run for 6 hours on a single charge, I don’t really know what that means for me. Is that good? How does it compare to my gasoline powered car? I can answer the question by thinking about how much time I spend driving my car each week, and how many times I fill up with gas. This gives me a common frame of reference, and now I can feel comfortable in evaluating what ‘6 hours on a charge’ really means in terms of cost and convenience.
With my very limited experience with Windows programming and cross-platform compilers I found myself wondering how to interpret the results. Those with a lot of experience likely have a good ‘gut feel’ for such things, but asking around resulted in some differences in opinion. Some said Delphi is as good as any C compiler, and others said not. Some suggested that certain compilers would produce better performing code on specific platforms than others.
I am not comfortable with assertions that cannot be verified with some hard facts, and this seemed to be one of those situations. Many people have opinions, but nobody seemed to be able to back them up with solid evidence… so I continued to press for answers.
I did find a few references on the web on Delphi vs. C, and there seemed to be little difference in the resulting executables when the code was optimized by the programmer properly (see this page for an example) – but the question of whether certain compilers favor one architecture over another was still in question. Thanks to Brian Neal (see this post in Ace’s Hardware Forum), this second question could be answered – at least to some degree.
On the website referenced in Brian’s post, there are executables generated from a variety of compilers – all solving the same problem with essentially the same algorithm. For those interested in seeing whether these are really optimally implemented, the source is provided for each – but that wasn’t really my interest. I just wanted to see if different executables would result in the same relative performance as the other compilers across all platforms.
Test Setup, Results and Conclusion
What I did was to set up three systems using as many of the same components as possible:
- Matrox G550
- Matrox 5.12.01.1200 driver
- 1024×768, 16-bit color
- IBM 75GXP 45GB HDD
- Windows 2000 SP2
The platform specific components were:
- AOpen AK77Pro(A)-133, Duron 1.2GHz (133MHz x 9), Crucial Technology PC2100 DDR, CL2
- AOpen AX34-U, PIII 1.2GHz (CuMine, 133MHz x 9), Crucial Technology PC133 SDRAM, CL2
- AOpen AX4BS, P4 1.2GHz (Willamette, 100MHz x 12), Crucial Technology PC133 SDRAM, CL2
The point of making these systems as similar as possible was not to identify whether one platform performed better than another on a per-clock basis, but to simply make things as ‘equal’ as possible. I did run a test with each executable on an Athlon XP (at 1.2GHz), and they were almost identical to the Duron results, so I didn’t bother to duplicate the entire set of results. I also might have used a DDR or DRDRAM based P4, but again, my intention here was not to compare the platforms, but the compilers. It just so happens that I had this P4 system set up and ready to perform the tests (as part of another benchmark analysis I am performing).
I then ran 6 of the executables provided using the external timer program available from the same site. The parameter I used for all runs was to generate 10,000 digits of pi. The executables I chose were from the following compilers:
- Borland C++ Build 6 (bcbcpp.exe)
- Borland Delphi 6 Update 2 (delphipi.exe)
- Metroworks Code Warrior 6 (cwc.exe and cwcpp.exe)
- Microsoft Visual Studio .NET (vsc.exe and vscpp.exe)
I ran each one 10 times to determine the accuracy of the timer, threw out the highest and lowest scores, and averaged the 8 remaining scores. Generally, these 8 scores were pretty close. I also included the complete table of results at the bottom of this article.
Here are the results
bcbcpp.exe |
cwc.exe |
cwcpp.exe |
DelphiPi.exe |
vsc.exe |
vscpp.exe | |
PIII |
17316 |
21347 |
23448 |
23062 |
14525 |
15633 |
Duron |
19668 |
24281 |
23260 |
28044 |
18075 |
17870 |
P4 |
35209 |
42283 |
40949 |
37123 |
27199 |
33328 |
Several things seem evident to me…
- Every compiler except one produced executables that performed better on the PIII than the other two platforms. The question is whether this is a compiler issue, or because the programmer used a coding style that favors the PIII over the others.
- The Delphi scores are generally worse than the Microsoft C and Borland C results, but this may also be due to the programmer being more familiar with optimizing C code than Delphi. One of the links earlier in this article seems to show that good programming style is more important than which compiler is used, at least between Microsoft C and Borland Delphi. It would be interesting to see some similar tests between various C compilers.
- The Metroworks compiler seems to produce code that the P4 really doesn’t like at all. The question, of course, is whether this is inherent in the compiler, or if there were some differences in flag settings when making the executables.
- The P4 scores are much worse for all compilers using this code. What seems interesting is that Delphi has the smallest delta between the three platforms, has the smallest delta between PIII and P4, but has the largest delta between PIII and K7.
- The Microsoft compiler seems to produce the best performing code for all platforms -and in most cases, significantly faster. This may explain why Microsoft’s compiler seems to be the one used by the majority of commercial developers. Though Intel’s compiler is regarded as the best in this regard, it also seems to have many problems – however, recently it has been suggested that version 6 is much better. In the meantime, there seems to be a few complaints that Microsofts compilers are getting a little worse, performance wise.
As is the case in many intellectual endeavors, answers seem to beget more questions.
With only one example, it is difficult to come to any definite conclusions about the relative performance of executables produced by these 6 compilers. I wonder how many would be interested in participating in an ‘optimization’ contest to prove their programming skills, and to promote their favorite compiler? It does appear that a good programmer can wring about the same performance from a Delphi program as one written in C, though we still don’t know how such optimizations affect the cross-platform performance. Another interesting contest, perhaps?
As for COSBI, only time will tell whether the initiative will result in its intended goal, but the question in my mind of whether Delphi is good enough to create a reasonable benchmark seems to have been answered. Also, while I have seen a few other compiler comparisons, all of them ignore the cross-platform issue. This wasn’t an issue even two years ago, but it certainly is today.
Complete Table of Results
Results sorted from lowest to highest (not in the order they were actually generated)
bcbcpp.exe |
cwc.exe |
cwcpp.exe |
DelphiPi.exe |
vsc.exe |
vscpp.exe | |
Duron |
19658 |
24275 |
23253 |
28021 |
18056 |
17865 |
19658 |
24275 |
23253 |
28030 |
18065 |
17865 | |
19658 |
24275 |
23253 |
28031 |
18066 |
17866 | |
19668 |
24275 |
23253 |
28040 |
18075 |
17866 | |
19668 |
24285 |
23263 |
28040 |
18076 |
17866 | |
19668 |
24285 |
23263 |
28041 |
18076 |
17866 | |
19669 |
24285 |
23264 |
28051 |
18076 |
17875 | |
19669 |
24285 |
23264 |
28060 |
18076 |
17876 | |
19688 |
24285 |
23264 |
28060 |
18086 |
17876 | |
19728 |
24315 |
23274 |
28071 |
18086 |
17915 | |
P4 |
35172 |
42250 |
40875 |
37109 |
27187 |
33313 |
35188 |
42250 |
40891 |
37109 |
27187 |
33313 | |
35203 |
42281 |
40922 |
37110 |
27187 |
33328 | |
35203 |
42281 |
40938 |
37110 |
27188 |
33328 | |
35203 |
42281 |
40953 |
37125 |
27188 |
33328 | |
35218 |
42282 |
40953 |
37125 |
27188 |
33328 | |
35219 |
42297 |
40968 |
37125 |
27203 |
33328 | |
35219 |
42297 |
40984 |
37125 |
27219 |
33329 | |
35219 |
42297 |
40985 |
37156 |
27234 |
33344 | |
35265 |
42312 |
41015 |
37156 |
27235 |
33359 | |
PIII |
17305 |
21321 |
23433 |
23053 |
14520 |
15623 |
17315 |
21321 |
23434 |
23053 |
14520 |
15623 | |
17315 |
21321 |
23443 |
23053 |
14521 |
15632 | |
17315 |
21330 |
23443 |
23053 |
14521 |
15632 | |
17315 |
21341 |
23444 |
23063 |
14521 |
15632 | |
17315 |
21361 |
23444 |
23063 |
14521 |
15633 | |
17315 |
21361 |
23444 |
23064 |
14521 |
15633 | |
17315 |
21371 |
23444 |
23073 |
14521 |
15633 | |
17325 |
21371 |
23484 |
23073 |
14551 |
15643 | |
17405 |
21381 |
23504 |
23103 |
14561 |
15673 |
Discuss (18 comments)