Overclocking For Performance

Pages: 1 2 3 4 5 6 7 8

Results of Overclocking the CPU

When the CPU multiplier is increased, the processor speed goes up without affecting the system bus speed. This method may improve throughput, but will also increase the power requirements and therefore increase the heat generated. This in turn may shorten the life of the processor, increase the chances of data corruption and failed operations, and may even damage the motherboard, depending upon how much current is drawn or how much heat is generated.

It should be fairly obvious that merely increasing the CPU multiplier provides lesser and lesser benefits as that multiplier goes up. This is due to the fact that the delivery of data (via the memory bus) is staying constant while the processors requirements for data are increasing. The degree that this effect occurs depends greatly on the hit ratio in L1 and L2 cache, as well as where that cache resides and how closely it’s speed matches that of the processor. For systems with the L2 cache tied to the system bus (such as Socket 7 based systems), the memory bus speed does not increase at all when the multiplier is increased which means that the declining performance increases is most noticable here. For Pentium Pro and Pentium II based systems, the mismatch will be much less, due to the L2 cache speed being tied directly to the processor speed rather than the system bus speed. Even with a full speed cache, however, the performance increase declines as the multiplier increases simply because it is impossible to achieve a 100% hit ratio in cache.

The following charts illustrate this point. Three processors were chosen for this illustration based upon their ability to accept various multipliers, their ability to overclock, and their differences in cache implementation. The other components were chosen to be as similar as possible to each other. For the Socket 7 tests, the following configuration was used – AOpen AX59Pro w/512K, 32MB Crucial Technology PC100 SDRAM, W.D. 34300 UDMA HDD, Toshiba XM-5704B CDROM and a Diamond Stealth II S220 with v4.10.01.101 drivers. The Pentium II tests used the same components, except an ABIT BX6 used. Winstone 98 Business suite was used to generate the performance scores. To test the Pentium Pro the memory had to be EDO (60ns), while the motherboard used was an M Tech R651. All other components were identical with the Pentium II and Socket 7 test components.

Winstone 98 Scores for various multipliers at 66MHz
Intel PII (Klamath)Winstone98 Business (Win95)Winstone98 Business (WinNT)High-End Winstone (WinNT)Quake 1.06 (DOS 7)
2.0×66 (133MHz)11.411.812.79.3 fps
2.5×66 (166MHz)15.916.015.910.6 fps
3.0×66 (200MHz)17.917.417.511.8 fps
3.5×66 (233MHz)19.518.818.312.6 fps
4.0×66 (266MHz)20.919.919.413.2 fps
4.5×66 (300MHz)22.120.820.013.4 fps

Winstone 98 Scores for various multipliers at 66MHz
Intel Pentium ProWinstone98 Business (Win95)Winstone98 Business (WinNT)High-End Winstone (WinNT)Quake 1.06 (DOS 7)
2.0×66 (133MHz)13.414.915.18.7 fps
2.5×66 (166MHz)15.516.516.710.6 fps
3.0×66 (200MHz)17.318.017.811.0 fps
3.5×66 (233MHz)18.818.918.811.4 fps

Winstone 98 Scores for various multipliers at 66MHz
AMD K6-2Winstone98 Business (Win95)Winstone98 Business (WinNT)High-End Winstone (WinNT)Quake 1.06 (DOS 7)
2.0×66 (133MHz)14.515.014.79.9 fps
2.5×66 (166MHz)16.116.616.011.1 fps
3.0×66 (200MHz)17.517.517.112.1 fps
3.5×66 (233MHz)18.418.317.812.7 fps
4.0×66 (266MHz)19.419.218.513.5 fps
4.5×66 (300MHz)20.019.619.313.8 fps
5.0×66 (333MHz)20.620.319.914.2 fps
5.5×66 (366MHz)DNF20.319.514.4 fps

Note that in almost every case the percent increase in throughput from the prior setting decreases as the multiplier increases. It should also be noted that size and speed of the cache has an impact on the performance improvement. While the numbers are not published here, tests with the Pentium Classic showed a smaller improvement than the AMD K6-2 using all the same hardware and software. This is because of the smaller L1 cache size (16KB vs 64KB for the AMD K6-2), which means that data must be retrieved from L2 cache more often, therefore the speed mismatch between processor and memory is greater. The worst case scenario is 5.5x, where the difference from 5.0x is essentially non-existent.

The Pentum II and Pentium Pro suffered from this effect less because the speed of the L2 cache is half and full speed, respectively, in relation to the CPU speed. However, even with a full speed L2 cache the percent improvement dropped a bit with each successive multiplier increase – even when measured in relation to the percentage of processor speed increase. For example, at 166MHz (24.8% increase over 133MHz) the performance increase was about 15.7% (60.5% of processor speed increase), however at 233MHz (16.5% increase over 200MHz) the performance improvement was 8.7% (53% of processor speed increase). This is apparently due to the cache hit ratio, which can never be 100%. Had the L2 cache on the Pentium Pro been 512K instead of 256K, the effect may have been lessened.

One of the ‘rules-of-thumb’ for performance tuning is that approximately a 10% improvement in throughput must be achieved before it becomes noticeable to the user. At a multiplier of 3.5 and up, every one of these configurations improved less than 10% over the previous multiplier. Eventually, it becomes necessary to push the processor several ‘steps’ to gain enough improvement to be of any real benefit. What is actually happening here is that the processor is ‘spinning’ faster, but the data is not being delivered any faster (or not keeping up with the increased processor speed) so the processor, in effect, becomes less efficient.

It should now be apparent that the optimum design would be to have the memory bus running at the same speed as the processor, though currently this is not possible. The introduction of Super Socket 7 and P-II chipsets has, however, provided a way to at least minimize the mismatch. Because of issues regarding PCI and AGP bus speeds with current chipsets, the tests and recommendations here include only 66MHz and 100MHz bus speeds (see previous page for results of overclocking the PCI bus).


Pages: « Prev   1 2 3 4 5 6 7 8   Next »

Be the first to discuss this article!