A Better Crystal Ball

Pages: 1 2 3 4 5 6 7 8

A Model of Performance Scaling with Processor Clock Frequency

Let’s consider an ideal processor frequency scaling experiment. We measure the performance of a computer system on a specific benchmark, change only the clock frequency and measure again. The executable is kept the same (i.e. created with the same compiler version, the same compiler flags, same libraries etc.) and the hardware is kept the same (same chipset, the same memory configuration, timing etc.). The first experiment measures performance P1 at frequency F1. The second experiment measures performance P2 at frequency F2.

P1 = F1 / (CPI1 * N1)
and P2 = F2 / (CPI2 * N2)

We will assume the program is not influenced by real time factors so N1 = N2. Let’s break CPI down into two components: an architectural component that is independent of average off-chip memory access time measured in processor clock cycles, and a component that directly tracks the average off-chip memory access time measured in processor clock cycles in the following manner:

CPI = CPI_architectural + CPI_off-chip

Let CPI_architectural = CPI * (1 – m)
and CPI_off-chip = CPI * m

Where m is a parameter that varies between 0 and 1 and indicates the degree to which a program is limited by off-chip memory accesses. When m = 0, changes in the average memory access time in processor clock cycles does not affect CPI, and performance scales perfectly with increases in processor clock frequency. This situation happens with small benchmarks like Dhrystone, whose code and data fit within a processor’s on-chip caches. When m = 1, performance is entirely limited by off-chip memory accesses, and therefore performance does not vary with clock frequency.

So now lets express CPI1 and CPI2 as

CPI1 = CPI * (1 – m) + CPI * m
and CPI2 = CPI * (1 – m) + CPI * m * F2 / F1

Now we can express performance as:

P1 = F1 / (CPI * (1 – m) * N + CPI * m * N)
and P2 = F2 / (CPI * (1 – m) * N + CPI * m * F2 / F1 * N)

If we define K = 1 / (CPI * N) then we can simplify the two equations to

P1 = K * F1 / ((1 – m) + m)
and P2 = K * F2 / ((1 – m) + m * F2 / F1)

Notice that the ((1 – m) + m) term in the denominator of the expression for P1 is simply 1. So therefore K = P1 / F1 and:

P2 = P1 * (F2 / F1) / ((1 – m) + m * F2 / F1)

To solve for m, we need to have performance measurements at two different frequencies (P1, F1, P2, F2) with all other software and hardware factors held constant. Doing the math, the equation for m is:

m = (P1 / P2 * F2 / F1 – 1) / (F2 / F1 – 1)

So now we have a model to predict performance Px at frequency Fx:

Px = P1 * (Fx / F1) / ((1 – m) + m * Fx / F1)


Pages: « Prev   1 2 3 4 5 6 7 8   Next »

Be the first to discuss this article!