Unless otherwise stated here, all tests are the arithmetic average of five runs.
Cachemem 2.6 – variance is usually only 0.2MB/s (bandwidth) or 1 clock cycle (latency). Average of three runs (bandwidth) or worst case run (latency)
Run memory test 15 times.
Report modal score and peak score as final result. If more than one modal result, average the modal scores. The peak score is the maximum of the combined bandwidth tests. If there is more than one peak, I average integer and floating point results and record the average integer and floating point scores.
Why do I take the modal score? From the SANDRA FAQ:
…”Q: How is Sandra’s Memory Benchmark
different from STREAM?
A: STREAM 2.0 uses static data (about 12M) – Sandra uses dynamic data (around 40-60% of physical system RAM). This means that on computers with fast memory Sandra may yield lower results than STREAM. It’s not feasible to make Sandra use static RAM – since Sandra is much more than a benchmark, thus it would needlessly use memory.
A major difference is that Sandra’s algorithm is multi-threaded on SMP systems. This works by splitting the arrays and letting each thread work on its own bit. Sandra creates a thread for each CPU in the system and assignes each thread to an individual CPU.
Another difference is the aggressive use of sheduling/overlapping of instructions in order to maximise memory throughput even on "slower" processors. The loops should always be memory bound rather than CPU bound on all modern processors.
The other major difference is the use of alignment. Sandra dynamically changes the alignment of streams until it finds the best combination, then it repeatedly tests it to estimate the maximum throughput of the system. You can change the alignment in STREAM and recompile – but generally it is set to 0 (i.e. no).“…
I take this to mean that there can be variance. The fact that SANDRA returns the same value in multiple runs means that this must be a viable option or a most likely case.
The Evolva documentation states that the demo should be run for a few iterations to allow the scores to “settle down”. Each test was allowed to loop five times and the figures recorded were the minimum and average frame rates.
Winstone recommended testing procedures are:
- run fives times,
- discard first run,
- report highest score,
repeat test if variance is greater than 3%
My test methodology is:
- run five times
- average scores
- repeat test if variance is greater than 4%
Be the first to discuss this article!