By: none (none.delete@this.none.com), October 3, 2015 1:04 am
Room: Moderated Discussions
Wilco (Wilco.Dijkstra.delete@this.ntlworld.com) on October 2, 2015 5:06 pm wrote:
[...]
> GCC does do a lot of function calls. Not sure whether there are performance counters that can count
> load vs LDP, but a static count should give a reasonable idea anyway given GCC is not loop heavy.
You don't need performance counters on real hardware for this kind of measures, you can
use a fast simulator.
On 403.gcc compiled with way:
gcc-linaro-4.9-2015.02-3-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc -DSPEC_CPU_LP64 -DSPEC_CPU -Ofast -mcpu=cortex-a57 -static
The 9 inputs total ~947B instructions. Among them 201B are loads and 74B are stores.
Among these ld/st, ~36B are LDP and ~38B are STP. Most of them are memset/memcpy and
function prologues/epilogues.
> For clearing there is a special clear instruction - current cores clear 64-128
> bytes per instruction as fast as L1 cache can write back into L2.
There are ~22B dc zva in 403.gcc (assuming the instruction clears 64 bytes at a time).
gcc loves clearing memory :-)
[...]
> GCC does do a lot of function calls. Not sure whether there are performance counters that can count
> load vs LDP, but a static count should give a reasonable idea anyway given GCC is not loop heavy.
You don't need performance counters on real hardware for this kind of measures, you can
use a fast simulator.
On 403.gcc compiled with way:
gcc-linaro-4.9-2015.02-3-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc -DSPEC_CPU_LP64 -DSPEC_CPU -Ofast -mcpu=cortex-a57 -static
The 9 inputs total ~947B instructions. Among them 201B are loads and 74B are stores.
Among these ld/st, ~36B are LDP and ~38B are STP. Most of them are memset/memcpy and
function prologues/epilogues.
> For clearing there is a special clear instruction - current cores clear 64-128
> bytes per instruction as fast as L1 cache can write back into L2.
There are ~22B dc zva in 403.gcc (assuming the instruction clears 64 bytes at a time).
gcc loves clearing memory :-)