By: Gabriele Svelto (gabriele.svelto.delete@this.gmail.com), February 5, 2021 5:30 am
Room: Moderated Discussions
See this interesting post on MSRC's blog:
Building Faster AMD64 Memset Routines
In a nutshell the author replaced a significant number of unpredictable branches with overlapping stores, calculating their addresses in branch-less code. The results effectively take advantage of the very good store buffer implementations and efficient handling of unaligned accesses on modern x86 processors.
I found the article particularly interesting because in my testing of other highly data-dependent leaf functions - such as memcpy() and string manipulation functions - branch mispredictions always stood out as the biggest contributor in the time spent there. I definitely prefer this approach to the jump-happy versions that are more common (like in glibc for example).
Building Faster AMD64 Memset Routines
In a nutshell the author replaced a significant number of unpredictable branches with overlapping stores, calculating their addresses in branch-less code. The results effectively take advantage of the very good store buffer implementations and efficient handling of unaligned accesses on modern x86 processors.
I found the article particularly interesting because in my testing of other highly data-dependent leaf functions - such as memcpy() and string manipulation functions - branch mispredictions always stood out as the biggest contributor in the time spent there. I definitely prefer this approach to the jump-happy versions that are more common (like in glibc for example).
Topic | Posted By | Date |
---|---|---|
An interesting approach to memset() | Gabriele Svelto | 2021/02/05 05:30 AM |
An interesting approach to memset() | Adrian | 2021/02/05 06:21 AM |
An interesting approach to memset() | rwessel | 2021/02/05 06:56 AM |
An interesting approach to memset() | Foo_ | 2021/02/05 09:40 AM |
An interesting approach to memset() | Wilco | 2021/02/05 02:15 PM |
An interesting approach to memset() | Anon | 2021/02/05 02:32 PM |
An interesting approach to memset() | Wilco | 2021/02/06 10:15 AM |
An interesting approach to memset() | Linus Torvalds | 2021/02/05 02:39 PM |
An interesting approach to memset() | Jörn Engel | 2021/02/07 02:48 PM |
An interesting approach to memset() | Linus Torvalds | 2021/02/07 03:14 PM |
An interesting approach to memset() | Jörn Engel | 2021/02/07 04:00 PM |
An interesting approach to memset() | Jörn Engel | 2021/02/08 11:32 AM |
An interesting approach to memset() | Andrey | 2021/02/08 01:46 PM |
An interesting approach to memset() | Andrey | 2021/02/08 02:22 PM |
An interesting approach to memset() | Jörn Engel | 2021/02/08 02:39 PM |
An interesting approach to memset() | Carson | 2021/02/07 07:02 PM |
An interesting approach to memset() | Mark Roulo | 2021/02/07 07:31 PM |
An interesting approach to memset() | Dummond D. Slow | 2021/02/07 08:02 PM |
An interesting approach to memset() | anon2 | 2021/02/07 10:22 PM |
An interesting approach to memset() | Anon | 2021/02/08 01:16 AM |
An interesting approach to memset() | anon2 | 2021/02/08 04:20 AM |
An interesting approach to memset() | Dummond D. Slow | 2021/02/08 06:50 PM |
An interesting approach to memset() | anon2 | 2021/02/09 06:04 PM |
An interesting approach to memset() | gallier2 | 2021/02/08 01:19 AM |
An interesting approach to memset() | anon2 | 2021/02/08 04:23 AM |
An interesting approach to memset() | Michael S | 2021/02/08 05:17 AM |
An interesting approach to memset() | Anon | 2021/02/08 06:43 AM |
An interesting approach to memset() | Adrian | 2021/02/08 10:34 AM |
An interesting approach to memset() | Jouni Osmala | 2021/02/08 03:37 AM |
An interesting approach to memset() | Anon | 2021/02/08 06:49 AM |
An interesting approach to memset() | anonymou5 | 2021/02/08 11:02 AM |
An interesting approach to memset() | Anon | 2021/02/08 11:18 AM |
An interesting approach to memset() | anonymou5 | 2021/02/08 12:28 PM |
An interesting approach to memset() | Linus Torvalds | 2021/02/08 01:01 PM |
An interesting approach to memset() | anonymou5 | 2021/02/08 07:17 PM |
An interesting approach to memset() | Chester | 2021/02/09 09:15 AM |
An interesting approach to memset() | Michael S | 2021/02/08 05:29 AM |
An interesting approach to memset() | j | 2021/02/08 09:01 AM |
An interesting approach to memset() | Linus Torvalds | 2021/02/08 12:18 PM |
An interesting approach to memset() | wumpus | 2021/02/08 11:25 AM |
quibble | Carlie Coats | 2021/02/09 09:23 AM |
quibble | Michael S | 2021/02/09 09:47 AM |
quibble | rwessel | 2021/02/09 07:56 PM |
quibble | Mark Roulo | 2021/02/11 10:40 AM |
quibble | rwessel | 2021/02/11 12:17 PM |
quibble | dmcq | 2021/02/12 03:37 AM |
AVX512 vs. SVE? | RA | 2021/02/08 02:40 AM |
AVX512 vs. SVE? | dmcq | 2021/02/08 03:34 AM |
AVX512 vs. SVE? | Doug S | 2021/02/08 10:36 AM |
AVX512 vs. SVE? | Michael S | 2021/02/08 12:03 PM |
AVX512 vs. SVE? | dmcq | 2021/02/08 12:05 PM |
AVX512 vs. SVE? | Jukka Larja | 2021/02/09 06:05 AM |
AVX512 vs. SVE? | Michael S | 2021/02/09 09:52 AM |
AVX512 vs. SVE? | -.- | 2021/02/09 06:58 PM |
AVX512 vs. SVE? | none | 2021/02/09 04:20 AM |
AVX512 vs. SVE? | Jörn Engel | 2021/02/09 10:18 AM |
AVX512 vs. SVE? | Wilco | 2021/02/09 02:56 PM |
AVX512 vs. SVE? | Jörn Engel | 2021/02/09 04:24 PM |
AVX512 vs. SVE? | -.- | 2021/02/09 06:37 PM |
AVX512 vs. SVE? | Jörn Engel | 2021/02/08 11:06 AM |
AVX512 vs. SVE? | anon | 2021/02/09 12:35 PM |
An interesting approach to memset() | Romain Dolbeau | 2021/02/09 12:43 AM |
An interesting approach to memset() | Jörn Engel | 2021/02/09 10:10 AM |