"manual memcpy" and modern compilers

By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), June 1, 2017 10:39 am
Room: Moderated Discussions
Jouni Osmala (a.delete@this.b.c) on May 31, 2017 11:42 pm wrote:
>
> It wouldn't really make sense having capability to do two of those in a cycle. Unaligned access in reality
> is two loads that hardware makes look like one load for software.

No. Really no.

This kind of incorrect thinking is why people who used to say that "unaligned accesses are bad" (you know who you are - ARM fanbois did that all that time here on this forum, right up until ARM started doing unaligned accesses, and now those same fanbois talk about how great they are and how they can help compilers generate better code).

Unaligned loads remain one single load, in most cases. They don't turn into two loads just for being unaligned.

They turn into two loads when they cross a cache fetch boundary (note the difference between cache fetch boundary and cacheline size - they two are not necessarily the same). Even then, they aren't really two "full" loads 99% of the time - unless they actually cross a page boundary they need a single TLB lookup, and a single address calculation (with the "next cache fetch" being a simpler second one).

Of course, if the size of the read is the same as the cache fetch width, then yes, the two concepts ("unaligned" and "crosses a cache fetch boundary") end up being exactly the same.

But quite often the two are wildly different. A regular 32-bit unaligned "int" read will be one single load 90% of the time if your cache interface is 32 bytes.

Of course, with bigger vector registers, it gets much less likely, and often the initial implementation of a new vector register might well match the cache fetch size on that uarch. But even then it's a mistake to think that unaligned accesses don't help: you still want your instruction set to handle them gracefully, and you still want to encourage people to use them over trying to manually align things, because guess what? That just means that you have a good avenue for improvement in hw for the future.

If you encourage software people to think that your hardware is stupid and cannot handle unaligned accesses, you also end up screwing that future potential upside when you make your cache interface wider - now people will do two accesses in software, even if a single hardware access could have worked.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Is K12 still alive?Heikki Kultala2017/05/11 10:34 PM
  It never made senseSomeone2017/05/12 12:58 AM
    It never made sensejuanrga2017/05/12 05:02 AM
      It never made senseMichael S2017/05/12 05:47 AM
      It never made senseanon.12017/05/12 08:19 AM
        It never made sensewumpus2017/05/12 04:57 PM
          It never made senseanon.12017/05/12 06:37 PM
            It never made sensewumpus2017/05/13 07:52 AM
              It never made senseanon.12017/05/13 06:29 PM
                It never made senseDavid Kanter2017/05/14 12:41 AM
                  It never made sensejuanrga2017/05/14 05:23 AM
                    It never made sensebakaneko2017/05/14 05:56 AM
                  It never made senseanon.12017/05/14 08:36 AM
                Hierofalcon ?Michael S2017/05/14 01:15 AM
                  Hierofalcon ?anyone2017/05/15 10:05 AM
        It never made sensejuanrga2017/05/12 07:11 PM
          It never made senseanon.12017/05/13 06:59 AM
            It never made sensejuanrga2017/05/14 04:35 AM
              It never made senseanon.12017/05/14 09:26 AM
                It never made sensejuanrga2017/05/14 04:47 PM
                  It never made senseanon.12017/05/14 05:49 PM
                    It never made sensejuanrga2017/05/17 05:10 AM
                      It never made senseanon.12017/05/18 09:11 AM
                        It never made sensejuanrga2017/05/20 03:10 AM
                          It never made senseanon.12017/05/20 09:40 AM
                            It never made senseBrett2017/05/20 11:08 AM
                              It never made sensewumpus2017/05/20 12:27 PM
                                It never made senseMichael S2017/05/20 01:49 PM
                            It never made senseanon.12017/05/20 04:19 PM
                              It never made senseBrett2017/05/20 05:44 PM
                                It never made senseanon.12017/05/20 06:22 PM
                                  It never made senseBrett2017/05/20 07:08 PM
                                    It never made senseanon.12017/05/20 07:35 PM
                                    It never made senseJouni Osmala2017/05/21 08:45 AM
                                      It never made senseBrett2017/05/21 12:28 PM
                                        It never made senseJouni Osmala2017/05/22 01:07 AM
                                          It never made senseMichael S2017/05/22 01:27 AM
                                      It never made senseMaynard Handley2017/05/21 08:09 PM
                                        It never made senseAndreas2017/05/23 05:03 AM
                                          It never made senseMaynard Handley2017/05/23 09:37 AM
                                            It never made senseAndreas2017/05/24 05:11 AM
                              It never made sensedmcq2017/05/20 05:45 PM
                                It never made senseanon.12017/05/20 06:24 PM
                                  It never made senseanon.12017/05/20 07:43 PM
                                    It never made sensedmcq2017/05/21 11:34 AM
                                    It never made senseblue2017/05/21 01:29 PM
                                      It never made senseblue2017/05/21 01:30 PM
                                  It never made senseMaynard Handley2017/05/21 08:12 PM
                                  To all! Snip your citations. It's annoying as hell asit is!!! (NT)gallier22017/05/22 12:48 AM
                              Bogus ICC comparisonWilco2017/05/21 04:06 AM
                                Bogus ICC comparisonanon.12017/05/21 08:09 AM
                                  Bogus ICC comparisonMichael S2017/05/21 09:11 AM
                                  Bogus ICC comparisonDavid Kanter2017/05/21 12:42 PM
                                    Bogus ICC comparisonAnne O'Nonymous2017/05/22 04:14 AM
                                      Bogus ICC comparisonslacker2017/05/22 05:21 AM
                                        Bogus ICC comparisonAnne O'Nymous2017/05/23 11:26 AM
                                    Bogus ICC comparisondmcq2017/05/22 05:55 AM
                                      Bogus ICC comparisonanon.12017/05/22 11:59 AM
                                        Bogus ICC comparisonWilco2017/05/22 01:15 PM
                                    Bogus ICC comparisonanon.12017/05/22 11:44 AM
                                      Bogus ICC comparisonWilco2017/05/22 12:55 PM
                                Just look at the 403.gcc resultsDoug S2017/05/21 12:24 PM
                                  Just look at the 403.gcc resultsMaynard Handley2017/05/21 08:17 PM
                                    Just look at the 403.gcc resultsDoug S2017/05/21 10:14 PM
                                      Just look at the 403.gcc resultsdmcq2017/05/22 06:08 AM
                            It never made sensejuanrga2017/05/21 05:46 AM
                              It never made senseanon.12017/05/21 07:57 AM
                                It never made senseanon.12017/05/21 08:32 AM
                              It never made senseAnne O'Nonymous2017/05/22 04:11 AM
                required PRF sizeHeikki Kultala2017/05/14 08:59 PM
                  required PRF sizeWilco2017/05/15 02:18 AM
                    required PRF sizeMichael S2017/05/15 03:05 AM
                      required PRF sizeanon.12017/05/15 06:57 AM
                        required PRF sizeWilco2017/05/15 02:46 PM
                          required PRF sizeanon.12017/05/15 06:30 PM
                            required PRF sizeWilco2017/05/16 03:50 AM
                              required PRF sizeMichael S2017/05/16 04:23 AM
                              required PRF sizeanon.12017/05/16 06:57 AM
                                required PRF sizeRicardo B2017/05/16 09:10 AM
                                  required PRF sizeanon.12017/05/16 11:56 AM
                                    Thanks! (NT)Ricardo B2017/05/16 03:51 PM
                                    required PRF sizeJouni Osmala2017/05/16 10:03 PM
                                      required PRF sizeanon.12017/05/17 12:04 AM
                                  required PRF sizeMaynard Handley2017/05/16 04:56 PM
                              required PRF sizeanon.12017/05/16 08:21 AM
                    required PRF sizeLinus B Torvalds2017/05/15 10:11 AM
                      required PRF sizeMichael S2017/05/15 11:20 AM
                        required PRF sizeLinus B Torvalds2017/05/15 03:49 PM
                          required PRF sizeJouni Osmala2017/05/17 06:04 AM
                      Load-op usageWilco2017/05/15 04:29 PM
                        Load-op usageanon52017/05/15 06:05 PM
                          Load-op usageWilco2017/05/16 05:15 PM
                            Load-op usageMichael S2017/05/17 01:00 AM
                              Load-op usageWilco2017/05/17 03:02 AM
                                could it be C vs C++? (NT)Michael S2017/05/17 03:46 AM
                                Load-op usageGabriele Svelto2017/05/17 05:27 AM
                                  Load-op usageGian-Carlo Pascutto2017/05/17 08:53 AM
                                    Use perf top?Travis2017/05/17 01:21 PM
                                      Use perf top?Wilco2017/05/17 04:23 PM
                                        Use perf top?Travis2017/05/17 06:12 PM
                                          Use perf top?Seni2017/05/17 09:13 PM
                                            Use perf top?Wilco2017/05/18 03:37 AM
                                              Compiled on Skylake? (NT)Michael S2017/05/18 04:16 AM
                                              Use perf top?Gabriele Svelto2017/05/18 05:19 AM
                                                Use perf top?octoploid2017/05/18 05:48 AM
                                                  Use perf top?Gabriele Svelto2017/05/18 09:33 AM
                                                    Use perf top?octoploid2017/05/18 10:51 AM
                                                      Use perf top?Gabriele Svelto2017/05/18 01:12 PM
                                                        Use perf top?octoploid2017/05/18 01:29 PM
                                                          Use perf top?Gian-Carlo Pascutto2017/05/22 08:21 AM
                                                            Use perf top?octoploid2017/05/22 09:01 AM
                                                              Use perf top?Gian-Carlo Pascutto2017/05/22 10:21 AM
                                                                Use perf top?octoploid2017/05/22 10:34 AM
                                                                  Use perf top?Gian-Carlo Pascutto2017/05/22 10:53 AM
                                                                    Use perf top?octoploid2017/05/23 03:54 AM
                                                                      Use perf top?rwessel2017/05/23 08:58 AM
                                                                        Use perf top?octoploid2017/05/23 09:09 AM
                                                                          Use perf top?Megol2017/05/24 05:04 AM
                                                                            Use perf top?octoploid2017/05/24 05:24 AM
                                                                              Use perf top?Gian-Carlo Pascutto2017/05/24 06:53 AM
                                                                                Use perf top?octoploid2017/05/24 07:01 AM
                                                                              Use perf top?Megol2017/05/25 01:24 PM
                                          Use perf top?Wilco2017/05/18 03:20 AM
                                            Use perf top?Travis2017/05/18 02:24 PM
                                              Use perf top?Wilco2017/05/18 04:50 PM
                                                Use perf top?Travis2017/05/18 07:34 PM
                            Load-op usageMichael S2017/05/17 01:21 AM
                              Load-op usageWilco2017/05/17 03:20 AM
                                Load-op usageLinus B Torvalds2017/05/17 09:29 AM
                                  Load-op usageLinus B Torvalds2017/05/17 02:45 PM
                        Load-op usageanon.12017/05/15 06:36 PM
                          Load-op usageMichael S2017/05/16 01:27 AM
                            Load-op usageanon.12017/05/16 07:52 AM
                              Load-op usageanon.12017/05/16 07:58 AM
                              Load-op usageMichael S2017/05/17 12:52 AM
                                Load-op usageanon.12017/05/17 07:03 AM
                                  Load-op usageMichael S2017/05/17 07:24 AM
                                    Load-op usageanon.12017/05/17 11:53 PM
                                      Load-op usageMichael S2017/05/18 12:48 AM
                        Load-op usageLinus B Torvalds2017/05/16 09:01 AM
                          Load-op usageLinus B Torvalds2017/05/16 09:17 AM
                          Load-op usage_Arthur2017/05/17 05:11 PM
                            Load-op usageMichael S2017/05/18 02:50 AM
                            Load-op usageLinus B Torvalds2017/05/18 10:03 AM
                              Load-op usageoctoploid2017/05/18 11:45 AM
                                Load-op usageLinus B Torvalds2017/05/18 12:28 PM
                  required PRF sizeanon.12017/05/15 07:44 AM
                    required PRF sizeslacker2017/05/15 05:20 PM
                      required PRF sizeanon.12017/05/15 07:48 PM
                        required PRF sizeslacker2017/05/15 09:52 PM
                          Fixed linkslacker2017/05/15 09:54 PM
                          required PRF sizeanon.12017/05/16 07:56 AM
          It never made senseanon.12017/05/13 08:03 AM
            It never made senseanon.12017/05/13 08:31 AM
              It never made sensenobody in particular2017/05/13 09:02 AM
              It never made senseGabriele Svelto2017/05/13 09:05 AM
                It never made senseanon.12017/05/13 11:07 AM
                It never made senseAaron Spink2017/05/13 05:18 PM
              It never made senseDavid Hess2017/05/13 07:28 PM
                It never made senseBrett2017/05/13 10:25 PM
                It never made senseanon.12017/05/13 11:44 PM
                  It never made senseNiels Jørgen Kruse2017/05/14 02:37 AM
                    It never made senseanon.12017/05/14 09:45 AM
                      It never made senseNiels Jørgen Kruse2017/05/14 01:06 PM
                    It never made senseMaynard Handley2017/05/16 04:46 AM
                      It never made senseNiels Jørgen Kruse2017/05/16 10:24 PM
                  It never made sensejuanrga2017/05/14 05:02 AM
                    It never made sensenobody in particular2017/05/14 05:31 AM
                      It never made sensejuanrga2017/05/14 02:36 PM
                        It never made sensenobody in particular2017/05/14 03:50 PM
                          It never made sensejuanrga2017/05/14 05:36 PM
                            You're discussing two dead-in-the-water architecturesdefault2017/05/15 02:52 PM
                              You're discussing two dead-in-the-water architecturesblue2017/05/15 07:14 PM
                              You're discussing two dead-in-the-water architecturesjuanrga2017/05/17 04:52 AM
                    It never made senseanon.12017/05/14 08:27 AM
                      It never made senseMichael S2017/05/14 08:54 AM
                        It never made senseanon.12017/05/14 09:40 AM
                      It never made sensejuanrga2017/05/14 03:09 PM
                        It never made sensenobody in particular2017/05/14 03:51 PM
                        It never made senseMichael S2017/05/14 03:56 PM
                        It never made senseanon.12017/05/14 05:54 PM
                  It never made senseDavid Hess2017/05/14 11:02 AM
                    It never made senseBrett2017/05/14 01:24 PM
                      It never made senseMichael S2017/05/15 04:55 AM
                        It never made senseAnon2017/05/15 04:14 PM
                          It never made senseMichael S2017/05/16 02:21 AM
                            It never made sensehobel2017/05/16 08:42 AM
                      It never made senseDavid Hess2017/05/15 06:33 AM
                    It never made sensewumpus2017/05/14 03:08 PM
                      It never made senseDavid Hess2017/05/15 06:23 AM
            It never made sensejuanrga2017/05/14 04:49 AM
              It never made senseAaron Spink2017/05/14 04:58 AM
    It never made senseHeikki Kultala2017/05/12 11:47 AM
      It never made senseAaron Spink2017/05/13 05:20 PM
    It never made senseWes Felter2017/05/12 01:18 PM
      It never made senseanon.12017/05/12 06:32 PM
  Is K12 still alive?juanrga2017/05/12 04:49 AM
    Is K12 still alive?Heikki Kultala2017/05/12 11:31 AM
      Is K12 still alive?who me?2017/05/17 07:39 PM
        Is K12 still alive?juanrga2017/05/18 02:44 AM
        Is K12 still alive?dmcq2017/05/22 06:19 AM
          Is K12 still alive?Foo_2017/05/22 07:56 AM
            Is K12 still alive?David Kanter2017/05/22 02:42 PM
              Is K12 still alive?Linus B Torvalds2017/05/22 07:45 PM
                Is K12 still alive?Michael_S2017/05/22 11:34 PM
                Is K12 still alive?David Kanter2017/05/23 09:17 AM
                  Is K12 still alive?Linus B Torvalds2017/05/23 10:29 AM
                    Is K12 still alive?octoploid2017/05/23 11:25 AM
                      slow AVX-512 memcpy/memsetEric Bron2017/05/23 12:48 PM
                        slow AVX-512 memcpy/memsetLinus B Torvalds2017/05/23 01:51 PM
                          slow AVX-512 memcpy/memsetEric Bron2017/05/23 02:05 PM
                            slow AVX-512 memcpy/memsetLinus B Torvalds2017/05/23 02:43 PM
                              slow AVX-512 memcpy/memsetEric Bron2017/05/23 02:59 PM
                                KNL code generator vs 2014Michael S2017/05/24 12:57 AM
                                  KNL code generator vs 2014Eric Bron2017/05/24 04:21 AM
                                  KNL code generator vs 2014anon.5122017/05/24 04:03 PM
                                    KNL code generator vs 2014Michael S2017/05/25 08:32 AM
                                  food for thoughtEric Bron2017/05/24 04:57 PM
                                    icc 17 on godbolt disagreeMichael S2017/05/25 01:45 AM
                                      Sorry, I posted SKX code twiceMichael S2017/05/25 01:48 AM
                                         stall 2 - are KNL VPUs really OoO?Michael S2017/05/25 02:27 AM
                                      which version of icc 17 ? (NT)Eric Bron2017/05/25 03:50 AM
                                        17.0.0Michael S2017/05/25 03:52 AM
                                          17.0.0Eric Bron2017/05/25 04:13 AM
                                          17.0.0Eric Bron2017/05/25 04:24 AM
                                            17.0.0Michael S2017/05/25 05:29 AM
                                              17.0.0Eric Bron2017/05/25 05:43 AM
                                                17.0.0Michael S2017/05/25 08:40 AM
                                                  strange 256-bit code with icc v7.0.4Eric Bron2017/05/25 10:51 AM
                                              17.0.0Eric Bron2017/05/25 05:54 AM
                                          fixed exampleEric Bron2017/05/25 04:57 AM
                              slow AVX-512 memcpy/memsetTravis2017/05/23 03:57 PM
                                correction: has NOT been the caseTravis2017/05/23 03:58 PM
                              slow AVX-512 memcpy/memsetanon2017/05/24 06:00 AM
                                slow AVX-512 memcpy/memsetTravis2017/05/24 02:27 PM
                                  slow AVX-512 memcpy/memsetanon2017/05/25 02:16 AM
                                    slow AVX-512 memcpy/memsetTravis2017/05/25 05:02 PM
                            slow AVX-512 memcpy/memsetGabriele Svelto2017/05/24 05:12 AM
                          slow AVX-512 memcpy/memsetDoug S2017/05/23 02:35 PM
                            slow AVX-512 memcpy/memsetLinus B Torvalds2017/05/23 03:07 PM
                              Dedicated mem* instructionsDoug S2017/05/23 11:17 PM
                                Dedicated mem* instructionsLinus Torvalds2017/05/24 01:21 AM
                                  Dedicated mem* instructionsLinus Torvalds2017/05/24 08:16 AM
                                    Dedicated mem* instructionsanon2017/05/24 09:52 AM
                                      Dedicated mem* instructionsLinus Torvalds2017/05/24 11:31 AM
                                        Should mem copy/fill/move be an instruction or a coprocessor with asychronous instructions? (NT)TEMLIB2017/05/24 12:52 PM
                                          asynchronous co-processors are evil (NT)Michael S2017/05/24 12:57 PM
                                          Should mem copy/fill/move be an instruction or a coprocessor with asychronous instructions?David Hess2017/05/24 03:52 PM
                                          Should mem copy/fill/move be an instruction or a coprocessor with asychronous instructions?Travis2017/05/24 03:55 PM
                                            Should mem copy/fill/move be an instruction or a coprocessor with asychronous instructions?TEMLIB2017/05/24 04:29 PM
                                        Dedicated mem* instructionsanon2017/05/24 08:39 PM
                                        AVX-512 and XOPYuhong Bao2017/05/24 11:19 PM
                                          128-bit vs 256-bit vectors in cryptoYuhong Bao2017/05/31 11:37 AM
                                    Dedicated mem* instructionsDoug S2017/05/24 12:37 PM
                                      Dedicated mem* instructionsMichael S2017/05/24 12:55 PM
                                        Dedicated mem* instructionsDoug S2017/05/24 02:35 PM
                                          Dedicated mem* instructionsLinus Torvalds2017/05/24 03:41 PM
                                            Dedicated mem* instructionsTravis2017/05/24 04:20 PM
                                              Dedicated mem* instructionsLinus Torvalds2017/05/25 10:54 AM
                                  Dedicated mem* instructionsGabriele Svelto2017/05/25 04:05 PM
                                Immediate lengths for mem* instructionsPaul A. Clayton2017/05/26 04:55 AM
                              slow AVX-512 memcpy/memsetTravis2017/05/24 03:41 PM
                                ucode branch predictionDavid Kanter2017/05/24 05:45 PM
                          Then why use even AVX2 for memcpy?Mark Roulo2017/05/23 04:30 PM
                            Then why use even AVX2 for memcpy?Linus B Torvalds2017/05/23 10:08 PM
                              Danke (NT).Mark Roulo2017/05/24 11:52 AM
                            It's all about the length of the memcpy.Heikki Kultala2017/05/23 10:18 PM
                              It's all about the length of the memcpy.Heikki Kultala2017/05/23 10:26 PM
                              It's all about the length of the memcpy.Yoav2017/05/24 01:08 AM
                              It's all about the length of the memcpy.Michael S2017/05/24 01:37 AM
                              It's all about the length of the memcpy.Megol2017/05/24 03:39 AM
                              It's all about the length of the memcpy.Gabriele Svelto2017/05/24 05:17 AM
                                It's all about the length of the memcpy.Travis2017/05/24 02:46 PM
                                  It's all about the length of the memcpy.Gabriele Svelto2017/05/25 04:24 AM
                                    It's all about the length of the memcpy.octoploid2017/05/25 04:45 AM
                                      Forgot , but you get the idea (NT)octoploid2017/05/25 05:12 AM
                                        Forgot to add a pre tag but you get the idea (NT)octoploid2017/05/25 05:14 AM
                                      It's all about the length of the memcpy.Gabriele Svelto2017/05/25 03:37 PM
                                        It's all about the length of the memcpy.Wilco2017/05/25 03:48 PM
                                          It's all about the length of the memcpy.Gabriele Svelto2017/05/25 04:07 PM
                                            It's all about the length of the memcpy.Wilco2017/05/26 02:47 AM
                                              "manual memcpy" and modern compilersHeikki Kultala2017/05/27 11:27 PM
                                                "manual memcpy" and modern compilersLinus Torvalds2017/05/29 08:30 PM
                                                  "manual memcpy" and modern compilersTravis2017/05/29 09:32 PM
                                                    "manual memcpy" and modern compilersLinus Torvalds2017/05/30 10:54 AM
                                                      "manual memcpy" and modern compilersJason Creighton2017/05/30 12:33 PM
                                                        "manual memcpy" and modern compilersWilco2017/05/30 08:29 PM
                                                      "manual memcpy" and modern compilersTravis2017/05/30 08:23 PM
                                                        "manual memcpy" and modern compilersWilco2017/05/30 08:34 PM
                                                          "manual memcpy" and modern compilersoctoploid2017/05/30 09:46 PM
                                                            "manual memcpy" and modern compilersWilco2017/05/31 02:28 AM
                                                              "manual memcpy" and modern compilersoctoploid2017/05/31 03:14 AM
                                                                "manual memcpy" and modern compilersWilco2017/05/31 02:42 PM
                                                                "manual memcpy" and modern compilersTravis2017/05/31 06:40 PM
                                                                  "manual memcpy" and modern compilersJouni Osmala2017/05/31 11:42 PM
                                                                    "manual memcpy" and modern compilersLinus Torvalds2017/06/01 10:39 AM
                                                                      "manual memcpy" and modern compilersTravis2017/06/01 04:30 PM
                                                                        "manual memcpy" and modern compilersoctoploid2017/06/02 01:26 AM
                                                                          "manual memcpy" and modern compilersoctoploid2017/06/02 01:27 AM
                                                                            "manual memcpy" and modern compilersTravis2017/06/02 12:18 PM
                                                                              "manual memcpy" and modern compilersTravis2017/06/02 12:40 PM
                                                                          "manual memcpy" and modern compilersoctoploid2017/06/02 03:29 AM
                                                                            "manual memcpy" and modern compilersGiGNiC2017/06/02 05:23 AM
                                                                            "manual memcpy" and modern compilersTravis2017/06/02 07:56 PM
                                                                          "manual memcpy" and modern compilersTravis2017/06/02 02:05 PM
                                                                            "manual memcpy" and modern compilersLinus Torvalds2017/06/02 03:48 PM
                                                                              "manual memcpy" and modern compilersTravis2017/06/02 04:50 PM
                                                                                "manual memcpy" and modern compilersgiovanni deretta2017/06/03 01:43 PM
                                                                                  "manual memcpy" and modern compilersDavid Kanter2017/06/04 10:04 AM
                                                                                  "manual memcpy" and modern compilersTravis2017/06/04 01:53 PM
                                                                                    "manual memcpy" and modern compilersDavid Kanter2017/06/04 09:03 PM
                                                                                      memory renamingTravis2017/06/06 11:52 AM
                                                                                        memory renaminganon.12017/06/07 08:06 PM
                                                                                          memory renaminganon.12017/06/07 08:54 PM
                                                                          "manual memcpy" and modern compilersTravis2017/06/02 08:21 PM
                                                                            "manual memcpy" and modern compilersoctoploid2017/06/02 09:31 PM
                                                                              "manual memcpy" and modern compilersoctoploid2017/06/03 02:19 AM
                                                                                "manual memcpy" and modern compilersTravis2017/06/03 11:38 AM
                                                                                  "manual memcpy" and modern compilersLinus Torvalds2017/06/04 10:57 AM
                                                                                    "manual memcpy" and modern compilersTravis2017/06/04 02:11 PM
                                                                                      "manual memcpy" and modern compilersMichael S2017/06/05 04:47 AM
                                                                        "manual memcpy" and modern compilersLinus Torvalds2017/06/02 09:21 AM
                                                                      "manual memcpy" and modern compilersYuhong Bao2017/06/02 06:02 PM
                                                                        "manual memcpy" and modern compilersLinus Torvalds2017/06/02 10:27 PM
                                                                          "manual memcpy" and modern compilersYuhong Bao2017/06/03 10:26 PM
                                                                            "manual memcpy" and modern compilersLinus Torvalds2017/06/04 11:12 AM
                                                                              "manual memcpy" and modern compilersgiovanni deretta2017/06/05 01:22 AM
                                                                                "manual memcpy" and modern compilersLinus Torvalds2017/06/05 09:49 AM
                                                          "manual memcpy" and modern compilersBrett2017/05/30 10:07 PM
                                                            "manual memcpy" and modern compilersWilco2017/05/31 02:37 AM
                                                              "manual memcpy" and modern compilersBrett2017/05/31 10:28 PM
                                                          "manual memcpy" and modern compilersTravis2017/05/31 06:29 PM
                                                      "manual memcpy" and modern compilersTravis2017/05/31 06:30 PM
                                                        "manual memcpy" and modern compilersWilco2017/06/01 02:06 AM
                                                          "manual memcpy" and modern compilersTravis2017/06/01 12:32 PM
                                                            "manual memcpy" and modern compilersWilco2017/06/01 01:51 PM
                                    It's all about the length of the memcpy.Travis2017/05/25 05:19 PM
                                      It's all about the length of the memcpy.Michael S2017/05/26 03:07 AM
                                        It's all about the length of the memcpy.Linus Torvalds2017/05/26 02:01 PM
                                      It's all about the length of the memcpy.Linus Torvalds2017/05/26 12:34 PM
                                        It's all about the length of the memcpy.Travis2017/05/26 05:13 PM
                                          It's all about the length of the memcpy.Travis2017/05/26 05:16 PM
                                          It's all about the length of the memcpy.Brett2017/05/26 08:25 PM
                                            It's all about the length of the memcpy.Travis2017/05/27 02:56 PM
                                          It's all about the length of the memcpy.Linus Torvalds2017/05/27 08:50 AM
                                            big.LITTLE ???Michael S2017/05/27 11:09 AM
                                              big.LITTLE ???Linus Torvalds2017/05/27 11:56 AM
                                                may be, Mongoose core ?Michael S2017/05/27 12:43 PM
                                                big.LITTLE ???Travis2017/05/27 03:18 PM
                                                  big.LITTLE ???Linus Torvalds2017/05/28 05:18 PM
                                                    big.LITTLE ???Travis2017/05/28 09:31 PM
                                                    In *theory* this is fixable with better benchmarks ...Mark Roulo2017/05/30 10:22 AM
                                                      In *theory* this is fixable with better benchmarks ...Linus Torvalds2017/05/30 11:12 AM
                                            It's all about the length of the memcpy.Travis2017/05/27 02:49 PM
                                              NT stores are an issueHeikki Kultala2017/05/27 11:25 PM
                                                NT stores are an issueTravis2017/05/28 12:38 AM
                                                  NT stores are an issue (Ryzen result)octoploid2017/05/28 12:57 AM
                                                    NT stores are an issue (Ryzen result)octoploid2017/05/28 12:59 AM
                                                      Bogus extra newline when using code,preoctoploid2017/05/28 01:03 AM
                                                        Bogus extra newline when using code,preMichael S2017/05/28 01:35 AM
                                                    NT stores are an issue (Ryzen result)Travis2017/05/28 01:30 AM
                                                      NT stores are an issue (Ryzen result)Travis2017/05/28 01:35 AM
                                                      NT stores are an issue (Ryzen result)Michael S2017/05/28 01:45 AM
                                                        NT stores are an issue (Ryzen result)Travis2017/05/28 02:20 AM
                                                    NT stores are an issue (Ryzen result)Travis2017/05/28 02:22 AM
                                                      NT stores are an issue (Ryzen result)octoploid2017/05/28 02:30 AM
                                                        NT stores are an issue (Ryzen result)Travis2017/05/28 01:10 PM
                                              It's all about the length of the memcpy.Doug S2017/05/28 08:55 AM
                                      It's all about the length of the memcpy.Gabriele Svelto2017/05/26 03:33 PM
                                        It's all about the length of the memcpy.Travis2017/05/26 06:51 PM
                                          It's all about the length of the memcpy.Seni2017/05/28 03:14 PM
                                            It's all about the length of the memcpy.Travis2017/05/28 03:26 PM
                                              It's all about the length of the memcpy.Gabriele Svelto2017/05/29 05:53 AM
                                                It's all about the length of the memcpy.Travis2017/05/29 02:04 PM
                                                  It's all about the length of the memcpy.Seni2017/05/29 05:06 PM
                                                    It's all about the length of the memcpy.Travis2017/05/29 07:45 PM
                                                      It's all about the length of the memcpy.Brett2017/05/29 09:36 PM
                                                  Real code, real data from a real workloadGabriele Svelto2017/05/30 03:59 PM
                                                    Real code, real data from a real workloadTravis2017/05/30 08:01 PM
                                                      Real code, real data from a real workloadGabriele Svelto2017/05/31 09:31 AM
                                                        Real code, real data from a real workloadgallier22017/05/31 10:02 AM
                                                        Real code, real data from a real workloadSymmetry2017/05/31 10:17 AM
                                                          Real code, real data from a real workloadTravis2017/05/31 06:49 PM
                                                        Real code, real data from a real workloadTravis2017/05/31 06:27 PM
                                                          Real code, real data from a real workloadMichael S2017/06/01 02:38 AM
                                                            Real code, real data from a real workloadWilco2017/06/01 11:06 AM
                                                              fixed indeedMichael S2017/06/01 12:23 PM
                                                          Real code, real data from a real workloadGabriele Svelto2017/06/01 09:44 PM
                                                            Real code, real data from a real workloadTravis2017/06/02 02:38 PM
                                                              Real code, real data from a real workloadmeh2017/06/03 06:22 AM
                                                                Real code, real data from a real workloadTravis2017/06/03 11:50 AM
                                                            Real code, real data from a real workloadSeni2017/06/02 04:34 PM
                                                              Real code, real data from a real workloadBrendan2017/06/02 11:09 PM
                                                                Real code, real data from a real workloadSeni2017/06/03 03:49 AM
                                                                Real code, real data from a real workloadrwessel2017/06/03 11:40 AM
                                                                  Real code, real data from a real workloadTravis2017/06/03 01:40 PM
                                                                Real code, real data from a real workloadTravis2017/06/03 01:20 PM
                                                          Real code, real data from a real workloadRicardo B2017/06/04 02:47 PM
                                                            Real code, real data from a real workloadTravis2017/06/04 05:15 PM
                                                              correctionTravis2017/06/04 05:17 PM
                                                              Real code, real data from a real workloadRicardo B2017/06/04 07:03 PM
                                                                Real code, real data from a real workloadTravis2017/06/06 12:33 PM
                                                            Real code, real data from a real workloadEtienne2017/06/05 03:40 AM
                                It's all about the length of the memcpy.Megol2017/05/25 08:08 AM
                              rep movsb is still slowWilco2017/05/25 03:43 PM
                                4K is not small... (NT)iz2017/05/26 01:10 PM
                                  Random copies are < 256 bytes (NT)Wilco2017/05/26 02:38 PM
                                rep movsb is still slowBrendan2017/05/27 07:50 PM
                                  rep movsb is still slowTravis2017/05/27 09:27 PM
                            Then why use even AVX2 for memcpy?Eric Bron2017/05/24 12:22 AM
                    Is K12 still alive?Ronald Maas2017/05/23 09:27 PM
                      Is K12 still alive?dmcq2017/05/24 03:37 AM
                    Wide registersLaurent2017/05/24 08:53 AM
                      It's called Amdahl's law (NT)Gabriele Svelto2017/05/25 04:09 PM
                      Wide registersMichael S2017/05/26 03:24 AM
                        Wide registersEric Bron2017/05/26 05:47 AM
                          Ivan Godard (NT)Michael S2017/05/27 11:11 AM
                        Wide registersLaurent2017/05/26 08:44 AM
            Is K12 still alive?dmcq2017/05/23 04:47 AM
              Is K12 still alive?juanrga2017/05/23 05:29 AM
              the whole post makes no sense at all (NT)Michael S2017/05/23 06:03 AM
                did you expect different?blue2017/05/23 08:07 AM
                  did you expect different?dmcq2017/05/24 03:35 AM
                    did you expect juanrga post to make sense? (NT) (clarified?)blue2017/05/27 03:44 AM
                      did you follow the discussion?Michael S2017/05/28 01:30 AM
                        did you follow the discussion?dmcq2017/05/28 03:05 AM
                          did you follow the discussion?juanrga2017/05/28 12:24 PM
                          did you follow the discussion?anon.12017/05/28 01:57 PM
                            did you follow the discussion?dmcq2017/05/28 03:18 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?