Moving copy to DRAM doesn't help for small copies

By: --- (---.delete@this.redheron.com), October 2, 2021 6:32 pm
Room: Moderated Discussions
Mark Roulo (nothanks.delete@this.xxx.com) on October 2, 2021 3:59 pm wrote:
> --- (---.delete@this.redheron.com) on October 2, 2021 3:12 pm wrote:
> ...snip ...
>
> > Oh, one more thing. Yet another way to solve the problem is not by DMA but by moving the task
> > (copy or flood-fill) up to the memory controller. Again much more feasible to the extent that
> > the memory controller has access to coherency tags. This allows you to do the job by routing
> > from DRAM to controller to DRAM, bypassing NoC and everything else, so lower power.
> >
> > The ultimate, of course is to do the job purely within the
> > DRAM. Onur Mutlu has published details of how this
> > could realistically be added to existing DRAM, but as far
> > as I know no-one has yet done so. Every year we have
> > some excitement about PIM around Hot Chips, then it all goes away and another year passes with no actually
> > purchasable PIM hardware. Even Apple, as far as we all
> > know, uses vanilla DRAM, and their temporary stake in
> > Toshiba Memory was apparently just a bit of financial engineering, not a prelude to bespoke DRAM :-(
>
> You are solving the problem of BULK memcpy.
>

(a)

Yes indeed. Because that was the subject Linus considered. As in, from my text that you snipped,
>>> I agree with you primary point, that obsessing over super-bulk transfers is not the first thing to worry
>>> about.




> If code is copying (or zero-ing) 5 - 100 bytes there is a very good chance that the copied or
> zero-d memory is going to be used immediately (for some values of immediately). Pushing the copy
> or zero to the DRAM is pretty much the wrong thing to do for either performance or power.
>
> NOTE: Bulk memory zero-ing (as might be useful for managed languages such as Java)
> might make a lot of sense. You could set up for 'free' memory ahead of time.
>
> Or maybe not. In theory the data could be zero-d as it was read
> into the caches so the bulk zero-ing in DRAM might be pointless.


(b)

Most of this discussion seems to think that (on aesthetic reasons, nothing else) a single solution only should exist; and that that solution should be considered only in light of use by a programming language.

But there are multiple use cases, many of which are outside of the domain of a programming language. These include eg
- wiping a page by the OS (eg for security reasons) AND
- the copy part of copy-on write of a page
both of which one may lend themselves to unorthodox mechanisms.


OF COURSE most copies (within a language) are small, most such copies probably want the copied data present in cache, and such copies are optimally handled either by existing instructions (with nice alignment and known sizes) or by *very simple* augmenting instructions.

But if your mind is wandering into the space of "let's do it via DMA", the issue is not that that's an empty space, it's that that's a lot less likely a job you want to be generated automatically by the compiler using weird instructions;
rather that's a task that you will call by API.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Armv8.8-A and Armv9.3-Aanonymou52021/09/16 02:25 PM
  Armv8.8-A and Armv9.3-ADoug S2021/09/16 09:57 PM
    Armv8.8-A and Armv9.3-ABrett2021/09/16 10:32 PM
      Armv8.8-A and Armv9.3-Aanon2021/09/16 10:55 PM
        Armv8.8-A and Armv9.3-Anone2021/09/16 11:51 PM
      Armv8.8-A and Armv9.3-AJörn Engel2021/09/17 04:42 AM
        Armv8.8-A and Armv9.3-AMichael S2021/09/17 06:48 AM
          Armv8.8-A and Armv9.3-AJörn Engel2021/09/18 01:01 PM
            Armv8.8-A and Armv9.3-AMichael S2021/09/18 02:58 PM
              microbenchmark resultsMichael S2021/09/19 03:46 PM
                microbenchmark source codeMichael S2021/09/19 03:58 PM
                  microbenchmark source code-.-2021/09/20 03:49 PM
                    microbenchmark source codeMichael S2021/09/21 09:17 AM
                      microbenchmark source code-.-2021/09/21 03:33 PM
                        microbenchmark source codeMichael S2021/09/21 05:05 PM
                microbenchmark resultsAnon2021/09/19 04:32 PM
                  microbenchmark resultsJörn Engel2021/09/19 07:46 PM
                    microbenchmark resultsdmcq2021/09/20 01:19 AM
                      microbenchmark resultsMichael S2021/09/20 04:12 AM
                      microbenchmark results-.-2021/09/20 03:44 PM
                        microbenchmark resultsMichael S2021/09/21 09:23 AM
                          microbenchmark results-.-2021/09/21 03:35 PM
                            microbenchmark resultsAndrey2021/09/21 04:25 PM
                              I agree (NT)Michael S2021/09/21 05:07 PM
                              microbenchmark results-.-2021/09/22 04:56 PM
                                microbenchmark resultsMichael S2021/09/23 05:11 AM
                                  microbenchmark resultsdmcq2021/09/23 06:53 AM
                                  microbenchmark resultsAndrey2021/09/23 09:20 AM
                                microbenchmark resultsAndrey2021/09/23 09:11 AM
                                  microbenchmark results-.-2021/09/23 07:01 PM
                                    microbenchmark resultsSimon Farnsworth2021/09/24 01:47 AM
                                      microbenchmark results-.-2021/09/24 05:00 PM
                                    microbenchmark resultsAndrey2021/09/24 07:29 AM
                                      microbenchmark resultsdmcq2021/09/24 12:05 PM
                                        microbenchmark resultsDoug S2021/09/24 01:12 PM
                                          microbenchmark results---2021/09/24 06:06 PM
                                            microbenchmark resultsDoug S2021/09/24 10:46 PM
                                              microbenchmark results---2021/09/25 08:56 AM
                                                microbenchmark resultsJukka Larja2021/09/26 01:01 AM
                                                microbenchmark resultsDoug S2021/09/26 08:41 AM
                                                  microbenchmark resultsdmcq2021/09/26 12:37 PM
                                                    microbenchmark resultsDoug S2021/09/27 09:32 AM
                                                      microbenchmark resultsdmcq2021/09/28 06:56 AM
                                              microbenchmark resultsDummond D. Slow2021/09/25 11:49 AM
                                                microbenchmark resultsBrett2021/09/25 02:31 PM
                                              microbenchmark resultsdmcq2021/09/25 11:51 AM
                                                microbenchmark resultsDoug S2021/09/26 08:45 AM
                                            microbenchmark resultsRichard S2021/09/25 12:51 AM
                                              microbenchmark resultsDummond D. Slow2021/09/25 11:52 AM
                                                microbenchmark results---2021/09/25 02:04 PM
                                      SVE alignment with non power-of-2 widths-.-2021/09/24 05:10 PM
                                        SVE alignment with non power-of-2 widthsAndrey2021/09/25 03:46 AM
                                          SVE alignment with non power-of-2 widths-.-2021/09/25 04:35 PM
                                          SVE alignment with non power-of-2 widthsKevin G2021/09/27 08:46 AM
                                            SVE alignment with non power-of-2 widths-.-2021/09/27 08:06 PM
                                              SVE alignment with non power-of-2 widthsJukka Larja2021/09/28 05:37 AM
                                                SVE alignment with non power-of-2 widthsAndrey2021/09/28 11:12 AM
                                                  SVE alignment with non power-of-2 widthsdmcq2021/09/28 01:29 PM
                                                SVE alignment with non power-of-2 widths-.-2021/09/28 05:37 PM
                                                  SVE alignment with non power-of-2 widthsJukka Larja2021/09/29 05:50 AM
                    microbenchmark results---2021/09/20 06:11 AM
                    microbenchmark resultsJörn Engel2021/09/23 04:10 AM
                      microbenchmark resultsMichael S2021/09/23 04:55 AM
                        microbenchmark resultsJörn Engel2021/09/23 08:24 AM
                          microbenchmark resultsRoyi2021/09/26 03:25 PM
                      microbenchmark resultsdmcq2021/09/23 09:42 AM
                        microbenchmark results---2021/09/23 10:53 AM
                      microbenchmark resultsanon22021/09/23 01:40 PM
                microbenchmark results: Zen 3Adrian2021/09/22 12:57 AM
                  microbenchmark results: Zen 3Adrian2021/09/22 01:08 AM
                    microbenchmark results: Zen 3Michael S2021/09/22 04:48 AM
                      microbenchmark results: Zen 3Adrian2021/09/22 05:05 AM
        Armv8.8-A and Armv9.3-AKonrad Schwarz2021/09/28 04:45 AM
    Armv8.8-A and Armv9.3-ALinus Torvalds2021/09/17 07:59 AM
      Armv8.8-A and Armv9.3-ADoug S2021/09/17 10:35 AM
        Armv8.8-A and Armv9.3-Anksingh2021/09/17 11:23 AM
          Armv8.8-A and Armv9.3-ADoug S2021/09/17 01:35 PM
            Armv8.8-A and Armv9.3-AKonrad Schwarz2021/10/15 05:23 AM
              Armv8.8-A and Armv9.3-Arwessel2021/10/15 05:49 AM
        Armv8.8-A and Armv9.3-AAdrian2021/09/17 10:07 PM
          Armv8.8-A and Armv9.3-ADoug S2021/09/18 06:34 AM
            Armv8.8-A and Armv9.3-AAdrian2021/09/18 06:38 AM
      Armv8.8-A and Armv9.3-Ablaine2021/09/18 09:37 AM
      Armv8.8-A and Armv9.3-ABrett2021/09/19 12:06 PM
        Armv8.8-A and Armv9.3-Admcq2021/09/19 12:36 PM
        Armv8.8-A and Armv9.3-ADoug S2021/09/19 05:07 PM
          Armv8.8-A and Armv9.3-A - movesdmcq2021/09/28 08:54 AM
            Armv8.8-A and Armv9.3-A - movesDoug S2021/09/28 12:57 PM
              Armv8.8-A and Armv9.3-A - movesdmcq2021/09/28 01:21 PM
                Armv8.8-A and Armv9.3-A - movesNoSpammer2021/09/29 02:53 AM
                  Armv8.8-A and Armv9.3-A - movesrwessel2021/09/29 05:55 AM
                    Armv8.8-A and Armv9.3-A - movesdmcq2021/09/29 06:53 AM
                      Armv8.8-A and Armv9.3-A - movesrwessel2021/09/29 10:35 AM
                        Armv8.8-A and Armv9.3-A - movesdmcq2021/09/29 12:44 PM
                          Armv8.8-A and Armv9.3-A - movesrwessel2021/09/29 12:58 PM
                            Armv8.8-A and Armv9.3-A - movesdmcq2021/09/29 02:52 PM
                              Armv8.8-A and Armv9.3-A - movesrwessel2021/09/29 05:36 PM
                              Armv8.8-A and Armv9.3-A - movesAndrey2021/09/29 06:58 PM
                    Armv8.8-A and Armv9.3-A - movesDoug S2021/09/29 09:10 AM
                      Armv8.8-A and Armv9.3-A - movesrwessel2021/09/29 10:30 AM
                        Armv8.8-A and Armv9.3-A - movesDoug S2021/09/29 09:02 PM
                          Armv8.8-A and Armv9.3-A - movesrwessel2021/09/29 10:22 PM
                            Armv8.8-A and Armv9.3-A - movesMark Roulo2021/09/30 06:37 AM
                              Armv8.8-A and Armv9.3-A - movesrwessel2021/09/30 07:02 AM
                                Did they publish a full description? (NT)Michael S2021/09/30 07:12 AM
                                  Did they publish a full description?rwessel2021/09/30 08:18 AM
                                    Did they publish a full description?Michael S2021/09/30 09:24 AM
                                      Did they publish a full description?rwessel2021/09/30 09:42 AM
                                    Did they publish a full description?Adrian2021/09/30 11:22 PM
                                  Do we even okiw it's three instructions per move?Carson2021/09/30 09:28 PM
                                    Do we even okiw it's three instructions per move?Adrian2021/09/30 11:27 PM
                                    Do we even okiw it's three instructions per move?rwessel2021/10/01 03:19 AM
                            Armv8.8-A and Armv9.3-A - movesDoug S2021/09/30 08:48 AM
                              Armv8.8-A and Armv9.3-A - movesrwessel2021/09/30 09:39 AM
                                Armv8.8-A and Armv9.3-A - movesDoug S2021/09/30 01:56 PM
                                  Armv8.8-A and Armv9.3-A - movesrwessel2021/09/30 04:20 PM
                                    Armv8.8-A and Armv9.3-A - movesdmcq2021/10/01 03:38 AM
                                      Armv8.8-A and Armv9.3-A - movesMichael S2021/10/01 04:04 AM
                                        Armv8.8-A and Armv9.3-A - movesLinus Torvalds2021/10/01 10:01 AM
                                          memcpy - instruction cracking vs DMArpg2021/10/02 01:51 AM
                                            memcpy - instruction cracking vs DMAAdrian2021/10/02 02:45 AM
                                              memcpy - instruction cracking vs DMADoug S2021/10/02 08:47 AM
                                                memcpy - instruction cracking vs DMAAdrian2021/10/02 09:15 AM
                                                memcpy - instruction cracking vs DMArwessel2021/10/02 10:37 AM
                                                  memcpy - instruction cracking vs DMADoug S2021/10/02 05:49 PM
                                            memcpy - instruction cracking vs DMALinus Torvalds2021/10/02 09:43 AM
                                              memcpy - instruction cracking vs DMAdmcq2021/10/02 10:32 AM
                                              memcpy - instruction cracking vs DMABrett2021/10/02 10:45 AM
                                              memcpy - instruction cracking vs DMA---2021/10/02 02:03 PM
                                                memcpy - instruction cracking vs DMA---2021/10/02 02:12 PM
                                                  Moving copy to DRAM doesn't help for small copiesMark Roulo2021/10/02 02:59 PM
                                                    Moving copy to DRAM doesn't help for small copies---2021/10/02 06:32 PM
                                                      Moving copy to DRAM doesn't help for small copiesMichael S2021/10/03 12:40 AM
                                                        Moving copy to DRAM doesn't help for small copiesDoug S2021/10/03 09:09 AM
                                                          Moving copy to DRAM doesn't help for small copiesrwessel2021/10/03 09:51 AM
                                                          Moving copy to DRAM doesn't help for small copiesLinus Torvalds2021/10/03 10:09 AM
                                                            How about environments such as Java?Mark Roulo2021/10/03 11:41 AM
                                                              How about environments such as Java?rwessel2021/10/03 11:49 AM
                                                                How about environments such as Java?Mark Roulo2021/10/03 12:22 PM
                                                              How about environments such as Java?anon22021/10/03 06:58 PM
                                                                How about environments such as Java?Etienne Lorrain2021/10/04 04:08 AM
                                                                  Apart from "It depends" there is no short answer. (NT)Michael S2021/10/04 04:30 AM
                                                                  How about environments such as Java?Andrey2021/10/04 05:04 AM
                                                                  How about environments such as Java?anon22021/10/04 05:32 AM
                                                                How about environments such as Java?Mark Roulo2021/10/04 06:31 AM
                                                                How about environments such as Java?---2021/10/04 08:41 AM
                                                                  How about environments such as Java?Doug S2021/10/04 09:23 AM
                                                                    How about environments such as Java?Andrey2021/10/04 11:14 AM
                                                                      How about environments such as Java?Doug S2021/10/04 12:20 PM
                                                                  How about environments such as Java?anon22021/10/04 01:23 PM
                                                                  How about environments such as Java?rwessel2021/10/04 03:54 PM
                                                            Moving copy to DRAM doesn't help for small copiesJörn Engel2021/10/04 04:52 AM
                                                            Early software zeroing !=== early hardware zeroingPaul A. Clayton2021/10/05 10:19 AM
                                                              Early software zeroing !=== early hardware zeroingDoug S2021/10/05 11:21 AM
                                                memcpy - instruction cracking vs DMABrendan2021/10/02 03:53 PM
                                                  memcpy - instruction cracking vs DMALinus Torvalds2021/10/03 09:48 AM
                                                    memcpy - instruction cracking vs DMAdmcq2021/10/03 12:54 PM
                                              memcpy - instruction cracking vs DMAYuhong Bao2021/10/03 12:30 AM
                                                memcpy - instruction cracking vs DMADavid Hess2021/10/05 04:19 PM
                                                  memcpy - instruction cracking vs DMAAdrian2021/10/05 10:28 PM
                                                    memcpy - instruction cracking vs DMAEtienne Lorrain2021/10/06 01:24 AM
                                                    memcpy - instruction cracking vs DMArwessel2021/10/06 02:38 AM
                                                      memcpy - instruction cracking vs DMAAdrian2021/10/06 03:04 AM
                                                        memcpy - instruction cracking vs DMArwessel2021/10/06 04:59 AM
                                                    memcpy - instruction cracking vs DMA---2021/10/06 08:07 AM
                                                      memcpy - instruction cracking vs DMAAndrey2021/10/06 01:59 PM
                                                    memcpy - instruction cracking vs DMAgallier22021/10/06 10:06 PM
                                                      memcpy - instruction cracking vs DMAAdrian2021/10/06 10:59 PM
                                              memcpy - instruction cracking vs DMAMichael S2021/10/03 12:51 AM
                                                memcpy - instruction cracking vs DMArwessel2021/10/03 04:06 AM
                                                  memcpy - instruction cracking vs DMAMichael S2021/10/03 04:24 AM
                                                    memcpy - instruction cracking vs DMAMatt Sayler2021/10/03 07:02 AM
                                                    memcpy - instruction cracking vs DMADoug S2021/10/03 09:14 AM
                                      Armv8.8-A and Armv9.3-A - movesrwessel2021/10/01 04:10 AM
                                        Armv8.8-A and Armv9.3-A - movesEtienne Lorrain2021/10/01 06:55 AM
                                          Armv8.8-A and Armv9.3-A - movesrwessel2021/10/01 07:14 AM
                                            Armv8.8-A and Armv9.3-A - movesDoug S2021/10/01 10:17 AM
                                              Armv8.8-A and Armv9.3-A - movesrwessel2021/10/02 03:57 AM
  Armv8.8-A and Armv9.3-Anone2021/10/13 05:06 AM
    Armv8.8-A and Armv9.3-AAdrian2021/10/13 05:22 AM
      Armv8.8-A and Armv9.3-ADoug S2021/10/13 08:01 AM
        Armv8.8-A and Armv9.3-Admcq2021/10/13 09:17 AM
          Armv8.8-A and Armv9.3-Anone2021/10/13 09:26 PM
            Armv8.8-A and Armv9.3-Admcq2021/10/14 07:22 AM
    Armv8.8-A and Armv9.3-Arwessel2021/10/14 08:01 AM
      Armv8.8-A and Armv9.3-AAnon2021/10/14 10:08 AM
        Armv8.8-A and Armv9.3-AMichael S2021/10/14 12:25 PM
      Armv8.8-A and Armv9.3-ADoug S2021/10/14 10:18 AM
        Armv8.8-A and Armv9.3-Arwessel2021/10/14 06:07 PM
          Armv8.8-A and Armv9.3-ADoug S2021/10/14 09:23 PM
            Armv8.8-A and Armv9.3-Admcq2021/10/15 12:41 AM
              Armv8.8-A and Armv9.3-AGabriele Svelto2021/10/15 04:07 AM
            Armv8.8-A and Armv9.3-Arwessel2021/10/15 03:49 AM
              Armv8.8-A and Armv9.3-ADoug S2021/10/15 09:44 AM
                Armv8.8-A and Armv9.3-Ame2021/10/15 05:34 PM
                  Armv8.8-A and Armv9.3-ADoug S2021/10/16 08:47 AM
                    Armv8.8-A and Armv9.3-Ame2021/10/17 04:19 AM
                      Armv8.8-A and Armv9.3-ADoug S2021/10/17 09:17 AM
                        Armv8.8-A and Armv9.3-Ame2021/10/17 11:31 AM
                          Armv8.8-A and Armv9.3-ADoug S2021/10/17 12:33 PM
                            Armv8.8-A and Armv9.3-AzArchJon2021/10/18 09:35 AM
                              Armv8.8-A and Armv9.3-ADoug S2021/10/18 01:35 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?