By: Maynard Handley (name99.delete@this.name99.org), August 19, 2020 12:05 pm
Room: Moderated Discussions
dmcq (dmcq.delete@this.fano.co.uk) on August 19, 2020 11:08 am wrote:
> Maynard Handley (name99.delete@this.name99.org) on August 19, 2020 10:02 am wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on August 19, 2020 4:08 am wrote:
> > > Maynard Handley (name99.delete@this.name99.org) on August 18, 2020 9:17 pm wrote:
> > > > anon2 (anon.delete@this.anon.com) on August 18, 2020 7:04 pm wrote:
> > > > > Maynard Handley (name99.delete@this.name99.org) on August 18, 2020 3:42 pm wrote:
> > > > > > dmcq (dmcq.delete@this.fano.co.uk) on August 18, 2020 1:00 pm wrote:
> > > > > > > hobold (hobold.delete@this.vectorizer.org) on August 18, 2020 11:58 am wrote:
> > > > > > > > Michael S (already5chosen.delete@this.yahoo.com) on August 18, 2020 4:48 am wrote:
> > > > > > > > [...]
> > > > > > > > > BTW, Power ISA™ Version 3.1 manual is here https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0
> > > > > > > > >
> > > > > > > >
> > > > > > > > Thanks for the link!
> > > > > > > >
> > > > > > > > Looks like they are really serious about autovectorization. Quite a few SIMD
> > > > > > > > bits to round out functionality, and more interoperation between scalar and
> > > > > > > > SIMD. They even have "vector centrifuge" now ... that is ... interesting.
> > > > > > >
> > > > > > > Gah! You can't split a prefixed instruction over a 64 byte boundary.
> > > > > > > I wonder why they felt it necessary to impose that.
> > > > > >
> > > > > > For precisely the reasons I discussed!
> > > > > > If you have that restriction, then (along with other things being
> > > > > > easier) an instruction is never split across two cache lines.
> > > > > > Which means pre-decode can mark instructions as pairs.
> > > > > > Which means that, yes, it IS possible to track a branch into the middle of a pair
> > > > > > of instructions and behave appropriately (which, IMHO, should be to fault).
> > > > > >
> > > > >
> > > > > Exactly as I said here https://www.realworldtech.com/forum/?threadid=194561&curpostid=194635
> > > > > (if that is indeed the reason for it -- predecode does seem likely).
> > > > >
> > > > > Interesting how quickly the x86 fan club are to insist none of the failings of x86 are
> > > > > actually failings at all, clearly without having thought through even the very basics.
> > > >
> > > > Yes, I think we are on the same page about this -- the pitfalls that are possible, and the
> > > > extent to which both the architectural design (eg the 64-byte restriction) and the implementation
> > > > (at the very least, pre-decode) prevent those pitfalls from being instantiated.
> > >
> > > Aren't literal values allowed in a program segment? If so they have to allow jumping to
> > > what seems to be the middle of an instruction. Unless they have some restrictions on the
> > > values of constants or where they can be placed. I think they were just making life a little
> > > easier for the hardware designers as they had enough other stuff to worry about.
> >
> > I don't understand what you are saying.
> > It doesn't matter HOW the branch is performed (PC relative, via count register, via link register, ...)
> > WHATEVER you do, to execute new code, that code has to PASS THROUGH the L1 cache.
> > And the L1 cache performs pre-decode, including marking instructions in various ways.
> > And one of those marks can be "invalid instruction -- second element of a pair".
> > At which point, somewhere in pipeline, that mark gets noticed and a fault is raised.
> >
> > Which part of this do you not understand?
>
> What you're saying would imply that the following is invalid
>
> data
> L1: word literal that looks like a prefix
> code
> L2: op code
Only if word literals spill over to code with a 64B granule...
It wouldn't surprise me if there are linker alignment restrictions even today, pre POWER10.
But is your starting point even correct? Islands of literals within the code stream is an ARM thing (maybe some other ISA's as well) but for PPC the way you are supposed to handle this is via the TOC. Does POWER even offer a PC-relative load? It's been years, but I seem to remember that you have to do a particular (non-trivial) contortion via the LR to get at the PC, and this is not something you'd want to do under normal conditions compared to just using the TOC the way it's supposed to be used.
> Maynard Handley (name99.delete@this.name99.org) on August 19, 2020 10:02 am wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on August 19, 2020 4:08 am wrote:
> > > Maynard Handley (name99.delete@this.name99.org) on August 18, 2020 9:17 pm wrote:
> > > > anon2 (anon.delete@this.anon.com) on August 18, 2020 7:04 pm wrote:
> > > > > Maynard Handley (name99.delete@this.name99.org) on August 18, 2020 3:42 pm wrote:
> > > > > > dmcq (dmcq.delete@this.fano.co.uk) on August 18, 2020 1:00 pm wrote:
> > > > > > > hobold (hobold.delete@this.vectorizer.org) on August 18, 2020 11:58 am wrote:
> > > > > > > > Michael S (already5chosen.delete@this.yahoo.com) on August 18, 2020 4:48 am wrote:
> > > > > > > > [...]
> > > > > > > > > BTW, Power ISA™ Version 3.1 manual is here https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0
> > > > > > > > >
> > > > > > > >
> > > > > > > > Thanks for the link!
> > > > > > > >
> > > > > > > > Looks like they are really serious about autovectorization. Quite a few SIMD
> > > > > > > > bits to round out functionality, and more interoperation between scalar and
> > > > > > > > SIMD. They even have "vector centrifuge" now ... that is ... interesting.
> > > > > > >
> > > > > > > Gah! You can't split a prefixed instruction over a 64 byte boundary.
> > > > > > > I wonder why they felt it necessary to impose that.
> > > > > >
> > > > > > For precisely the reasons I discussed!
> > > > > > If you have that restriction, then (along with other things being
> > > > > > easier) an instruction is never split across two cache lines.
> > > > > > Which means pre-decode can mark instructions as pairs.
> > > > > > Which means that, yes, it IS possible to track a branch into the middle of a pair
> > > > > > of instructions and behave appropriately (which, IMHO, should be to fault).
> > > > > >
> > > > >
> > > > > Exactly as I said here https://www.realworldtech.com/forum/?threadid=194561&curpostid=194635
> > > > > (if that is indeed the reason for it -- predecode does seem likely).
> > > > >
> > > > > Interesting how quickly the x86 fan club are to insist none of the failings of x86 are
> > > > > actually failings at all, clearly without having thought through even the very basics.
> > > >
> > > > Yes, I think we are on the same page about this -- the pitfalls that are possible, and the
> > > > extent to which both the architectural design (eg the 64-byte restriction) and the implementation
> > > > (at the very least, pre-decode) prevent those pitfalls from being instantiated.
> > >
> > > Aren't literal values allowed in a program segment? If so they have to allow jumping to
> > > what seems to be the middle of an instruction. Unless they have some restrictions on the
> > > values of constants or where they can be placed. I think they were just making life a little
> > > easier for the hardware designers as they had enough other stuff to worry about.
> >
> > I don't understand what you are saying.
> > It doesn't matter HOW the branch is performed (PC relative, via count register, via link register, ...)
> > WHATEVER you do, to execute new code, that code has to PASS THROUGH the L1 cache.
> > And the L1 cache performs pre-decode, including marking instructions in various ways.
> > And one of those marks can be "invalid instruction -- second element of a pair".
> > At which point, somewhere in pipeline, that mark gets noticed and a fault is raised.
> >
> > Which part of this do you not understand?
>
> What you're saying would imply that the following is invalid
>
> data
> L1: word literal that looks like a prefix
> code
> L2: op code
Only if word literals spill over to code with a 64B granule...
It wouldn't surprise me if there are linker alignment restrictions even today, pre POWER10.
But is your starting point even correct? Islands of literals within the code stream is an ARM thing (maybe some other ISA's as well) but for PPC the way you are supposed to handle this is via the TOC. Does POWER even offer a PC-relative load? It's been years, but I seem to remember that you have to do a particular (non-trivial) contortion via the LR to get at the PC, and this is not something you'd want to do under normal conditions compared to just using the TOC the way it's supposed to be used.
Topic | Posted By | Date |
---|---|---|
IBM introduces POWER10 | Crystal S. Diamond | 2020/08/16 10:20 PM |
"New ISA Prefix Fusion" | QAnon | 2020/08/16 11:21 PM |
"New ISA Prefix Fusion" | Anon3 | 2020/08/17 06:59 AM |
"New ISA Prefix Fusion" | Kevin G | 2020/08/17 10:51 AM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/17 11:51 AM |
"New ISA Prefix Fusion" | Anon3 | 2020/08/17 04:10 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/17 04:34 PM |
"New ISA Prefix Fusion" | Anon3 | 2020/08/17 05:34 PM |
"New ISA Prefix Fusion" | Adrian | 2020/08/17 06:39 PM |
"New ISA Prefix Fusion" | anon2 | 2020/08/17 09:24 PM |
"New ISA Prefix Fusion" | Doug S | 2020/08/17 09:58 PM |
"New ISA Prefix Fusion" | hobold | 2020/08/18 01:47 AM |
"New ISA Prefix Fusion" | Michael S | 2020/08/18 04:48 AM |
"New ISA Prefix Fusion" | hobold | 2020/08/18 11:58 AM |
"New ISA Prefix Fusion" | dmcq | 2020/08/18 01:00 PM |
"New ISA Prefix Fusion" | Michael S | 2020/08/18 01:48 PM |
"New ISA Prefix Fusion" | hobold | 2020/08/18 02:29 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/18 03:46 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/18 03:42 PM |
"New ISA Prefix Fusion" | anon2 | 2020/08/18 07:04 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/18 09:17 PM |
"New ISA Prefix Fusion" | dmcq | 2020/08/19 04:08 AM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/19 10:02 AM |
"New ISA Prefix Fusion" | dmcq | 2020/08/19 11:08 AM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/19 12:05 PM |
"New ISA Prefix Fusion" | dmcq | 2020/08/19 02:14 PM |
"New ISA Prefix Fusion" | Maynard Handley | 2020/08/19 02:44 PM |
IBM introduces POWER10 | Thu | 2020/08/16 11:56 PM |
IBM introduces POWER10 | Michael S | 2020/08/17 02:12 AM |
IBM introduces POWER10 | Thu | 2020/08/17 03:27 AM |
IBM introduces POWER10 | TransientStudent | 2020/08/17 04:23 AM |
IBM introduces POWER10 | Rayla | 2020/08/17 04:29 AM |
IBM introduces POWER10 | Maynard Handley | 2020/08/17 10:44 AM |
IBM introduces POWER10 | Kevin G | 2020/08/17 10:57 AM |
IBM introduces POWER10 | Rayla | 2020/08/17 04:26 AM |
IBM introduces POWER10 | Thu | 2020/08/17 05:00 PM |
Matrix Math Accelerator | Adrian | 2020/08/17 01:01 AM |
Matrix Math Accelerator | Michael S | 2020/08/17 02:32 AM |
Matrix Math Accelerator | Adrian | 2020/08/17 02:46 AM |
Matrix Math Accelerator | j | 2020/08/18 02:32 AM |