By: Konrad Schwarz (konrad.schwarz.delete@this.siemens.com), July 21, 2015 5:24 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on July 21, 2015 3:11 am wrote:
> Which PowerPC barrier instruction?
> isync ? - certainly non-global
>
> eieio ? - does not appear to be global
>
> mbar ? looks like it does not exist on "big" IBM cores. Not sure about "biggish"
> Freescale cores, like e600. Anyway, programming note suggests that mbar is intended
> for memory-mapped I/O synchronization rather than SMP synchronization.
>
> sync ? In my copy of docs (PowerISA_v2.07) they list 18 legal forms of sync in m. Which one is global?
Sync, with L=0. Also, mbar, which is only available in embedded environments (eieio is weaker).
See for example:
Programming Environments Manual
for 32-Bit Implementations of the
PowerPC™ Architecture
5.2.6.1, Memory Access Ordering
[...] When a processor (P1) executes sync or eieio, a memory barrier is created that separates applicable
memory accesses into two groups, G1 and G2. G1 includes all applicable memory accesses
associated with instructions preceding the barrier-creating instruction, and G2 includes all
applicable memory accesses associated with instructions following the barrier-creating instruction.
The memory barrier ensures that all memory accesses in G1 are performed with respect to any processor
or mechanism, to the extent required by the associated memory coherence required attributes (that is, the
memory coherence required attribute, if any, associated with each access), before any memory accesses in
G2 are performed with respect to that processor or mechanism.
The ordering done by a memory barrier is said to be cumulative if it also orders memory accesses that are
performed by processors and mechanisms other than P1, as follows:
• G1 includes all applicable memory accesses by any such processor or mechanism that have been
performed with respect to P1 before the memory barrier is created.
• G2 includes all applicable memory accesses by any such processor or mechanism that are
performed after a load instruction executed by that processor or mechanism has returned the value
stored by a store that is in G2.
A memory barrier created by sync is cumulative and applies to all accesses except those associated with
fetching instructions following the sync. See the definition of eieio in Chapter 8, “Instruction Set,” for a
description of the corresponding properties of the memory barrier created by that instruction.
[...]
The instruction description of eieio basically says that eieio is not cumulative.
For Power ISA 2.06 Revision B, see the discussion under 1.7.1, Storage Access Ordering. 4.4.3, Synchronize with L=0, the default encoding, aka "heavyweight sync", is essentially equivalent to the FPE32 instruction above.
> Which PowerPC barrier instruction?
> isync ? - certainly non-global
>
> eieio ? - does not appear to be global
>
> mbar ? looks like it does not exist on "big" IBM cores. Not sure about "biggish"
> Freescale cores, like e600. Anyway, programming note suggests that mbar is intended
> for memory-mapped I/O synchronization rather than SMP synchronization.
>
> sync ? In my copy of docs (PowerISA_v2.07) they list 18 legal forms of sync in m. Which one is global?
Sync, with L=0. Also, mbar, which is only available in embedded environments (eieio is weaker).
See for example:
Programming Environments Manual
for 32-Bit Implementations of the
PowerPC™ Architecture
5.2.6.1, Memory Access Ordering
[...] When a processor (P1) executes sync or eieio, a memory barrier is created that separates applicable
memory accesses into two groups, G1 and G2. G1 includes all applicable memory accesses
associated with instructions preceding the barrier-creating instruction, and G2 includes all
applicable memory accesses associated with instructions following the barrier-creating instruction.
The memory barrier ensures that all memory accesses in G1 are performed with respect to any processor
or mechanism, to the extent required by the associated memory coherence required attributes (that is, the
memory coherence required attribute, if any, associated with each access), before any memory accesses in
G2 are performed with respect to that processor or mechanism.
The ordering done by a memory barrier is said to be cumulative if it also orders memory accesses that are
performed by processors and mechanisms other than P1, as follows:
• G1 includes all applicable memory accesses by any such processor or mechanism that have been
performed with respect to P1 before the memory barrier is created.
• G2 includes all applicable memory accesses by any such processor or mechanism that are
performed after a load instruction executed by that processor or mechanism has returned the value
stored by a store that is in G2.
A memory barrier created by sync is cumulative and applies to all accesses except those associated with
fetching instructions following the sync. See the definition of eieio in Chapter 8, “Instruction Set,” for a
description of the corresponding properties of the memory barrier created by that instruction.
[...]
The instruction description of eieio basically says that eieio is not cumulative.
For Power ISA 2.06 Revision B, see the discussion under 1.7.1, Storage Access Ordering. 4.4.3, Synchronize with L=0, the default encoding, aka "heavyweight sync", is essentially equivalent to the FPE32 instruction above.