The Memory Aliasing Problem
Memory aliasing is a relatively easy to understand problem. Out-of-order CPUs usually have many load and store operations in-flight at the same time, but the CPU has to preserve the program order of loads and stores. Intuitively this problem is very similar to the cache coherency problem. Figure 8 shows the problem with aliasing; because they share the same address, instruction 9 cannot move before instruction 2 (broken red arrow), or it would read the wrong data. Unfortunately, the CPU does not know what address instruction 5 is storing to, so it is unclear whether instruction 9 can move around number 5 (solid black arrow). However, the CPU can obviously move instruction 9 before instruction 8 and after instruction 5, since there are no aliasing stores (solid blue arrow).
Figure 8 – Memory Aliasing Example
Memory disambiguation is the process of determining whether a pair of memory instructions (usually a load and a store) alias, or share the same address. If they use different addresses, then they can be moved around each other. The problem is that to disambiguate a load, the memory system has to search the addresses of all in-flight store operations, which is hideously expensive. The P6 makes this a lot easier by splitting up store instructions into two different uops, one to calculate the address, and one to actually store the data. This way the store address is known in advance, and can be easily checked for alias problems. In the P6, the Memory Reorder Buffer (MOB) uses the following rules to avoid aliasing:
- All loads are delayed if a store is in-flight with an unknown address
- Loads cannot proceed ahead of an aliased store data uop
- A store cannot be moved in front of another store
The problem with this approach is that rule 1 is pessimistic and creates some false aliasing (i.e. it assumes aliasing, which is not always the case). Academic studies have shown that for an EV6-like processor with 512 in-flight instructions, roughly 97+% of loads and stores do not alias . For a more modest instruction window (no shipping design has more than about 200 instructions in flight), there would be even less aliasing. Therefore, it makes sense to remove rule 1, and simply assume that all load/store pairs do not alias, but ensure correct recovery when a mistake is made.
Discuss (148 comments)