The Memory Subsystem
OCTEON is a fully coherent system on a chip, so that standard development tools and operating systems can be used. The entire device shares a 1MB 8-way associative L2 cache that uses write back to the DDR2 memory controller. Cache lines are 128 bytes and set locking and partitioning are supported to reduce performance variation for certain workloads. The L2 also includes the L1 tags with valid/invalid bits to keep track of coherency information.
Figure 3 – OCTEON Memory Subsystem
Access to the L2 is controlled by a coherent crossbar bus with peak bandwidth of 320Gb/s at 600MHz. The crossbar actually has three components: a 64 bit command bus, a 256 bit request bus to provide data to the L1I and L1D caches, and a 128 bit store bus to send data to the L2 cache from the cores. Access to the command and store buses are arbitrated in a single cycle using Manchester carry chains to pass an access token round-robin style until a requesting processor is encountered.
The L2 cache connects to the DDR2 controller, with a 144 bit bus running at up to 400MHz (800MHz DDR), providing a peak 12.8GB/s bandwidth. Up to 4x4GB DIMMs are supported, for a maximum capacity of 16GB, although lower end products may not have so many DIMM slots. The memory bus can also be narrowed to 72 bits, giving even more flexibility across the product line.
OCTEON also supports Reduced Latency DRAM (RLDRAM), which is a low latency specialty memory used in telecommunications, through a secondary memory controller. Two 18 bit interfaces connect to a special DRAM controller that is shared by the cnMIPS64 cores and the Deterministic Finite Automata engines (DFAs, more on those in the next section). This memory is uncacheable, and is generally used by the DFAs, rather than the processors.