Power Saving Techniques
Power and heat have become first order constraints in modern MPU design, and are equally important in the design of interconnects. Thus it should come as no surprise that CSI has a variety of power saving techniques, which tend to span both the physical and link layer.
The most obvious technique to reduce power is using reduced or low-power states, just as in a microprocessor. CSI incorporates at least two different power states which boost efficiency by offering various degrees of power reduction and wake-up penalties for each side of a link . The intermediate power state, L0s, saves some power, but has a relatively short wake-up time to accommodate brief periods of inactivity . There are several triggers for entering the L0s state; they can include software commands, an empty or near empty transaction queue (at the transmitter), a protocol message from an agent, etc. When the CSI link enters the L0s state, an analog wake-up detection circuit is activated, which monitors for any signals which would trigger an exit from L0s. Additionally, a wake-up can be caused by flits entering the transaction queue.
During normal operation, even if one half of the link is inactive, it will still have to send idle flits to the other side to maintain flow control and provide acknowledgement that flits are being received correctly. In the L0s state, the link stops flow control temporarily and can shut down some, but not all, of the circuitry associated with the physical layer. Circuits are powered down based on whether or not they can be woken up within a predetermined period of time. This wake-up timer is configurable and the value likely depends upon factors such as the target market (mobile, desktop or server) and power source (AC versus battery). For instance, the bit lanes can generally be held in an electrical idle so they do not consume any power. However, the clock recovery circuits (receiver side PLLs or DLLs) must be kept active and periodically recalibrated. This ensures that when the link is activated, no physical layer initialization is required, which keeps the wake up latency relatively low. Generally, increasing the timer would improve the power consumption in L0s, but could negatively impact performance. Intel’s patents indicate that the wake-up timer can be set as low as 20ns, or roughly 96-128 cycles .
For more dramatic power savings, CSI links can be put into a low power state. The L1 state is optimized specifically for the lowest possible power, without regard for the wake-up latency, as it is intended to be used for prolonged idle periods. The biggest difference between the L0s and L1 states is that in the latter, the DLLs or PLLs used for clock recovery and skew compensation are turned off. This means that the physical layer of the link must be retrained when it is turned back on, which is fairly expensive in terms of latency – roughly 10us . However, the benefit is that the link barely dissipates any power when in the L1 state. Figure 3 shows a state diagram for the links, including various resets and the L1 and L0s states.
Figure 3 – CSI Initialization and Power Saving State Diagram, 
Another power saving trick for CSI addresses situations where the link is underutilized, but must remain operational. Intel’s engineers designed CSI so that the link width can be dynamically modulated . This is not too difficult, since the physical link between two CSI agents can vary between 5, 10 and 20 bits wide and the link layer must be able to efficiently accommodate each configuration. The only additional work is designing a mechanism to switch between full, half and quarter-width and ensuring that the link will operate correctly during and after a width change.
Note that width modulation is separate for each unidirectional portion of a link, so one direction might be wider to provide more bandwidth, while the opposite direction is mostly inactive. When the link layer is auto-negotiating, each CSI agent will keep track of the configurations supported by the other side (i.e. full width, half-width, quarter-width). Once the link has been established and is operating, each transmitter will periodically check to see if there is an opportunity to save power, or if more bandwidth is required.
If the link bandwidth is not being used, then the transmitter will select a narrower link configuration that is mutually supported and notify the receiver. Then the transmitter will modulate to a new width, and place the inactivated quadrants into the L0s or L1 power saving states, and the receiver will follow suit. One interesting twist is that the unused quadrants can be in distinct power states. For example, a full-width link could modulate down to a half-width link, putting quadrants 0 and 1 into the L1 state, and then later modulate down to a quarter-width link, putting quadrant 2 into the L0s state. In this situation, the link could respond to an immediate need for bandwidth by activating quadrant 2 quickly, while still saving a substantial amount of power.
If more bandwidth is required, the process is slightly more complicated. First, the transmitter will wake up its own circuitry, and also send out a wake-up signal to the receiver. However, because the wake-up is not instantaneous, the transmitter will have to wait for a predetermined and configurable period of time. Once this period has passed, and the receiver is guaranteed to be awake, then the transmitter can finally modulate to a wider link, and start transmitting data at the higher bandwidth.
Most of the previously discussed power saving techniques are highly dynamic and difficult to predict. This means that engineers will naturally have to build in substantial guard-banding to guarantee correct operation. However, CSI also offers deterministic thermal throttling . When a CSI agent reaches a thermal stress point, such as exceeding TDP for a period of time, or exceeding a specific temperature on-die, the overheating agent will send a thermal management request to other agents that it is connected to via CSI. The thermal management request typically includes a specified acknowledgement window and a sleep timer (these could be programmed into the BIOS, or dynamically set by the overheating agent). If the other agent responds affirmatively within the acknowledgement window, then both sides of the link will shut down for the specified sleep time. Using an acknowledgement window ensures that the other agent has the flexibility to finish in-flight transactions before de-activating the CSI link.