Estimating the FO4 depth of the critical path: Part 1
In figure A6, we use a conservative estimate of the circuit for the barrel shifter. We show that the critical path of the circuit begins with the 3 FO4 logic depth for the buffering of the control signal. Then we add a logic depth of 3 FO4 per node, and the result is 18 FO4 delays through the shifter in addition to the delays through the worse case signal propagation path.
Estimating the FO4 depth of the critical path: Part 2
In figure A7, we use a slightly more aggressive estimate of the circuit for the barrel shifter. We instead use muxes with 2 FO4 delays instead of 3. We also increase the depth of the inverter buffering since the alternative mux implementation shown in figure A4 requires the control signal and the inverse of the signal for each mux. As a result, we show that the critical path of the circuit begins with the 4 FO4 delay for the buffering of the control signal. Then we add a delay of 2 FO4 per node, and the result is 14 FO4 delays through the shifter in addition to the delays through the worse case signal propagation path. As part of the optimization attempt, we also inverted the ordering of the shift, since the longest wire path shown is between stages 3 and 4, with the inversion of the logic circuit, the worst case delay through the longest wire length may perhaps be partially hidden by the delay path of the 4 FO4 deep control buffering circuitry.
This is just the left shift, what about the right shift?
Astute readers may note that we have obtained rough first order approximations for the logic depth of a barrel shifter, but the circuit only performs the functionality of a left shift. The functionality of a right shift operation needs to be accounted for as well. In figure A8, we implement the functionality simply by duplication of the shifting array, but reverse the direction of the shift array. A final level of mux is added at the end to select either the result of the left shift or the result of the right shift.
Is there a better way to do it?
Depending on the definition of “better”, we may perhaps re-use a single barrel shifting circuitry for both the left shift as well as the right shift operation. All that is required is to take advantage of the fact that a left shift operation of the bit vector am is the same operation as the right shift operation of the bit vector an if n and m are simply indexes of reverse ordering within the same range. In figure A9, we show an abstract implementation of a circuit that could reverses the bit ordering of the input vector as well as the output vector. In this manner, the same circuit may be used for both the right shift as well as the left shift. The logic depth for this circuit would then add two additional mux delays to the simple barrel shifter. However, more problematic may be the delay of interconnects used to reverse the ordering of the bit vectors. Furthermore, the routing for the bit-vector re-ordering interconnects may also occupy significant die area, and thus partly viciates the benefit of sharing the shifter array itself.
Discuss (77 comments)