Bundles and Groups
The IA64 instruction set architecture can best be described as a variable length VLIW design. Its instruction stream is organized logically into instruction groups and organized physically into instruction bundles. An instruction group is a collection of IA64 instructions that fulfill a set of requirements that ensure they are collectively free of inter-instruction dependencies that would prevent an IA64 processor from executing them in parallel. It is up to compilers to ensure that instruction groups meet all necessary requirements for parallel execution. Failure to do so results in illegal code that will have undefined results when executed by hardware. The main rule is that no instruction in a group can write to a general purpose register that is specified as an input operand to a later instruction in the group (i.e. no read after write dependencies permitted).
Each IA64 instruction is 41 bits long. Three instructions are combined with a 5-bit tag, or template field, to form a 128-bit instruction bundle. The most important concept dealing with the IA64 ISA is that instruction groups do not have a fixed mapping to instruction bundles. A single bundle can include instructions from two groups, while a single group can be spread over multiple bundles. The IA64 instruction set is divided into six basic types: A, I, M, F, B, and L+X that are described in Table 1 .
Execution Unit Type
I-unit or M-unit
Extended (64 bit immediate)
The IA64 architecture also specifies four different classes of execution units: integer or I-unit, memory or M-unit, floating point or F-unit, and branch or B-unit. M-unit, F-unit, and B-unit type execution units execute M, F, and B type instructions respectively. Things get a little more complicated with integer instructions, which are effectively divided into three types A, I, and L+X. A type instructions include simple ALU-oriented instructions such as add, subtract, and bit-wise logical operations. A type instructions can be executed by either an I-unit or M-unit. I type instructions are more complicated integer instructions that require complex execution resources, such as a multiplier or barrel shifter. The I type instructions can only be executed by an I-unit. The L+X type instruction is an architectural kludge that allows two adjacent 41-bit instruction fields in a bundle to be combined to specify an instruction containing a 64-bit immediate value. The Itanium only supports one L+X instruction, the move long immediate, which writes a 64-bit immediate value into a general purpose register (GPR). Future IA64 processors will support long branch and call instructions using the L+X format. This will allow subroutine calling overhead to be reduced by eliminating the need to use linkage tables.
As previously mentioned, an IA64 bundle consists of three 41-bit instructions and a 5 bit template field. The three instructions fields in a bundle are called slots and are named slot 0, slot 1, and slot 2. The template field has two separate purposes. The first is to encode the basic type of instruction present in each slot of the bundle. This feature allows re-use of opcode values across different instruction types. That is, two or more instructions of different basic type can map to the same 41-bit pattern and are distinguished only by their slot position and template value. With six instruction types and three slot positions there are far too many possible combinations than can be encoded in a five-bit value. This imbalance is made worse because the template’s second use is to indicate stops within the bundle. A stop is simply an indication where one instruction group ends and the next begins. A five-bit template can distinguish between 32 possible combinations of instruction types within slots and presence or absence of stops. The initial definition of IA64 defines 24 different template values and reserves the remaining eight for future use. The 24 defined IA64 bundle formats are shown in Figure 1.
Figure 1 Defined IA64 Instruction Bundle Formats
Notice that the 24 defined formats are really 12 unique formats with and without a stop (shown as a thick black vertical bar between or after instruction slots) at the end of the bundle. The defined template values are encoded such that the least significant bit of the template code indicates the presence or absence of a stop at the end of a bundle. There are two bundle format pairs: 2,3 and 10,11 that include a stop between slot 1 and 2, and slot 0 and 1 respectively. If an instruction group does not encompass a bundle with format 2, 3, 10, or 11 then it consists of 1 or more complete bundles.
Be the first to discuss this article!