Instruction-level parallelism in EPIC
Static Multiple Issue : The VLIW Approach
Very long instruction word (VLIW) refers to processor architectures
designed to take advantage of instruction
level parallelism (ILP). Whereas conventional processors
mostly allow programs only to specify instructions that will be executed in
sequence, a VLIW processor allows programs to explicitly specify instructions
that will be executed at the same time (that is, in parallel).
Modern superscalar processors are
complex, power-hungry devices that present an antiquated view of processor
architecture to the programmer in the interests of backwards compatibility —
and do a lot of work to achieve high performance while maintaining this
illusion.
The alternative to superscalar is
aVLIW architecture, but these have traditionally been actively
backwards-incompatible, with performancehighly dependent on the (frequently
mediocre) abilities ofthe compiler. Neither VLIW nor superscalar are perfect
architectures:
each has its own set of
trade-offs. This report discusses the relative strengths and weaknesses of the
two, focusing on the benefits of VLIW andthe closely-related EPIC architecture
as used in Intel’s Itanium processor family.An introduction to the motivation
behind VLIW is given, VLIWand EPIC are discussed in detail,and then two case
studies are presented: the Analog Devices SHARC family of DSPs, demonstrating
the VLIW influences present in a modern DSP; and Intel’s Itanium processor
family, which is to date the only implementation of EPIC
VLIWs use multiple,
independent functional units. Rather than attempting to issue multiple,
independent instructions to the units, a
VLIW packages the multiple operations into one very
long instruction, or requires that the
instructions in the issue packet satisfy the same constraints.
we will assume that
multiple operations are placed in one instruction, as in the original VLIW approach. Since the burden for choosing the instructions
to be issued simultaneously falls on the compiler, the hardware in a superscalar to
make these issue decisions is unneeded. Since this advantage of a VLIW
increases as the maximum issue rate grows, we focus on a wider issue processor.
Indeed, for simple two
issue processors, the overhead of a superscalar is probably mini mal. Because VLIW approaches make sense
for wider processors, we choose to focus our example on such an architecture.
For example, a VLIW processor might have instructions
that contain five operations, including:
one integer operation (which could also be a branch), two floating - point operations, and two memory
references.
The instruction would have a set of fields for each
functional unit —perhaps 16 to 24 bits
per unit, yielding an instruction length of between 112 and 168 bits.
No comments:
Post a Comment