Countdown to IA-64

Pages: 1 2 3 4 5

Wanted: Brilliant Compiler, Ideally Able to Predict Future

The answer to that question doesn’t lie within the realm of chip design (although the opportunity for failure on that front is always present), but in software. In particular, it lies in the arcane realm of compiler algorithm design and implementation. Compared to compilation for CISC and even RISC architectures, an IA-64 compiler is faced with the burden of trying to replace the dynamic scheduling capabilities of superscalar out-of-order execution RISC and CISC processors with brilliant code generation at compile time. Imagine trying to plan your entire work day the night before right down to the smallest detail – which route to take to work, when to take lunch, when to use the washroom and so on. Then the next day you must execute this plan to the letter regardless of circumstances. Your plan tells you to take the elevator to your office but it turns out to be broken down. Sorry but you didn’t think of that last night. The EPIC philosophy means you can’t decide to take the stairs instead. You must stand there and wait for the elevator to be repaired.

Well, it isn’t quite that bad because the EPIC design philosophy has two ways of dealing with its inherent rigidity. One is explicit compiler driven speculation with recovery. The compiler decides that in specific cases it can’t be sure what will happen so it makes a guess and generates code that will run optimally if that guess is correct. It also adds instruction(s) to perform a runtime check to see if the guess was correct after all. If the guess was incorrect, the code branches off to a cleanup routine to recover from the incorrect assumption. Returning to my analogy, this is like adding a special note to your workday plan to assume elevator is working but check and if not then backtrack and try the stairs instead.

The second method of dealing with the difficulty of predicting the future is by the observation that in most cases the future tends to repeat the past. Consider the analogy of planning your workday in advance. You might decide to take a trial run of your initial plan for your workday on a Sunday and take careful notes of what you encounter. Then you use those notes to help plan your real workday. You might make a note that the elevator was broken and the front stairs get you to the office quicker than the back stairs. So you revise the plan on the assumption that the elevator will still be broken tomorrow and you should take the front stairs instead.

This approach is the analog to run time profile driven compiler optimization. What this entails is first compiling the program using best guess defaults and with profile data collection enabled, and running the program on a representative set of data. The run time profile (essentially how many times each code statement, subroutine, loop etc. was executed) is used as input to the compiler for a second compilation of the program and is used to guide how to set branch hint flags, when to eliminate branches through predication, and when to introduce speculative code segments [7]. The key to a successful EPIC compiler is to use the architecture’s explicit speculation instructions and tools in a very sparing manner – where it really counts. Otherwise the object code balloons in size, the instruction cache and TLB miss rate goes up, and the processor spends a lot of instruction slots doing things that don’t advance the computational state. Another pitfall occurs when the run time profile is generated with data that is not very representative of most uses of the application. Going back to my analogy, if you take a trial run of your workday on a Sunday you may underestimate traffic and miss the fact that the main parking lot will be full before you arrive on a real workday. That will lead to a poor workday plan.

Pages: « Prev   1 2 3 4 5   Next »

Be the first to discuss this article!