For desktops, Barcelona will probably lead multithreaded performance and applications that strongly depend on high bandwidth, but single threaded workloads may slightly favor Intel’s designs. Note that the dual core desktop processors are likely to remain a large portion of the product mix, until the marginal cost for the additional cores is low, or most applications can use 4 threads. A mobile variant of Barcelona will not be introduced till 2009, after Griffin. This is because many of the improvements in Barcelona are focused on server performance, and may not have the right power/performance balance for notebooks.
Dual processor servers will be a mixed bag for AMD. Expect extremely strong performance for almost all HPC-style workloads – that is and will continue to be a strong suit for AMD’s architecture because of the highly integrated system design and copious bandwidth. However, for commercial workloads like file or web serving or transaction processing, which don’t require as much bandwidth, any performance gaps will be much smaller. For these workloads Barcelona will certainly be competitive and will exceed Intel’s performance on some benchmarks, but Intel will likely retain a lead for other benchmarks. Performance for single processor servers will generally be similar – but the advantages from AMD’s system architecture will be smaller, hence the multithreaded performance will probably be even closer than for dual processor servers.
One area where AMD should have a slight edge is on dual processor platform power consumption, due to differences in the memory systems. AMD uses DDR2 DIMMs, which consume 3-5W each, while the FB-DIMMs that Intel’s systems use consume 5W above a normal DDR2 DIMM. The power advantage for AMD will depend on the configuration of each individual server. As more memory is added, AMD’s advantage will grow, however, as other components are added, AMD’s relative advantage will diminish – for example, in systems with 8 or more disks, the differences in memory systems may be lost in the noise. This difference in memory architectures applies for dual processor FB-DIMM based servers only, since Intel’s single processor systems use DDR2 DIMMs and some dual processor servers may use regular DDR2 DIMMs. For those servers where Intel uses regular DDR2, AMD will not have any significant power advantages.
At the high-end, MP servers should be a bright spot for AMD. The higher level of integration and the additional HyperTransport link in Barcelona will improve an already formidable system architecture that has Intel on the defensive. One open question is whether Barcelona will truly commoditize the market for large MP (8 socket) servers. The capability is there, but it is unclear whether OEMs will aggressively push a solution that does not address the inherent limits of a snooping cache coherency policy, and lacks some of the RAS features that typical mid-range servers offer.
Barcelona is the first revision to AMD’s microarchitecture since 2003. Rather than starting from scratch, Barcelona builds on the previous generation and subtly improves almost every aspect of the design. In many ways, this mirrors the evolution of computer architecture – there are very few techniques that can give a large boost on a wide spectrum of applications. Instead, architects are turning to many smaller improvements, just as AMD has done with Barcelona. The only obvious trick left in the bag for AMD is multithreading, which could provide a big boost in a future microarchitecture, such as the K10. However, this style of conservative, consistent design has worked very well for AMD in the past.
Barcelona is a solid improvement across the board and should give AMD momentum across several key markets. The performance advantages will be decisive for HPC applications and MP servers, other areas will be close in performance. Hence, there is quite a bit to look forward to in the near future with the debut and performance numbers for Barcelona. No matter what, AMD’s engineering teams deserve kudos for a solidly executed product.
 Moore, Chuck. Redefining Performance Through System Balance. Spring Processor Forum, 2006.
 Sander, Ben. Optimizing the Microprocessor for System-Level Performance. Fall Processor Forum, 2006.
 Dorsey, J. et al. An Integrated Quad-Core Opteron Processor. International Solid-State Circuits Conference Technical Digest, February 2007.
 Searles, S. An Integrated Quad-Core Opteron Processor. International Solid-State Circuits Conference, February 2007.
 Conway, P. et al. The Opteron CMP NorthBridge Architecture, Now and in the Future. Hot Chips XVIII, August 2006.
 Software Optimization Guide for AMD Family 10h Processors. May, 2007.
 de Vries, Hans. Understanding the Detailed Architecture of AMD’s 64 bit Core. Chip Architect, September, 2003.