Silvermont, Intel’s Low Power Architecture

Silvermont is Intel’s first CPU core tailored for power efficient applications such as smartphones, tablets, and microservers. The 22nm microarchitecture features updated instruction set extensions, full out-of-order execution with a tightly coupled L2 cache, aggressive power management, and a new high performance SoC fabric. These enhancements deliver tremendous performance and frequency gains over the aging Atom core, putting Intel’s mobile strategy in a more competitive position.

Read More (8 pages)Discuss (296 comments)

Microservers must Specialize to Survive

The server market is at a potential inflection point, with a new breed of ARM-based microserver vendors challenging the status quo, particularly for cloud computing. We survey 20 modern processors to understand the options for alternative architectures. To achieve disruptive performance, microserver vendors must deeply specialize in particular workloads. However, there is a trade-off between differentiation and market breadth. As the handful of microserver startups are culled to 1-2 viable vendors, only the companies which deliver compelling advantages to significant markets will survive.

Read More (4 pages)Discuss (345 comments)

Intel’s Haswell CPU Microarchitecture

Intel’s Haswell CPU is the first core optimized for 22nm and includes a huge number of innovations for developers and users. New instructions for transactional memory, bit-manipulation, full 256-bit integer SIMD and floating point multiply-accumulate are combined in a microarchitecture that essentially doubles computational throughput and cache bandwidth. Most importantly, the microarchitecture was designed for efficiency and extends Intel’s offerings down to 10W tablets, while maintaining leadership for notebooks, desktops, servers and workstations.

Read More (6 pages)Discuss (100 comments)

Intel’s Near-Threshold Voltage Computing and Applications

Near-threshold voltage computing extends the voltage scaling associated with Moore’s Law and dramatically improves power and energy efficiency. The technology is superb for throughput, at the cost of latency, and best suited to Intel’s products for HPC and mobile graphics.

Read More (4 pages)Discuss (86 comments)

Haswell Transactional Memory Alternatives

We previously theorized that Intel’s TSX extensions in Haswell use the caches to provide transactional memory semantics. This article describes an alternative approach based on minimal changes to the CPU core, contrasts the advantages of the two techniques and discusses the expected implementation in Haswell.

Read More (3 pages)Discuss (30 comments)

ARM Goes 64-bit

The new ARMv8 architecture is classically British; a clean and elegant 64-bit instruction set, with compatibility for 32-bit software. The 64-bit mode eliminates many complicated and awkward features and will foster a larger and more diverse ARM ecosystem with new licensees and applications.

Read More (5 pages)Discuss (187 comments)

Computational Efficiency for CPUs and GPUs in 2012

New compute efficiency data shows GPUs with a clear edge over CPUs, but the gap is narrowing as CPUs adopt wide vectors (e.g. AVX). Surprisingly, a throughput CPU is the most energy efficient processor, offering hope for future architectures. Our data also shows some advantages of AMD’s Bulldozer, and the overhead associated with highly scalable server CPUs.

Read More (3 pages)Discuss (133 comments)

Sandy Bridge-EP Review

In our Sandy Bridge-EP and Romley platform review, we look at the performance and power efficiency gains for Intel’s latest server microprocessor on industry standard benchmarks including SPECcpu2006 and SPECpower_ssj2008. The results are impressive, Sandy Bridge-EP is clearly the best x86 server processor on the market, and Romley will be the platform of choice for the next 2 years.

Read More (6 pages)Discuss (15 comments)

Sandy Bridge-EP Launches

Sandy Bridge-EP is the first major overhaul for Intel servers since 2009, and nearly ever aspect has been enhanced. The processor pairs 8 cores with a large last level cache, DDR3 memory controller, QPI 1.1, integrated PCI-E and power management. This article provides an overview of the major features, including new I/O optimization and power capping techniques and discusses the expected impact.

Read More (3 pages)Discuss (17 comments)

Analysis of Haswell’s Transactional Memory

Intel’s upcoming Haswell microprocessors include transactional memory and hardware lock elision that are exposed through the Transactional Synchronization Extensions or TSX. In this article, I discuss TSX and predict the implementation details of Haswell’s transactional memory and expected adoption across the industry, based on my previous experience.

Read More (4 pages)Discuss (6 comments)