By: Robert Davide Graham (bigrobg.delete@this.gmail.com), August 12, 2011 12:02 pm
Room: Moderated Discussions
I'm trying to figure out the raw compute resources there are on the chip (assuming I could find a way to program it). Your guide implies that it has roughly the same compute power as one of the Sandy Bridge cores.
I'm concerned with theoretical max compute power. The Radeon has twice the theoretical compute power as the GeForce (for otherwise equivalent chips), but the GeForce has the better overall architecture, such as the flexibility of multithreading vs. the limitations of VLIW. But for a compute-centric application like password-cracking (for example), those other benefits are wasted, so password cracking on a Radeon is twice as fast as on a (equivalent) GeForce.
The same is true for an Atom. It's SSE integer ops take a single clock cycle, and it has two SSE units. Therefore, a 1.6-GHz Atom core cracks passwords as fast as a 1.6-GHz Core2 core.
From your article, it seems that I can execute 6 instructions per clock (because there's 6 units), each of which operates on 4 data, times 0.85 GHz. In contrast, the Sandy Bridge CPU core can execute 2 instructions per clock, also 4 data, but at three times the clock speed, thus achieving the same overall compute throughput.
Are these numbers correct?
I'm concerned with theoretical max compute power. The Radeon has twice the theoretical compute power as the GeForce (for otherwise equivalent chips), but the GeForce has the better overall architecture, such as the flexibility of multithreading vs. the limitations of VLIW. But for a compute-centric application like password-cracking (for example), those other benefits are wasted, so password cracking on a Radeon is twice as fast as on a (equivalent) GeForce.
The same is true for an Atom. It's SSE integer ops take a single clock cycle, and it has two SSE units. Therefore, a 1.6-GHz Atom core cracks passwords as fast as a 1.6-GHz Core2 core.
From your article, it seems that I can execute 6 instructions per clock (because there's 6 units), each of which operates on 4 data, times 0.85 GHz. In contrast, the Sandy Bridge CPU core can execute 2 instructions per clock, also 4 data, but at three times the clock speed, thus achieving the same overall compute throughput.
Are these numbers correct?