By: David Kanter (dkanter.delete@this.realworldtech.com), August 12, 2011 12:25 pm
Room: Moderated Discussions
Robert Davide Graham (bigrobg@gmail.com) on 8/12/11 wrote:
---------------------------
>I'm trying to figure out the raw compute resources there are on the chip (assuming
>I could find a way to program it). Your guide implies that >it has roughly the same
>compute power as one of the Sandy Bridge cores.
I'm not sure if that's true, I'd need to do the math.
>I'm concerned with theoretical max compute power. The Radeon has twice the theoretical
>compute power as the GeForce (for otherwise equivalent chips), but the GeForce has
>the better overall architecture, such as the flexibility of multithreading vs. the
>limitations of VLIW. But for a compute-centric application like password-cracking
>(for example), those other benefits are wasted, so password cracking on a Radeon
>is twice as fast as on a (equivalent) GeForce.
Sure. It depends on your workload.
>The same is true for an Atom. It's SSE integer ops take a single clock cycle, and
>it has two SSE units. Therefore, a 1.6-GHz Atom core >cracks passwords as fast as a 1.6-GHz Core2 core.
I'm not super familiar with Atom. Can it do a MULPx and an ADDPx every cycle? I would have expected only one of the two, plus a load.
>From your article, it seems that I can execute 6 instructions per clock (because
>there's 6 units), each of which operates on 4 data, times 0.85 GHz. In contrast,
>the Sandy Bridge CPU core can execute 2 instructions per clock, also 4 data, but
>at three times the clock speed, thus achieving the same overall compute throughput.
>
>Are these numbers correct?
Those numbers are not correct. I also can't tell whether you are interested in 32-bit or 64-bit data, which makes a pretty big difference.
There are 12 shader cores in the SNB GPU and 4 regular cores. However, the CPU runs about 3-4X faster.
Anyway, to answer your questions, you'll need to be a bit more specific.
David
---------------------------
>I'm trying to figure out the raw compute resources there are on the chip (assuming
>I could find a way to program it). Your guide implies that >it has roughly the same
>compute power as one of the Sandy Bridge cores.
I'm not sure if that's true, I'd need to do the math.
>I'm concerned with theoretical max compute power. The Radeon has twice the theoretical
>compute power as the GeForce (for otherwise equivalent chips), but the GeForce has
>the better overall architecture, such as the flexibility of multithreading vs. the
>limitations of VLIW. But for a compute-centric application like password-cracking
>(for example), those other benefits are wasted, so password cracking on a Radeon
>is twice as fast as on a (equivalent) GeForce.
Sure. It depends on your workload.
>The same is true for an Atom. It's SSE integer ops take a single clock cycle, and
>it has two SSE units. Therefore, a 1.6-GHz Atom core >cracks passwords as fast as a 1.6-GHz Core2 core.
I'm not super familiar with Atom. Can it do a MULPx and an ADDPx every cycle? I would have expected only one of the two, plus a load.
>From your article, it seems that I can execute 6 instructions per clock (because
>there's 6 units), each of which operates on 4 data, times 0.85 GHz. In contrast,
>the Sandy Bridge CPU core can execute 2 instructions per clock, also 4 data, but
>at three times the clock speed, thus achieving the same overall compute throughput.
>
>Are these numbers correct?
Those numbers are not correct. I also can't tell whether you are interested in 32-bit or 64-bit data, which makes a pretty big difference.
There are 12 shader cores in the SNB GPU and 4 regular cores. However, the CPU runs about 3-4X faster.
Anyway, to answer your questions, you'll need to be a bit more specific.
David