New CELL Article Online

Article: CELL Microprocessor III
By: Deadmeat (, August 7, 2005 4:06 pm
Room: Moderated Discussions
>You may very well be correct here, that a thread is required to kick off each SPU.
>But I think you're making the assumption (mistake) that the thread doing the kicking
>is also a "work" thread, one that requires calculation.

The APU kicking call is synchronized; when a CPU thread thread kicks an APU, the CPU thread is immediately blocked and does nothing, until the APU finishes the job and stops executing. Single CPU thread cannot kick two APUs at the same time, why is this concept so hard to understand???

>Take the "job queue" model for instance:

You don't expect games to be programmed with "job queue" model, do you?

> The kicking thread kicks and blocks forever
> while the SPU(s) sit(s) there handling orders delivered by one or more threads whose
>number is unrelated to the number of SPUs.

You have to understand that so-called "job queue" layer is just another abstration with its own threads; when the user thread hands a job to the "queue managing" thread, these two are both CPU-threads and non-blocking of each other. It is then the queue-managing thread's job to spawn more threads that kick the next available APU.

The CPU-thread spawning business is not gone; it is now done by somebody else instead of you, at the expense of inefficiency.

> For some applications this will be the most efficient model. For other apps, other models will be better.

Exactly, games tend to be highest-performance demanding and programmed at the lowest level, meaning the game developers are burdened by this "spawn thread and kick APU" business.

> No, it assumes that appropriately written software will scale more directly in
> performance with both the performance and number of the coprocessor engines, and
> the environment they sit in.

Graphics pipe from T&L are already well-parallelized.

It is CPU's job to handle the algorithms that do not easily parallelize, like physics, AI, and animation.

>>> Not because they expect it's so powerful that it will still be relevant in 2015,
>>but because it will scale like mad, far faster than conventional, homogenous "jack
>>of all, master of none" architectures.

It is already proven that system performance doesn't scale well past 2 CPUs.

IBM's own experience with CELL on FFT algorithm already proves that the rate of scaling drops off significantly past certain point, hence 8 APUs will only give you 4X the performance of single APU.

> Not everybody suffers from your lack of
>imagination. (I don't mean to be harsh, but your argument is ignoring lots of evidence that contradicts it.)

6 years is not a lot of life span to realize your imagination. PSX3 doesn't have a 10 year life span; it has 6 years just like PSX1 and PSX2.

>Your whole argument is analagous to saying, "multi-core x86 isn't scalable because
>single-threaded apps can't use the other core(s), and programming multi-threaded apps is hard."

Look at it from the perspective of people who do the coding for a living. Do you want an easy way to make a living or a hard way to make a living. If you are a soldier, do you want to be a warehouse clerk or a marine dodging bullers and bombs while eating sand in Iraq???
TopicPosted ByDate
