By: Nicolas Capens (nicolas.capens.delete@this.gmail.com), March 7, 2011 3:28 pm
Room: Moderated Discussions
Hi Seni,
Seni (seniike@hotmail.com) on 2/24/11 wrote:
---------------------------
>> There's just no way a contemporary game using high quality setting is going
>>to run at decent framerates on this IGP.
>
>Well, it's an IGP. Current IGPs aren't about decent framerates. They are about
>getting crap framerates but still playable.
Exactly. And software rendering isn't too far behind. Make use of AVX, add FMA and gather/scatter, and replace the IGP with more CPU cores and you get adequate graphics while also having a CPU which can do way more than legacy rasterization.
>Hopefully they will catch up a little in the future, but I'm not expecting much from Intel just yet.
They can't catch up. AnandTech has tested aggressively overclocking the IGP and the performance didn't go up by much. It's limited by the socket's bandwidth. This also means that spending more die area on the IGP would be a waste. And increasing the bandwidth while keeping a low latency is expensive. So the IGP isn't going anywhere any time soon, while the CPU is catching up in effective throughput.
>>>They increased the GPU clock but not the memory bandwidth. This resulting in poor scaling.
>>>Increasing GPU clock but not bandwidth has always resulted in poor scaling. Nothing new here.
>>>How this shows anything about trends in future hardware is beyond me.
>>
>>It shows that the die area percentage for the IGP can't simply be increased in
>>the next generations to save it from its fate.
>
>I don't see why not, especially if memory bandwidth rises for unrelated reasons, as it has been doing for decades.
Bandwidth is only increasing slowly in comparison to throughput. A Core 2 Duo E8400 had 10.7 GB/s of bandwidth, and 48 GFLOPS. An i7-2600K has double the bandwidth but over four times the GFLOPS. And FMA, gather/scatter and replacing the IGP with CPU cores would dramatically increase the throughput again.
>>Certain people here insist on comparing
>>theoretical compute density but fail to see the bigger picture
>
>But I'm not one of those certain people, so... strawman?
No, I merely added it as an additional observation, not in response to any of your arguments. When we agree, we agree.
>>The CPU is catching
>>up and unification gives it an area advantage, which can also be traded for reduced power consumption.
>
>Except that the best way to trade area for reduced power consumption is likely to de-unify.
>Specialized hardware is more power efficient and can be powered down completely
>when not in use. The only cost is area.
If power consumption was that important we'd have specific GPUs for playing different games, not a continuing expansion of capabilities and unification of functionality. I also seriously doubt we're ever going to see physics ASICs ever again. People care a lot about the price and flexibility of their hardware.
So as the architectures and workloads converge closer, there's simply a point where unification gives you more than the sum of the parts. Power consumption determines how small the gap has to be, but it doesn't stop the convergence itself.
>>In particular it means that as the convergence reduces the gap, the IGP is the first to go.
>>
>>>>The CPU is catching up and clearly there's little point in investing more die space
>>>>into the IGP. So CPU manufacturers should look into using the die space for something
>>>>more interesting, or cut it out entirely.
>>>
>>>They cannot use the die area for something more interesting, since IGPs are maximally interesting.
>>
>>With all due respect that's as short-sighted as saying the architecture of an R300
>>is the maximally interesting use of 195 mm2 of silicon. Beyond low-end graphics,
>>the IGP's capabilities are pathetic so you can forget about GPGPU. There's a lot to be improved.
>
>See, the thing is, what's interesting and what's not is all opinion. You can't disprove an opinion.
>If I claim that IGPs are maximally interesting to me, well then you have no ability
>to dispute that, because only I know what is interesting to me.
True, it's all a matter of opinion. But let me just repeat that it would clearly have been foolish to consider an R300 maximally interesting several years ago, or any other architecture really. History tells us that innovation has never been at a standstill. NVIDIA and AMD can't stop innovating now either, or it would seriously hurt their business. That's hardly an opinion.
Heck, continuously scaling the same architecture is simply physically impossible. As the transistor budget increases exponentially, you need to compensate for the growing gap between bandwidth and logic density. The relative excess of transistors allow to increase programmability, implement a memory hierarchy to take advantage of spacial and temporal locality, increase ILP, etc. Moore's Law has held true for over four decades so any other proposition is a much bigger gamble.
If you want to stick to your opinion that things won't evolve any further, fine, but then the discussion ends here and we'll meet again in several years...
>>More interesting GPUs (programmability) and more interesting CPUs (throughput)
>>are released every major generation. So they're on a collision course. Replacing
>>the IGP with two more CPU cores makes the use of that die area a lot more interesting
>>because they can then be used in combination with the already existing CPU cores.
>
>On the contrary, putting more cores on the chip is boring since they are exactly
>like the existing cores. I mean, it's just a numerical increase. How much more boring can it get?
If I can run *any* software with reasonable efficiency on the die space occupied by the IGP, I call that more interesting than just the ability to run mediocre graphics. Also note it's not just a numerical increase in the number of cores, the cores themselves would also get FMA and gather/scatter to increase the effective throughput.
>If you buy loaves of bread at the store, and see that the manufacturer decided
>to put raisins in the bread, that might be interesting.
>If they just increased the size of the loaf by 1 slice, that's not interesting. That's a boring numerical increase.
>You might prefer it, if raisins aren't your thing. But that doesn't make it interesting.
With all due respect that's a terrible comparison. A heterogeneous architecture is more like a small loaf of bread and a pile of raisins next to it. A homogeneous architecture is actual raisin bread. The fact that this makes it a bigger loaf doesn't mean it's boring.
You can have the best of both worlds with an architecture which is latency and throughput optimized. This is a major benefit for mixed workloads, which would otherwise have to communicate between the heterogeneous cores or perform work on a suboptimal core for the task. This benefit is only getting bigger, because DLP is limited by Amdahl's Law and at the same time software is getting more complex.
>>>>I think I've clarified by now that eventually there's just 'computing'. A homogenous
>>>>CPU with FMA and gather/scatter can handle both latency-sensitive and high-throughput
>>>>workloads (and most applications are a complex mix of these). So the main thing separating the markets is price.
>>>
>>>There are many market segments separated by more than price, and your refusal to see them is baffling.
>>
>>Other than CPUs and GPUs, which are converging, I don't see what else could be
>>relevant to the discussion. So instead of implying I refuse to see them, why don't you show them?
>
>So you want a list of market segments, eh?
>Down from the top:
>
>-supercomputers
>-mainframes
>-workstations
>-database / transaction processing servers
>-web servers
>-cloud computing / consolidated servers
>-Gamer Desktops
>-DTR laptops
>-game consoles
>-Cheapskate/Corporate Desktops
>-mainstream & thin&light Laptops
>-Cheapskate/Corporate Laptops
>-netbooks
>-tablets
>-portable game consoles
>-smartphones
>-embedded / industrial / military
>-embedded automotive
>-embedded consumer appliances
>
>Let's look at some of the differences here.
>Supercomputer and workstation for example are mostly floating point. More specifically,
>double-precision floating point.
>Mainframes and transaction processing have very little floating point, and a heavy
>focus on reliability, branchy integer code, i/o, and throughput
>Gamer desktops, game consoles and so on are also heavily floating point, but it's all single-precision.
>For Corporate / Cheapskate Desktops, performance isn't a big issue, but what little
>performance is needed is all latency-sensitive integer code.
>For laptops and all other portable stuff, power efficiency is important.
>For military, industrial and automotive, reliable operation under harsh conditions is important.
>
>So you see, this stuff is all different, and it's not just about cheap vs. fast.
>There are other issues. Efficiency, reliability, suitedness to different workloads.
And despite all that the very same CPU core design used for desktops are also used for supercomputers, workstations, servers, laptops, military systems, etc. Likewise, the same GPU architectures are used for gaming and HPC.
You're underestimating the cost of differentiation, you're underestimating the value of flexibility, and you're overestimating the cost of unification. An organization buying a supercomputer doesn't want to be caught with its pants down when suddenly someone needs to run integer intensive code. The cost of adding an integer ALU next to a floating-point ALU (or having a unified design) is almost negligible compared to the potential value for customers.
Creating a new design is insanely expensive. Intel can bring a new generation of chips to the market ever two years, but it still takes multiple teams to work in parallel and costs billions in R&D. Likewise AMD had to buy ATI for 5.4 billion to be able to produce discrete GPUs and fuse an IGP into the CPU. So you'd better make sure your architectures are well balanced for multiple workloads.
And for a CPU+IGP the reliability of both components has to be the same so this doesn't change for a homogeneous architecture. Also, while some CPUs feature ECC support and such, that's not relevant to the unification of cores we're talking about (plus GPUs for the HPC market feature it too).
So seriously, the CPU and GPU core designs for these market segments are practically identical, leaving price as the biggest differentiator. And with the CPU and GPU converging, it's only a matter of time before we get one architecture to do all forms of computing. The IGP is the obvious first victim.
>>>>Again, most applications are a complex mix of workloads. But even if some tend
>>>
>>>No they aren't. It's not a complex mix.
>>
>>Have you ever taken a look at the various shaders used by a various games? They
>>differ significantly. And they would differ even more if the GPU wasn't so restrictive.
>>Like you said yourself, sometimes a texture access is replaced by procedural code
>>and sometimes arithmetic code is replaced by a texture lookup, to better match the
>>architecture. A more unified architecture would allow to use the operations they
>intended to use, and get a net speedup.
>>
>>As a matter of fact that's what I've already observed for the past decade. In the
>>early days there were no dependent texture lookups so some effects were simply not
>>possible or there was usually just one workaround with acceptable performance.
>
>There was always multipass rendering, which is Turing-complete. And also you could
>always render a bitmap on the CPU and upload it as a texture. So no effects were ever not possible.
Don't be ridiculous. I know the GPU is Turing-complete, but when you have to use multipass to implement dependent texture lookups on architectures that don't natively support it, that definitely counts as a complex workload, for which a generic homogeneous architecture would be far better suited.
So stating that "no effects were ever not possible" is true in itself but it's not helping you forward with anything relevant to the discussion. On the contrary, it shows just how limited GPUs were, and even today certain workloads may be possible in theory but are a no-go in practice.
>>This
>>and other limitations made every game look like a clone of other contemporary games
>
>And yet they weren't. Early 3d games do not, in fact, all look the same.
The artwork was different, but the graphics techniques were largely the same. With Direct3D 8 most surfaces simply had a base texture and a lightmap, and if they had a bump map it was the same effect based on the texbem functionality.
>>(not in gameplay but in graphics), which started to change as the capabilities expanded.
>>The unification of vertex and pixel processing also allowed developers a lot more
>>freedom in the workloads,
>
>Actually it made very little difference. It was a big letdown.
And yet they never looked back. Sure, most existing games of that time didn't benefit all that much from unification, since they still attempted to balance the vertex and pixel workloads. But now that pretty much everyone has a unified graphics chip, games have a greater variety in vertex and pixel workloads. So you have to take more than just the instant benefit into account. Furthermore, geometry shaders, computer shaders, tessellation, etc. it all greatly benefits from unification.
>You can't use it to increase the vertex count, because there is a bottleneck in vertex fetch & triangle setup.
Exposing another bottleneck is not an argument against unification. They fixed that with later generations.
Also note that a homogeneous CPU with gather/scatter would never have such a bottleneck in the first place.
>You can't use it to increase pixel detail by much, since the hardware vertex shaders
>were only like 10% of the total shader hardware, and thus exposing them to pixel
>workloads only give you about 10% speedup.
>The tradeoff is a little better on Fermi as there are multiple triangle setup units.
>But at the time when unified shading was introduced it was surprisingly ineffective.
Again, that wasn't the fault of unification itself. It's irrelevant how well it did or didn't work back then. The reality is that it's currently unthinkable to go back. Besides, it has only been four years since the first unified chips hit the market. That's a very fast return-on-investment.
>>between applications but also between scenes in the same
>>application. Floating-point textures and render buffers enabled deferred shading,
>
>Deferred shading is a mistake. All it gives you is performance, but it comes at
>the expense of breaking your AA algorithm, thereby costing more performance than it saves.
That was fixed with Direct3D 10.1 by allowing access to anti-aliased buffers. Besides, floating-point textures and buffers have more applications than just deferred shading.
>>which differs quite radically from previous approaches. And that's just the things
>>that immediately popped into mind, and I wasn't even just talking about graphics.
>>Other applications you may run on your system also have diverse and complex mixes of workloads.
>
>They have diverse behavior as a group, but each individual program is mostly homogenous, not a complex mix.
So? You still want to keep the data passed between them on-chip as much as possible. For instance geometry shaders are a specific workload but it's faster to run them concurrently with the other shaders so you don't consume bandwidth. So the chip is actually running a complex mix or shaders. And to dynamically balance vertex, geometry and pixel shaders a unified architecture makes most sense.
>>There's really no denying that GPUs have become more programmable and more capable
>>at handling this melting pot. And with GPGPU and future graphics research hinting
>>at ray-tracing and micro-polygons and whatnot there doesn't appear to be an end
>>in sight. At the same time CPUs have increased their throughput using multi-core
>>and widening the vectors, and FMA and gather/scatter will make them even better
>>at running more than just latency sensitive workloads.
>
>This is circular reasoning. You say gather/scatter will catch on because convergence
>demands it. Then you say convergence will happen because of gather/scatter.
>It could just as well go the other way: lack of gather/scatter support makes convergence
>infeasible. Then non-convergence makes gather/scatter unnecessary.
It's not circular reasoning. Convergence happens regardless of gather/scatter. MORE convergence happens with gather/scatter support. Convergence isn't a goal in and by itself, it's just that an architecture which balances ILP, TLP and DLP is the most valuable. Unification is an added bonus which allows to bridge the final gap.
Also, this is what innovation is all about. You need sparks (AVX, FMA, gather/scatter) to ignite something bigger (a superior homogeneous architecture). And the industry doesn't leave innovation opportunities unused. The competition is fierce so you can't afford not to look at all the options. The idea for gather/scatter is out there, and the potential of it is being researched as we speak. All the ingredients are there; it has a relatively low hardware cost and it greatly benefits throughput workloads while also allowing vectorization of less trivially data parallel workloads. So it's just a matter of time.
>>So with both aiming to support
>>both types of workloads, a showdown is inevitable.
>
>I think it's quite possible the the showdown is evitable and will in fact never occur.
You're talking as if it would be a bad thing to have a homogeneous CPU which can also adequately handle graphics. There's no reason to fight it. Like I said, convergence isn't a goal in and by itself, but neither is avoiding convergence. CPUs get a big increase in transistor budget with every process node, and one of the ways to spend these transistors is by increasing throughput. Likewise, the GPU is running into DLP limits and has to spend a larger portion of its transistors on storage and ILP. This is merely an observation.
>>Note that there isn't any strict dividing line between them. You can't offload
>>every high-throughput task to the GPU and execute every low-latency task on the
>>CPU. Unless you have a massive amount of data parallelism to exploit it's faster
>>not to send things over to the GPU and just process things on the CPU. The communication
>>latency is too high and the bandwidth too low.
>
>The communication latency & bandwidth is an issue if you are communicating over PCI-E.
>It's a complete non-issue for CPU-IGPs though, where all the communication stays
>on-chip. Not that IGPs are up to task... yet.
It's not a complete non-issue at all. Communication latency and bandwidth between CPU cores and IGP cores is pretty bad.
The round-trip latency involves a lot more than just making an electrical signal cross a few centimeter and back. You have to format your data, issue a command to the API, the data gets copied and the command ends up in a queue, then when the IGP is ready it pulls in the data and starts processing, when completely done it signals the CPU and the results have to travel back.
So letting the IGP take care of all throughput heavy workloads is nowhere near free of issues. In fact GPGPU on the IGP is totally hopeless. Only one-way communication like with graphics is realistic. If Intel wants that die space to have any value beyond graphics it's only option is to equip the CPU cores with FMA and gather/scatter and replacing the IGP with such cores. The data can then stay local for much longer, and even communication between the homogeneous cores can be much more fine-grained than communicating with the heterogeneous IGP.
>>It's better to unify things and keep
>>the data local. Likewise, the GPU shouldn't always leave latency sensitive tasks
>>to the CPU. Often it's faster to use a dumb brute-force approach to get the job
>>done without a round-trip to the CPU. But imagine a latency-optimized GPU which
>>doesn't have to waste computing resources so it becomes even faster.
>
>A latency-optimized GPU would suck.
That's an unfounded claim. There are significant differences in latency between various GPU architectures, yet their effective performance is competitive.
Non-aggressive latency optimizations don't hurt the performance. They may lower the theoretical throughput but compensate it with a higher efficiency.
And for an IGP you get the area benefit of unification on top of that. So a homogeneous chip can have more latency optimizations, allowing it to function as a CPU as well.
>>Both are heading for the same thing.
>>
>>>>to be hugely latency-sensitive or high-throughput, there's no telling what application(s) you'll run tomorrow.
>>>
>>>Unless, of course, for some reason, you know what applications you'll run tomorrow.
>>>Like if it's your job or something.
>>>Like, for example, you are evaluating the choices of what to run right now, so as to buy the necessary software.
>>>Or maybe you bought the software and have no funds left for more software, so you
>>>will have to continue using the same software.
>>>Or maybe they're the same apps you're running right now, and that you've been running for the past ten years.
>>>You overestimate the uncertainty.
>>
>>I don't overestimate it. You're absolutely right that systems are always bought
>>with certain uses in mind, and usually it closely matches the actual uses. But that
>>doesn't mean we should underestimate it either. Someone with a 500 € budget may
>>not expect to be able to simulate the ocean currents, but he does expect to be able
>>to run the vast majority of desktop applications, including some that have yet to appear.
>
>Ordinary desktop applications (excluding games) have barely changed in the past
>fifteen years. I think it's reasonable to guess that they will barely change for the next fifteen as well.
>I know there have been some new applications since then, but I can't think of a
>single one for which performance matters.
>So if you buy a low end laptop now, and expect to run future applications on it, it will probably work fine.
It won't. Try running any modern application on a system from '96. It will be a horrible experience. Even an Atom CPU is many times faster than the CPUs we had back then, so whether you like it or not, the software has evolved a lot in fifteen years.
>>In particular, someone who currently opts for a CPU+IGP, would be better off with
>>a homogeneous CPU with gather/scatter in the not too distant future.
>>
>>>>>Ray-traced Super Mario sounds pretty awesome. I'd get it.
>>>>
>>>>Well, actually, due to the wildly varying latencies occuring in ray-tracing, you
>>>>need an architecture capable of coping with that optimally. A homogenous CPU with
>>>>gather/scatter can handle fine-grained dynamic balancing of the complex mix of workloads.
>>>
>>>There's not really a problem here with varying latencies. Ray tracing on GPUs works fine.
>>
>>This should sober you up: http://www.youtube.com/watch?v=4bITAdWvMXE
>
>He's saying that GPU raytracing is not quite as good as CPU raytracing. But then,
>he's doing the CPU raytracing on a twelve-core multi-socket workstation.
That's only two 6-core CPUs, and he's comparing it against a pair of expensive GPUs. Furthemore, once again the GPUs are worthless without a CPU. The driver needs a fairly powerful CPU. So when comparing the cost you can't look at the prices of the cards alone.
>Also it's not clear how old this comparison is and if anything has changed since
>then. It may be that the GPU is catching up, or already has caught up.
>All signs suggest that the GPU's ability to raytrace is improving at a high rate,
>so it may very well surpass the CPU soon if it hasn't already.
Keep in mind that the CPU is evolving as well. 6, 8 and 10-core CPUs with AVX will be lauched later this year. So by the time the GPU can do all the complex ray-tracing, and do it faster than the 12-core system, it's up against a new generation of CPUs which much higher throughput.
The only way the GPU can catch up with that moving target, is by investing more transistors into latency optimizations. But because this lowers the peak throughput it can't possible surpass the CPU for this workload. The CPU still has FMA and gather/scatter to greatly improve its effective performance. I don't know of any new technology that would give the GPU an advantage over the CPU.
>Also, you see he said nothing about variable latencies being a problem. Because they aren't.
>The actual problem seems to be that GPU can't cast enough rays per second to smooth out the sampling noise.
>It casts too many where they aren't needed and not enough where they are, since
>it lacks the flexibility to make a finer-grained decision about it.
It's probably a combination of things really. But the reality stays the same: The GPU would require CPU technology to catch up, and this also means it will never surpass it.
>>The GPU has lots of limitations and doesn't achieve the same quality as a CPU does
>>in a fixed amount of time. You may note that his 'solution' around 7:30 is a heterogenous
>>architecture, but that's just the current status quo. It fails to take into account
>>that the CPU's vector units got wider, and we still have FMA and gather/scatter to come.
>>
>
>It's true the CPU's vector units are getting wider, but the GPU is widening at an even faster rate.
No they're not. It's not just the vectors getting wider. Simultaneously the number of CPU cores increases, and FMA will also double the theoretical throughput.
GPUs on the other hand are losing compute density:
* Cypress XT: 2154 Mtransistors, 2720 GFLOPS, 320 VLIW5 cores, 850 Mhz
* Cayman XT: 2640 Mtransistors, 2703 GFLOPS, 384 VLIW4 cores, 880 Mhz
The focus on getting wider is making way for other optimizations. So in terms of compute density, the CPU and GPU are converging.
>>In fact there are reasonably successful 'hybrid' ray-tracers but unsurprisingly
>>the developer's biggest complaint is the communication overhead between the CPU
>>and GPU. A CPU with FMA and gather/scatter offers the best of both world without
>>such overhead.
>
>So does a beefed-up on-chip IGP.
No, that still has a high communication overhead.
>>You could also make the GPU more latency optimized but that's a lot
>>more work and you still don't get the unification benefit. So the CPU is going to eventually win this debate.
>>
>>>>The casual gaming market is booming, and the potential of WebGL and Molehill is
>>>hard to overestimate. The web allows to reach a whole new market of people, who
>>>are hesitant to set foot in a game store or install Steam.
>>>>Mark my words; the next World of Warcraft hype may very well be an HTML5 or Flash game.
>>>>The emerging Smart TV market also consists of devices with weak graphics chips (shameless plug: http://gametree.tv/platform/).
>>>
>>>The casual gaming market is booming, but I suspect the casual gaming market has little to do with hardware at all.
>>>Casual gamers couldn't care less whether they have a 4-core or a 6-core.
>>
>>Then why would Intel even manufacture a mainstream quad-core CPU with an IGP?
>
>Because the hypothetical software-based graphics that you propose as an alternative does not yet exist.
>Most people buy an IGP because it is currently the cheapest way to get a bootable system with a screen.
That's not an obstacle at all. There would still be basic 2D graphics support. The 3D driver only gets loaded when the operating system starts up, just like what happens today with the IGP.
>Others buy an IGP because its low power consumption allows it to fit in a small laptop where a discrete GPU would not.
>People buy 4-core processors because they don't need 6-core processors and 4 is a bit cheaper.
>Also, this is a clearly superior one-for-one replacement of their previous popular
>product - dual-core & quad-core cpus with northbridge IGPs.
You can have a 6-core CPU without IGP for the same price as a 4-core CPU with IGP. For graphics they can be downclocked to stay power efficient. Alternatively, you can save the price of the IGP entirely with a homogeneous 4-core CPU. Both are more interesting than today's offerings, when the homogeneous CPU delivers adequate 3D graphics. This merely requires FMA and gather/scatter.
>>The thing is, casual gamers aren't just casual gamers. They buy a 'multimedia'
>>system and expect a wide variety of applications to run smoothly.
>
>They buy a "cheap" system and don't know whether to expect anything will run smoothly or not.
>Then they find out that most stuff does run smoothly, except a few games. Then they blame the games.
>So they play the games that work instead of the ones that don't.
Indeed they buy a cheap system, but still the one with the best specs. So if you offer a homogeneous quad-core for the same price as a dual-core with IGP, that's a better deal. It also enables software developers to create applications which use all this extra CPU power.
You may question whether we need extra computing power, but that question has been asked many times before in history and every single time the answer was yes.
>>And with lots
>>of software development turning toward multi-core, it's no luxury to put more than
>>two CPU cores into a system, and beyond, to make it sufficiently future-proof.
>
>Future-proofing doesn't make a lot of sense these days.
>It's cheaper to buy a minimally adequate system now, and then buy a new one years
>later later when prices are lower if/when your requirements change.
It would be more future-proof, for the same price! A quad-core homogeneous CPU capable of adequate graphics is more future-proof than a dual-core CPU+IGP. It means that even when you're on a budget you can buy a system that will last longer than one with a heterogeneous computing solution.
>>>Also, Farmville may be comparable in size to WOW, but I doubt many WOW players
>>>are quitting to play Farmville instead. (Unfortunately, I have no statistics on this.)
>>>They will continue to play WOW for quite some time. And when they eventually stop
>>>it will be to play another WOW-like game, probably with better graphics.
>>
>>I'm not talking about Farmville. I'm talking about actual MMORPGs that are as challenging as World of Warcraft.
>
>Why aren't you talking about Farmville? You said an MMO comparable in scale to WOW. Farmville is it.
I didn't say an MMO comparable in scale to WOW. I said "the next World of Warcraft hype may very well be an HTML5 or Flash game".
But if you want to talk about Farmville, fine: It doesn't require 3D at all, and if a future version does then software rendering will still suffice.
>Also, "as challenging as WoW"? What? There's zero challenge to WoW. You kill
>monsters until you hit the level cap, and then you wait for the expansion to raise the level cap. That's it.
>All the instance raids and gear-seeking is optional. It is a challenge people
>take on themselves because the game has failed to provide one.
That's a matter of taste. Millions of players disagree with you. I'm not a big fan of it myself, I just mentioned it to illustrate the kind of graphics which is considered adequate by a significant portion of the market.
On a quad-core CPU, this game already runs well with SwiftShader. With AVX, FMA and gather/scatter, the efficiency would be much improved so the power consumption can be reduced. So there's no point in paying extra for an IGP, or sacrificing two cores to fit in an IGP.
>>As I've indicated before you can't have a 4 times faster IGP without also scaling
>>the memory bandwidth accordingly.
>
>But you CAN scale memory bandwidth accordingly. Both pincounts and memory clock
>frequencies have been rising at a good rate for quite some time now.
>The 4004 had 16 pins. A 386 has like 100. A modern CPU has around 1000. The number
>is growing and there is no reason to think it's about to suddenly stop.
I'm not saying it's about to stop. I'm saying it increases too slowly for the IGP to keep up with the CPU's increasing performance.
A mainstream Core i5 has twice the bandwidth of a mainstream Core 2 Duo back in 2006. So it took about five years to double the bandwidth. This means that by 2016 you can expect to see a mainstream CPU with an IGP that is twice as fast. But by that time we're at an 11 nm process meaning 8 cores are definitely mainstream, and FMA and gather/scatter are no challenge either. So this two times IGP would be up against a CPU which is well over twice as fast. And this assumes a heterogeneous architecture. A homogeneous architecture can have more than 8 cores at 11 nm.
The other option for the IGP is to add value by becoming more programmable. But this costs area and power. It converges the IGP closer to the CPU.
So the IGP can't outrun its fate.
>>That won't happen unless they put a lot more pins
>>under the chip,
>
>Or if they raise the memory clock. Or both. Or if they improve compression or something.
They'll increase both the pin count and clock frequency, but again, only at a slow pace. Increasing the clock frequency isn't without issues. DDR4 won't go mainstream before 2015. And don't forget the CPU and IGP clock freqencies are going up slowly as well.
>>making it far more expensive.
>
>Except that the cost per pin decreases over time as packaging technology improves.
>So what's expensive now, won't be for long.
Again, the pin count will indeed increase, but too slowly for the IGP to outrun the advances in CPU performance.
>>Low-latency DDR bandwidth is a lot
>
>I don't think there's any such thing as low-latency DDR. All DRAM is high latency.
>Well, there's probably a product called "low-latency DDR" but the latency is still not low, just lower than the usual.
I was comparing system DRAM to graphics DRAM.
>>more expensive than GDDR bandwidth. So you shouldn't expect CPU bandwidth to go
>>up very fast, meaning IGPs remain in the low-end segment.
>
>I think bandwidth will go up fast, and some IGPs will move up from low end-toward the middle.
Based on what? For the bandwidth to increase any faster the consumer will have to pay extra. But there's no incentive for doing that. People who really want better graphics, buy a discrete graphics card. And the IGP is worthless for anything other than graphics.
So it's just going to remain the lowest common denominator. If its performance increases, it's only at the mercy of CPU bandwidth improvements and proportionate increases in transistor budget. The CPU benefits from these things equally, but on top of that it also benefits from FMA and gather/scatter.
>>Even if they pulled out all the stops to make it happen, it would be a big shame
>>if such an expensive chip (we're talking current server market segment here) is
>>mediocre at running Crysis 2 but gets beaten by a cheap multi-core CPU at every
>>other application. There's just no demand for such an abomination.
>
>What? What are you talking about, running Crysis on a server? That's not what servers are for.
If you want to increase the bandwidth beyond the slow current evolution, you need extra memory lanes. This brings us into server chip domain. The extra pins and wires make both the chip and the motherboard more expensive. But like I said, it would be ridiculous if such an expensive solution had poor CPU performance. So it's just not going to happen that the bandwidth is aggressively increased for the sake of making the IGP faster, without also scaling the CPU part. So the IGP isn't going to outrun it.
Cheers,
Nicolas
Seni (seniike@hotmail.com) on 2/24/11 wrote:
---------------------------
>> There's just no way a contemporary game using high quality setting is going
>>to run at decent framerates on this IGP.
>
>Well, it's an IGP. Current IGPs aren't about decent framerates. They are about
>getting crap framerates but still playable.
Exactly. And software rendering isn't too far behind. Make use of AVX, add FMA and gather/scatter, and replace the IGP with more CPU cores and you get adequate graphics while also having a CPU which can do way more than legacy rasterization.
>Hopefully they will catch up a little in the future, but I'm not expecting much from Intel just yet.
They can't catch up. AnandTech has tested aggressively overclocking the IGP and the performance didn't go up by much. It's limited by the socket's bandwidth. This also means that spending more die area on the IGP would be a waste. And increasing the bandwidth while keeping a low latency is expensive. So the IGP isn't going anywhere any time soon, while the CPU is catching up in effective throughput.
>>>They increased the GPU clock but not the memory bandwidth. This resulting in poor scaling.
>>>Increasing GPU clock but not bandwidth has always resulted in poor scaling. Nothing new here.
>>>How this shows anything about trends in future hardware is beyond me.
>>
>>It shows that the die area percentage for the IGP can't simply be increased in
>>the next generations to save it from its fate.
>
>I don't see why not, especially if memory bandwidth rises for unrelated reasons, as it has been doing for decades.
Bandwidth is only increasing slowly in comparison to throughput. A Core 2 Duo E8400 had 10.7 GB/s of bandwidth, and 48 GFLOPS. An i7-2600K has double the bandwidth but over four times the GFLOPS. And FMA, gather/scatter and replacing the IGP with CPU cores would dramatically increase the throughput again.
>>Certain people here insist on comparing
>>theoretical compute density but fail to see the bigger picture
>
>But I'm not one of those certain people, so... strawman?
No, I merely added it as an additional observation, not in response to any of your arguments. When we agree, we agree.
>>The CPU is catching
>>up and unification gives it an area advantage, which can also be traded for reduced power consumption.
>
>Except that the best way to trade area for reduced power consumption is likely to de-unify.
>Specialized hardware is more power efficient and can be powered down completely
>when not in use. The only cost is area.
If power consumption was that important we'd have specific GPUs for playing different games, not a continuing expansion of capabilities and unification of functionality. I also seriously doubt we're ever going to see physics ASICs ever again. People care a lot about the price and flexibility of their hardware.
So as the architectures and workloads converge closer, there's simply a point where unification gives you more than the sum of the parts. Power consumption determines how small the gap has to be, but it doesn't stop the convergence itself.
>>In particular it means that as the convergence reduces the gap, the IGP is the first to go.
>>
>>>>The CPU is catching up and clearly there's little point in investing more die space
>>>>into the IGP. So CPU manufacturers should look into using the die space for something
>>>>more interesting, or cut it out entirely.
>>>
>>>They cannot use the die area for something more interesting, since IGPs are maximally interesting.
>>
>>With all due respect that's as short-sighted as saying the architecture of an R300
>>is the maximally interesting use of 195 mm2 of silicon. Beyond low-end graphics,
>>the IGP's capabilities are pathetic so you can forget about GPGPU. There's a lot to be improved.
>
>See, the thing is, what's interesting and what's not is all opinion. You can't disprove an opinion.
>If I claim that IGPs are maximally interesting to me, well then you have no ability
>to dispute that, because only I know what is interesting to me.
True, it's all a matter of opinion. But let me just repeat that it would clearly have been foolish to consider an R300 maximally interesting several years ago, or any other architecture really. History tells us that innovation has never been at a standstill. NVIDIA and AMD can't stop innovating now either, or it would seriously hurt their business. That's hardly an opinion.
Heck, continuously scaling the same architecture is simply physically impossible. As the transistor budget increases exponentially, you need to compensate for the growing gap between bandwidth and logic density. The relative excess of transistors allow to increase programmability, implement a memory hierarchy to take advantage of spacial and temporal locality, increase ILP, etc. Moore's Law has held true for over four decades so any other proposition is a much bigger gamble.
If you want to stick to your opinion that things won't evolve any further, fine, but then the discussion ends here and we'll meet again in several years...
>>More interesting GPUs (programmability) and more interesting CPUs (throughput)
>>are released every major generation. So they're on a collision course. Replacing
>>the IGP with two more CPU cores makes the use of that die area a lot more interesting
>>because they can then be used in combination with the already existing CPU cores.
>
>On the contrary, putting more cores on the chip is boring since they are exactly
>like the existing cores. I mean, it's just a numerical increase. How much more boring can it get?
If I can run *any* software with reasonable efficiency on the die space occupied by the IGP, I call that more interesting than just the ability to run mediocre graphics. Also note it's not just a numerical increase in the number of cores, the cores themselves would also get FMA and gather/scatter to increase the effective throughput.
>If you buy loaves of bread at the store, and see that the manufacturer decided
>to put raisins in the bread, that might be interesting.
>If they just increased the size of the loaf by 1 slice, that's not interesting. That's a boring numerical increase.
>You might prefer it, if raisins aren't your thing. But that doesn't make it interesting.
With all due respect that's a terrible comparison. A heterogeneous architecture is more like a small loaf of bread and a pile of raisins next to it. A homogeneous architecture is actual raisin bread. The fact that this makes it a bigger loaf doesn't mean it's boring.
You can have the best of both worlds with an architecture which is latency and throughput optimized. This is a major benefit for mixed workloads, which would otherwise have to communicate between the heterogeneous cores or perform work on a suboptimal core for the task. This benefit is only getting bigger, because DLP is limited by Amdahl's Law and at the same time software is getting more complex.
>>>>I think I've clarified by now that eventually there's just 'computing'. A homogenous
>>>>CPU with FMA and gather/scatter can handle both latency-sensitive and high-throughput
>>>>workloads (and most applications are a complex mix of these). So the main thing separating the markets is price.
>>>
>>>There are many market segments separated by more than price, and your refusal to see them is baffling.
>>
>>Other than CPUs and GPUs, which are converging, I don't see what else could be
>>relevant to the discussion. So instead of implying I refuse to see them, why don't you show them?
>
>So you want a list of market segments, eh?
>Down from the top:
>
>-supercomputers
>-mainframes
>-workstations
>-database / transaction processing servers
>-web servers
>-cloud computing / consolidated servers
>-Gamer Desktops
>-DTR laptops
>-game consoles
>-Cheapskate/Corporate Desktops
>-mainstream & thin&light Laptops
>-Cheapskate/Corporate Laptops
>-netbooks
>-tablets
>-portable game consoles
>-smartphones
>-embedded / industrial / military
>-embedded automotive
>-embedded consumer appliances
>
>Let's look at some of the differences here.
>Supercomputer and workstation for example are mostly floating point. More specifically,
>double-precision floating point.
>Mainframes and transaction processing have very little floating point, and a heavy
>focus on reliability, branchy integer code, i/o, and throughput
>Gamer desktops, game consoles and so on are also heavily floating point, but it's all single-precision.
>For Corporate / Cheapskate Desktops, performance isn't a big issue, but what little
>performance is needed is all latency-sensitive integer code.
>For laptops and all other portable stuff, power efficiency is important.
>For military, industrial and automotive, reliable operation under harsh conditions is important.
>
>So you see, this stuff is all different, and it's not just about cheap vs. fast.
>There are other issues. Efficiency, reliability, suitedness to different workloads.
And despite all that the very same CPU core design used for desktops are also used for supercomputers, workstations, servers, laptops, military systems, etc. Likewise, the same GPU architectures are used for gaming and HPC.
You're underestimating the cost of differentiation, you're underestimating the value of flexibility, and you're overestimating the cost of unification. An organization buying a supercomputer doesn't want to be caught with its pants down when suddenly someone needs to run integer intensive code. The cost of adding an integer ALU next to a floating-point ALU (or having a unified design) is almost negligible compared to the potential value for customers.
Creating a new design is insanely expensive. Intel can bring a new generation of chips to the market ever two years, but it still takes multiple teams to work in parallel and costs billions in R&D. Likewise AMD had to buy ATI for 5.4 billion to be able to produce discrete GPUs and fuse an IGP into the CPU. So you'd better make sure your architectures are well balanced for multiple workloads.
And for a CPU+IGP the reliability of both components has to be the same so this doesn't change for a homogeneous architecture. Also, while some CPUs feature ECC support and such, that's not relevant to the unification of cores we're talking about (plus GPUs for the HPC market feature it too).
So seriously, the CPU and GPU core designs for these market segments are practically identical, leaving price as the biggest differentiator. And with the CPU and GPU converging, it's only a matter of time before we get one architecture to do all forms of computing. The IGP is the obvious first victim.
>>>>Again, most applications are a complex mix of workloads. But even if some tend
>>>
>>>No they aren't. It's not a complex mix.
>>
>>Have you ever taken a look at the various shaders used by a various games? They
>>differ significantly. And they would differ even more if the GPU wasn't so restrictive.
>>Like you said yourself, sometimes a texture access is replaced by procedural code
>>and sometimes arithmetic code is replaced by a texture lookup, to better match the
>>architecture. A more unified architecture would allow to use the operations they
>intended to use, and get a net speedup.
>>
>>As a matter of fact that's what I've already observed for the past decade. In the
>>early days there were no dependent texture lookups so some effects were simply not
>>possible or there was usually just one workaround with acceptable performance.
>
>There was always multipass rendering, which is Turing-complete. And also you could
>always render a bitmap on the CPU and upload it as a texture. So no effects were ever not possible.
Don't be ridiculous. I know the GPU is Turing-complete, but when you have to use multipass to implement dependent texture lookups on architectures that don't natively support it, that definitely counts as a complex workload, for which a generic homogeneous architecture would be far better suited.
So stating that "no effects were ever not possible" is true in itself but it's not helping you forward with anything relevant to the discussion. On the contrary, it shows just how limited GPUs were, and even today certain workloads may be possible in theory but are a no-go in practice.
>>This
>>and other limitations made every game look like a clone of other contemporary games
>
>And yet they weren't. Early 3d games do not, in fact, all look the same.
The artwork was different, but the graphics techniques were largely the same. With Direct3D 8 most surfaces simply had a base texture and a lightmap, and if they had a bump map it was the same effect based on the texbem functionality.
>>(not in gameplay but in graphics), which started to change as the capabilities expanded.
>>The unification of vertex and pixel processing also allowed developers a lot more
>>freedom in the workloads,
>
>Actually it made very little difference. It was a big letdown.
And yet they never looked back. Sure, most existing games of that time didn't benefit all that much from unification, since they still attempted to balance the vertex and pixel workloads. But now that pretty much everyone has a unified graphics chip, games have a greater variety in vertex and pixel workloads. So you have to take more than just the instant benefit into account. Furthermore, geometry shaders, computer shaders, tessellation, etc. it all greatly benefits from unification.
>You can't use it to increase the vertex count, because there is a bottleneck in vertex fetch & triangle setup.
Exposing another bottleneck is not an argument against unification. They fixed that with later generations.
Also note that a homogeneous CPU with gather/scatter would never have such a bottleneck in the first place.
>You can't use it to increase pixel detail by much, since the hardware vertex shaders
>were only like 10% of the total shader hardware, and thus exposing them to pixel
>workloads only give you about 10% speedup.
>The tradeoff is a little better on Fermi as there are multiple triangle setup units.
>But at the time when unified shading was introduced it was surprisingly ineffective.
Again, that wasn't the fault of unification itself. It's irrelevant how well it did or didn't work back then. The reality is that it's currently unthinkable to go back. Besides, it has only been four years since the first unified chips hit the market. That's a very fast return-on-investment.
>>between applications but also between scenes in the same
>>application. Floating-point textures and render buffers enabled deferred shading,
>
>Deferred shading is a mistake. All it gives you is performance, but it comes at
>the expense of breaking your AA algorithm, thereby costing more performance than it saves.
That was fixed with Direct3D 10.1 by allowing access to anti-aliased buffers. Besides, floating-point textures and buffers have more applications than just deferred shading.
>>which differs quite radically from previous approaches. And that's just the things
>>that immediately popped into mind, and I wasn't even just talking about graphics.
>>Other applications you may run on your system also have diverse and complex mixes of workloads.
>
>They have diverse behavior as a group, but each individual program is mostly homogenous, not a complex mix.
So? You still want to keep the data passed between them on-chip as much as possible. For instance geometry shaders are a specific workload but it's faster to run them concurrently with the other shaders so you don't consume bandwidth. So the chip is actually running a complex mix or shaders. And to dynamically balance vertex, geometry and pixel shaders a unified architecture makes most sense.
>>There's really no denying that GPUs have become more programmable and more capable
>>at handling this melting pot. And with GPGPU and future graphics research hinting
>>at ray-tracing and micro-polygons and whatnot there doesn't appear to be an end
>>in sight. At the same time CPUs have increased their throughput using multi-core
>>and widening the vectors, and FMA and gather/scatter will make them even better
>>at running more than just latency sensitive workloads.
>
>This is circular reasoning. You say gather/scatter will catch on because convergence
>demands it. Then you say convergence will happen because of gather/scatter.
>It could just as well go the other way: lack of gather/scatter support makes convergence
>infeasible. Then non-convergence makes gather/scatter unnecessary.
It's not circular reasoning. Convergence happens regardless of gather/scatter. MORE convergence happens with gather/scatter support. Convergence isn't a goal in and by itself, it's just that an architecture which balances ILP, TLP and DLP is the most valuable. Unification is an added bonus which allows to bridge the final gap.
Also, this is what innovation is all about. You need sparks (AVX, FMA, gather/scatter) to ignite something bigger (a superior homogeneous architecture). And the industry doesn't leave innovation opportunities unused. The competition is fierce so you can't afford not to look at all the options. The idea for gather/scatter is out there, and the potential of it is being researched as we speak. All the ingredients are there; it has a relatively low hardware cost and it greatly benefits throughput workloads while also allowing vectorization of less trivially data parallel workloads. So it's just a matter of time.
>>So with both aiming to support
>>both types of workloads, a showdown is inevitable.
>
>I think it's quite possible the the showdown is evitable and will in fact never occur.
You're talking as if it would be a bad thing to have a homogeneous CPU which can also adequately handle graphics. There's no reason to fight it. Like I said, convergence isn't a goal in and by itself, but neither is avoiding convergence. CPUs get a big increase in transistor budget with every process node, and one of the ways to spend these transistors is by increasing throughput. Likewise, the GPU is running into DLP limits and has to spend a larger portion of its transistors on storage and ILP. This is merely an observation.
>>Note that there isn't any strict dividing line between them. You can't offload
>>every high-throughput task to the GPU and execute every low-latency task on the
>>CPU. Unless you have a massive amount of data parallelism to exploit it's faster
>>not to send things over to the GPU and just process things on the CPU. The communication
>>latency is too high and the bandwidth too low.
>
>The communication latency & bandwidth is an issue if you are communicating over PCI-E.
>It's a complete non-issue for CPU-IGPs though, where all the communication stays
>on-chip. Not that IGPs are up to task... yet.
It's not a complete non-issue at all. Communication latency and bandwidth between CPU cores and IGP cores is pretty bad.
The round-trip latency involves a lot more than just making an electrical signal cross a few centimeter and back. You have to format your data, issue a command to the API, the data gets copied and the command ends up in a queue, then when the IGP is ready it pulls in the data and starts processing, when completely done it signals the CPU and the results have to travel back.
So letting the IGP take care of all throughput heavy workloads is nowhere near free of issues. In fact GPGPU on the IGP is totally hopeless. Only one-way communication like with graphics is realistic. If Intel wants that die space to have any value beyond graphics it's only option is to equip the CPU cores with FMA and gather/scatter and replacing the IGP with such cores. The data can then stay local for much longer, and even communication between the homogeneous cores can be much more fine-grained than communicating with the heterogeneous IGP.
>>It's better to unify things and keep
>>the data local. Likewise, the GPU shouldn't always leave latency sensitive tasks
>>to the CPU. Often it's faster to use a dumb brute-force approach to get the job
>>done without a round-trip to the CPU. But imagine a latency-optimized GPU which
>>doesn't have to waste computing resources so it becomes even faster.
>
>A latency-optimized GPU would suck.
That's an unfounded claim. There are significant differences in latency between various GPU architectures, yet their effective performance is competitive.
Non-aggressive latency optimizations don't hurt the performance. They may lower the theoretical throughput but compensate it with a higher efficiency.
And for an IGP you get the area benefit of unification on top of that. So a homogeneous chip can have more latency optimizations, allowing it to function as a CPU as well.
>>Both are heading for the same thing.
>>
>>>>to be hugely latency-sensitive or high-throughput, there's no telling what application(s) you'll run tomorrow.
>>>
>>>Unless, of course, for some reason, you know what applications you'll run tomorrow.
>>>Like if it's your job or something.
>>>Like, for example, you are evaluating the choices of what to run right now, so as to buy the necessary software.
>>>Or maybe you bought the software and have no funds left for more software, so you
>>>will have to continue using the same software.
>>>Or maybe they're the same apps you're running right now, and that you've been running for the past ten years.
>>>You overestimate the uncertainty.
>>
>>I don't overestimate it. You're absolutely right that systems are always bought
>>with certain uses in mind, and usually it closely matches the actual uses. But that
>>doesn't mean we should underestimate it either. Someone with a 500 € budget may
>>not expect to be able to simulate the ocean currents, but he does expect to be able
>>to run the vast majority of desktop applications, including some that have yet to appear.
>
>Ordinary desktop applications (excluding games) have barely changed in the past
>fifteen years. I think it's reasonable to guess that they will barely change for the next fifteen as well.
>I know there have been some new applications since then, but I can't think of a
>single one for which performance matters.
>So if you buy a low end laptop now, and expect to run future applications on it, it will probably work fine.
It won't. Try running any modern application on a system from '96. It will be a horrible experience. Even an Atom CPU is many times faster than the CPUs we had back then, so whether you like it or not, the software has evolved a lot in fifteen years.
>>In particular, someone who currently opts for a CPU+IGP, would be better off with
>>a homogeneous CPU with gather/scatter in the not too distant future.
>>
>>>>>Ray-traced Super Mario sounds pretty awesome. I'd get it.
>>>>
>>>>Well, actually, due to the wildly varying latencies occuring in ray-tracing, you
>>>>need an architecture capable of coping with that optimally. A homogenous CPU with
>>>>gather/scatter can handle fine-grained dynamic balancing of the complex mix of workloads.
>>>
>>>There's not really a problem here with varying latencies. Ray tracing on GPUs works fine.
>>
>>This should sober you up: http://www.youtube.com/watch?v=4bITAdWvMXE
>
>He's saying that GPU raytracing is not quite as good as CPU raytracing. But then,
>he's doing the CPU raytracing on a twelve-core multi-socket workstation.
That's only two 6-core CPUs, and he's comparing it against a pair of expensive GPUs. Furthemore, once again the GPUs are worthless without a CPU. The driver needs a fairly powerful CPU. So when comparing the cost you can't look at the prices of the cards alone.
>Also it's not clear how old this comparison is and if anything has changed since
>then. It may be that the GPU is catching up, or already has caught up.
>All signs suggest that the GPU's ability to raytrace is improving at a high rate,
>so it may very well surpass the CPU soon if it hasn't already.
Keep in mind that the CPU is evolving as well. 6, 8 and 10-core CPUs with AVX will be lauched later this year. So by the time the GPU can do all the complex ray-tracing, and do it faster than the 12-core system, it's up against a new generation of CPUs which much higher throughput.
The only way the GPU can catch up with that moving target, is by investing more transistors into latency optimizations. But because this lowers the peak throughput it can't possible surpass the CPU for this workload. The CPU still has FMA and gather/scatter to greatly improve its effective performance. I don't know of any new technology that would give the GPU an advantage over the CPU.
>Also, you see he said nothing about variable latencies being a problem. Because they aren't.
>The actual problem seems to be that GPU can't cast enough rays per second to smooth out the sampling noise.
>It casts too many where they aren't needed and not enough where they are, since
>it lacks the flexibility to make a finer-grained decision about it.
It's probably a combination of things really. But the reality stays the same: The GPU would require CPU technology to catch up, and this also means it will never surpass it.
>>The GPU has lots of limitations and doesn't achieve the same quality as a CPU does
>>in a fixed amount of time. You may note that his 'solution' around 7:30 is a heterogenous
>>architecture, but that's just the current status quo. It fails to take into account
>>that the CPU's vector units got wider, and we still have FMA and gather/scatter to come.
>>
>
>It's true the CPU's vector units are getting wider, but the GPU is widening at an even faster rate.
No they're not. It's not just the vectors getting wider. Simultaneously the number of CPU cores increases, and FMA will also double the theoretical throughput.
GPUs on the other hand are losing compute density:
* Cypress XT: 2154 Mtransistors, 2720 GFLOPS, 320 VLIW5 cores, 850 Mhz
* Cayman XT: 2640 Mtransistors, 2703 GFLOPS, 384 VLIW4 cores, 880 Mhz
The focus on getting wider is making way for other optimizations. So in terms of compute density, the CPU and GPU are converging.
>>In fact there are reasonably successful 'hybrid' ray-tracers but unsurprisingly
>>the developer's biggest complaint is the communication overhead between the CPU
>>and GPU. A CPU with FMA and gather/scatter offers the best of both world without
>>such overhead.
>
>So does a beefed-up on-chip IGP.
No, that still has a high communication overhead.
>>You could also make the GPU more latency optimized but that's a lot
>>more work and you still don't get the unification benefit. So the CPU is going to eventually win this debate.
>>
>>>>The casual gaming market is booming, and the potential of WebGL and Molehill is
>>>hard to overestimate. The web allows to reach a whole new market of people, who
>>>are hesitant to set foot in a game store or install Steam.
>>>>Mark my words; the next World of Warcraft hype may very well be an HTML5 or Flash game.
>>>>The emerging Smart TV market also consists of devices with weak graphics chips (shameless plug: http://gametree.tv/platform/).
>>>
>>>The casual gaming market is booming, but I suspect the casual gaming market has little to do with hardware at all.
>>>Casual gamers couldn't care less whether they have a 4-core or a 6-core.
>>
>>Then why would Intel even manufacture a mainstream quad-core CPU with an IGP?
>
>Because the hypothetical software-based graphics that you propose as an alternative does not yet exist.
>Most people buy an IGP because it is currently the cheapest way to get a bootable system with a screen.
That's not an obstacle at all. There would still be basic 2D graphics support. The 3D driver only gets loaded when the operating system starts up, just like what happens today with the IGP.
>Others buy an IGP because its low power consumption allows it to fit in a small laptop where a discrete GPU would not.
>People buy 4-core processors because they don't need 6-core processors and 4 is a bit cheaper.
>Also, this is a clearly superior one-for-one replacement of their previous popular
>product - dual-core & quad-core cpus with northbridge IGPs.
You can have a 6-core CPU without IGP for the same price as a 4-core CPU with IGP. For graphics they can be downclocked to stay power efficient. Alternatively, you can save the price of the IGP entirely with a homogeneous 4-core CPU. Both are more interesting than today's offerings, when the homogeneous CPU delivers adequate 3D graphics. This merely requires FMA and gather/scatter.
>>The thing is, casual gamers aren't just casual gamers. They buy a 'multimedia'
>>system and expect a wide variety of applications to run smoothly.
>
>They buy a "cheap" system and don't know whether to expect anything will run smoothly or not.
>Then they find out that most stuff does run smoothly, except a few games. Then they blame the games.
>So they play the games that work instead of the ones that don't.
Indeed they buy a cheap system, but still the one with the best specs. So if you offer a homogeneous quad-core for the same price as a dual-core with IGP, that's a better deal. It also enables software developers to create applications which use all this extra CPU power.
You may question whether we need extra computing power, but that question has been asked many times before in history and every single time the answer was yes.
>>And with lots
>>of software development turning toward multi-core, it's no luxury to put more than
>>two CPU cores into a system, and beyond, to make it sufficiently future-proof.
>
>Future-proofing doesn't make a lot of sense these days.
>It's cheaper to buy a minimally adequate system now, and then buy a new one years
>later later when prices are lower if/when your requirements change.
It would be more future-proof, for the same price! A quad-core homogeneous CPU capable of adequate graphics is more future-proof than a dual-core CPU+IGP. It means that even when you're on a budget you can buy a system that will last longer than one with a heterogeneous computing solution.
>>>Also, Farmville may be comparable in size to WOW, but I doubt many WOW players
>>>are quitting to play Farmville instead. (Unfortunately, I have no statistics on this.)
>>>They will continue to play WOW for quite some time. And when they eventually stop
>>>it will be to play another WOW-like game, probably with better graphics.
>>
>>I'm not talking about Farmville. I'm talking about actual MMORPGs that are as challenging as World of Warcraft.
>
>Why aren't you talking about Farmville? You said an MMO comparable in scale to WOW. Farmville is it.
I didn't say an MMO comparable in scale to WOW. I said "the next World of Warcraft hype may very well be an HTML5 or Flash game".
But if you want to talk about Farmville, fine: It doesn't require 3D at all, and if a future version does then software rendering will still suffice.
>Also, "as challenging as WoW"? What? There's zero challenge to WoW. You kill
>monsters until you hit the level cap, and then you wait for the expansion to raise the level cap. That's it.
>All the instance raids and gear-seeking is optional. It is a challenge people
>take on themselves because the game has failed to provide one.
That's a matter of taste. Millions of players disagree with you. I'm not a big fan of it myself, I just mentioned it to illustrate the kind of graphics which is considered adequate by a significant portion of the market.
On a quad-core CPU, this game already runs well with SwiftShader. With AVX, FMA and gather/scatter, the efficiency would be much improved so the power consumption can be reduced. So there's no point in paying extra for an IGP, or sacrificing two cores to fit in an IGP.
>>As I've indicated before you can't have a 4 times faster IGP without also scaling
>>the memory bandwidth accordingly.
>
>But you CAN scale memory bandwidth accordingly. Both pincounts and memory clock
>frequencies have been rising at a good rate for quite some time now.
>The 4004 had 16 pins. A 386 has like 100. A modern CPU has around 1000. The number
>is growing and there is no reason to think it's about to suddenly stop.
I'm not saying it's about to stop. I'm saying it increases too slowly for the IGP to keep up with the CPU's increasing performance.
A mainstream Core i5 has twice the bandwidth of a mainstream Core 2 Duo back in 2006. So it took about five years to double the bandwidth. This means that by 2016 you can expect to see a mainstream CPU with an IGP that is twice as fast. But by that time we're at an 11 nm process meaning 8 cores are definitely mainstream, and FMA and gather/scatter are no challenge either. So this two times IGP would be up against a CPU which is well over twice as fast. And this assumes a heterogeneous architecture. A homogeneous architecture can have more than 8 cores at 11 nm.
The other option for the IGP is to add value by becoming more programmable. But this costs area and power. It converges the IGP closer to the CPU.
So the IGP can't outrun its fate.
>>That won't happen unless they put a lot more pins
>>under the chip,
>
>Or if they raise the memory clock. Or both. Or if they improve compression or something.
They'll increase both the pin count and clock frequency, but again, only at a slow pace. Increasing the clock frequency isn't without issues. DDR4 won't go mainstream before 2015. And don't forget the CPU and IGP clock freqencies are going up slowly as well.
>>making it far more expensive.
>
>Except that the cost per pin decreases over time as packaging technology improves.
>So what's expensive now, won't be for long.
Again, the pin count will indeed increase, but too slowly for the IGP to outrun the advances in CPU performance.
>>Low-latency DDR bandwidth is a lot
>
>I don't think there's any such thing as low-latency DDR. All DRAM is high latency.
>Well, there's probably a product called "low-latency DDR" but the latency is still not low, just lower than the usual.
I was comparing system DRAM to graphics DRAM.
>>more expensive than GDDR bandwidth. So you shouldn't expect CPU bandwidth to go
>>up very fast, meaning IGPs remain in the low-end segment.
>
>I think bandwidth will go up fast, and some IGPs will move up from low end-toward the middle.
Based on what? For the bandwidth to increase any faster the consumer will have to pay extra. But there's no incentive for doing that. People who really want better graphics, buy a discrete graphics card. And the IGP is worthless for anything other than graphics.
So it's just going to remain the lowest common denominator. If its performance increases, it's only at the mercy of CPU bandwidth improvements and proportionate increases in transistor budget. The CPU benefits from these things equally, but on top of that it also benefits from FMA and gather/scatter.
>>Even if they pulled out all the stops to make it happen, it would be a big shame
>>if such an expensive chip (we're talking current server market segment here) is
>>mediocre at running Crysis 2 but gets beaten by a cheap multi-core CPU at every
>>other application. There's just no demand for such an abomination.
>
>What? What are you talking about, running Crysis on a server? That's not what servers are for.
If you want to increase the bandwidth beyond the slow current evolution, you need extra memory lanes. This brings us into server chip domain. The extra pins and wires make both the chip and the motherboard more expensive. But like I said, it would be ridiculous if such an expensive solution had poor CPU performance. So it's just not going to happen that the bandwidth is aggressively increased for the sake of making the IGP faster, without also scaling the CPU part. So the IGP isn't going to outrun it.
Cheers,
Nicolas
Topic | Posted By | Date |
---|---|---|
Sandy Bridge CPU article online | David Kanter | 2010/09/26 09:35 PM |
Sandy Bridge CPU article online | Alex | 2010/09/27 05:22 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:06 AM |
Sandy Bridge CPU article online | someone | 2010/09/27 06:03 AM |
Sandy Bridge CPU article online | slacker | 2010/09/27 02:08 PM |
PowerPC is now Power | Paul A. Clayton | 2010/09/27 04:34 PM |
Sandy Bridge CPU article online | Dave | 2010/11/10 10:15 PM |
Sandy Bridge CPU article online | someone | 2010/09/27 06:23 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 06:39 PM |
Optimizing register clear | Paul A. Clayton | 2010/09/28 12:34 PM |
Sandy Bridge CPU article online | MS | 2010/09/27 06:54 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:15 AM |
Sandy Bridge CPU article online | MS | 2010/09/27 11:02 AM |
Sandy Bridge CPU article online | mpx | 2010/09/27 11:44 AM |
Sandy Bridge CPU article online | MS | 2010/09/27 02:37 PM |
Precisely | David Kanter | 2010/09/27 03:22 PM |
Sandy Bridge CPU article online | Richard Cownie | 2010/09/27 08:27 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:01 AM |
Sandy Bridge CPU article online | Richard Cownie | 2010/09/27 10:40 AM |
Sandy Bridge CPU article online | boots | 2010/09/27 11:19 AM |
Right, mid-2011, not 2010. Sorry (NT) | Richard Cownie | 2010/09/27 11:42 AM |
bulldozer single thread performance | Max | 2010/09/27 12:57 PM |
bulldozer single thread performance | Matt Waldhauer | 2011/03/02 11:32 AM |
Sandy Bridge CPU article online | Pun Zu | 2010/09/27 11:32 AM |
Sandy Bridge CPU article online | ? | 2010/09/27 11:44 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 01:11 PM |
My opinion is that anything that would take advantage of 256-bit AVX | redpriest | 2010/09/27 01:17 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/27 03:09 PM |
My opinion is that anything that would take advantage of 256-bit AVX | redpriest | 2010/09/27 04:06 PM |
My opinion is that anything that would take advantage of 256-bit AVX | David Kanter | 2010/09/27 05:23 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 03:57 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:35 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Matt Waldhauer | 2010/09/28 10:58 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/27 06:39 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:14 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Megol | 2010/09/28 02:17 AM |
My opinion is that anything that would take advantage of 256-bit AVX | Michael S | 2010/09/28 05:47 AM |
PGI | Carlie Coats | 2010/09/28 10:23 AM |
gfortran... | Carlie Coats | 2010/09/29 09:33 AM |
My opinion is that anything that would take advantage of 256-bit AVX | mpx | 2010/09/28 12:58 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Michael S | 2010/09/28 01:36 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Foo_ | 2010/09/29 01:08 AM |
My opinion is that anything that would take advantage of 256-bit AVX | mpx | 2010/09/28 11:37 AM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/28 01:19 PM |
My opinion is that anything that would take advantage of 256-bit AVX | hobold | 2010/09/28 03:08 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:26 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Anthony | 2010/09/28 10:31 PM |
Sandy Bridge CPU article online | Hans de Vries | 2010/09/27 02:19 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 03:19 PM |
Sandy Bridge CPU article online | -Sweeper_ | 2010/09/27 05:50 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 06:41 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/27 02:55 PM |
Sandy Bridge CPU article online | line98 | 2010/09/27 03:05 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 03:20 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/27 03:23 PM |
Sandy Bridge CPU article online | line98 | 2010/09/27 03:42 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 09:33 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 04:04 PM |
Sandy Bridge CPU article online | Jack | 2010/09/27 04:40 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 11:47 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 11:54 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 11:59 PM |
Sandy Bridge CPU article online | JS | 2010/09/28 01:18 AM |
Sandy Bridge CPU article online | Royi | 2010/09/28 01:31 AM |
Sandy Bridge CPU article online | Jack | 2010/09/28 06:34 AM |
Sandy Bridge CPU article online | Royi | 2010/09/28 08:22 AM |
Sandy Bridge CPU article online | Foo_ | 2010/09/28 12:53 PM |
Sandy Bridge CPU article online | Paul | 2010/09/28 01:17 PM |
Sandy Bridge CPU article online | mpx | 2010/09/28 01:22 PM |
Sandy Bridge CPU article online | anonymous | 2010/09/28 02:06 PM |
Sandy Bridge CPU article online | IntelUser2000 | 2010/09/29 01:49 AM |
Sandy Bridge CPU article online | Jack | 2010/09/28 05:08 PM |
Sandy Bridge CPU article online | mpx | 2010/09/29 01:50 AM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/29 12:01 PM |
Sandy Bridge CPU article online | Royi | 2010/09/29 12:48 PM |
Sandy Bridge CPU article online | mpx | 2010/09/29 02:15 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/29 02:27 PM |
Sandy Bridge CPU article online | ? | 2010/09/29 11:18 PM |
Sandy Bridge CPU article online | savantu | 2010/09/30 12:28 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 03:43 AM |
Sandy Bridge CPU article online | gallier2 | 2010/09/30 04:18 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 08:38 AM |
Sandy Bridge CPU article online | David Hess | 2010/09/30 10:28 AM |
moderation (again) | hobold | 2010/10/01 05:08 AM |
Sandy Bridge CPU article online | Megol | 2010/09/30 02:13 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 03:47 AM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 08:54 AM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 10:18 AM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 12:04 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 12:38 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/30 01:02 PM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 08:09 PM |
Sandy Bridge CPU article online | mpx | 2010/09/30 12:40 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 01:00 PM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 08:44 PM |
Sandy Bridge CPU article online | David Hess | 2010/09/30 10:36 AM |
Sandy Bridge CPU article online | someone | 2010/09/30 11:23 AM |
Sandy Bridge CPU article online | mpx | 2010/09/30 01:50 PM |
wii lesson | Michael S | 2010/09/30 02:12 PM |
wii lesson | Dan Downs | 2010/09/30 03:33 PM |
wii lesson | Kevin G | 2010/10/01 12:27 AM |
wii lesson | Rohit | 2010/10/01 07:53 AM |
wii lesson | Kevin G | 2010/10/02 03:30 AM |
wii lesson | mpx | 2010/10/01 09:02 AM |
wii lesson | IntelUser2000 | 2010/10/01 09:31 AM |
GPUs and games | David Kanter | 2010/09/30 08:17 PM |
GPUs and games | hobold | 2010/10/01 05:27 AM |
GPUs and games | anonymous | 2010/10/01 06:35 AM |
GPUs and games | Gabriele Svelto | 2010/10/01 09:07 AM |
GPUs and games | Linus Torvalds | 2010/10/01 10:41 AM |
GPUs and games | Anon | 2010/10/01 11:23 AM |
Can Intel do *this* ??? | Mark Roulo | 2010/10/03 03:17 PM |
Can Intel do *this* ??? | Anon | 2010/10/03 03:29 PM |
Can Intel do *this* ??? | Mark Roulo | 2010/10/03 03:55 PM |
Can Intel do *this* ??? | Anon | 2010/10/03 05:45 PM |
Can Intel do *this* ??? | Ian Ameline | 2010/10/03 10:35 PM |
Graphics, IGPs, and Cache | Joe | 2010/10/10 09:51 AM |
Graphics, IGPs, and Cache | Anon | 2010/10/10 10:18 PM |
Graphics, IGPs, and Cache | Rohit | 2010/10/11 06:14 AM |
Graphics, IGPs, and Cache | hobold | 2010/10/11 06:43 AM |
Maybe the IGPU doesn't load into the L3 | Mark Roulo | 2010/10/11 08:05 AM |
Graphics, IGPs, and Cache | David Kanter | 2010/10/11 09:01 AM |
Can Intel do *this* ??? | Gabriele Svelto | 2010/10/04 12:31 AM |
Kanter's Law. | Ian Ameline | 2010/10/01 02:05 PM |
Kanter's Law. | David Kanter | 2010/10/01 02:18 PM |
Kanter's Law. | Ian Ameline | 2010/10/01 02:33 PM |
Kanter's Law. | Kevin G | 2010/10/01 04:19 PM |
Kanter's Law. | IntelUser2000 | 2010/10/01 10:36 PM |
Kanter's Law. | Kevin G | 2010/10/02 03:15 AM |
Kanter's Law. | IntelUser2000 | 2010/10/02 02:35 PM |
Wii vs pc's | Rohit | 2010/10/01 07:34 PM |
Wii vs pc's | Gabriele Svelto | 2010/10/01 11:54 PM |
GPUs and games | mpx | 2010/10/02 11:30 AM |
GPUs and games | Foo_ | 2010/10/02 04:03 PM |
GPUs and games | mpx | 2010/10/03 11:29 AM |
GPUs and games | Foo_ | 2010/10/03 01:52 PM |
GPUs and games | mpx | 2010/10/03 03:29 PM |
GPUs and games | Anon | 2010/10/03 03:49 PM |
GPUs and games | mpx | 2010/10/04 11:42 AM |
GPUs and games | MS | 2010/10/04 02:51 PM |
GPUs and games | Anon | 2010/10/04 08:29 PM |
persistence of vision | hobold | 2010/10/04 11:47 PM |
GPUs and games | mpx | 2010/10/05 12:51 AM |
GPUs and games | MS | 2010/10/05 06:49 AM |
GPUs and games | Jack | 2010/10/05 11:17 AM |
GPUs and games | MS | 2010/10/05 05:19 PM |
GPUs and games | Jack | 2010/10/05 11:11 AM |
GPUs and games | mpx | 2010/10/05 12:51 PM |
GPUs and games | David Kanter | 2010/10/06 09:04 AM |
GPUs and games | jack | 2010/10/06 09:34 PM |
GPUs and games | Linus Torvalds | 2010/10/05 07:29 AM |
GPUs and games | Foo_ | 2010/10/04 04:49 AM |
GPUs and games | Jeremiah | 2010/10/08 10:58 AM |
GPUs and games | MS | 2010/10/08 01:37 PM |
GPUs and games | Salvatore De Dominicis | 2010/10/04 01:41 AM |
GPUs and games | Kevin G | 2010/10/05 02:13 PM |
GPUs and games | mpx | 2010/10/03 11:36 AM |
GPUs and games | David Kanter | 2010/10/04 07:08 AM |
GPUs and games | Kevin G | 2010/10/04 10:38 AM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 09:19 PM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 12:06 PM |
Sandy Bridge CPU article online | rwessel | 2010/09/30 02:29 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/30 03:06 PM |
Sandy Bridge CPU article online | rwessel | 2010/09/30 06:55 PM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 03:53 AM |
Sandy Bridge CPU article online | rwessel | 2010/10/01 08:30 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 09:31 AM |
Sandy Bridge CPU article online | rwessel | 2010/10/01 10:56 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:28 PM |
Sandy Bridge CPU article online | Ricardo B | 2010/10/02 05:38 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/02 06:59 PM |
which bus more wasteful | Michael S | 2010/10/02 10:38 AM |
which bus more wasteful | rwessel | 2010/10/02 07:15 PM |
Sandy Bridge CPU article online | Ricardo B | 2010/10/01 10:08 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:31 PM |
Sandy Bridge CPU article online | Andi Kleen | 2010/10/01 11:55 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:32 PM |
Sandy Bridge CPU article online | kdg | 2010/10/01 11:26 AM |
Sandy Bridge CPU article online | Anon | 2010/10/01 11:33 AM |
Analog display out? | David Kanter | 2010/10/01 01:05 PM |
Analog display out? | mpx | 2010/10/02 11:46 AM |
Analog display out? | Anon | 2010/10/03 03:26 PM |
Digital is expensive! | David Kanter | 2010/10/03 06:36 PM |
Digital is expensive! | Anon | 2010/10/03 08:07 PM |
Digital is expensive! | David Kanter | 2010/10/03 10:02 PM |
Digital is expensive! | Steve Underwood | 2010/10/04 03:52 AM |
Digital is expensive! | David Kanter | 2010/10/04 07:03 AM |
Digital is expensive! | anonymous | 2010/10/04 07:11 AM |
Digital is not very expensive! | Steve Underwood | 2010/10/04 06:08 PM |
Digital is not very expensive! | Anon | 2010/10/04 08:33 PM |
Digital is not very expensive! | Steve Underwood | 2010/10/04 11:03 PM |
Digital is not very expensive! | mpx | 2010/10/05 01:10 PM |
Digital is not very expensive! | Gabriele Svelto | 2010/10/05 12:24 AM |
Digital is expensive! | jal142 | 2010/10/04 11:46 AM |
Digital is expensive! | mpx | 2010/10/04 01:04 AM |
Digital is expensive! | Gabriele Svelto | 2010/10/04 03:28 AM |
Digital is expensive! | Mark Christiansen | 2010/10/04 03:12 PM |
Analog display out? | slacker | 2010/10/03 06:44 PM |
Analog display out? | Anon | 2010/10/03 08:05 PM |
Analog display out? | Steve Underwood | 2010/10/04 03:48 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:37 PM |
Sandy Bridge CPU article online | slacker | 2010/10/02 02:53 PM |
Sandy Bridge CPU article online | David Hess | 2010/10/02 06:49 PM |
memory bandwith | Max | 2010/09/30 12:19 PM |
memory bandwith | Anon | 2010/10/01 11:28 AM |
memory bandwith | Jack | 2010/10/01 07:45 PM |
memory bandwith | Anon | 2010/10/03 03:19 PM |
Sandy Bridge CPU article online | PiedPiper | 2010/09/30 07:05 PM |
Sandy Bridge CPU article online | Matt Sayler | 2010/09/29 04:38 PM |
Sandy Bridge CPU article online | Jack | 2010/09/29 09:39 PM |
Sandy Bridge CPU article online | mpx | 2010/09/30 12:24 AM |
Sandy Bridge CPU article online | passer | 2010/09/30 03:15 AM |
Sandy Bridge CPU article online | mpx | 2010/09/30 03:47 AM |
Sandy Bridge CPU article online | passer | 2010/09/30 04:25 AM |
SB and web browsing | Rohit | 2010/09/30 06:47 AM |
SB and web browsing | David Hess | 2010/09/30 07:10 AM |
SB and web browsing | MS | 2010/09/30 10:21 AM |
SB and web browsing | passer | 2010/09/30 10:26 AM |
SB and web browsing | MS | 2010/10/02 06:41 PM |
SB and web browsing | Rohit | 2010/10/01 08:02 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/30 08:35 AM |
Sandy Bridge CPU article online | Jack | 2010/09/30 10:40 PM |
processor evolution | hobold | 2010/09/29 02:16 PM |
processor evolution | Foo_ | 2010/09/30 06:10 AM |
processor evolution | Jack | 2010/09/30 07:07 PM |
3D gaming as GPGPU app | hobold | 2010/10/01 04:59 AM |
3D gaming as GPGPU app | Jack | 2010/10/01 07:39 PM |
processor evolution | hobold | 2010/10/01 04:35 AM |
processor evolution | David Kanter | 2010/10/01 10:02 AM |
processor evolution | Anon | 2010/10/01 11:46 AM |
Display | David Kanter | 2010/10/01 01:26 PM |
Display | Rohit | 2010/10/02 02:56 AM |
Display | Linus Torvalds | 2010/10/02 07:40 AM |
Display | rwessel | 2010/10/02 08:58 AM |
Display | sJ | 2010/10/02 10:28 PM |
Display | rwessel | 2010/10/03 08:38 AM |
Display | Anon | 2010/10/03 03:06 PM |
Display tech and compute are different | David Kanter | 2010/10/03 06:33 PM |
Display tech and compute are different | Anon | 2010/10/03 08:16 PM |
Display tech and compute are different | David Kanter | 2010/10/03 10:00 PM |
Display tech and compute are different | hobold | 2010/10/04 01:40 AM |
Display | ? | 2010/10/03 03:02 AM |
Display | Linus Torvalds | 2010/10/03 10:18 AM |
Display | Richard Cownie | 2010/10/03 11:12 AM |
Display | Linus Torvalds | 2010/10/03 12:16 PM |
Display | slacker | 2010/10/03 07:35 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/04 07:06 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/04 11:44 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/04 02:59 PM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/04 03:13 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/04 08:58 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/05 01:39 AM |
current V12 engines with >6.0 displacement | MS | 2010/10/05 06:57 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/05 01:20 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/05 09:26 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 05:39 AM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 01:22 PM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/06 03:07 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 03:56 PM |
current V12 engines with >6.0 displacement | rwessel | 2010/10/06 03:30 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 03:53 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/07 01:32 PM |
current V12 engines with >6.0 displacement | rwessel | 2010/10/07 07:54 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 09:02 PM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | slacker | 2010/10/06 07:20 PM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | Ricardo B | 2010/10/07 01:32 AM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | slacker | 2010/10/07 08:15 AM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | Ricardo B | 2010/10/07 10:51 AM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 05:03 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 06:26 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 11:15 PM |
current V12 engines with >6.0 displacement | Howard Chu | 2010/10/07 02:16 PM |
current V12 engines with >6.0 displacement | Anon | 2010/10/05 10:31 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 05:55 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/06 06:15 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 06:34 AM |
I wonder is there any tech area that this forum doesn't have an opinion on (NT) | Rob Thorpe | 2010/10/06 10:11 AM |
Cunieform tablets | David Kanter | 2010/10/06 12:57 PM |
Cunieform tablets | Linus Torvalds | 2010/10/06 01:06 PM |
Ouch...maybe I should hire a new editor (NT) | David Kanter | 2010/10/06 04:38 PM |
Cunieform tablets | rwessel | 2010/10/06 03:41 PM |
Cunieform tablets | seni | 2010/10/07 10:56 AM |
Cunieform tablets | Howard Chu | 2010/10/07 01:44 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/06 06:10 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/06 10:44 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 07:55 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:51 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 07:38 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:33 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 09:04 PM |
Practical vehicles for commuting | Rob Thorpe | 2010/10/08 05:50 AM |
Practical vehicles for commuting | Gabriele Svelto | 2010/10/08 06:05 AM |
Practical vehicles for commuting | Rob Thorpe | 2010/10/08 06:21 AM |
Practical vehicles for commuting | j | 2010/10/08 02:20 PM |
Practical vehicles for commuting | Rob Thorpe | 2010/12/09 07:00 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/08 10:14 AM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/07 01:23 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/07 04:08 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 05:41 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 08:05 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:52 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/08 07:52 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 11:28 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 12:37 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/07 01:37 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/05 02:02 AM |
Display | Linus Torvalds | 2010/10/04 10:39 AM |
Display | Gabriele Svelto | 2010/10/05 12:34 AM |
Display | Richard Cownie | 2010/10/04 06:22 AM |
Display | anon | 2010/10/04 09:22 PM |
Display | Richard Cownie | 2010/10/05 06:42 AM |
Display | mpx | 2010/10/03 11:55 AM |
Display | rcf | 2010/10/03 01:12 PM |
Display | mpx | 2010/10/03 02:36 PM |
Display | rcf | 2010/10/03 05:36 PM |
Display | Ricardo B | 2010/10/04 02:50 PM |
Display | gallier2 | 2010/10/05 03:44 AM |
Display | David Hess | 2010/10/05 05:21 AM |
Display | gallier2 | 2010/10/05 08:21 AM |
Display | David Hess | 2010/10/03 11:21 PM |
Display | rcf | 2010/10/04 08:06 AM |
Display | David Kanter | 2010/10/03 01:54 PM |
Alternative integration | Paul A. Clayton | 2010/10/06 08:51 AM |
Display | slacker | 2010/10/03 07:26 PM |
Display & marketing & analogies | ? | 2010/10/04 02:33 AM |
Display & marketing & analogies | kdg | 2010/10/04 06:00 AM |
Display | Kevin G | 2010/10/02 09:49 AM |
Display | Anon | 2010/10/03 03:43 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/29 03:17 PM |
Sandy Bridge CPU article online | Jack | 2010/09/28 06:27 AM |
Sandy Bridge CPU article online | IntelUser2000 | 2010/09/28 03:07 AM |
Sandy Bridge CPU article online | mpx | 2010/09/28 12:34 PM |
Sandy Bridge CPU article online | Aaron Spink | 2010/09/28 01:28 PM |
Sandy Bridge CPU article online | JoshW | 2010/09/28 02:13 PM |
Sandy Bridge CPU article online | mpx | 2010/09/28 02:54 PM |
Sandy Bridge CPU article online | Foo_ | 2010/09/29 01:19 AM |
Sandy Bridge CPU article online | mpx | 2010/09/29 03:06 AM |
Sandy Bridge CPU article online | JS | 2010/09/29 03:42 AM |
Sandy Bridge CPU article online | mpx | 2010/09/29 04:03 AM |
Sandy Bridge CPU article online | Foo_ | 2010/09/29 05:55 AM |
Sandy Bridge CPU article online | ajensen | 2010/09/28 12:19 AM |
Sandy Bridge CPU article online | Ian Ollmann | 2010/09/28 04:52 PM |
Sandy Bridge CPU article online | a reader | 2010/09/28 05:05 PM |
Sandy Bridge CPU article online | ajensen | 2010/09/28 11:35 PM |
Updated: Sandy Bridge CPU article | David Kanter | 2010/10/01 05:11 AM |
Updated: Sandy Bridge CPU article | anon | 2011/01/07 09:55 PM |
Updated: Sandy Bridge CPU article | Eric Bron | 2011/01/08 03:29 AM |
Updated: Sandy Bridge CPU article | anon | 2011/01/11 11:24 PM |
Updated: Sandy Bridge CPU article | anon | 2011/01/15 11:21 AM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anon | 2011/01/16 11:22 PM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anonymous | 2011/01/17 02:04 AM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anon | 2011/01/17 07:12 AM |
I can try.... | David Kanter | 2011/01/18 03:54 PM |
I can try.... | anon | 2011/01/18 08:07 PM |
I can try.... | David Kanter | 2011/01/18 11:24 PM |
I can try.... | anon | 2011/01/19 07:51 AM |
Wider fetch than execute makes sense | Paul A. Clayton | 2011/01/19 08:53 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/04 07:29 AM |
Sandy Bridge CPU article online | Seni | 2011/01/04 09:07 PM |
Sandy Bridge CPU article online | hobold | 2011/01/04 11:26 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 02:01 AM |
software assist exceptions | hobold | 2011/01/05 04:36 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 01:58 AM |
Sandy Bridge CPU article online | anon | 2011/01/05 04:51 AM |
Sandy Bridge CPU article online | Seni | 2011/01/05 08:53 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 09:03 AM |
Sandy Bridge CPU article online | anon | 2011/01/05 04:14 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 04:50 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/05 05:00 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 07:26 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/05 07:50 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 08:39 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 03:50 PM |
permuting vector elements | hobold | 2011/01/05 05:03 PM |
permuting vector elements | Nicolas Capens | 2011/01/05 06:01 PM |
permuting vector elements | Nicolas Capens | 2011/01/06 08:27 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/11 11:33 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/11 01:51 PM |
Sandy Bridge CPU article online | hobold | 2011/01/11 02:11 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/11 06:07 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/12 03:25 AM |
Sandy Bridge CPU article online | hobold | 2011/01/12 05:03 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/12 11:27 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/13 02:38 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/13 03:32 AM |
Sandy Bridge CPU article online | hobold | 2011/01/13 01:53 PM |
What happened to VPERMIL2PS? | Michael S | 2011/01/13 03:46 AM |
What happened to VPERMIL2PS? | Eric Bron | 2011/01/13 06:46 AM |
Lower cost permute | Paul A. Clayton | 2011/01/13 12:11 PM |
Sandy Bridge CPU article online | anon | 2011/01/25 06:31 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/12 06:34 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/13 07:38 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/15 09:47 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/16 03:13 AM |
And just to make a further example | Gabriele Svelto | 2011/01/16 04:24 AM |
Sandy Bridge CPU article online | mpx | 2011/01/16 01:27 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/25 02:56 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/25 04:11 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/26 08:49 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/26 04:35 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/27 02:51 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/27 02:40 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/28 03:24 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/28 03:49 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/30 02:11 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/31 03:43 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/01 04:02 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/01 04:28 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/01 04:43 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/28 07:14 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/01 02:58 AM |
Sandy Bridge CPU article online | EduardoS | 2011/02/01 02:36 PM |
Sandy Bridge CPU article online | anon | 2011/02/01 04:56 PM |
Sandy Bridge CPU article online | EduardoS | 2011/02/01 09:17 PM |
Sandy Bridge CPU article online | anon | 2011/02/01 10:13 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/02 04:08 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/02 04:26 AM |
Sandy Bridge CPU article online | kalmaegi | 2011/02/01 09:29 AM |
SW Rasterization | David Kanter | 2011/01/27 05:18 PM |
Lower pin count memory | iz | 2011/01/27 09:19 PM |
Lower pin count memory | David Kanter | 2011/01/27 09:25 PM |
Lower pin count memory | iz | 2011/01/27 11:31 PM |
Lower pin count memory | David Kanter | 2011/01/27 11:52 PM |
Lower pin count memory | iz | 2011/01/28 12:28 AM |
Lower pin count memory | David Kanter | 2011/01/28 01:05 AM |
Lower pin count memory | iz | 2011/01/28 03:55 AM |
Lower pin count memory | David Hess | 2011/01/28 01:15 PM |
Lower pin count memory | David Kanter | 2011/01/28 01:57 PM |
Lower pin count memory | iz | 2011/01/28 05:20 PM |
Two years later | ForgotPants | 2013/10/26 11:33 AM |
Two years later | anon | 2013/10/26 11:36 AM |
Two years later | Exophase | 2013/10/26 12:56 PM |
Two years later | David Hess | 2013/10/26 05:05 PM |
Herz is totally the thing you DON*T care. | Jouni Osmala | 2013/10/27 01:48 AM |
Herz is totally the thing you DON*T care. | EduardoS | 2013/10/27 07:00 AM |
Herz is totally the thing you DON*T care. | Michael S | 2013/10/27 07:45 AM |
Two years later | someone | 2013/10/28 07:21 AM |
Lower pin count memory | Martin Høyer Kristiansen | 2011/01/28 01:41 AM |
Lower pin count memory | iz | 2011/01/28 03:07 AM |
Lower pin count memory | Darrell Coker | 2011/01/27 10:39 PM |
Lower pin count memory | iz | 2011/01/28 12:20 AM |
Lower pin count memory | Darrell Coker | 2011/01/28 06:07 PM |
Lower pin count memory | iz | 2011/01/28 11:57 PM |
Lower pin count memory | Darrell Coker | 2011/01/29 02:21 AM |
Lower pin count memory | iz | 2011/01/31 10:28 PM |
SW Rasterization | Nicolas Capens | 2011/02/02 08:48 AM |
SW Rasterization | Eric Bron | 2011/02/02 09:37 AM |
SW Rasterization | Nicolas Capens | 2011/02/02 04:35 PM |
SW Rasterization | Eric Bron | 2011/02/02 05:11 PM |
SW Rasterization | Eric Bron | 2011/02/03 02:13 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 07:57 AM |
SW Rasterization | Eric Bron | 2011/02/04 08:50 AM |
erratum | Eric Bron | 2011/02/04 08:58 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 05:25 PM |
SW Rasterization | David Kanter | 2011/02/04 05:33 PM |
SW Rasterization | anon | 2011/02/04 06:04 PM |
SW Rasterization | Nicolas Capens | 2011/02/05 03:39 PM |
SW Rasterization | David Kanter | 2011/02/05 05:07 PM |
SW Rasterization | Nicolas Capens | 2011/02/05 11:39 PM |
SW Rasterization | Eric Bron | 2011/02/04 10:55 AM |
Comments pt 1 | David Kanter | 2011/02/02 01:08 PM |
Comments pt 1 | Eric Bron | 2011/02/02 03:16 PM |
Comments pt 1 | Gabriele Svelto | 2011/02/03 01:37 AM |
Comments pt 1 | Eric Bron | 2011/02/03 02:36 AM |
Comments pt 1 | Nicolas Capens | 2011/02/03 11:08 PM |
Comments pt 1 | Nicolas Capens | 2011/02/03 10:26 PM |
Comments pt 1 | Eric Bron | 2011/02/04 03:33 AM |
Comments pt 1 | Nicolas Capens | 2011/02/04 05:24 AM |
example code | Eric Bron | 2011/02/04 04:51 AM |
example code | Nicolas Capens | 2011/02/04 08:24 AM |
example code | Eric Bron | 2011/02/04 08:36 AM |
example code | Nicolas Capens | 2011/02/05 11:43 PM |
Comments pt 1 | Rohit | 2011/02/04 12:43 PM |
Comments pt 1 | Nicolas Capens | 2011/02/04 05:05 PM |
Comments pt 1 | David Kanter | 2011/02/04 05:36 PM |
Comments pt 1 | Nicolas Capens | 2011/02/05 02:45 PM |
Comments pt 1 | Eric Bron | 2011/02/05 04:13 PM |
Comments pt 1 | Nicolas Capens | 2011/02/05 11:52 PM |
Comments pt 1 | Eric Bron | 2011/02/06 01:31 AM |
Comments pt 1 | Nicolas Capens | 2011/02/06 04:06 PM |
Comments pt 1 | Eric Bron | 2011/02/07 03:12 AM |
The need for gather/scatter support | Nicolas Capens | 2011/02/10 10:07 AM |
The need for gather/scatter support | Eric Bron | 2011/02/11 03:11 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/13 03:39 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 07:46 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:48 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 09:32 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 10:07 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 09:00 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:49 AM |
Gather/scatter performance data | Eric Bron | 2011/02/15 02:23 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 05:06 PM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:52 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 09:43 AM |
SW Rasterization - a long way off | Rohit | 2011/02/02 01:17 PM |
SW Rasterization - a long way off | Nicolas Capens | 2011/02/04 03:59 AM |
CPU only rendering - a long way off | Rohit | 2011/02/04 11:52 AM |
CPU only rendering - a long way off | Nicolas Capens | 2011/02/04 07:15 PM |
CPU only rendering - a long way off | Rohit | 2011/02/05 02:00 AM |
CPU only rendering - a long way off | Nicolas Capens | 2011/02/05 09:45 PM |
CPU only rendering - a long way off | David Kanter | 2011/02/06 09:51 PM |
CPU only rendering - a long way off | Gian-Carlo Pascutto | 2011/02/07 12:22 AM |
Encryption | David Kanter | 2011/02/07 01:18 AM |
Encryption | Nicolas Capens | 2011/02/07 07:51 AM |
Encryption | David Kanter | 2011/02/07 11:50 AM |
Encryption | Nicolas Capens | 2011/02/08 10:26 AM |
CPUs are latency optimized | David Kanter | 2011/02/08 11:38 AM |
efficient compiler on an efficient GPU real today. | sJ | 2011/02/08 11:29 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/09 09:49 PM |
CPUs are latency optimized | Eric Bron | 2011/02/10 12:49 AM |
CPUs are latency optimized | Antti-Ville Tuunainen | 2011/02/10 06:16 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/10 07:04 AM |
CPUs are latency optimized | Eric Bron | 2011/02/10 07:48 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/10 01:31 PM |
CPUs are latency optimized | Eric Bron | 2011/02/11 02:43 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/11 07:31 AM |
CPUs are latency optimized | EduardoS | 2011/02/10 05:29 PM |
CPUs are latency optimized | Anon | 2011/02/10 06:40 PM |
CPUs are latency optimized | David Kanter | 2011/02/10 08:33 PM |
CPUs are latency optimized | EduardoS | 2011/02/11 02:18 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/11 05:56 AM |
CPUs are latency optimized | Rohit | 2011/02/11 07:33 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/14 02:19 AM |
CPUs are latency optimized | Eric Bron | 2011/02/14 03:23 AM |
CPUs are latency optimized | EduardoS | 2011/02/14 01:11 PM |
CPUs are latency optimized | David Kanter | 2011/02/11 02:45 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/15 05:22 AM |
CPUs are latency optimized | David Kanter | 2011/02/15 12:47 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/15 07:10 PM |
Have fun | David Kanter | 2011/02/15 10:04 PM |
Have fun | Nicolas Capens | 2011/02/17 03:59 AM |
Have fun | Brett | 2011/02/17 12:56 PM |
Have fun | Nicolas Capens | 2011/02/19 04:53 PM |
Have fun | Brett | 2011/02/20 06:08 PM |
Have fun | Brett | 2011/02/20 07:13 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/02/23 05:37 PM |
On-die storage to fight Amdahl | Brett | 2011/02/23 09:59 PM |
On-die storage to fight Amdahl | Brett | 2011/02/23 10:08 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/02/24 07:42 PM |
On-die storage to fight Amdahl | Rohit | 2011/02/25 11:02 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/03/09 06:53 PM |
On-die storage to fight Amdahl | Rohit | 2011/03/10 08:02 AM |
NVIDIA using tile based rendering? | Nathan Monson | 2011/03/11 07:58 PM |
NVIDIA using tile based rendering? | Rohit | 2011/03/12 04:29 AM |
NVIDIA using tile based rendering? | Nathan Monson | 2011/03/12 11:05 AM |
NVIDIA using tile based rendering? | Rohit | 2011/03/12 11:16 AM |
On-die storage to fight Amdahl | Brett | 2011/02/26 02:10 AM |
On-die storage to fight Amdahl | Nathan Monson | 2011/02/26 01:51 PM |
On-die storage to fight Amdahl | Brett | 2011/02/26 04:40 PM |
Convergence is inevitable | Nicolas Capens | 2011/03/09 08:22 PM |
Convergence is inevitable | Brett | 2011/03/09 10:59 PM |
Convergence is inevitable | Antti-Ville Tuunainen | 2011/03/10 03:34 PM |
Convergence is inevitable | Brett | 2011/03/10 09:39 PM |
Procedural texturing? | David Kanter | 2011/03/11 01:32 AM |
Procedural texturing? | hobold | 2011/03/11 03:59 AM |
Procedural texturing? | Dan Downs | 2011/03/11 09:28 AM |
Procedural texturing? | Mark Roulo | 2011/03/11 02:58 PM |
Procedural texturing? | Anon | 2011/03/11 06:11 PM |
Procedural texturing? | Nathan Monson | 2011/03/11 07:30 PM |
Procedural texturing? | Brett | 2011/03/15 07:45 AM |
Procedural texturing? | Seni | 2011/03/15 10:13 AM |
Procedural texturing? | Brett | 2011/03/15 11:45 AM |
Procedural texturing? | Seni | 2011/03/15 02:09 PM |
Procedural texturing? | Brett | 2011/03/11 10:02 PM |
Procedural texturing? | Brett | 2011/03/11 09:34 PM |
Procedural texturing? | Eric Bron | 2011/03/12 03:37 AM |
Convergence is inevitable | Jouni Osmala | 2011/03/09 11:28 PM |
Convergence is inevitable | Brett | 2011/04/05 05:08 PM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 05:23 AM |
Convergence is inevitable | none | 2011/04/07 07:03 AM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 10:34 AM |
Convergence is inevitable | anon | 2011/04/07 02:15 PM |
Convergence is inevitable | none | 2011/04/08 01:57 AM |
Convergence is inevitable | Brett | 2011/04/07 08:04 PM |
Convergence is inevitable | none | 2011/04/08 02:14 AM |
Gather implementation | David Kanter | 2011/04/08 12:01 PM |
RAM Latency | David Hess | 2011/04/07 08:22 AM |
RAM Latency | Brett | 2011/04/07 07:20 PM |
RAM Latency | Nicolas Capens | 2011/04/07 10:18 PM |
RAM Latency | Brett | 2011/04/08 05:33 AM |
RAM Latency | Nicolas Capens | 2011/04/10 02:23 PM |
RAM Latency | Rohit | 2011/04/08 06:57 AM |
RAM Latency | Nicolas Capens | 2011/04/10 01:23 PM |
RAM Latency | David Kanter | 2011/04/10 02:27 PM |
RAM Latency | Rohit | 2011/04/11 06:17 AM |
Convergence is inevitable | Eric Bron | 2011/04/07 09:46 AM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 09:50 PM |
Convergence is inevitable | Eric Bron | 2011/04/08 12:39 AM |
Flaws in PowerVR | Rohit | 2011/02/25 11:21 PM |
Flaws in PowerVR | Brett | 2011/02/26 12:37 AM |
Flaws in PowerVR | Paul | 2011/02/26 05:17 AM |
Have fun | David Kanter | 2011/02/18 12:52 PM |
Have fun | Michael S | 2011/02/19 12:12 PM |
Have fun | David Kanter | 2011/02/19 03:26 PM |
Have fun | Michael S | 2011/02/19 04:43 PM |
Have fun | anon | 2011/02/19 05:02 PM |
Have fun | Michael S | 2011/02/19 05:56 PM |
Have fun | anon | 2011/02/20 03:50 PM |
Have fun | EduardoS | 2011/02/20 02:44 PM |
Linear vs non-linear | EduardoS | 2011/02/20 02:55 PM |
Have fun | Michael S | 2011/02/20 04:19 PM |
Have fun | EduardoS | 2011/02/20 05:51 PM |
Have fun | Nicolas Capens | 2011/02/21 11:12 AM |
Have fun | Michael S | 2011/02/21 12:38 PM |
Have fun | Eric Bron | 2011/02/21 02:10 PM |
Have fun | Eric Bron | 2011/02/21 02:39 PM |
Have fun | Michael S | 2011/02/21 06:13 PM |
Have fun | Eric Bron | 2011/02/22 12:43 AM |
Have fun | Michael S | 2011/02/22 01:47 AM |
Have fun | Eric Bron | 2011/02/22 02:10 AM |
Have fun | Michael S | 2011/02/22 11:37 AM |
Have fun | anon | 2011/02/22 01:38 PM |
Have fun | EduardoS | 2011/02/22 03:49 PM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/23 06:37 PM |
Gather/scatter efficiency | anonymous | 2011/02/23 06:51 PM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/24 06:57 PM |
Gather/scatter efficiency | anonymous | 2011/02/24 07:16 PM |
Gather/scatter efficiency | Michael S | 2011/02/25 07:45 AM |
Gather implementation | David Kanter | 2011/02/25 05:34 PM |
Gather implementation | Michael S | 2011/02/26 10:40 AM |
Gather implementation | anon | 2011/02/26 11:52 AM |
Gather implementation | Michael S | 2011/02/26 12:16 PM |
Gather implementation | anon | 2011/02/26 11:22 PM |
Gather implementation | Michael S | 2011/02/27 07:23 AM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/28 03:14 PM |
Consider yourself ignored | David Kanter | 2011/02/22 01:05 AM |
one more anti-FMA flame. By me. | Michael S | 2011/02/16 07:40 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/16 08:30 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/16 09:15 AM |
one more anti-FMA flame. By me. | Nicolas Capens | 2011/02/17 06:27 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/17 07:42 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/17 05:46 PM |
Tarantula paper | Paul A. Clayton | 2011/02/18 12:38 AM |
Tarantula paper | Nicolas Capens | 2011/02/19 05:19 PM |
anti-FMA != anti-throughput or anti-SG | Eric Bron | 2011/02/18 01:48 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/20 03:46 PM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/20 05:00 PM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 04:05 AM |
Software pipelining on x86 | David Kanter | 2011/02/23 05:04 AM |
Software pipelining on x86 | JS | 2011/02/23 05:25 AM |
Software pipelining on x86 | Salvatore De Dominicis | 2011/02/23 08:37 AM |
Software pipelining on x86 | Jouni Osmala | 2011/02/23 09:10 AM |
Software pipelining on x86 | LeeMiller | 2011/02/23 10:07 PM |
Software pipelining on x86 | Nicolas Capens | 2011/02/24 03:17 PM |
Software pipelining on x86 | anonymous | 2011/02/24 07:04 PM |
Software pipelining on x86 | Nicolas Capens | 2011/02/28 09:27 AM |
Software pipelining on x86 | Antti-Ville Tuunainen | 2011/03/02 04:31 AM |
Software pipelining on x86 | Megol | 2011/03/02 12:55 PM |
Software pipelining on x86 | Geert Bosch | 2011/03/03 07:58 AM |
FMA benefits and latency predictions | David Kanter | 2011/02/25 05:14 PM |
FMA benefits and latency predictions | Antti-Ville Tuunainen | 2011/02/26 10:43 AM |
FMA benefits and latency predictions | Matt Waldhauer | 2011/02/27 06:42 AM |
FMA benefits and latency predictions | Nicolas Capens | 2011/03/09 06:11 PM |
FMA benefits and latency predictions | Rohit | 2011/03/10 08:11 AM |
FMA benefits and latency predictions | Eric Bron | 2011/03/10 09:30 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/23 05:19 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 07:50 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/23 10:37 AM |
FMA and beyond | Nicolas Capens | 2011/02/24 04:47 PM |
detour on terminology | hobold | 2011/02/24 07:08 PM |
detour on terminology | Nicolas Capens | 2011/02/28 02:24 PM |
detour on terminology | Eric Bron | 2011/03/01 02:38 AM |
detour on terminology | Michael S | 2011/03/01 05:03 AM |
detour on terminology | Eric Bron | 2011/03/01 05:39 AM |
detour on terminology | Michael S | 2011/03/01 08:33 AM |
detour on terminology | Eric Bron | 2011/03/01 09:34 AM |
erratum | Eric Bron | 2011/03/01 09:54 AM |
detour on terminology | Nicolas Capens | 2011/03/10 08:39 AM |
detour on terminology | Eric Bron | 2011/03/10 09:50 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 06:12 AM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 11:25 PM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/17 06:51 PM |
Tarantula vector unit well-integrated | Paul A. Clayton | 2011/02/18 12:38 AM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/19 02:17 PM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 02:09 AM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/20 09:55 AM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 01:39 PM |
anti-FMA != anti-throughput or anti-SG | EduardoS | 2011/02/20 02:35 PM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/21 08:12 AM |
anti-FMA != anti-throughput or anti-SG | anon | 2011/02/17 10:44 PM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/18 06:20 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/17 08:24 AM |
thanks | Michael S | 2011/02/17 04:56 PM |
CPUs are latency optimized | EduardoS | 2011/02/15 01:24 PM |
SwiftShader SNB test | Eric Bron | 2011/02/15 03:46 PM |
SwiftShader NHM test | Eric Bron | 2011/02/15 04:50 PM |
SwiftShader SNB test | Nicolas Capens | 2011/02/17 12:06 AM |
SwiftShader SNB test | Eric Bron | 2011/02/17 01:21 AM |
SwiftShader SNB test | Eric Bron | 2011/02/22 10:32 AM |
SwiftShader SNB test 2nd run | Eric Bron | 2011/02/22 10:51 AM |
SwiftShader SNB test 2nd run | Nicolas Capens | 2011/02/23 02:14 PM |
SwiftShader SNB test 2nd run | Eric Bron | 2011/02/23 02:42 PM |
Win7SP1 out but no AVX hype? | Michael S | 2011/02/24 03:14 AM |
Win7SP1 out but no AVX hype? | Eric Bron | 2011/02/24 03:39 AM |
CPUs are latency optimized | Eric Bron | 2011/02/15 08:02 AM |
CPUs are latency optimized | EduardoS | 2011/02/11 03:40 PM |
CPU only rendering - not a long way off | Nicolas Capens | 2011/02/07 06:45 AM |
CPU only rendering - not a long way off | David Kanter | 2011/02/07 12:09 PM |
CPU only rendering - not a long way off | anonymous | 2011/02/07 10:25 PM |
Sandy Bridge IGP EUs | David Kanter | 2011/02/07 11:22 PM |
Sandy Bridge IGP EUs | Hannes | 2011/02/08 05:59 AM |
SW Rasterization - Why? | Seni | 2011/02/02 02:53 PM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/10 03:12 PM |
Market reasons to ditch the IGP | Seni | 2011/02/11 05:42 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/16 04:29 AM |
Market reasons to ditch the IGP | Seni | 2011/02/16 01:39 PM |
An excellent post! | David Kanter | 2011/02/16 03:18 PM |
CPUs clock higher | Moritz | 2011/02/17 08:06 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/18 06:22 PM |
Market reasons to ditch the IGP | IntelUser2000 | 2011/02/18 07:20 PM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/21 02:42 PM |
Bad data (repeated) | David Kanter | 2011/02/22 12:21 AM |
Bad data (repeated) | none | 2011/02/22 03:04 AM |
13W or 8W? | Foo_ | 2011/02/22 06:00 AM |
13W or 8W? | Linus Torvalds | 2011/02/22 08:58 AM |
13W or 8W? | David Kanter | 2011/02/22 11:33 AM |
13W or 8W? | Mark Christiansen | 2011/02/22 02:47 PM |
Bigger picture | Nicolas Capens | 2011/02/24 06:33 PM |
Bigger picture | Nicolas Capens | 2011/02/24 08:06 PM |
20+ Watt | Nicolas Capens | 2011/02/24 08:18 PM |
<20W | David Kanter | 2011/02/25 01:13 PM |
>20W | Nicolas Capens | 2011/03/08 07:34 PM |
IGP is 3X more efficient | David Kanter | 2011/03/08 10:53 PM |
IGP is 3X more efficient | Eric Bron | 2011/03/09 02:44 AM |
>20W | Eric Bron | 2011/03/09 03:48 AM |
Specious data and claims are still specious | David Kanter | 2011/02/25 02:38 AM |
IGP power consumption, LRB samplers | Nicolas Capens | 2011/03/08 06:24 PM |
IGP power consumption, LRB samplers | EduardoS | 2011/03/08 06:52 PM |
IGP power consumption, LRB samplers | Rohit | 2011/03/09 07:42 AM |
Market reasons to ditch the IGP | none | 2011/02/22 02:58 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/24 06:43 PM |
Market reasons to ditch the IGP | slacker | 2011/02/22 02:32 PM |
Market reasons to ditch the IGP | Seni | 2011/02/18 09:51 PM |
Correction - 28 comparators, not 36. (NT) | Seni | 2011/02/18 10:03 PM |
Market reasons to ditch the IGP | Gabriele Svelto | 2011/02/19 01:49 AM |
Market reasons to ditch the IGP | Seni | 2011/02/19 11:59 AM |
Market reasons to ditch the IGP | Exophase | 2011/02/20 10:43 AM |
Market reasons to ditch the IGP | EduardoS | 2011/02/19 10:13 AM |
Market reasons to ditch the IGP | Seni | 2011/02/19 11:46 AM |
The next revolution | Nicolas Capens | 2011/02/22 03:33 AM |
The next revolution | Gabriele Svelto | 2011/02/22 09:15 AM |
The next revolution | Eric Bron | 2011/02/22 09:48 AM |
The next revolution | Nicolas Capens | 2011/02/23 07:39 PM |
The next revolution | Gabriele Svelto | 2011/02/24 12:43 AM |
GPGPU content creation (or lack of it) | Nicolas Capens | 2011/02/28 07:39 AM |
GPGPU content creation (or lack of it) | The market begs to differ | 2011/03/01 06:32 AM |
GPGPU content creation (or lack of it) | Nicolas Capens | 2011/03/09 09:14 PM |
GPGPU content creation (or lack of it) | Gabriele Svelto | 2011/03/10 01:01 AM |
The market begs to differ | Gabriele Svelto | 2011/03/01 06:33 AM |
The next revolution | Anon | 2011/02/24 02:15 AM |
The next revolution | Nicolas Capens | 2011/02/28 02:34 PM |
The next revolution | Seni | 2011/02/22 02:02 PM |
The next revolution | Gabriele Svelto | 2011/02/23 06:27 AM |
The next revolution | Seni | 2011/02/23 09:03 AM |
The next revolution | Nicolas Capens | 2011/02/24 06:11 AM |
The next revolution | Seni | 2011/02/24 08:45 PM |
IGP sampler count | Nicolas Capens | 2011/03/03 05:19 AM |
Latency and throughput optimized cores | Nicolas Capens | 2011/03/07 03:28 PM |
The real reason no IGP /CPU converge. | Jouni Osmala | 2011/03/07 11:34 PM |
Still converging | Nicolas Capens | 2011/03/13 03:08 PM |
Homogeneous CPU advantages | Nicolas Capens | 2011/03/08 12:12 AM |
Homogeneous CPU advantages | Seni | 2011/03/08 09:23 AM |
Homogeneous CPU advantages | David Kanter | 2011/03/08 11:16 AM |
Homogeneous CPU advantages | Brett | 2011/03/09 03:37 AM |
Homogeneous CPU advantages | Jouni Osmala | 2011/03/09 12:27 AM |
SW Rasterization | firsttimeposter | 2011/02/03 11:18 PM |
SW Rasterization | Nicolas Capens | 2011/02/04 04:48 AM |
SW Rasterization | Eric Bron | 2011/02/04 05:14 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 08:36 AM |
SW Rasterization | Eric Bron | 2011/02/04 08:42 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/26 03:23 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/02/04 04:31 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/05 08:46 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/02/06 06:20 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/06 06:07 PM |
Sandy Bridge CPU article online | arch.comp | 2011/01/06 10:58 PM |
Sandy Bridge CPU article online | Seni | 2011/01/07 10:25 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 04:28 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 06:06 AM |
permuting vector elements (yet again) | hobold | 2011/01/05 05:15 PM |
permuting vector elements (yet again) | Nicolas Capens | 2011/01/06 06:11 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/05 12:46 PM |
wow ...! | hobold | 2011/01/05 05:19 PM |
wow ...! | Nicolas Capens | 2011/01/05 06:11 PM |
wow ...! | Eric Bron | 2011/01/05 10:46 PM |
compress LUT | Eric Bron | 2011/01/05 11:05 PM |
wow ...! | Michael S | 2011/01/06 02:25 AM |
wow ...! | Nicolas Capens | 2011/01/06 06:26 AM |
wow ...! | Eric Bron | 2011/01/06 09:08 AM |
wow ...! | Nicolas Capens | 2011/01/07 07:19 AM |
wow ...! | Steve Underwood | 2011/01/07 10:53 PM |
saturation | hobold | 2011/01/08 10:25 AM |
saturation | Steve Underwood | 2011/01/08 12:38 PM |
saturation | Michael S | 2011/01/08 01:05 PM |
128 bit floats | Brett | 2011/01/08 01:39 PM |
128 bit floats | Michael S | 2011/01/08 02:10 PM |
128 bit floats | Anil Maliyekkel | 2011/01/08 03:46 PM |
128 bit floats | Kevin G | 2011/02/27 11:15 AM |
128 bit floats | hobold | 2011/02/27 04:42 PM |
128 bit floats | Ian Ollmann | 2011/02/28 04:56 PM |
OpenCL FP accuracy | hobold | 2011/03/01 06:45 AM |
OpenCL FP accuracy | anon | 2011/03/01 08:03 PM |
OpenCL FP accuracy | hobold | 2011/03/02 03:53 AM |
OpenCL FP accuracy | Eric Bron | 2011/03/02 07:10 AM |
pet project | hobold | 2011/03/02 09:22 AM |
pet project | Anon | 2011/03/02 09:10 PM |
pet project | hobold | 2011/03/03 04:57 AM |
pet project | Eric Bron | 2011/03/03 02:29 AM |
pet project | hobold | 2011/03/03 05:14 AM |
pet project | Eric Bron | 2011/03/03 03:10 PM |
pet project | hobold | 2011/03/03 04:04 PM |
OpenCL and AMD | Vincent Diepeveen | 2011/03/07 01:44 PM |
OpenCL and AMD | Eric Bron | 2011/03/08 02:05 AM |
OpenCL and AMD | Vincent Diepeveen | 2011/03/08 08:27 AM |
128 bit floats | Michael S | 2011/02/27 04:46 PM |
128 bit floats | Anil Maliyekkel | 2011/02/27 06:14 PM |
saturation | Steve Underwood | 2011/01/17 04:42 AM |
wow ...! | hobold | 2011/01/06 05:05 PM |
Ring | Moritz | 2011/01/20 10:51 PM |
Ring | Antti-Ville Tuunainen | 2011/01/21 12:25 PM |
Ring | Moritz | 2011/01/23 01:38 AM |
Ring | Michael S | 2011/01/23 04:04 AM |
So fast | Moritz | 2011/01/23 07:57 AM |
So fast | David Kanter | 2011/01/23 10:05 AM |
Sandy Bridge CPU (L1D cache) | Gordon Ward | 2011/09/09 02:47 AM |
Sandy Bridge CPU (L1D cache) | David Kanter | 2011/09/09 04:19 PM |
Sandy Bridge CPU (L1D cache) | EduardoS | 2011/09/09 08:53 PM |
Sandy Bridge CPU (L1D cache) | Paul A. Clayton | 2011/09/10 05:12 AM |
Sandy Bridge CPU (L1D cache) | Michael S | 2011/09/10 09:41 AM |
Sandy Bridge CPU (L1D cache) | EduardoS | 2011/09/10 11:17 AM |
Address Ports on Sandy Bridge Scheduler | Victor | 2011/10/16 06:40 AM |
Address Ports on Sandy Bridge Scheduler | EduardoS | 2011/10/16 07:45 PM |
Address Ports on Sandy Bridge Scheduler | Megol | 2011/10/17 09:20 AM |
Address Ports on Sandy Bridge Scheduler | Victor | 2011/10/18 05:34 PM |
Benefits of early scheduling | Paul A. Clayton | 2011/10/18 06:53 PM |
Benefits of early scheduling | Victor | 2011/10/19 05:58 PM |
Consistency and invalidation ordering | Paul A. Clayton | 2011/10/20 04:43 AM |
Address Ports on Sandy Bridge Scheduler | John Upcroft | 2011/10/21 04:16 PM |
Address Ports on Sandy Bridge Scheduler | David Kanter | 2011/10/22 10:49 AM |
Address Ports on Sandy Bridge Scheduler | John Upcroft | 2011/10/26 01:24 PM |
Store TLB look-up at commit? | Paul A. Clayton | 2011/10/26 08:30 PM |
Store TLB look-up at commit? | Richard Scott | 2011/10/26 09:40 PM |
Just a guess | Paul A. Clayton | 2011/10/27 01:54 PM |