By: Nicolas Capens (nicolas.capens.delete@this.gmail.com), January 15, 2011 9:47 pm
Room: Moderated Discussions
Gabriele Svelto (gabriele.svelto@gmail.com) on 1/13/11 wrote:
>Honestly, I still have to see a system - outside of embedded applications - that
>doesn't have a GPU.
There would be no innovation whatsoever if an argument like this had any validity. Just because you haven't seen something yet doesn't mean it won't be viable in the future. Back in 2005 people had yet to see a consumer GPU with a unified vertex and pixel shader unit. That didn't stop it from becoming the dominant architecture really fast.
All it takes for CPU software rendering to become viable is an affordable mainstream CPU capable of delivering adequate graphics performance for the low-end market at acceptable power consumption. It may not be here yet but it's well within reach. Performance/Watt is doubling every two years, at any price point, and gather/scatter would get things ahead of that curve.
So no matter what the market's expectations are, at some point in the future the CPU will be capable of supporting it. A few years ago I've had people telling me software renderers would never run Oblivion or Crysis, unless at several seconds per frame on a high-end machine. Some even believed that it was somehow physically impossible to achieve the same image quality as a GPU. They were all proven wrong right the next year.
Sometimes I get the idea that the only impression people like you get when I speak the words "software rendering" is a memory of playing Quake at 320x200 at 8 FPS. There has been tremendous progress since those days! There is more tremendous progress coming! For the record, CPUs are hardware too, so software rendering is a bit of a misnomer. And shaders are software too. There are fewer architectural distinctions between CPUs and GPUs every generation.
>And no, if you go low in the CPU scale you lose a lot of performance.
>You have a high-clocked quad-core, how many of those end up in laptops? Cheap OEM
>PCs? CULV laptops? How fast would SwiftShader run 3DMark06 on a dual-core 1.4GHz
>CULV processor? Here's my guess: much slower than the pitiful IGPs you get with those machines.
Once again you're stuck in today. Instead try to look at the past and project that progress into the future. Back when they were launched, a Core 2 Duo at 2.93GHz costed 1000 USD. Nowadays you can get a superior Core i3 at 3.06 GHz for 115 USD, consuming only a fraction of the power. And if it didn't include an IGP it would have costed even less. In the AMD camp 100 USD will even buy you a quad-core already.
And with all due respect it's kind of lame that you have to use CULV systems as an example. It's pretty obvious they'll be the last to switch to software rendering. However, their low power consumption hasn't stopped them from using HD Audio codecs (despite high CPU usage), while as you pointed out earlier handheld devices have dedicated DSPs to lower the power consumption further. It indicates that they're just several years behind on the curve, but will surely follow.
>Well, Sandy Bridge IGP, which is a really nice surprise coming from Intel, scores
>over 4k in 3DMark06, and that's with early drivers:
Look at where it's heading. It's a 5x gap today but four years from now we could have gather/scatter and 8-core mainstream CPUs that don't break a sweat at 5 GHz. So that's roughly 3x the performance thanks to the full-featured AVX extensions and another 3x for the cores and clock frequency. Together with some fine-tuning of the software renderer we're looking at about 8000 3DMark06 points, on par with an 8800 GTS!
By then 3DMark06 will be showcased in musea next to dot-matrix printers though. The games of 2015 will require fully generically programmable GPUs that don't buckle when running long complex shaders. This requires technology found in CPU architectures... So you might as well use the actual CPU cores.
Let me put it this way: At a certain point the framerate of an old application no longer matters. Nobody really cares if something like Unreal Tournament 2003 is only slightly faster on a GF104 versus a G92, despite having nearly three times the transistors, since it's running in the range of 400+ FPS anyway. What actually matters is which of the latest games your hardware can handle adequately. As the programmability of graphics APIs went up the CPU became increasingly more efficient at implementing the graphics pipeline. As the ALU:TEX ratio of modern games continues to increase, the inefficiency at texture sampling starts to matter less and less. Gather/scatter is all that is needed to make the CPU excel at modern graphics.
>http://nl.hardware.info/reviews/1947/45/intel-core-i7-2600k-i5-2500k-i5-2300-sandy-bridge-review-gpu-benchmarks
>
>That's 10 times what you'd achieve with SwiftShader on those two cores.
You probably meant four cores? Or six if you replace the IGP's die space with CPU cores.
>And 3DMark06 is a basically CPU-less load, with many
>games you wouldn't have that luxury, StarCraft 2 for example which is pretty light
>for todays standards burns you 2-3 cores just for IA and physics.
StarCraft II is an interesting case because while it requires 3 cores or more to run optimally, the average CPU usage isnt't that high: http://www.pcper.com/article.php?aid=958. And given that 30 FPS is plenty for a strategy game this leaves even more CPU power unused and the game is GPU limited. So to overcome Amdahl's Law you could give the CPU some other tasks to work on as well. If you move some of the graphics workload from the GPU to the CPU you could have a cheaper GPU and still achieve the same performance. If this trend continues, sooner or later you're best off running all graphics on the CPU.
It also shows that games are not just about graphics. They make use of the generic capabilities of CPUs to deliver a more exciting experience than just pretty images. If you ditch the GPU and instead have a CPU with more cores and gather/scatter support it opens up a whole new world of possibilities. After all we don't really want any more pixels. We want pixels which show something interesting. Computing physics and AI is not something the GPU is succesful at. We're a long way from running the entire game on the GPU. For that to happen it would have to be fully generic like the CPU itself. But obviously that means it's more likely for the graphics to be moved to the CPU than for everything else to be moved to the GPU.
>So no, putting
>two additional cores on Sandy Bridge instead of an IGP would have been a terrible
>idea for 99% of the users. Having a decent IGP on the other hands enables them to
>play games, Oh, and removing the IGP would also means doing video decoding on the
>CPU which would have shortened battery life when you're on the move, etc...
That 99% is nonsense. Like I said before systems with X3100's are still sold by the millions today. Out of all the people who would buy a Sandy Bridge system without dedicated graphics, there has to be a large percentage who would be happy with lesser graphics if it made things cheaper or the CPU itself faster.
>Honestly, a 6-core SB using software rendering wouldn't be able to play games with
>entry-level settings
The Crysis benchmark runs at 17 FPS on my Core i7 920 @ 3.2 GHz. On a 6-core Sandy Bridge it would be running at over 30 FPS. Still without making use of AVX (let alone gather/scatter).
>You seem to have completely missed the point: AMD GPUs use VLIW processors. VLIW
>has been since its inception a design methodology for processors that trades off
>complexity for higher maximum throughput. VLIW processors will have lower utilization
>*by design* compared to a RISC approach, but they will make up for it by offering
>higher maximum throughput and compute density because they are simpler and thus
>smaller to implement. The low utilization of AMD GPU's is a *design choice* which
>actually allowed them to hit the same performance targets with a smaller die.
I don't see how VLIW is less complex than SIMD. It needs to route multiple control words to each unit, it needs support for horizontal operations, and it requires a shuffle network.
But by keeping the clock frequency lower, AMD can push TSMC's 40 nm process to the integration density limit. They can have fewer pipeline stages, route the wires closer together with less fear of interference, and use smaller transistor gate widths. They also don't need buffers between the clock domains. All this was necessary to compensate for the VLIW complexity and lower utilization. But on top of that they also decided on smaller caches, fewer ROPs and less bandwidth. This causes AMD's architectures to frequently become bottlenecked, further lowering utilization. So let's not confuse these two.
At first it might seem they succeeded at making a smaller chip perform the same, but it's not without compromises. Fermi is still more feature-rich and won't stall as easily. So if you take that into account they've created an equivalent at best. There's no such thing as a free lunch.
So to get back to the actual point: NVIDIA proves that you can do with less GFLOPS, if you use them efficiently.
>>And since there are lots of GPGPU applications which
>>don't even achieve 10% of the theoretical performance of the GPU, while they use
>>the CPU to it's fullest, it's clear that CPUs are really efficient at juggling tasks
>>around and keeping the data flowing.
>
>We already knew that, but that doesn't change a thing for graphics.
It does change things for graphics, because graphics becomes ever more irregular. NVIDIA's architectures have ever lower computing density (per transistor) but keep up with AMD's architectures. Amdahl's Law, a.k.a. the law of diminishing returns, makes AMD's approach less scalable.
>>So how AMD and NVIDIA compare today is pretty irrelevant to the discussion.
>
>You were the one to bring this argument to the topic, not me.
Please read the context. I literally meant "today". If all you care about is today's games, AMD's current architecture is as good a choice as NVIDIA's. The relevant part is how things will evolve in the future.
>>What
>>matters in the long term is that applications, including graphics applications,
>>are getting more complex and a CPU architecture is more suited for this.
>
>Then why GPUs are becoming more and more prevalent? First it was only desktop computers,
>now it's desktop and laptops, now you get them even in DVRs, cellphones, tablets,
>you name it. If they are on-die or not doesn't change a basic fact: if you want
>to do graphics at decent performance level and in a decent power envelope you need
>dedicated hardware. Complex, programmable dedicated hardware but still dedicated.
Dedicated sound processing came and went. Dedicated physics processing came and went. Dedicated RAID controllers are also no longer a must and hybrid RAID and sofware RAID is taking its place...
Dedicated graphics has a much longer cycle because it's more complex and requires more performance for adequate support, but the end is getting nearer. Every stage of the graphics pipeline is becoming programmable. Programmability is the enemy of dedicated.
There's actually a physical reason why everything that is programmable, eventually moves to the CPU. Every time the semiconductor feature size halves, you can have twice the number of wires, but up to four times the transistor density. What this means is that eventually everything becomes bandwidth limited. On-die memory (registers and caches) delay the issue but it's starting to matter less and less what's doing the actual processing. This makes a single processor which is capable of all tasks, a lot more interesting than a heterogenous design which can't balance the workload.
>>Note that software rendering frees game engine developers from the graphics API restrictions, which in turn results in higher performance in practice: http://graphics.cs.williams.edu/archive/SweeneyHPG2009/TimHPG2009.pdf
>
>I had already read that presentation and while Tim Sweeny has already predicted
>the demise of GPUs many times GPUs have been entering new markets where they were
>inexistent in the past. So no, this is a purely theoretical argument that doesn't
>only happen in practice: it's also being proven wrong by the fact that we're getting
>*more* dedicated hardware, not less.
Entering new markets doesn't mean they have been succesful at it yet. A 2500 USD Tesla card (which has been tested extensively to avoid the 2-10% failure rate of the consumer market) might sound like a fair deal to the HPC market but it's not what Sweeney had in mind. The average GPU in a typical desktop sytem isn't capable of running anything additional to the graphics workload, while achieving better performance than when making use of the CPU's idle cores instead.
And I've already given you examples of dedicated hardware which became irrelevant because the CPU became powerful enough to take over their role. New dedicated hardware may come, but if it benefits from programmability it will eventually go.
>>Retirement buffers are pretty small. But instead of speculative execution, GPUs
>>opt for massive simultaneous multi-threading, which means they need register space
>>for all these extra threads. Shaders which use more registers than what the hardware
>>was designed for, can really decimate performance. Developers also want a true call
>>stack, so you need massive caches to store all this context.
>>This situation is not sustainable. Some really drastic measures need to be taken
>>to reduce the thread count.
>> But you don't need to look any further than CPU architectures.
>>Speculative execution, branch prediction, forwarding, register renaming, etc. can
>>come to the rescue.
>
>GPU caches are smaller than CPU caches, how's that comes? GPU register files are
>pretty large, but very dense, do you think that a ROB is a small/cheap structure?
>Or a branch predictor history table? Or OoO logic? They're not and for throughput-oriented
>workloads they do not make much sense. How comes thread count on CPUs is going up?
Register files are less dense than caches. But anyway, the real problem for GPUs is the number of threads (strands).
Why does Fermi still not support recursion? Because it requires many kilobytes of stack space, per thread. On the CPU, 1 MB of stack space per thread is typical. Even in the worst case, that's still likely to fit in L3 cache. Even without recursion the call stack can get pretty deep for complex code. With thousands of threads in flight, a GPU just can't have the on-die storage (regardless of whether it's registers or slightly denser caches).
So to ever support recursion or deep call stacks, GPUs will need to drastically reduce their thread count. You need the tricks used in CPUs to achieve that.
>Besides you are the first person I hear saying that current GPU trends are not sustainable,
>if it came from a GPU HW designer it might make sense but from you it sounds like wishful thinking.
NVIDIA's GPU designers have added superscalar execution. This complicates the scheduler and actually creates the risk that they can't find two independent operations in the shader. So clearly some benefit outweighed these disadvantages...
This is GPU designers telling me that threads need to be processed faster in order to avoid running out of register space.
That said, why would it matter who's saying something, for it to make sense? I don't want GPUs to not support recursion. It's just a technical fact. If they did support it, I wouldn't be having this discussion and I would write software for the GPU. And for your information, I have a masters degree in computer science and engineering, with a minor in embedded systems.
If you have a better technical explanation why NVIDIA went superscalar, please let me hear it. No qualifications necessary.
>> But when GPUs sacrifice computing density for higher efficiency
>>at complex generic tasks, it's obviously also a really interesting option to just
>>start with a multi-core CPU architecture and give it powerful graphics capabilities by adding gather/scatter.
>
>Yes, they tried that, it didn't work out.
No, they tried that in the high-end market and it didn't work out. They created an architecture that was ahead of its time. Larrabee does support recursion, for one thing, but no software uses it because no GPU supports it. It's a chicken-and-egg problem. You need Larrabee on the market for developers to make good use of its capabilities, but you can't sell Larrabee when it's up against high-end GPUs running legacy benchmarks.
Approaching things from the low-end market first does work, because it's not necessary for every dollar of hardware to be worth its maximum in graphics performance. It also buys you a CPU. So while Larrabee is only judged by its ability to render graphics, a CPU is valuable for lots of other things. So as long as it's reasonably efficient at graphics, there's bound to be a market for it. And gather/scatter will make that final difference in efficiency which makes a GPU redundant.
>>While texture sampling is a highly specialized operation (it's the only thing Larrabee
>>really has dedicated hardware for), the general trend is still to generalize them
>>into load/store (gather/scatter) operations.
>
>No, the general trend is to have more, more powerful texture units in GPUs. Look
>for yourself at the latest architecture from *any* graphics vendor if you don't believe me.
Sure, in absolute numbers there are more texture units each generation, but the TEX:ALU ratio has been going down. A smaller percentage of die space is going to texture units. But shaders and compute kernels also access unfiltered local and global data. For this you need load/store units.
>>There are several reasons for this:
>>
>>Shaders need ever fewer texture samples (TEX:ALU ratio). So the amount of die area
>>use on texture samplers has steadily decreased. But this means that TEX heavy shaders
>>become bottlenecked, and for ALU heavy shaders they are underutilized...
>>
>>But while the TEX:ALU ratio decreases, the shaders do make more unfiltered memory
>>accesses. So modern GPUs also have specialized access to local memory, shared memory,
>>global memory, etc. Each of these can again be a bottleneck!
>
>Actually they're no bottleneck at all, they are part of the specialized hardware
>that makes GPUs so fast at graphics. They have tremendous bandwidth, they are optimized
>for texture access patterns and they are usually fully associative. No matter how
>you slice it, TEX:ALU ratio has decreased and still TUs have increased in number
>and functionality. They are not going away, they are getting better and they're
>both faster and more efficient at doing their job as a CPU will ever be. On top of that they use less space on the die.
The early graphics chips were pretty much only texture units and raster operation units. You needed multiple passes to do anything interesting. Nowadays the texture units occupy just a fraction of the die space. And you can easily get bottlenecked by texture accesses.
The lower TEX:ALU ratio has helped software rendering to catch up. Arithmetic operations map almost directly to CPU instructions. Texture sampling still takes many instructions but since they're not so frequent any more the CPU has relatively speaking become better at graphics. Just compare running 3DMark2001 and 3DMark06 using SwiftShader. The drop in 3DMark points is smaller than when using a GPU.
Anyway, that doesn't take away that gather/scatter support would still make a critical difference in texture sampling efficiency for the CPU, and it would also helps speed up other parts of the graphics pipeline.
>>Also, filtering is useless for GPGPU, while graphics on the other hand wants full
>>FP32 filtering (possibly even FP64). These diverging needs are hard to combine into
>>texture units or other highly specialized load/store units.
>
>All recent GPUs can do full-speed FP32 filtering, and filtering is very important
>for GPGPU too because many GPGPU applications are actually doing graphics and so
>they can also benefit from texture hardware.
No they can't: http://www.anandtech.com/show/4008/nvidias-geforce-gtx-580/2
>>So it becomes very tempting to just slap all these different forms of memory access
>>together, have generic load/store units, and let the cache hierarchy take care of local and temporal coherence.
>
>Tempting for who? All hardware available now or in the design pipe is going in
>the other direction, how do you fail to see this basic, hard fact?
What other direction? The latest GPUs now have the ability to gather any of the four components of a 2x2 footprint and query the texture LOD, from within the shader! They're also performing part of the addressing in the shader units. So it's already really close to being fully programmable.
The fact of the matter is that there's a massive number of shader units and they're frequently underutilized. So why have dedicated hardware for addressing and filtering? By slimming down the texture units they can have more of them, and by unifying them with load/store units the bottlenecks are reduced.
I'll respond to the rest of your visionless arguments some other day. Got a SIGGRAPH submission deadline coming up...
>Honestly, I still have to see a system - outside of embedded applications - that
>doesn't have a GPU.
There would be no innovation whatsoever if an argument like this had any validity. Just because you haven't seen something yet doesn't mean it won't be viable in the future. Back in 2005 people had yet to see a consumer GPU with a unified vertex and pixel shader unit. That didn't stop it from becoming the dominant architecture really fast.
All it takes for CPU software rendering to become viable is an affordable mainstream CPU capable of delivering adequate graphics performance for the low-end market at acceptable power consumption. It may not be here yet but it's well within reach. Performance/Watt is doubling every two years, at any price point, and gather/scatter would get things ahead of that curve.
So no matter what the market's expectations are, at some point in the future the CPU will be capable of supporting it. A few years ago I've had people telling me software renderers would never run Oblivion or Crysis, unless at several seconds per frame on a high-end machine. Some even believed that it was somehow physically impossible to achieve the same image quality as a GPU. They were all proven wrong right the next year.
Sometimes I get the idea that the only impression people like you get when I speak the words "software rendering" is a memory of playing Quake at 320x200 at 8 FPS. There has been tremendous progress since those days! There is more tremendous progress coming! For the record, CPUs are hardware too, so software rendering is a bit of a misnomer. And shaders are software too. There are fewer architectural distinctions between CPUs and GPUs every generation.
>And no, if you go low in the CPU scale you lose a lot of performance.
>You have a high-clocked quad-core, how many of those end up in laptops? Cheap OEM
>PCs? CULV laptops? How fast would SwiftShader run 3DMark06 on a dual-core 1.4GHz
>CULV processor? Here's my guess: much slower than the pitiful IGPs you get with those machines.
Once again you're stuck in today. Instead try to look at the past and project that progress into the future. Back when they were launched, a Core 2 Duo at 2.93GHz costed 1000 USD. Nowadays you can get a superior Core i3 at 3.06 GHz for 115 USD, consuming only a fraction of the power. And if it didn't include an IGP it would have costed even less. In the AMD camp 100 USD will even buy you a quad-core already.
And with all due respect it's kind of lame that you have to use CULV systems as an example. It's pretty obvious they'll be the last to switch to software rendering. However, their low power consumption hasn't stopped them from using HD Audio codecs (despite high CPU usage), while as you pointed out earlier handheld devices have dedicated DSPs to lower the power consumption further. It indicates that they're just several years behind on the curve, but will surely follow.
>Well, Sandy Bridge IGP, which is a really nice surprise coming from Intel, scores
>over 4k in 3DMark06, and that's with early drivers:
Look at where it's heading. It's a 5x gap today but four years from now we could have gather/scatter and 8-core mainstream CPUs that don't break a sweat at 5 GHz. So that's roughly 3x the performance thanks to the full-featured AVX extensions and another 3x for the cores and clock frequency. Together with some fine-tuning of the software renderer we're looking at about 8000 3DMark06 points, on par with an 8800 GTS!
By then 3DMark06 will be showcased in musea next to dot-matrix printers though. The games of 2015 will require fully generically programmable GPUs that don't buckle when running long complex shaders. This requires technology found in CPU architectures... So you might as well use the actual CPU cores.
Let me put it this way: At a certain point the framerate of an old application no longer matters. Nobody really cares if something like Unreal Tournament 2003 is only slightly faster on a GF104 versus a G92, despite having nearly three times the transistors, since it's running in the range of 400+ FPS anyway. What actually matters is which of the latest games your hardware can handle adequately. As the programmability of graphics APIs went up the CPU became increasingly more efficient at implementing the graphics pipeline. As the ALU:TEX ratio of modern games continues to increase, the inefficiency at texture sampling starts to matter less and less. Gather/scatter is all that is needed to make the CPU excel at modern graphics.
>http://nl.hardware.info/reviews/1947/45/intel-core-i7-2600k-i5-2500k-i5-2300-sandy-bridge-review-gpu-benchmarks
>
>That's 10 times what you'd achieve with SwiftShader on those two cores.
You probably meant four cores? Or six if you replace the IGP's die space with CPU cores.
>And 3DMark06 is a basically CPU-less load, with many
>games you wouldn't have that luxury, StarCraft 2 for example which is pretty light
>for todays standards burns you 2-3 cores just for IA and physics.
StarCraft II is an interesting case because while it requires 3 cores or more to run optimally, the average CPU usage isnt't that high: http://www.pcper.com/article.php?aid=958. And given that 30 FPS is plenty for a strategy game this leaves even more CPU power unused and the game is GPU limited. So to overcome Amdahl's Law you could give the CPU some other tasks to work on as well. If you move some of the graphics workload from the GPU to the CPU you could have a cheaper GPU and still achieve the same performance. If this trend continues, sooner or later you're best off running all graphics on the CPU.
It also shows that games are not just about graphics. They make use of the generic capabilities of CPUs to deliver a more exciting experience than just pretty images. If you ditch the GPU and instead have a CPU with more cores and gather/scatter support it opens up a whole new world of possibilities. After all we don't really want any more pixels. We want pixels which show something interesting. Computing physics and AI is not something the GPU is succesful at. We're a long way from running the entire game on the GPU. For that to happen it would have to be fully generic like the CPU itself. But obviously that means it's more likely for the graphics to be moved to the CPU than for everything else to be moved to the GPU.
>So no, putting
>two additional cores on Sandy Bridge instead of an IGP would have been a terrible
>idea for 99% of the users. Having a decent IGP on the other hands enables them to
>play games, Oh, and removing the IGP would also means doing video decoding on the
>CPU which would have shortened battery life when you're on the move, etc...
That 99% is nonsense. Like I said before systems with X3100's are still sold by the millions today. Out of all the people who would buy a Sandy Bridge system without dedicated graphics, there has to be a large percentage who would be happy with lesser graphics if it made things cheaper or the CPU itself faster.
>Honestly, a 6-core SB using software rendering wouldn't be able to play games with
>entry-level settings
The Crysis benchmark runs at 17 FPS on my Core i7 920 @ 3.2 GHz. On a 6-core Sandy Bridge it would be running at over 30 FPS. Still without making use of AVX (let alone gather/scatter).
>You seem to have completely missed the point: AMD GPUs use VLIW processors. VLIW
>has been since its inception a design methodology for processors that trades off
>complexity for higher maximum throughput. VLIW processors will have lower utilization
>*by design* compared to a RISC approach, but they will make up for it by offering
>higher maximum throughput and compute density because they are simpler and thus
>smaller to implement. The low utilization of AMD GPU's is a *design choice* which
>actually allowed them to hit the same performance targets with a smaller die.
I don't see how VLIW is less complex than SIMD. It needs to route multiple control words to each unit, it needs support for horizontal operations, and it requires a shuffle network.
But by keeping the clock frequency lower, AMD can push TSMC's 40 nm process to the integration density limit. They can have fewer pipeline stages, route the wires closer together with less fear of interference, and use smaller transistor gate widths. They also don't need buffers between the clock domains. All this was necessary to compensate for the VLIW complexity and lower utilization. But on top of that they also decided on smaller caches, fewer ROPs and less bandwidth. This causes AMD's architectures to frequently become bottlenecked, further lowering utilization. So let's not confuse these two.
At first it might seem they succeeded at making a smaller chip perform the same, but it's not without compromises. Fermi is still more feature-rich and won't stall as easily. So if you take that into account they've created an equivalent at best. There's no such thing as a free lunch.
So to get back to the actual point: NVIDIA proves that you can do with less GFLOPS, if you use them efficiently.
>>And since there are lots of GPGPU applications which
>>don't even achieve 10% of the theoretical performance of the GPU, while they use
>>the CPU to it's fullest, it's clear that CPUs are really efficient at juggling tasks
>>around and keeping the data flowing.
>
>We already knew that, but that doesn't change a thing for graphics.
It does change things for graphics, because graphics becomes ever more irregular. NVIDIA's architectures have ever lower computing density (per transistor) but keep up with AMD's architectures. Amdahl's Law, a.k.a. the law of diminishing returns, makes AMD's approach less scalable.
>>So how AMD and NVIDIA compare today is pretty irrelevant to the discussion.
>
>You were the one to bring this argument to the topic, not me.
Please read the context. I literally meant "today". If all you care about is today's games, AMD's current architecture is as good a choice as NVIDIA's. The relevant part is how things will evolve in the future.
>>What
>>matters in the long term is that applications, including graphics applications,
>>are getting more complex and a CPU architecture is more suited for this.
>
>Then why GPUs are becoming more and more prevalent? First it was only desktop computers,
>now it's desktop and laptops, now you get them even in DVRs, cellphones, tablets,
>you name it. If they are on-die or not doesn't change a basic fact: if you want
>to do graphics at decent performance level and in a decent power envelope you need
>dedicated hardware. Complex, programmable dedicated hardware but still dedicated.
Dedicated sound processing came and went. Dedicated physics processing came and went. Dedicated RAID controllers are also no longer a must and hybrid RAID and sofware RAID is taking its place...
Dedicated graphics has a much longer cycle because it's more complex and requires more performance for adequate support, but the end is getting nearer. Every stage of the graphics pipeline is becoming programmable. Programmability is the enemy of dedicated.
There's actually a physical reason why everything that is programmable, eventually moves to the CPU. Every time the semiconductor feature size halves, you can have twice the number of wires, but up to four times the transistor density. What this means is that eventually everything becomes bandwidth limited. On-die memory (registers and caches) delay the issue but it's starting to matter less and less what's doing the actual processing. This makes a single processor which is capable of all tasks, a lot more interesting than a heterogenous design which can't balance the workload.
>>Note that software rendering frees game engine developers from the graphics API restrictions, which in turn results in higher performance in practice: http://graphics.cs.williams.edu/archive/SweeneyHPG2009/TimHPG2009.pdf
>
>I had already read that presentation and while Tim Sweeny has already predicted
>the demise of GPUs many times GPUs have been entering new markets where they were
>inexistent in the past. So no, this is a purely theoretical argument that doesn't
>only happen in practice: it's also being proven wrong by the fact that we're getting
>*more* dedicated hardware, not less.
Entering new markets doesn't mean they have been succesful at it yet. A 2500 USD Tesla card (which has been tested extensively to avoid the 2-10% failure rate of the consumer market) might sound like a fair deal to the HPC market but it's not what Sweeney had in mind. The average GPU in a typical desktop sytem isn't capable of running anything additional to the graphics workload, while achieving better performance than when making use of the CPU's idle cores instead.
And I've already given you examples of dedicated hardware which became irrelevant because the CPU became powerful enough to take over their role. New dedicated hardware may come, but if it benefits from programmability it will eventually go.
>>Retirement buffers are pretty small. But instead of speculative execution, GPUs
>>opt for massive simultaneous multi-threading, which means they need register space
>>for all these extra threads. Shaders which use more registers than what the hardware
>>was designed for, can really decimate performance. Developers also want a true call
>>stack, so you need massive caches to store all this context.
>>This situation is not sustainable. Some really drastic measures need to be taken
>>to reduce the thread count.
>> But you don't need to look any further than CPU architectures.
>>Speculative execution, branch prediction, forwarding, register renaming, etc. can
>>come to the rescue.
>
>GPU caches are smaller than CPU caches, how's that comes? GPU register files are
>pretty large, but very dense, do you think that a ROB is a small/cheap structure?
>Or a branch predictor history table? Or OoO logic? They're not and for throughput-oriented
>workloads they do not make much sense. How comes thread count on CPUs is going up?
Register files are less dense than caches. But anyway, the real problem for GPUs is the number of threads (strands).
Why does Fermi still not support recursion? Because it requires many kilobytes of stack space, per thread. On the CPU, 1 MB of stack space per thread is typical. Even in the worst case, that's still likely to fit in L3 cache. Even without recursion the call stack can get pretty deep for complex code. With thousands of threads in flight, a GPU just can't have the on-die storage (regardless of whether it's registers or slightly denser caches).
So to ever support recursion or deep call stacks, GPUs will need to drastically reduce their thread count. You need the tricks used in CPUs to achieve that.
>Besides you are the first person I hear saying that current GPU trends are not sustainable,
>if it came from a GPU HW designer it might make sense but from you it sounds like wishful thinking.
NVIDIA's GPU designers have added superscalar execution. This complicates the scheduler and actually creates the risk that they can't find two independent operations in the shader. So clearly some benefit outweighed these disadvantages...
This is GPU designers telling me that threads need to be processed faster in order to avoid running out of register space.
That said, why would it matter who's saying something, for it to make sense? I don't want GPUs to not support recursion. It's just a technical fact. If they did support it, I wouldn't be having this discussion and I would write software for the GPU. And for your information, I have a masters degree in computer science and engineering, with a minor in embedded systems.
If you have a better technical explanation why NVIDIA went superscalar, please let me hear it. No qualifications necessary.
>> But when GPUs sacrifice computing density for higher efficiency
>>at complex generic tasks, it's obviously also a really interesting option to just
>>start with a multi-core CPU architecture and give it powerful graphics capabilities by adding gather/scatter.
>
>Yes, they tried that, it didn't work out.
No, they tried that in the high-end market and it didn't work out. They created an architecture that was ahead of its time. Larrabee does support recursion, for one thing, but no software uses it because no GPU supports it. It's a chicken-and-egg problem. You need Larrabee on the market for developers to make good use of its capabilities, but you can't sell Larrabee when it's up against high-end GPUs running legacy benchmarks.
Approaching things from the low-end market first does work, because it's not necessary for every dollar of hardware to be worth its maximum in graphics performance. It also buys you a CPU. So while Larrabee is only judged by its ability to render graphics, a CPU is valuable for lots of other things. So as long as it's reasonably efficient at graphics, there's bound to be a market for it. And gather/scatter will make that final difference in efficiency which makes a GPU redundant.
>>While texture sampling is a highly specialized operation (it's the only thing Larrabee
>>really has dedicated hardware for), the general trend is still to generalize them
>>into load/store (gather/scatter) operations.
>
>No, the general trend is to have more, more powerful texture units in GPUs. Look
>for yourself at the latest architecture from *any* graphics vendor if you don't believe me.
Sure, in absolute numbers there are more texture units each generation, but the TEX:ALU ratio has been going down. A smaller percentage of die space is going to texture units. But shaders and compute kernels also access unfiltered local and global data. For this you need load/store units.
>>There are several reasons for this:
>>
>>Shaders need ever fewer texture samples (TEX:ALU ratio). So the amount of die area
>>use on texture samplers has steadily decreased. But this means that TEX heavy shaders
>>become bottlenecked, and for ALU heavy shaders they are underutilized...
>>
>>But while the TEX:ALU ratio decreases, the shaders do make more unfiltered memory
>>accesses. So modern GPUs also have specialized access to local memory, shared memory,
>>global memory, etc. Each of these can again be a bottleneck!
>
>Actually they're no bottleneck at all, they are part of the specialized hardware
>that makes GPUs so fast at graphics. They have tremendous bandwidth, they are optimized
>for texture access patterns and they are usually fully associative. No matter how
>you slice it, TEX:ALU ratio has decreased and still TUs have increased in number
>and functionality. They are not going away, they are getting better and they're
>both faster and more efficient at doing their job as a CPU will ever be. On top of that they use less space on the die.
The early graphics chips were pretty much only texture units and raster operation units. You needed multiple passes to do anything interesting. Nowadays the texture units occupy just a fraction of the die space. And you can easily get bottlenecked by texture accesses.
The lower TEX:ALU ratio has helped software rendering to catch up. Arithmetic operations map almost directly to CPU instructions. Texture sampling still takes many instructions but since they're not so frequent any more the CPU has relatively speaking become better at graphics. Just compare running 3DMark2001 and 3DMark06 using SwiftShader. The drop in 3DMark points is smaller than when using a GPU.
Anyway, that doesn't take away that gather/scatter support would still make a critical difference in texture sampling efficiency for the CPU, and it would also helps speed up other parts of the graphics pipeline.
>>Also, filtering is useless for GPGPU, while graphics on the other hand wants full
>>FP32 filtering (possibly even FP64). These diverging needs are hard to combine into
>>texture units or other highly specialized load/store units.
>
>All recent GPUs can do full-speed FP32 filtering, and filtering is very important
>for GPGPU too because many GPGPU applications are actually doing graphics and so
>they can also benefit from texture hardware.
No they can't: http://www.anandtech.com/show/4008/nvidias-geforce-gtx-580/2
>>So it becomes very tempting to just slap all these different forms of memory access
>>together, have generic load/store units, and let the cache hierarchy take care of local and temporal coherence.
>
>Tempting for who? All hardware available now or in the design pipe is going in
>the other direction, how do you fail to see this basic, hard fact?
What other direction? The latest GPUs now have the ability to gather any of the four components of a 2x2 footprint and query the texture LOD, from within the shader! They're also performing part of the addressing in the shader units. So it's already really close to being fully programmable.
The fact of the matter is that there's a massive number of shader units and they're frequently underutilized. So why have dedicated hardware for addressing and filtering? By slimming down the texture units they can have more of them, and by unifying them with load/store units the bottlenecks are reduced.
I'll respond to the rest of your visionless arguments some other day. Got a SIGGRAPH submission deadline coming up...
Topic | Posted By | Date |
---|---|---|
Sandy Bridge CPU article online | David Kanter | 2010/09/26 09:35 PM |
Sandy Bridge CPU article online | Alex | 2010/09/27 05:22 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:06 AM |
Sandy Bridge CPU article online | someone | 2010/09/27 06:03 AM |
Sandy Bridge CPU article online | slacker | 2010/09/27 02:08 PM |
PowerPC is now Power | Paul A. Clayton | 2010/09/27 04:34 PM |
Sandy Bridge CPU article online | Dave | 2010/11/10 10:15 PM |
Sandy Bridge CPU article online | someone | 2010/09/27 06:23 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 06:39 PM |
Optimizing register clear | Paul A. Clayton | 2010/09/28 12:34 PM |
Sandy Bridge CPU article online | MS | 2010/09/27 06:54 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:15 AM |
Sandy Bridge CPU article online | MS | 2010/09/27 11:02 AM |
Sandy Bridge CPU article online | mpx | 2010/09/27 11:44 AM |
Sandy Bridge CPU article online | MS | 2010/09/27 02:37 PM |
Precisely | David Kanter | 2010/09/27 03:22 PM |
Sandy Bridge CPU article online | Richard Cownie | 2010/09/27 08:27 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:01 AM |
Sandy Bridge CPU article online | Richard Cownie | 2010/09/27 10:40 AM |
Sandy Bridge CPU article online | boots | 2010/09/27 11:19 AM |
Right, mid-2011, not 2010. Sorry (NT) | Richard Cownie | 2010/09/27 11:42 AM |
bulldozer single thread performance | Max | 2010/09/27 12:57 PM |
bulldozer single thread performance | Matt Waldhauer | 2011/03/02 11:32 AM |
Sandy Bridge CPU article online | Pun Zu | 2010/09/27 11:32 AM |
Sandy Bridge CPU article online | ? | 2010/09/27 11:44 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 01:11 PM |
My opinion is that anything that would take advantage of 256-bit AVX | redpriest | 2010/09/27 01:17 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/27 03:09 PM |
My opinion is that anything that would take advantage of 256-bit AVX | redpriest | 2010/09/27 04:06 PM |
My opinion is that anything that would take advantage of 256-bit AVX | David Kanter | 2010/09/27 05:23 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 03:57 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:35 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Matt Waldhauer | 2010/09/28 10:58 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/27 06:39 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:14 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Megol | 2010/09/28 02:17 AM |
My opinion is that anything that would take advantage of 256-bit AVX | Michael S | 2010/09/28 05:47 AM |
PGI | Carlie Coats | 2010/09/28 10:23 AM |
gfortran... | Carlie Coats | 2010/09/29 09:33 AM |
My opinion is that anything that would take advantage of 256-bit AVX | mpx | 2010/09/28 12:58 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Michael S | 2010/09/28 01:36 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Foo_ | 2010/09/29 01:08 AM |
My opinion is that anything that would take advantage of 256-bit AVX | mpx | 2010/09/28 11:37 AM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/28 01:19 PM |
My opinion is that anything that would take advantage of 256-bit AVX | hobold | 2010/09/28 03:08 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:26 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Anthony | 2010/09/28 10:31 PM |
Sandy Bridge CPU article online | Hans de Vries | 2010/09/27 02:19 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 03:19 PM |
Sandy Bridge CPU article online | -Sweeper_ | 2010/09/27 05:50 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 06:41 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/27 02:55 PM |
Sandy Bridge CPU article online | line98 | 2010/09/27 03:05 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 03:20 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/27 03:23 PM |
Sandy Bridge CPU article online | line98 | 2010/09/27 03:42 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 09:33 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 04:04 PM |
Sandy Bridge CPU article online | Jack | 2010/09/27 04:40 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 11:47 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 11:54 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 11:59 PM |
Sandy Bridge CPU article online | JS | 2010/09/28 01:18 AM |
Sandy Bridge CPU article online | Royi | 2010/09/28 01:31 AM |
Sandy Bridge CPU article online | Jack | 2010/09/28 06:34 AM |
Sandy Bridge CPU article online | Royi | 2010/09/28 08:22 AM |
Sandy Bridge CPU article online | Foo_ | 2010/09/28 12:53 PM |
Sandy Bridge CPU article online | Paul | 2010/09/28 01:17 PM |
Sandy Bridge CPU article online | mpx | 2010/09/28 01:22 PM |
Sandy Bridge CPU article online | anonymous | 2010/09/28 02:06 PM |
Sandy Bridge CPU article online | IntelUser2000 | 2010/09/29 01:49 AM |
Sandy Bridge CPU article online | Jack | 2010/09/28 05:08 PM |
Sandy Bridge CPU article online | mpx | 2010/09/29 01:50 AM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/29 12:01 PM |
Sandy Bridge CPU article online | Royi | 2010/09/29 12:48 PM |
Sandy Bridge CPU article online | mpx | 2010/09/29 02:15 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/29 02:27 PM |
Sandy Bridge CPU article online | ? | 2010/09/29 11:18 PM |
Sandy Bridge CPU article online | savantu | 2010/09/30 12:28 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 03:43 AM |
Sandy Bridge CPU article online | gallier2 | 2010/09/30 04:18 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 08:38 AM |
Sandy Bridge CPU article online | David Hess | 2010/09/30 10:28 AM |
moderation (again) | hobold | 2010/10/01 05:08 AM |
Sandy Bridge CPU article online | Megol | 2010/09/30 02:13 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 03:47 AM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 08:54 AM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 10:18 AM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 12:04 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 12:38 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/30 01:02 PM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 08:09 PM |
Sandy Bridge CPU article online | mpx | 2010/09/30 12:40 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 01:00 PM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 08:44 PM |
Sandy Bridge CPU article online | David Hess | 2010/09/30 10:36 AM |
Sandy Bridge CPU article online | someone | 2010/09/30 11:23 AM |
Sandy Bridge CPU article online | mpx | 2010/09/30 01:50 PM |
wii lesson | Michael S | 2010/09/30 02:12 PM |
wii lesson | Dan Downs | 2010/09/30 03:33 PM |
wii lesson | Kevin G | 2010/10/01 12:27 AM |
wii lesson | Rohit | 2010/10/01 07:53 AM |
wii lesson | Kevin G | 2010/10/02 03:30 AM |
wii lesson | mpx | 2010/10/01 09:02 AM |
wii lesson | IntelUser2000 | 2010/10/01 09:31 AM |
GPUs and games | David Kanter | 2010/09/30 08:17 PM |
GPUs and games | hobold | 2010/10/01 05:27 AM |
GPUs and games | anonymous | 2010/10/01 06:35 AM |
GPUs and games | Gabriele Svelto | 2010/10/01 09:07 AM |
GPUs and games | Linus Torvalds | 2010/10/01 10:41 AM |
GPUs and games | Anon | 2010/10/01 11:23 AM |
Can Intel do *this* ??? | Mark Roulo | 2010/10/03 03:17 PM |
Can Intel do *this* ??? | Anon | 2010/10/03 03:29 PM |
Can Intel do *this* ??? | Mark Roulo | 2010/10/03 03:55 PM |
Can Intel do *this* ??? | Anon | 2010/10/03 05:45 PM |
Can Intel do *this* ??? | Ian Ameline | 2010/10/03 10:35 PM |
Graphics, IGPs, and Cache | Joe | 2010/10/10 09:51 AM |
Graphics, IGPs, and Cache | Anon | 2010/10/10 10:18 PM |
Graphics, IGPs, and Cache | Rohit | 2010/10/11 06:14 AM |
Graphics, IGPs, and Cache | hobold | 2010/10/11 06:43 AM |
Maybe the IGPU doesn't load into the L3 | Mark Roulo | 2010/10/11 08:05 AM |
Graphics, IGPs, and Cache | David Kanter | 2010/10/11 09:01 AM |
Can Intel do *this* ??? | Gabriele Svelto | 2010/10/04 12:31 AM |
Kanter's Law. | Ian Ameline | 2010/10/01 02:05 PM |
Kanter's Law. | David Kanter | 2010/10/01 02:18 PM |
Kanter's Law. | Ian Ameline | 2010/10/01 02:33 PM |
Kanter's Law. | Kevin G | 2010/10/01 04:19 PM |
Kanter's Law. | IntelUser2000 | 2010/10/01 10:36 PM |
Kanter's Law. | Kevin G | 2010/10/02 03:15 AM |
Kanter's Law. | IntelUser2000 | 2010/10/02 02:35 PM |
Wii vs pc's | Rohit | 2010/10/01 07:34 PM |
Wii vs pc's | Gabriele Svelto | 2010/10/01 11:54 PM |
GPUs and games | mpx | 2010/10/02 11:30 AM |
GPUs and games | Foo_ | 2010/10/02 04:03 PM |
GPUs and games | mpx | 2010/10/03 11:29 AM |
GPUs and games | Foo_ | 2010/10/03 01:52 PM |
GPUs and games | mpx | 2010/10/03 03:29 PM |
GPUs and games | Anon | 2010/10/03 03:49 PM |
GPUs and games | mpx | 2010/10/04 11:42 AM |
GPUs and games | MS | 2010/10/04 02:51 PM |
GPUs and games | Anon | 2010/10/04 08:29 PM |
persistence of vision | hobold | 2010/10/04 11:47 PM |
GPUs and games | mpx | 2010/10/05 12:51 AM |
GPUs and games | MS | 2010/10/05 06:49 AM |
GPUs and games | Jack | 2010/10/05 11:17 AM |
GPUs and games | MS | 2010/10/05 05:19 PM |
GPUs and games | Jack | 2010/10/05 11:11 AM |
GPUs and games | mpx | 2010/10/05 12:51 PM |
GPUs and games | David Kanter | 2010/10/06 09:04 AM |
GPUs and games | jack | 2010/10/06 09:34 PM |
GPUs and games | Linus Torvalds | 2010/10/05 07:29 AM |
GPUs and games | Foo_ | 2010/10/04 04:49 AM |
GPUs and games | Jeremiah | 2010/10/08 10:58 AM |
GPUs and games | MS | 2010/10/08 01:37 PM |
GPUs and games | Salvatore De Dominicis | 2010/10/04 01:41 AM |
GPUs and games | Kevin G | 2010/10/05 02:13 PM |
GPUs and games | mpx | 2010/10/03 11:36 AM |
GPUs and games | David Kanter | 2010/10/04 07:08 AM |
GPUs and games | Kevin G | 2010/10/04 10:38 AM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 09:19 PM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 12:06 PM |
Sandy Bridge CPU article online | rwessel | 2010/09/30 02:29 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/30 03:06 PM |
Sandy Bridge CPU article online | rwessel | 2010/09/30 06:55 PM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 03:53 AM |
Sandy Bridge CPU article online | rwessel | 2010/10/01 08:30 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 09:31 AM |
Sandy Bridge CPU article online | rwessel | 2010/10/01 10:56 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:28 PM |
Sandy Bridge CPU article online | Ricardo B | 2010/10/02 05:38 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/02 06:59 PM |
which bus more wasteful | Michael S | 2010/10/02 10:38 AM |
which bus more wasteful | rwessel | 2010/10/02 07:15 PM |
Sandy Bridge CPU article online | Ricardo B | 2010/10/01 10:08 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:31 PM |
Sandy Bridge CPU article online | Andi Kleen | 2010/10/01 11:55 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:32 PM |
Sandy Bridge CPU article online | kdg | 2010/10/01 11:26 AM |
Sandy Bridge CPU article online | Anon | 2010/10/01 11:33 AM |
Analog display out? | David Kanter | 2010/10/01 01:05 PM |
Analog display out? | mpx | 2010/10/02 11:46 AM |
Analog display out? | Anon | 2010/10/03 03:26 PM |
Digital is expensive! | David Kanter | 2010/10/03 06:36 PM |
Digital is expensive! | Anon | 2010/10/03 08:07 PM |
Digital is expensive! | David Kanter | 2010/10/03 10:02 PM |
Digital is expensive! | Steve Underwood | 2010/10/04 03:52 AM |
Digital is expensive! | David Kanter | 2010/10/04 07:03 AM |
Digital is expensive! | anonymous | 2010/10/04 07:11 AM |
Digital is not very expensive! | Steve Underwood | 2010/10/04 06:08 PM |
Digital is not very expensive! | Anon | 2010/10/04 08:33 PM |
Digital is not very expensive! | Steve Underwood | 2010/10/04 11:03 PM |
Digital is not very expensive! | mpx | 2010/10/05 01:10 PM |
Digital is not very expensive! | Gabriele Svelto | 2010/10/05 12:24 AM |
Digital is expensive! | jal142 | 2010/10/04 11:46 AM |
Digital is expensive! | mpx | 2010/10/04 01:04 AM |
Digital is expensive! | Gabriele Svelto | 2010/10/04 03:28 AM |
Digital is expensive! | Mark Christiansen | 2010/10/04 03:12 PM |
Analog display out? | slacker | 2010/10/03 06:44 PM |
Analog display out? | Anon | 2010/10/03 08:05 PM |
Analog display out? | Steve Underwood | 2010/10/04 03:48 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:37 PM |
Sandy Bridge CPU article online | slacker | 2010/10/02 02:53 PM |
Sandy Bridge CPU article online | David Hess | 2010/10/02 06:49 PM |
memory bandwith | Max | 2010/09/30 12:19 PM |
memory bandwith | Anon | 2010/10/01 11:28 AM |
memory bandwith | Jack | 2010/10/01 07:45 PM |
memory bandwith | Anon | 2010/10/03 03:19 PM |
Sandy Bridge CPU article online | PiedPiper | 2010/09/30 07:05 PM |
Sandy Bridge CPU article online | Matt Sayler | 2010/09/29 04:38 PM |
Sandy Bridge CPU article online | Jack | 2010/09/29 09:39 PM |
Sandy Bridge CPU article online | mpx | 2010/09/30 12:24 AM |
Sandy Bridge CPU article online | passer | 2010/09/30 03:15 AM |
Sandy Bridge CPU article online | mpx | 2010/09/30 03:47 AM |
Sandy Bridge CPU article online | passer | 2010/09/30 04:25 AM |
SB and web browsing | Rohit | 2010/09/30 06:47 AM |
SB and web browsing | David Hess | 2010/09/30 07:10 AM |
SB and web browsing | MS | 2010/09/30 10:21 AM |
SB and web browsing | passer | 2010/09/30 10:26 AM |
SB and web browsing | MS | 2010/10/02 06:41 PM |
SB and web browsing | Rohit | 2010/10/01 08:02 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/30 08:35 AM |
Sandy Bridge CPU article online | Jack | 2010/09/30 10:40 PM |
processor evolution | hobold | 2010/09/29 02:16 PM |
processor evolution | Foo_ | 2010/09/30 06:10 AM |
processor evolution | Jack | 2010/09/30 07:07 PM |
3D gaming as GPGPU app | hobold | 2010/10/01 04:59 AM |
3D gaming as GPGPU app | Jack | 2010/10/01 07:39 PM |
processor evolution | hobold | 2010/10/01 04:35 AM |
processor evolution | David Kanter | 2010/10/01 10:02 AM |
processor evolution | Anon | 2010/10/01 11:46 AM |
Display | David Kanter | 2010/10/01 01:26 PM |
Display | Rohit | 2010/10/02 02:56 AM |
Display | Linus Torvalds | 2010/10/02 07:40 AM |
Display | rwessel | 2010/10/02 08:58 AM |
Display | sJ | 2010/10/02 10:28 PM |
Display | rwessel | 2010/10/03 08:38 AM |
Display | Anon | 2010/10/03 03:06 PM |
Display tech and compute are different | David Kanter | 2010/10/03 06:33 PM |
Display tech and compute are different | Anon | 2010/10/03 08:16 PM |
Display tech and compute are different | David Kanter | 2010/10/03 10:00 PM |
Display tech and compute are different | hobold | 2010/10/04 01:40 AM |
Display | ? | 2010/10/03 03:02 AM |
Display | Linus Torvalds | 2010/10/03 10:18 AM |
Display | Richard Cownie | 2010/10/03 11:12 AM |
Display | Linus Torvalds | 2010/10/03 12:16 PM |
Display | slacker | 2010/10/03 07:35 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/04 07:06 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/04 11:44 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/04 02:59 PM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/04 03:13 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/04 08:58 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/05 01:39 AM |
current V12 engines with >6.0 displacement | MS | 2010/10/05 06:57 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/05 01:20 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/05 09:26 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 05:39 AM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 01:22 PM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/06 03:07 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 03:56 PM |
current V12 engines with >6.0 displacement | rwessel | 2010/10/06 03:30 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 03:53 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/07 01:32 PM |
current V12 engines with >6.0 displacement | rwessel | 2010/10/07 07:54 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 09:02 PM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | slacker | 2010/10/06 07:20 PM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | Ricardo B | 2010/10/07 01:32 AM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | slacker | 2010/10/07 08:15 AM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | Ricardo B | 2010/10/07 10:51 AM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 05:03 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 06:26 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 11:15 PM |
current V12 engines with >6.0 displacement | Howard Chu | 2010/10/07 02:16 PM |
current V12 engines with >6.0 displacement | Anon | 2010/10/05 10:31 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 05:55 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/06 06:15 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 06:34 AM |
I wonder is there any tech area that this forum doesn't have an opinion on (NT) | Rob Thorpe | 2010/10/06 10:11 AM |
Cunieform tablets | David Kanter | 2010/10/06 12:57 PM |
Cunieform tablets | Linus Torvalds | 2010/10/06 01:06 PM |
Ouch...maybe I should hire a new editor (NT) | David Kanter | 2010/10/06 04:38 PM |
Cunieform tablets | rwessel | 2010/10/06 03:41 PM |
Cunieform tablets | seni | 2010/10/07 10:56 AM |
Cunieform tablets | Howard Chu | 2010/10/07 01:44 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/06 06:10 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/06 10:44 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 07:55 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:51 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 07:38 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:33 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 09:04 PM |
Practical vehicles for commuting | Rob Thorpe | 2010/10/08 05:50 AM |
Practical vehicles for commuting | Gabriele Svelto | 2010/10/08 06:05 AM |
Practical vehicles for commuting | Rob Thorpe | 2010/10/08 06:21 AM |
Practical vehicles for commuting | j | 2010/10/08 02:20 PM |
Practical vehicles for commuting | Rob Thorpe | 2010/12/09 07:00 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/08 10:14 AM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/07 01:23 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/07 04:08 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 05:41 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 08:05 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:52 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/08 07:52 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 11:28 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 12:37 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/07 01:37 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/05 02:02 AM |
Display | Linus Torvalds | 2010/10/04 10:39 AM |
Display | Gabriele Svelto | 2010/10/05 12:34 AM |
Display | Richard Cownie | 2010/10/04 06:22 AM |
Display | anon | 2010/10/04 09:22 PM |
Display | Richard Cownie | 2010/10/05 06:42 AM |
Display | mpx | 2010/10/03 11:55 AM |
Display | rcf | 2010/10/03 01:12 PM |
Display | mpx | 2010/10/03 02:36 PM |
Display | rcf | 2010/10/03 05:36 PM |
Display | Ricardo B | 2010/10/04 02:50 PM |
Display | gallier2 | 2010/10/05 03:44 AM |
Display | David Hess | 2010/10/05 05:21 AM |
Display | gallier2 | 2010/10/05 08:21 AM |
Display | David Hess | 2010/10/03 11:21 PM |
Display | rcf | 2010/10/04 08:06 AM |
Display | David Kanter | 2010/10/03 01:54 PM |
Alternative integration | Paul A. Clayton | 2010/10/06 08:51 AM |
Display | slacker | 2010/10/03 07:26 PM |
Display & marketing & analogies | ? | 2010/10/04 02:33 AM |
Display & marketing & analogies | kdg | 2010/10/04 06:00 AM |
Display | Kevin G | 2010/10/02 09:49 AM |
Display | Anon | 2010/10/03 03:43 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/29 03:17 PM |
Sandy Bridge CPU article online | Jack | 2010/09/28 06:27 AM |
Sandy Bridge CPU article online | IntelUser2000 | 2010/09/28 03:07 AM |
Sandy Bridge CPU article online | mpx | 2010/09/28 12:34 PM |
Sandy Bridge CPU article online | Aaron Spink | 2010/09/28 01:28 PM |
Sandy Bridge CPU article online | JoshW | 2010/09/28 02:13 PM |
Sandy Bridge CPU article online | mpx | 2010/09/28 02:54 PM |
Sandy Bridge CPU article online | Foo_ | 2010/09/29 01:19 AM |
Sandy Bridge CPU article online | mpx | 2010/09/29 03:06 AM |
Sandy Bridge CPU article online | JS | 2010/09/29 03:42 AM |
Sandy Bridge CPU article online | mpx | 2010/09/29 04:03 AM |
Sandy Bridge CPU article online | Foo_ | 2010/09/29 05:55 AM |
Sandy Bridge CPU article online | ajensen | 2010/09/28 12:19 AM |
Sandy Bridge CPU article online | Ian Ollmann | 2010/09/28 04:52 PM |
Sandy Bridge CPU article online | a reader | 2010/09/28 05:05 PM |
Sandy Bridge CPU article online | ajensen | 2010/09/28 11:35 PM |
Updated: Sandy Bridge CPU article | David Kanter | 2010/10/01 05:11 AM |
Updated: Sandy Bridge CPU article | anon | 2011/01/07 09:55 PM |
Updated: Sandy Bridge CPU article | Eric Bron | 2011/01/08 03:29 AM |
Updated: Sandy Bridge CPU article | anon | 2011/01/11 11:24 PM |
Updated: Sandy Bridge CPU article | anon | 2011/01/15 11:21 AM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anon | 2011/01/16 11:22 PM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anonymous | 2011/01/17 02:04 AM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anon | 2011/01/17 07:12 AM |
I can try.... | David Kanter | 2011/01/18 03:54 PM |
I can try.... | anon | 2011/01/18 08:07 PM |
I can try.... | David Kanter | 2011/01/18 11:24 PM |
I can try.... | anon | 2011/01/19 07:51 AM |
Wider fetch than execute makes sense | Paul A. Clayton | 2011/01/19 08:53 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/04 07:29 AM |
Sandy Bridge CPU article online | Seni | 2011/01/04 09:07 PM |
Sandy Bridge CPU article online | hobold | 2011/01/04 11:26 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 02:01 AM |
software assist exceptions | hobold | 2011/01/05 04:36 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 01:58 AM |
Sandy Bridge CPU article online | anon | 2011/01/05 04:51 AM |
Sandy Bridge CPU article online | Seni | 2011/01/05 08:53 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 09:03 AM |
Sandy Bridge CPU article online | anon | 2011/01/05 04:14 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 04:50 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/05 05:00 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 07:26 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/05 07:50 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 08:39 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 03:50 PM |
permuting vector elements | hobold | 2011/01/05 05:03 PM |
permuting vector elements | Nicolas Capens | 2011/01/05 06:01 PM |
permuting vector elements | Nicolas Capens | 2011/01/06 08:27 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/11 11:33 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/11 01:51 PM |
Sandy Bridge CPU article online | hobold | 2011/01/11 02:11 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/11 06:07 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/12 03:25 AM |
Sandy Bridge CPU article online | hobold | 2011/01/12 05:03 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/12 11:27 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/13 02:38 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/13 03:32 AM |
Sandy Bridge CPU article online | hobold | 2011/01/13 01:53 PM |
What happened to VPERMIL2PS? | Michael S | 2011/01/13 03:46 AM |
What happened to VPERMIL2PS? | Eric Bron | 2011/01/13 06:46 AM |
Lower cost permute | Paul A. Clayton | 2011/01/13 12:11 PM |
Sandy Bridge CPU article online | anon | 2011/01/25 06:31 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/12 06:34 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/13 07:38 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/15 09:47 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/16 03:13 AM |
And just to make a further example | Gabriele Svelto | 2011/01/16 04:24 AM |
Sandy Bridge CPU article online | mpx | 2011/01/16 01:27 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/25 02:56 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/25 04:11 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/26 08:49 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/26 04:35 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/27 02:51 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/27 02:40 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/28 03:24 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/28 03:49 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/30 02:11 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/31 03:43 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/01 04:02 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/01 04:28 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/01 04:43 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/28 07:14 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/01 02:58 AM |
Sandy Bridge CPU article online | EduardoS | 2011/02/01 02:36 PM |
Sandy Bridge CPU article online | anon | 2011/02/01 04:56 PM |
Sandy Bridge CPU article online | EduardoS | 2011/02/01 09:17 PM |
Sandy Bridge CPU article online | anon | 2011/02/01 10:13 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/02 04:08 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/02 04:26 AM |
Sandy Bridge CPU article online | kalmaegi | 2011/02/01 09:29 AM |
SW Rasterization | David Kanter | 2011/01/27 05:18 PM |
Lower pin count memory | iz | 2011/01/27 09:19 PM |
Lower pin count memory | David Kanter | 2011/01/27 09:25 PM |
Lower pin count memory | iz | 2011/01/27 11:31 PM |
Lower pin count memory | David Kanter | 2011/01/27 11:52 PM |
Lower pin count memory | iz | 2011/01/28 12:28 AM |
Lower pin count memory | David Kanter | 2011/01/28 01:05 AM |
Lower pin count memory | iz | 2011/01/28 03:55 AM |
Lower pin count memory | David Hess | 2011/01/28 01:15 PM |
Lower pin count memory | David Kanter | 2011/01/28 01:57 PM |
Lower pin count memory | iz | 2011/01/28 05:20 PM |
Two years later | ForgotPants | 2013/10/26 11:33 AM |
Two years later | anon | 2013/10/26 11:36 AM |
Two years later | Exophase | 2013/10/26 12:56 PM |
Two years later | David Hess | 2013/10/26 05:05 PM |
Herz is totally the thing you DON*T care. | Jouni Osmala | 2013/10/27 01:48 AM |
Herz is totally the thing you DON*T care. | EduardoS | 2013/10/27 07:00 AM |
Herz is totally the thing you DON*T care. | Michael S | 2013/10/27 07:45 AM |
Two years later | someone | 2013/10/28 07:21 AM |
Lower pin count memory | Martin Høyer Kristiansen | 2011/01/28 01:41 AM |
Lower pin count memory | iz | 2011/01/28 03:07 AM |
Lower pin count memory | Darrell Coker | 2011/01/27 10:39 PM |
Lower pin count memory | iz | 2011/01/28 12:20 AM |
Lower pin count memory | Darrell Coker | 2011/01/28 06:07 PM |
Lower pin count memory | iz | 2011/01/28 11:57 PM |
Lower pin count memory | Darrell Coker | 2011/01/29 02:21 AM |
Lower pin count memory | iz | 2011/01/31 10:28 PM |
SW Rasterization | Nicolas Capens | 2011/02/02 08:48 AM |
SW Rasterization | Eric Bron | 2011/02/02 09:37 AM |
SW Rasterization | Nicolas Capens | 2011/02/02 04:35 PM |
SW Rasterization | Eric Bron | 2011/02/02 05:11 PM |
SW Rasterization | Eric Bron | 2011/02/03 02:13 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 07:57 AM |
SW Rasterization | Eric Bron | 2011/02/04 08:50 AM |
erratum | Eric Bron | 2011/02/04 08:58 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 05:25 PM |
SW Rasterization | David Kanter | 2011/02/04 05:33 PM |
SW Rasterization | anon | 2011/02/04 06:04 PM |
SW Rasterization | Nicolas Capens | 2011/02/05 03:39 PM |
SW Rasterization | David Kanter | 2011/02/05 05:07 PM |
SW Rasterization | Nicolas Capens | 2011/02/05 11:39 PM |
SW Rasterization | Eric Bron | 2011/02/04 10:55 AM |
Comments pt 1 | David Kanter | 2011/02/02 01:08 PM |
Comments pt 1 | Eric Bron | 2011/02/02 03:16 PM |
Comments pt 1 | Gabriele Svelto | 2011/02/03 01:37 AM |
Comments pt 1 | Eric Bron | 2011/02/03 02:36 AM |
Comments pt 1 | Nicolas Capens | 2011/02/03 11:08 PM |
Comments pt 1 | Nicolas Capens | 2011/02/03 10:26 PM |
Comments pt 1 | Eric Bron | 2011/02/04 03:33 AM |
Comments pt 1 | Nicolas Capens | 2011/02/04 05:24 AM |
example code | Eric Bron | 2011/02/04 04:51 AM |
example code | Nicolas Capens | 2011/02/04 08:24 AM |
example code | Eric Bron | 2011/02/04 08:36 AM |
example code | Nicolas Capens | 2011/02/05 11:43 PM |
Comments pt 1 | Rohit | 2011/02/04 12:43 PM |
Comments pt 1 | Nicolas Capens | 2011/02/04 05:05 PM |
Comments pt 1 | David Kanter | 2011/02/04 05:36 PM |
Comments pt 1 | Nicolas Capens | 2011/02/05 02:45 PM |
Comments pt 1 | Eric Bron | 2011/02/05 04:13 PM |
Comments pt 1 | Nicolas Capens | 2011/02/05 11:52 PM |
Comments pt 1 | Eric Bron | 2011/02/06 01:31 AM |
Comments pt 1 | Nicolas Capens | 2011/02/06 04:06 PM |
Comments pt 1 | Eric Bron | 2011/02/07 03:12 AM |
The need for gather/scatter support | Nicolas Capens | 2011/02/10 10:07 AM |
The need for gather/scatter support | Eric Bron | 2011/02/11 03:11 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/13 03:39 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 07:46 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:48 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 09:32 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 10:07 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 09:00 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:49 AM |
Gather/scatter performance data | Eric Bron | 2011/02/15 02:23 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 05:06 PM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:52 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 09:43 AM |
SW Rasterization - a long way off | Rohit | 2011/02/02 01:17 PM |
SW Rasterization - a long way off | Nicolas Capens | 2011/02/04 03:59 AM |
CPU only rendering - a long way off | Rohit | 2011/02/04 11:52 AM |
CPU only rendering - a long way off | Nicolas Capens | 2011/02/04 07:15 PM |
CPU only rendering - a long way off | Rohit | 2011/02/05 02:00 AM |
CPU only rendering - a long way off | Nicolas Capens | 2011/02/05 09:45 PM |
CPU only rendering - a long way off | David Kanter | 2011/02/06 09:51 PM |
CPU only rendering - a long way off | Gian-Carlo Pascutto | 2011/02/07 12:22 AM |
Encryption | David Kanter | 2011/02/07 01:18 AM |
Encryption | Nicolas Capens | 2011/02/07 07:51 AM |
Encryption | David Kanter | 2011/02/07 11:50 AM |
Encryption | Nicolas Capens | 2011/02/08 10:26 AM |
CPUs are latency optimized | David Kanter | 2011/02/08 11:38 AM |
efficient compiler on an efficient GPU real today. | sJ | 2011/02/08 11:29 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/09 09:49 PM |
CPUs are latency optimized | Eric Bron | 2011/02/10 12:49 AM |
CPUs are latency optimized | Antti-Ville Tuunainen | 2011/02/10 06:16 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/10 07:04 AM |
CPUs are latency optimized | Eric Bron | 2011/02/10 07:48 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/10 01:31 PM |
CPUs are latency optimized | Eric Bron | 2011/02/11 02:43 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/11 07:31 AM |
CPUs are latency optimized | EduardoS | 2011/02/10 05:29 PM |
CPUs are latency optimized | Anon | 2011/02/10 06:40 PM |
CPUs are latency optimized | David Kanter | 2011/02/10 08:33 PM |
CPUs are latency optimized | EduardoS | 2011/02/11 02:18 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/11 05:56 AM |
CPUs are latency optimized | Rohit | 2011/02/11 07:33 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/14 02:19 AM |
CPUs are latency optimized | Eric Bron | 2011/02/14 03:23 AM |
CPUs are latency optimized | EduardoS | 2011/02/14 01:11 PM |
CPUs are latency optimized | David Kanter | 2011/02/11 02:45 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/15 05:22 AM |
CPUs are latency optimized | David Kanter | 2011/02/15 12:47 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/15 07:10 PM |
Have fun | David Kanter | 2011/02/15 10:04 PM |
Have fun | Nicolas Capens | 2011/02/17 03:59 AM |
Have fun | Brett | 2011/02/17 12:56 PM |
Have fun | Nicolas Capens | 2011/02/19 04:53 PM |
Have fun | Brett | 2011/02/20 06:08 PM |
Have fun | Brett | 2011/02/20 07:13 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/02/23 05:37 PM |
On-die storage to fight Amdahl | Brett | 2011/02/23 09:59 PM |
On-die storage to fight Amdahl | Brett | 2011/02/23 10:08 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/02/24 07:42 PM |
On-die storage to fight Amdahl | Rohit | 2011/02/25 11:02 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/03/09 06:53 PM |
On-die storage to fight Amdahl | Rohit | 2011/03/10 08:02 AM |
NVIDIA using tile based rendering? | Nathan Monson | 2011/03/11 07:58 PM |
NVIDIA using tile based rendering? | Rohit | 2011/03/12 04:29 AM |
NVIDIA using tile based rendering? | Nathan Monson | 2011/03/12 11:05 AM |
NVIDIA using tile based rendering? | Rohit | 2011/03/12 11:16 AM |
On-die storage to fight Amdahl | Brett | 2011/02/26 02:10 AM |
On-die storage to fight Amdahl | Nathan Monson | 2011/02/26 01:51 PM |
On-die storage to fight Amdahl | Brett | 2011/02/26 04:40 PM |
Convergence is inevitable | Nicolas Capens | 2011/03/09 08:22 PM |
Convergence is inevitable | Brett | 2011/03/09 10:59 PM |
Convergence is inevitable | Antti-Ville Tuunainen | 2011/03/10 03:34 PM |
Convergence is inevitable | Brett | 2011/03/10 09:39 PM |
Procedural texturing? | David Kanter | 2011/03/11 01:32 AM |
Procedural texturing? | hobold | 2011/03/11 03:59 AM |
Procedural texturing? | Dan Downs | 2011/03/11 09:28 AM |
Procedural texturing? | Mark Roulo | 2011/03/11 02:58 PM |
Procedural texturing? | Anon | 2011/03/11 06:11 PM |
Procedural texturing? | Nathan Monson | 2011/03/11 07:30 PM |
Procedural texturing? | Brett | 2011/03/15 07:45 AM |
Procedural texturing? | Seni | 2011/03/15 10:13 AM |
Procedural texturing? | Brett | 2011/03/15 11:45 AM |
Procedural texturing? | Seni | 2011/03/15 02:09 PM |
Procedural texturing? | Brett | 2011/03/11 10:02 PM |
Procedural texturing? | Brett | 2011/03/11 09:34 PM |
Procedural texturing? | Eric Bron | 2011/03/12 03:37 AM |
Convergence is inevitable | Jouni Osmala | 2011/03/09 11:28 PM |
Convergence is inevitable | Brett | 2011/04/05 05:08 PM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 05:23 AM |
Convergence is inevitable | none | 2011/04/07 07:03 AM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 10:34 AM |
Convergence is inevitable | anon | 2011/04/07 02:15 PM |
Convergence is inevitable | none | 2011/04/08 01:57 AM |
Convergence is inevitable | Brett | 2011/04/07 08:04 PM |
Convergence is inevitable | none | 2011/04/08 02:14 AM |
Gather implementation | David Kanter | 2011/04/08 12:01 PM |
RAM Latency | David Hess | 2011/04/07 08:22 AM |
RAM Latency | Brett | 2011/04/07 07:20 PM |
RAM Latency | Nicolas Capens | 2011/04/07 10:18 PM |
RAM Latency | Brett | 2011/04/08 05:33 AM |
RAM Latency | Nicolas Capens | 2011/04/10 02:23 PM |
RAM Latency | Rohit | 2011/04/08 06:57 AM |
RAM Latency | Nicolas Capens | 2011/04/10 01:23 PM |
RAM Latency | David Kanter | 2011/04/10 02:27 PM |
RAM Latency | Rohit | 2011/04/11 06:17 AM |
Convergence is inevitable | Eric Bron | 2011/04/07 09:46 AM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 09:50 PM |
Convergence is inevitable | Eric Bron | 2011/04/08 12:39 AM |
Flaws in PowerVR | Rohit | 2011/02/25 11:21 PM |
Flaws in PowerVR | Brett | 2011/02/26 12:37 AM |
Flaws in PowerVR | Paul | 2011/02/26 05:17 AM |
Have fun | David Kanter | 2011/02/18 12:52 PM |
Have fun | Michael S | 2011/02/19 12:12 PM |
Have fun | David Kanter | 2011/02/19 03:26 PM |
Have fun | Michael S | 2011/02/19 04:43 PM |
Have fun | anon | 2011/02/19 05:02 PM |
Have fun | Michael S | 2011/02/19 05:56 PM |
Have fun | anon | 2011/02/20 03:50 PM |
Have fun | EduardoS | 2011/02/20 02:44 PM |
Linear vs non-linear | EduardoS | 2011/02/20 02:55 PM |
Have fun | Michael S | 2011/02/20 04:19 PM |
Have fun | EduardoS | 2011/02/20 05:51 PM |
Have fun | Nicolas Capens | 2011/02/21 11:12 AM |
Have fun | Michael S | 2011/02/21 12:38 PM |
Have fun | Eric Bron | 2011/02/21 02:10 PM |
Have fun | Eric Bron | 2011/02/21 02:39 PM |
Have fun | Michael S | 2011/02/21 06:13 PM |
Have fun | Eric Bron | 2011/02/22 12:43 AM |
Have fun | Michael S | 2011/02/22 01:47 AM |
Have fun | Eric Bron | 2011/02/22 02:10 AM |
Have fun | Michael S | 2011/02/22 11:37 AM |
Have fun | anon | 2011/02/22 01:38 PM |
Have fun | EduardoS | 2011/02/22 03:49 PM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/23 06:37 PM |
Gather/scatter efficiency | anonymous | 2011/02/23 06:51 PM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/24 06:57 PM |
Gather/scatter efficiency | anonymous | 2011/02/24 07:16 PM |
Gather/scatter efficiency | Michael S | 2011/02/25 07:45 AM |
Gather implementation | David Kanter | 2011/02/25 05:34 PM |
Gather implementation | Michael S | 2011/02/26 10:40 AM |
Gather implementation | anon | 2011/02/26 11:52 AM |
Gather implementation | Michael S | 2011/02/26 12:16 PM |
Gather implementation | anon | 2011/02/26 11:22 PM |
Gather implementation | Michael S | 2011/02/27 07:23 AM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/28 03:14 PM |
Consider yourself ignored | David Kanter | 2011/02/22 01:05 AM |
one more anti-FMA flame. By me. | Michael S | 2011/02/16 07:40 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/16 08:30 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/16 09:15 AM |
one more anti-FMA flame. By me. | Nicolas Capens | 2011/02/17 06:27 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/17 07:42 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/17 05:46 PM |
Tarantula paper | Paul A. Clayton | 2011/02/18 12:38 AM |
Tarantula paper | Nicolas Capens | 2011/02/19 05:19 PM |
anti-FMA != anti-throughput or anti-SG | Eric Bron | 2011/02/18 01:48 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/20 03:46 PM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/20 05:00 PM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 04:05 AM |
Software pipelining on x86 | David Kanter | 2011/02/23 05:04 AM |
Software pipelining on x86 | JS | 2011/02/23 05:25 AM |
Software pipelining on x86 | Salvatore De Dominicis | 2011/02/23 08:37 AM |
Software pipelining on x86 | Jouni Osmala | 2011/02/23 09:10 AM |
Software pipelining on x86 | LeeMiller | 2011/02/23 10:07 PM |
Software pipelining on x86 | Nicolas Capens | 2011/02/24 03:17 PM |
Software pipelining on x86 | anonymous | 2011/02/24 07:04 PM |
Software pipelining on x86 | Nicolas Capens | 2011/02/28 09:27 AM |
Software pipelining on x86 | Antti-Ville Tuunainen | 2011/03/02 04:31 AM |
Software pipelining on x86 | Megol | 2011/03/02 12:55 PM |
Software pipelining on x86 | Geert Bosch | 2011/03/03 07:58 AM |
FMA benefits and latency predictions | David Kanter | 2011/02/25 05:14 PM |
FMA benefits and latency predictions | Antti-Ville Tuunainen | 2011/02/26 10:43 AM |
FMA benefits and latency predictions | Matt Waldhauer | 2011/02/27 06:42 AM |
FMA benefits and latency predictions | Nicolas Capens | 2011/03/09 06:11 PM |
FMA benefits and latency predictions | Rohit | 2011/03/10 08:11 AM |
FMA benefits and latency predictions | Eric Bron | 2011/03/10 09:30 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/23 05:19 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 07:50 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/23 10:37 AM |
FMA and beyond | Nicolas Capens | 2011/02/24 04:47 PM |
detour on terminology | hobold | 2011/02/24 07:08 PM |
detour on terminology | Nicolas Capens | 2011/02/28 02:24 PM |
detour on terminology | Eric Bron | 2011/03/01 02:38 AM |
detour on terminology | Michael S | 2011/03/01 05:03 AM |
detour on terminology | Eric Bron | 2011/03/01 05:39 AM |
detour on terminology | Michael S | 2011/03/01 08:33 AM |
detour on terminology | Eric Bron | 2011/03/01 09:34 AM |
erratum | Eric Bron | 2011/03/01 09:54 AM |
detour on terminology | Nicolas Capens | 2011/03/10 08:39 AM |
detour on terminology | Eric Bron | 2011/03/10 09:50 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 06:12 AM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 11:25 PM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/17 06:51 PM |
Tarantula vector unit well-integrated | Paul A. Clayton | 2011/02/18 12:38 AM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/19 02:17 PM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 02:09 AM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/20 09:55 AM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 01:39 PM |
anti-FMA != anti-throughput or anti-SG | EduardoS | 2011/02/20 02:35 PM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/21 08:12 AM |
anti-FMA != anti-throughput or anti-SG | anon | 2011/02/17 10:44 PM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/18 06:20 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/17 08:24 AM |
thanks | Michael S | 2011/02/17 04:56 PM |
CPUs are latency optimized | EduardoS | 2011/02/15 01:24 PM |
SwiftShader SNB test | Eric Bron | 2011/02/15 03:46 PM |
SwiftShader NHM test | Eric Bron | 2011/02/15 04:50 PM |
SwiftShader SNB test | Nicolas Capens | 2011/02/17 12:06 AM |
SwiftShader SNB test | Eric Bron | 2011/02/17 01:21 AM |
SwiftShader SNB test | Eric Bron | 2011/02/22 10:32 AM |
SwiftShader SNB test 2nd run | Eric Bron | 2011/02/22 10:51 AM |
SwiftShader SNB test 2nd run | Nicolas Capens | 2011/02/23 02:14 PM |
SwiftShader SNB test 2nd run | Eric Bron | 2011/02/23 02:42 PM |
Win7SP1 out but no AVX hype? | Michael S | 2011/02/24 03:14 AM |
Win7SP1 out but no AVX hype? | Eric Bron | 2011/02/24 03:39 AM |
CPUs are latency optimized | Eric Bron | 2011/02/15 08:02 AM |
CPUs are latency optimized | EduardoS | 2011/02/11 03:40 PM |
CPU only rendering - not a long way off | Nicolas Capens | 2011/02/07 06:45 AM |
CPU only rendering - not a long way off | David Kanter | 2011/02/07 12:09 PM |
CPU only rendering - not a long way off | anonymous | 2011/02/07 10:25 PM |
Sandy Bridge IGP EUs | David Kanter | 2011/02/07 11:22 PM |
Sandy Bridge IGP EUs | Hannes | 2011/02/08 05:59 AM |
SW Rasterization - Why? | Seni | 2011/02/02 02:53 PM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/10 03:12 PM |
Market reasons to ditch the IGP | Seni | 2011/02/11 05:42 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/16 04:29 AM |
Market reasons to ditch the IGP | Seni | 2011/02/16 01:39 PM |
An excellent post! | David Kanter | 2011/02/16 03:18 PM |
CPUs clock higher | Moritz | 2011/02/17 08:06 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/18 06:22 PM |
Market reasons to ditch the IGP | IntelUser2000 | 2011/02/18 07:20 PM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/21 02:42 PM |
Bad data (repeated) | David Kanter | 2011/02/22 12:21 AM |
Bad data (repeated) | none | 2011/02/22 03:04 AM |
13W or 8W? | Foo_ | 2011/02/22 06:00 AM |
13W or 8W? | Linus Torvalds | 2011/02/22 08:58 AM |
13W or 8W? | David Kanter | 2011/02/22 11:33 AM |
13W or 8W? | Mark Christiansen | 2011/02/22 02:47 PM |
Bigger picture | Nicolas Capens | 2011/02/24 06:33 PM |
Bigger picture | Nicolas Capens | 2011/02/24 08:06 PM |
20+ Watt | Nicolas Capens | 2011/02/24 08:18 PM |
<20W | David Kanter | 2011/02/25 01:13 PM |
>20W | Nicolas Capens | 2011/03/08 07:34 PM |
IGP is 3X more efficient | David Kanter | 2011/03/08 10:53 PM |
IGP is 3X more efficient | Eric Bron | 2011/03/09 02:44 AM |
>20W | Eric Bron | 2011/03/09 03:48 AM |
Specious data and claims are still specious | David Kanter | 2011/02/25 02:38 AM |
IGP power consumption, LRB samplers | Nicolas Capens | 2011/03/08 06:24 PM |
IGP power consumption, LRB samplers | EduardoS | 2011/03/08 06:52 PM |
IGP power consumption, LRB samplers | Rohit | 2011/03/09 07:42 AM |
Market reasons to ditch the IGP | none | 2011/02/22 02:58 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/24 06:43 PM |
Market reasons to ditch the IGP | slacker | 2011/02/22 02:32 PM |
Market reasons to ditch the IGP | Seni | 2011/02/18 09:51 PM |
Correction - 28 comparators, not 36. (NT) | Seni | 2011/02/18 10:03 PM |
Market reasons to ditch the IGP | Gabriele Svelto | 2011/02/19 01:49 AM |
Market reasons to ditch the IGP | Seni | 2011/02/19 11:59 AM |
Market reasons to ditch the IGP | Exophase | 2011/02/20 10:43 AM |
Market reasons to ditch the IGP | EduardoS | 2011/02/19 10:13 AM |
Market reasons to ditch the IGP | Seni | 2011/02/19 11:46 AM |
The next revolution | Nicolas Capens | 2011/02/22 03:33 AM |
The next revolution | Gabriele Svelto | 2011/02/22 09:15 AM |
The next revolution | Eric Bron | 2011/02/22 09:48 AM |
The next revolution | Nicolas Capens | 2011/02/23 07:39 PM |
The next revolution | Gabriele Svelto | 2011/02/24 12:43 AM |
GPGPU content creation (or lack of it) | Nicolas Capens | 2011/02/28 07:39 AM |
GPGPU content creation (or lack of it) | The market begs to differ | 2011/03/01 06:32 AM |
GPGPU content creation (or lack of it) | Nicolas Capens | 2011/03/09 09:14 PM |
GPGPU content creation (or lack of it) | Gabriele Svelto | 2011/03/10 01:01 AM |
The market begs to differ | Gabriele Svelto | 2011/03/01 06:33 AM |
The next revolution | Anon | 2011/02/24 02:15 AM |
The next revolution | Nicolas Capens | 2011/02/28 02:34 PM |
The next revolution | Seni | 2011/02/22 02:02 PM |
The next revolution | Gabriele Svelto | 2011/02/23 06:27 AM |
The next revolution | Seni | 2011/02/23 09:03 AM |
The next revolution | Nicolas Capens | 2011/02/24 06:11 AM |
The next revolution | Seni | 2011/02/24 08:45 PM |
IGP sampler count | Nicolas Capens | 2011/03/03 05:19 AM |
Latency and throughput optimized cores | Nicolas Capens | 2011/03/07 03:28 PM |
The real reason no IGP /CPU converge. | Jouni Osmala | 2011/03/07 11:34 PM |
Still converging | Nicolas Capens | 2011/03/13 03:08 PM |
Homogeneous CPU advantages | Nicolas Capens | 2011/03/08 12:12 AM |
Homogeneous CPU advantages | Seni | 2011/03/08 09:23 AM |
Homogeneous CPU advantages | David Kanter | 2011/03/08 11:16 AM |
Homogeneous CPU advantages | Brett | 2011/03/09 03:37 AM |
Homogeneous CPU advantages | Jouni Osmala | 2011/03/09 12:27 AM |
SW Rasterization | firsttimeposter | 2011/02/03 11:18 PM |
SW Rasterization | Nicolas Capens | 2011/02/04 04:48 AM |
SW Rasterization | Eric Bron | 2011/02/04 05:14 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 08:36 AM |
SW Rasterization | Eric Bron | 2011/02/04 08:42 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/26 03:23 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/02/04 04:31 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/05 08:46 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/02/06 06:20 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/06 06:07 PM |
Sandy Bridge CPU article online | arch.comp | 2011/01/06 10:58 PM |
Sandy Bridge CPU article online | Seni | 2011/01/07 10:25 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 04:28 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 06:06 AM |
permuting vector elements (yet again) | hobold | 2011/01/05 05:15 PM |
permuting vector elements (yet again) | Nicolas Capens | 2011/01/06 06:11 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/05 12:46 PM |
wow ...! | hobold | 2011/01/05 05:19 PM |
wow ...! | Nicolas Capens | 2011/01/05 06:11 PM |
wow ...! | Eric Bron | 2011/01/05 10:46 PM |
compress LUT | Eric Bron | 2011/01/05 11:05 PM |
wow ...! | Michael S | 2011/01/06 02:25 AM |
wow ...! | Nicolas Capens | 2011/01/06 06:26 AM |
wow ...! | Eric Bron | 2011/01/06 09:08 AM |
wow ...! | Nicolas Capens | 2011/01/07 07:19 AM |
wow ...! | Steve Underwood | 2011/01/07 10:53 PM |
saturation | hobold | 2011/01/08 10:25 AM |
saturation | Steve Underwood | 2011/01/08 12:38 PM |
saturation | Michael S | 2011/01/08 01:05 PM |
128 bit floats | Brett | 2011/01/08 01:39 PM |
128 bit floats | Michael S | 2011/01/08 02:10 PM |
128 bit floats | Anil Maliyekkel | 2011/01/08 03:46 PM |
128 bit floats | Kevin G | 2011/02/27 11:15 AM |
128 bit floats | hobold | 2011/02/27 04:42 PM |
128 bit floats | Ian Ollmann | 2011/02/28 04:56 PM |
OpenCL FP accuracy | hobold | 2011/03/01 06:45 AM |
OpenCL FP accuracy | anon | 2011/03/01 08:03 PM |
OpenCL FP accuracy | hobold | 2011/03/02 03:53 AM |
OpenCL FP accuracy | Eric Bron | 2011/03/02 07:10 AM |
pet project | hobold | 2011/03/02 09:22 AM |
pet project | Anon | 2011/03/02 09:10 PM |
pet project | hobold | 2011/03/03 04:57 AM |
pet project | Eric Bron | 2011/03/03 02:29 AM |
pet project | hobold | 2011/03/03 05:14 AM |
pet project | Eric Bron | 2011/03/03 03:10 PM |
pet project | hobold | 2011/03/03 04:04 PM |
OpenCL and AMD | Vincent Diepeveen | 2011/03/07 01:44 PM |
OpenCL and AMD | Eric Bron | 2011/03/08 02:05 AM |
OpenCL and AMD | Vincent Diepeveen | 2011/03/08 08:27 AM |
128 bit floats | Michael S | 2011/02/27 04:46 PM |
128 bit floats | Anil Maliyekkel | 2011/02/27 06:14 PM |
saturation | Steve Underwood | 2011/01/17 04:42 AM |
wow ...! | hobold | 2011/01/06 05:05 PM |
Ring | Moritz | 2011/01/20 10:51 PM |
Ring | Antti-Ville Tuunainen | 2011/01/21 12:25 PM |
Ring | Moritz | 2011/01/23 01:38 AM |
Ring | Michael S | 2011/01/23 04:04 AM |
So fast | Moritz | 2011/01/23 07:57 AM |
So fast | David Kanter | 2011/01/23 10:05 AM |
Sandy Bridge CPU (L1D cache) | Gordon Ward | 2011/09/09 02:47 AM |
Sandy Bridge CPU (L1D cache) | David Kanter | 2011/09/09 04:19 PM |
Sandy Bridge CPU (L1D cache) | EduardoS | 2011/09/09 08:53 PM |
Sandy Bridge CPU (L1D cache) | Paul A. Clayton | 2011/09/10 05:12 AM |
Sandy Bridge CPU (L1D cache) | Michael S | 2011/09/10 09:41 AM |
Sandy Bridge CPU (L1D cache) | EduardoS | 2011/09/10 11:17 AM |
Address Ports on Sandy Bridge Scheduler | Victor | 2011/10/16 06:40 AM |
Address Ports on Sandy Bridge Scheduler | EduardoS | 2011/10/16 07:45 PM |
Address Ports on Sandy Bridge Scheduler | Megol | 2011/10/17 09:20 AM |
Address Ports on Sandy Bridge Scheduler | Victor | 2011/10/18 05:34 PM |
Benefits of early scheduling | Paul A. Clayton | 2011/10/18 06:53 PM |
Benefits of early scheduling | Victor | 2011/10/19 05:58 PM |
Consistency and invalidation ordering | Paul A. Clayton | 2011/10/20 04:43 AM |
Address Ports on Sandy Bridge Scheduler | John Upcroft | 2011/10/21 04:16 PM |
Address Ports on Sandy Bridge Scheduler | David Kanter | 2011/10/22 10:49 AM |
Address Ports on Sandy Bridge Scheduler | John Upcroft | 2011/10/26 01:24 PM |
Store TLB look-up at commit? | Paul A. Clayton | 2011/10/26 08:30 PM |
Store TLB look-up at commit? | Richard Scott | 2011/10/26 09:40 PM |
Just a guess | Paul A. Clayton | 2011/10/27 01:54 PM |