By: David Kanter (dkanter.delete@this.realworldtech.com), February 8, 2011 11:38 am
Room: Moderated Discussions
Nicolas Capens (nicolas.capens@gmail.com) on 2/8/11 wrote:
---------------------------
>Hi David,
>
>David Kanter (dkanter@realworldtech.com) on 2/7/11 wrote:
>---------------------------
>>Nicolas Capens (nicolas.capens@gmail.com) on 2/7/11 wrote:
>>---------------------------
>>>Hi David,
>>>
>>>David Kanter (dkanter@realworldtech.com) on 2/7/11 wrote:
>>>---------------------------
>>>>The fundamental point is that efficiency *really* does matter a lot. In a world
>>>>where power is the #1 performance limiter, doing stuff in dedicated hardware becomes
>>>>increasingly attractive. If you can cut the energy required to render a frame in
>>>>half by using an IGP (which is quite believable), then you have gained a huge amount
>>>>of energy to expend in the CPU or other areas.
>>>>
>>>>The key is understanding that with Moore's Law, spending more transistors to lower
>>>>power is VERY attractive. That dictates more dedicated hardware, not less. Just
>>>>look at what Intel has done - putting embedded micro-controllers inside CPUs to manage power.
>>>>
>>>>The whole idea of eliminating throughput oriented cores goes against that entirely.
>>>
>>>Who's talking about eliminating throughput oriented cores? I'm talking about adding
>>>gather/scatter to CPU cores so they *become* efficient throughput oriented cores
>>>and the functionality of the IGP can be unified.
>>
>>You are talking about eliminating throughput cores. You claim to have a basic
>>understanding of hardware, so it should be readily apparent how CPU cores and throughput
>>cores (e.g. Niagara, GPU shaders) differ.
>
>Throughput-oriented is a term which originates from server systems, long before
>graphics chips were even called GPUs! It merely means a focus on data rate, potentially
>at the cost of latency. Any multiprocessor system, is a throughput oriented system.
>Clock frequency and ILP are latency-oriented, while DLP and TLP are throughput-oriented.
>
>Today's x86 CPUs exploit DLP and TLP (SIMD, multi-core and Hyper-Threading) and
>have a less aggressive clocking than several years ago.
Actually the clock speeds are about the same 3-4GHz.
>So they are definitely throughput-oriented
>architectures already and would become more efficient at with the addition of gather/scatter
>support. Parallel load/store is the main thing setting them apart from GPUs.
That's simply false. There are many more differences in terms of circuit design and the latency of individual instructions.
>GPUs are, in the words of NVIDIA's chief scientists, "aggressively throughput-oriented
>processors". Note though that GF104 features superscalar >execution, intended to
>lower the latency. And aside from reducing bandwidth, caches also reduce latency.
>So GPUs are forces to become less aggressive at using throughput-oriented techniques,
>because reducing latency somewhat also reduces the amount of on-chip storage you
>need. It's a balancing act, because obviously reducing >latency costs transistors as well.
The two architectures are leagues apart in terms of latency? What is the latency of a dependent chain of adds in a GPU? What about a CPU? What is the branch coherence? What is the memory latency?
The memory latency makes this especially obvious, as there is a huge difference.
>>Even to someone without circuit design
>>expertise, it should be blinding obvious - the clock speeds are about a factor of 2-4X different.
>
>Then why did NVIDIA decide to put its aggressively throughput-oriented cores into
>a higher clock domain? The GeForce GTX 560 Ti has a shader clock of 1645 MHz, while
>the Radeon HD 6950 has a clock of 800 MHz. Does that mean >NVIDIA's architecture is not throughput-oriented?
1.5GHz is still half the speed or less of a modern CPU. AMD has GPUs that run at 900MHz. And those speeds have stayed constant for the last several generations (since about 65nm).
>Clock speed alone isn't an indication of being throughput-oriented or not. It's
>a design decision which doesn't have to compromise *effective* throughput, as proven
>by NVIDIA vs. AMD. Another example is Cell BE, which clocks at 3.2 GHz but even
>at 90 nm was considered strongly thoughput-oriented. x86 CPUs have come a long way
>since the single-core Pentium 4 days, and they're not about to stop increasing throughput
>efficiency (AVX, Bulldozer, FMA, etc.).
Listen, no matter what you say, CPUs are still optimized for latency. It's blatantly obvious if you have ever written code for a CPU. It's also blatantly obvious that GPUs are not optimized for latency. This means different circuits, different architectures, pipeline arrangements, etc. etc. and most importantly a different style of programming.
Consider the performance degradation of running a latency sensitive workload (e.g. SPECint) on a GPU vs. a CPU. It's going to be huge.
>As a matter of fact the CPU's clock frequency has remained nearly constant. The
>i7-2600K has a 3.8 GHz Turbo Mode ( but under multi-threaded workloads it's not
>that high), the same as the 2004 Pentium 4 Prescott. In the same time period, NVIDIA's
>GPUs have increased their clock frequency by over a factor 3. There's no indication
>this is going to change. For NVIDIA to conquer the HPC market, it needs to continue
>investing into latency reduction. To prevent an excessive growth in die size, it
>needs to increase the clock frequency. GF100 had some thermal issues, but they got
>that under control with GF110, which has more cores >enabled and even higher clocks.
Again, frequency is just one aspect. There are so many other things to consider. Pipeline depth, result forwarding, dependent instruction latency, etc. etc.
>So while it's "blinding obvious" that there's a clock frequency difference today,
>it's also "blinding obvious" they're on a collision >course. Gather/scatter support
>is still several years out, so by that time they'll have >converged even closer and gather/scatter is the keystone.
Scatter/gather is helpful, but it will not make CPUs as efficient as GPUs. It doesn't help make the ALUs lower power. It doesn't help make the front-end lower power.
Look at what CPUs have that GPUs do not:
Bypass networks
Branch predictors
Prefetchers
Out-of-order execution
Low latency instructions
etc. etc.
The differences at the circuit level are just as big.
>>You cannot simply tack on scatter/gather to a latency optimized CPU core and expect
>>it to look like a throughput core in terms of power efficiency. At least, there
>>is definitely a lack of evidence for any such claims. Moreover, you need to preserve
>>the power efficiency for workloads that cannot be vectorized.
>
>An architecture which balances latency and theoretical throughput, can still achieve
>high effective thoughput. It's how NVIDIA achieved to >outperform AMD with only half the FLOP density.
That's so wrong it isn't even funny. That's because of AMD's use of static scheduling for their VLIW, and because Nvidia is much more optimized for scalar memory accesses. Has nothing to do with latency vs. throughput.
>The way things converge, tacking on gather/scatter support >does put the GPU within
>striking distance, starting with the IGP. For someone not
>time, a balanced homogenous architecture is the most cost >effective solution for all his processing needs.
I don't believe that a homogeneous architecture is optimal at all, and you have yet to show that in any meaningful way. In fact, you have admitted that it is sub-optimal for power consumption...which means that as long as graphics consumes a non-trivial amount of power, that an IGP will be a superior solution. If there is a day when graphics is merely 1-2% of all cycles, then perhaps it might happen...but I don't see that ever happening.
>Note that widening the vectors amortizes the cost of >things like out-of-order execution.
>At the same time, AMD has reduced its VLIW width from 5 to >4, in order to achieve
>higher efficiency.
So what?
>GPUs also introduced concurrent kernel execution and >scalar execution,
>and have growing register files and caches. So they're >investing more transistors
>into latency reduction and programmability than raw FLOPS. >GF110 has a 0.52 FLOP/transitor
>ratio. With G92b that was still a 0.94 ratio.
Register file size is increased for better throughput...the registers per vector lane have been decreasing.
What you are saying is obvious - GPUs are becoming more programmable. But the reality is that they are not even remotely optimized for latency. Where do you compile your code? On a CPU or a GPU?
The bottom line is that to achieve optimal throughput you must sacrifice latency (look at the memory subsystem), and vica versa. You can refuse to believe this, but it's simply true. While GPUs may become more programmable, this is all relative to an architecture that started with 0 programmability. The gap between GPUs and CPUs may shrink, but it will never disappear and the efficiency differences will always be sizable.
>It's easy to see where your preconceptions come from though. NV40 had a 0.24 ratio,
>which G92b increased by a fourfold in a few years time. But you got fooled into
>thinking that this is a trend which can be sustained. Widening a component only
>increases the overall throughput desity of that component till it reaches 50% of
>the die area. And the components themselves get fatter to increase programmability
>as well, and the rest of the architecture needs to support >the same throughput.
Of course it's easy to see where my preconceptions come from. It's reality.
>So for your own sake stop staring yourself blind at >theoretical thoughput. There's
>a lot more to effective performance than that.
>
>>>If dedicated hardware was the universal answer, then GPUs >would still have separate
>>>vertex and pixel pipelines. So clearly you're not taking >all the of the factors into account!
>>
>>Newsflash: GPU shaders are dedicated hardware. The fact that the workloads for
>>vertex and pixel shading are similar enough to use the same hardware is not relevant.
>
>That's old news. Shader cores are no longer dedicated. NVIDIA now calls them CUDA
>cores instead, for good reason. The floating-point operations are IEEE-754 compliant,
>just like on a CPU.
And what is the latency of those operations? The latency of forwarding between dependent ops? What is the branch prediction latency? What is the mispredict penalty? What is the latency of a load?
>A Tesla card is as dedicated to graphics as it is to >biochemical
>research.
You notice how Tesla is dedicated for throughput oriented tasks, rather than latency ones?
>A CPU with gather/scatter would be just as dedicated to those. The only
>things making the GPU dedicated to gaphics, are the texture samplers, ROPs, and
>rasterizers (each of which are also optimized in a >software renderer when there's gather/scatter support).
Then there's the shader cores, which are optimized for throughput rather than latency (and ditto for the memory hierarchy).
>And please explain to me why the workload is not relevant to unification, instead
>of handwaving. If pixel shading still only consisted of integer operations, there
>wouldn't be any significant reason to unify them, and shader cores would be truely
>dedicated to graphics. Unification is strong proof that >dedicated hardware is not
>the universal answer. If you still think it's irrelevant >to the discussion, please elaborate.
I already explained, so perhaps repeating myself will help. They are DEDICATED TO THROUGHPUT ORIENTED WORKLOADS.
>>Let's take another example. Icera makes a very cool SDR. However, to meet the
>>performance and power efficiency requirements, they use a custom designed chip to
>>run the SDR. So, the 'dedicated hardware' is used by many different radio protocols,
>>in exactly the same way that GPU shaders are used by many different shader types.
>It's still dedicated hardware though.
>
>Does Icera's SDR support IEEE-754? I guess not, so *this* >is irrelevant.
What does IEEE-754 have to do with latency?
>It's nothing personal, but face it, you're running out of >arguments and start handwaving
>and reaching for absurd examples which I'm easily able to >debunk.
Only because you totally fail to understand and refuse to acknowledge reality.
>>In case you haven't noticed, modern CPUs are filled with >>idle silicon. Floating
>>point units, AES crypto blocks, virtualization support, >>real mode support, etc. Many of these were added recently.
>
>Floating-point is useful to graphics, so this isn't an argument against software rendering.
>
>As for AES, virtualization, real mode, etc. they certainly don't "fill" the CPU
>with idle silicon.
Microcode?
>Unless you can prove me otherwise, AES doesn't take die space
>proportional to the GPU's, ROPs, texture samplers or >rasterizers.
The ROPs are used for general purpose workloads, as are the texture sampling units. Where do you think loads and stores are executed? And atomic operations? The rasterizer is not useful for general software, but how much power does it consume? How much area?
>And like I said
>before, fast AES support is important for generic encrypted disk and network access,
>and gather/scatter speeds up software AES so the dedicated >hardware can be removed.
You said that, but you're wrong. You cannot remove it for compatibility reasons, and also for security reasons.
>VT-x and real mode are even supported by Atom cores, so >it's doubtful this takes
>any noticable die space on a desktop chip, and it's >obviously indispensable for the software that make use of >it.
Why is virtualization support in hardware? VMware was doing fine with their binary translation. Maybe it was added to improve performance and efficiency!!!! Just like rasterizers!
>Besides, like I said before GPUs also have lots of programmability features which
>may or may not be used. For instance it's doubtful I'll ever use my GeForce GTX
>460's double-precision computing capabilities. But that's fine, it's relatively
>small and it's not worth designing a separate chip for the people who do use it.
>
>So I have nothing against dedicated hardware in general, but like I said it has
>to offer a high enough efficiency advantage, weighed against its utilization. The
>problem with some of the GPU's dedicated hardware is that even during it's key application,
>graphics, it's often either a significant bottleneck or >mostly idle.
You have said that, but frankly, you've said a lot of things that are simply wrong.
How about you provide some hard data on modern high performance GPUs (e.g. most recent generation from NV or AMD) on the utilization of the rasterizer. They have performance profilers, so it shouldn't be too hard. Then you can find out how much power the rasterizers use, and we can compare it to the power consumption of SW rendering. Then you will have actually a marginal understanding of the relative efficiency.
And I'm fairly certain that you will find that comparison to be very unattractive for SW rendering.
>Unifying vertex
>and pixel processing removed the bottleneck between them >and increased utilization.
>Texture sampling is useless to generic computing and >having too few texture units
>is a bottleneck to graphics, while the importance of FP32 >texture filtering increases,
>so it makes lots of sense to start doing the filtering in >shader units and have
>more generic gather/scatter units. And support for >micropolygons would require substantial
>hardware to sustain the peak throughput, but it's again >idle during other workloads
>and even for graphics it's full capacity isn't used all >the time. Make it smaller,
>and it's a bottleneck when drawing micropolygons. Again >unification seems like the better option here to me.
You haven't even quantified the gains from utilization at all for rendering, or the cost in terms of power consumption.
>>>What you're also forgetting is that the software evolves as well. In 2001 people
>>>were really excited about pixel shader 1.1. Today, a desktop GPU with only pixel
>>>shader 1.1 support would be totally ridiculous, regardless of how power efficient
>>>it is. I've said it before; we don't need more pixels, we >need more exciting ones.
>>>Which means increasing generic programmability.
>>
>>So let the shaders evolve, and stay separate.
>
>I sincerely hope you're not being serious. There's no way >GPU manufacturers will un-unify their architectures.
Please read what I wrote, carefully and think about it. "Stay separate" implies they are already separate. What are they separate from? You seem to assume I'm talking about the vertex/pixel/geo shaders being separate from one another, but that's hardly clear.
What was meant is that the shaders should stay separate from the CPU (which is the state today, even in IGPs).
>>Every single fact that I've seen tends to suggest that software rendering is a demonstrably bad idea.
>
>You haven't demonstrated anything.
Sure I have. CPUs are not optimized for throughput and have roughly 4X lower performance efficiency. In fact, in some cases that's a vast understatement.
A Tesla has roughly 2.2 GFLOP/s per W (DP). A high performance Westmere has roughly 0.75 GFLOP/s per W. Cayman is roughly 2.7 GFLOP/s per, although a real workstation card would be lower, probably around 2.5 GFLOP/s per W.
So the reality is that the performance per watt is much worse on CPUs than GPU, by a factor of 3-4. So to achieve the same throughput, the power consumption would be 3-4X higher. So...um...CPUs aren't throughput optimized.
>And "tends to suggest" coming from someone who's clearly basing things on prejudice
>is just more handwaving. I've proven you WRONG about the necessity for dedicated
>texture decompression, using real data.
You have no real data. You had bad data from an old simulator that the author of the simulator thought was BS. Garbage in, garbage out.
>And a 6-core CPU with FMA has 6 times more
>FLOPS than SwiftShader currently uses. Furthermore, I've shown that texel fetching
>is currently a huge bottleneck and gather/scatter would >put the CPU on par with
>the GPU's capabilities,
No you haven't. "On par with" means +/-20% to me, and I don't think a high-end CPU would come close.
>plus speed up many other graphics operations and other throughput-oriented
>applications. I've also shown that the battery life of a >laptop while gaming wouldn't
>be much lower than when using an IGP.
Actually your claims about power are probably the worst.
>And finally I've shown that an IGP does cost
>quite a bit and is worthless for non-graphics applications.
That you definitely haven't shown. And IGPs are useful for the same general purpose applications that a GPU is. Fusion parts will have OpenCL and compute shader. So will Ivy Bridge.
>Bandwidth, throughput, power efficiency, cost, everything is within reach to produce
>a more powerful CPU with adequate graphics capabilities to make the IGP redundant.
>And this whole discussion has made me even more confident >it won't stop there.
Congratulations, you have entered the twilight zone.
>>>As I've shown in my previous response, the IGP is bandwidth
>>>limited and software rendering is catching up with it. >Things are converging, and
>>>it will lead to much more exciting computer graphics (and >other appliations). So
>>>there's no need to fight it. Relatively things become less >efficient, but note that
>>>GT110 is far less efficient per transistor than a 40 nm >NV20 as well. In absolute
>>>terms the efficiency still improves thanks to >semiconductor advances. This is not about to stop.
>>
>>Yes and that is why GPU shaders are evolving. What you are suggesting is that
>>we throw away 10-15 years of GPU evolution and simply graft on some major features
>>to CPUs, and hope it works. Even Intel's approach with LRB was much more reasonable.
>
>Yes, GPUs are evolving too, toward a more CPU-like >architecture! I've proven that many times now.
Yes and the relative gap in performance is still HUGE.
>And why would you even care if this means throwing away 10-15 years of GPU evolution?
>Did you shed a tear when sound cards became redundant? >There's plenty of other examples
>of technology that didn't survive evolution.
No, because A) Creative Labs was a bunch of incompetent idiots and B) sound only took up 1-2% of my CPU cycles. Notice how that's not true of graphics.
>But GPU technology doesn't completely go to waste. AMD probably learned a thing
>or two from ATI on how to improve throughput efficiency (Bulldozer), and NVIDIA
>can produce ARM processors with well balanced IPC, DLP and >TLP.
>
>Heck, I wouldn't mind if CPU technology went to waste, if >it meant that GPUs are
>efficient enough at complex tasks to run Windows, and was >sitting in the center
>of my motherboard. I'll rename that it to "CPU" then >though.
Yeah, let me know when the GPU can actually compile stuff.
The bottom line is that while it's true that GPUs and CPUs are evolving towards one another, that says nothing about how vast the distance between the two is. The reality is that there is roughly a 4X gap in performance efficiency between GPUs and CPUs on many throughput workloads, and the gap is even larger on latency sensitive workloads.
Throughput means more than just scatter/gather although it is one key aspect. But to simply throughput down to scatter/gather is pure ignorance and naivete, and shows an acute lack of understanding of the substantial differences in circuit design, microarchitecture and software.
Anyway, unless you can really produce some real data, I'm done with this conversation.
DK
---------------------------
>Hi David,
>
>David Kanter (dkanter@realworldtech.com) on 2/7/11 wrote:
>---------------------------
>>Nicolas Capens (nicolas.capens@gmail.com) on 2/7/11 wrote:
>>---------------------------
>>>Hi David,
>>>
>>>David Kanter (dkanter@realworldtech.com) on 2/7/11 wrote:
>>>---------------------------
>>>>The fundamental point is that efficiency *really* does matter a lot. In a world
>>>>where power is the #1 performance limiter, doing stuff in dedicated hardware becomes
>>>>increasingly attractive. If you can cut the energy required to render a frame in
>>>>half by using an IGP (which is quite believable), then you have gained a huge amount
>>>>of energy to expend in the CPU or other areas.
>>>>
>>>>The key is understanding that with Moore's Law, spending more transistors to lower
>>>>power is VERY attractive. That dictates more dedicated hardware, not less. Just
>>>>look at what Intel has done - putting embedded micro-controllers inside CPUs to manage power.
>>>>
>>>>The whole idea of eliminating throughput oriented cores goes against that entirely.
>>>
>>>Who's talking about eliminating throughput oriented cores? I'm talking about adding
>>>gather/scatter to CPU cores so they *become* efficient throughput oriented cores
>>>and the functionality of the IGP can be unified.
>>
>>You are talking about eliminating throughput cores. You claim to have a basic
>>understanding of hardware, so it should be readily apparent how CPU cores and throughput
>>cores (e.g. Niagara, GPU shaders) differ.
>
>Throughput-oriented is a term which originates from server systems, long before
>graphics chips were even called GPUs! It merely means a focus on data rate, potentially
>at the cost of latency. Any multiprocessor system, is a throughput oriented system.
>Clock frequency and ILP are latency-oriented, while DLP and TLP are throughput-oriented.
>
>Today's x86 CPUs exploit DLP and TLP (SIMD, multi-core and Hyper-Threading) and
>have a less aggressive clocking than several years ago.
Actually the clock speeds are about the same 3-4GHz.
>So they are definitely throughput-oriented
>architectures already and would become more efficient at with the addition of gather/scatter
>support. Parallel load/store is the main thing setting them apart from GPUs.
That's simply false. There are many more differences in terms of circuit design and the latency of individual instructions.
>GPUs are, in the words of NVIDIA's chief scientists, "aggressively throughput-oriented
>processors". Note though that GF104 features superscalar >execution, intended to
>lower the latency. And aside from reducing bandwidth, caches also reduce latency.
>So GPUs are forces to become less aggressive at using throughput-oriented techniques,
>because reducing latency somewhat also reduces the amount of on-chip storage you
>need. It's a balancing act, because obviously reducing >latency costs transistors as well.
The two architectures are leagues apart in terms of latency? What is the latency of a dependent chain of adds in a GPU? What about a CPU? What is the branch coherence? What is the memory latency?
The memory latency makes this especially obvious, as there is a huge difference.
>>Even to someone without circuit design
>>expertise, it should be blinding obvious - the clock speeds are about a factor of 2-4X different.
>
>Then why did NVIDIA decide to put its aggressively throughput-oriented cores into
>a higher clock domain? The GeForce GTX 560 Ti has a shader clock of 1645 MHz, while
>the Radeon HD 6950 has a clock of 800 MHz. Does that mean >NVIDIA's architecture is not throughput-oriented?
1.5GHz is still half the speed or less of a modern CPU. AMD has GPUs that run at 900MHz. And those speeds have stayed constant for the last several generations (since about 65nm).
>Clock speed alone isn't an indication of being throughput-oriented or not. It's
>a design decision which doesn't have to compromise *effective* throughput, as proven
>by NVIDIA vs. AMD. Another example is Cell BE, which clocks at 3.2 GHz but even
>at 90 nm was considered strongly thoughput-oriented. x86 CPUs have come a long way
>since the single-core Pentium 4 days, and they're not about to stop increasing throughput
>efficiency (AVX, Bulldozer, FMA, etc.).
Listen, no matter what you say, CPUs are still optimized for latency. It's blatantly obvious if you have ever written code for a CPU. It's also blatantly obvious that GPUs are not optimized for latency. This means different circuits, different architectures, pipeline arrangements, etc. etc. and most importantly a different style of programming.
Consider the performance degradation of running a latency sensitive workload (e.g. SPECint) on a GPU vs. a CPU. It's going to be huge.
>As a matter of fact the CPU's clock frequency has remained nearly constant. The
>i7-2600K has a 3.8 GHz Turbo Mode ( but under multi-threaded workloads it's not
>that high), the same as the 2004 Pentium 4 Prescott. In the same time period, NVIDIA's
>GPUs have increased their clock frequency by over a factor 3. There's no indication
>this is going to change. For NVIDIA to conquer the HPC market, it needs to continue
>investing into latency reduction. To prevent an excessive growth in die size, it
>needs to increase the clock frequency. GF100 had some thermal issues, but they got
>that under control with GF110, which has more cores >enabled and even higher clocks.
Again, frequency is just one aspect. There are so many other things to consider. Pipeline depth, result forwarding, dependent instruction latency, etc. etc.
>So while it's "blinding obvious" that there's a clock frequency difference today,
>it's also "blinding obvious" they're on a collision >course. Gather/scatter support
>is still several years out, so by that time they'll have >converged even closer and gather/scatter is the keystone.
Scatter/gather is helpful, but it will not make CPUs as efficient as GPUs. It doesn't help make the ALUs lower power. It doesn't help make the front-end lower power.
Look at what CPUs have that GPUs do not:
Bypass networks
Branch predictors
Prefetchers
Out-of-order execution
Low latency instructions
etc. etc.
The differences at the circuit level are just as big.
>>You cannot simply tack on scatter/gather to a latency optimized CPU core and expect
>>it to look like a throughput core in terms of power efficiency. At least, there
>>is definitely a lack of evidence for any such claims. Moreover, you need to preserve
>>the power efficiency for workloads that cannot be vectorized.
>
>An architecture which balances latency and theoretical throughput, can still achieve
>high effective thoughput. It's how NVIDIA achieved to >outperform AMD with only half the FLOP density.
That's so wrong it isn't even funny. That's because of AMD's use of static scheduling for their VLIW, and because Nvidia is much more optimized for scalar memory accesses. Has nothing to do with latency vs. throughput.
>The way things converge, tacking on gather/scatter support >does put the GPU within
>striking distance, starting with the IGP. For someone not
I don't believe that a homogeneous architecture is optimal at all, and you have yet to show that in any meaningful way. In fact, you have admitted that it is sub-optimal for power consumption...which means that as long as graphics consumes a non-trivial amount of power, that an IGP will be a superior solution. If there is a day when graphics is merely 1-2% of all cycles, then perhaps it might happen...but I don't see that ever happening.
>Note that widening the vectors amortizes the cost of >things like out-of-order execution.
>At the same time, AMD has reduced its VLIW width from 5 to >4, in order to achieve
>higher efficiency.
So what?
>GPUs also introduced concurrent kernel execution and >scalar execution,
>and have growing register files and caches. So they're >investing more transistors
>into latency reduction and programmability than raw FLOPS. >GF110 has a 0.52 FLOP/transitor
>ratio. With G92b that was still a 0.94 ratio.
Register file size is increased for better throughput...the registers per vector lane have been decreasing.
What you are saying is obvious - GPUs are becoming more programmable. But the reality is that they are not even remotely optimized for latency. Where do you compile your code? On a CPU or a GPU?
The bottom line is that to achieve optimal throughput you must sacrifice latency (look at the memory subsystem), and vica versa. You can refuse to believe this, but it's simply true. While GPUs may become more programmable, this is all relative to an architecture that started with 0 programmability. The gap between GPUs and CPUs may shrink, but it will never disappear and the efficiency differences will always be sizable.
>It's easy to see where your preconceptions come from though. NV40 had a 0.24 ratio,
>which G92b increased by a fourfold in a few years time. But you got fooled into
>thinking that this is a trend which can be sustained. Widening a component only
>increases the overall throughput desity of that component till it reaches 50% of
>the die area. And the components themselves get fatter to increase programmability
>as well, and the rest of the architecture needs to support >the same throughput.
Of course it's easy to see where my preconceptions come from. It's reality.
>So for your own sake stop staring yourself blind at >theoretical thoughput. There's
>a lot more to effective performance than that.
>
>>>If dedicated hardware was the universal answer, then GPUs >would still have separate
>>>vertex and pixel pipelines. So clearly you're not taking >all the of the factors into account!
>>
>>Newsflash: GPU shaders are dedicated hardware. The fact that the workloads for
>>vertex and pixel shading are similar enough to use the same hardware is not relevant.
>
>That's old news. Shader cores are no longer dedicated. NVIDIA now calls them CUDA
>cores instead, for good reason. The floating-point operations are IEEE-754 compliant,
>just like on a CPU.
And what is the latency of those operations? The latency of forwarding between dependent ops? What is the branch prediction latency? What is the mispredict penalty? What is the latency of a load?
>A Tesla card is as dedicated to graphics as it is to >biochemical
>research.
You notice how Tesla is dedicated for throughput oriented tasks, rather than latency ones?
>A CPU with gather/scatter would be just as dedicated to those. The only
>things making the GPU dedicated to gaphics, are the texture samplers, ROPs, and
>rasterizers (each of which are also optimized in a >software renderer when there's gather/scatter support).
Then there's the shader cores, which are optimized for throughput rather than latency (and ditto for the memory hierarchy).
>And please explain to me why the workload is not relevant to unification, instead
>of handwaving. If pixel shading still only consisted of integer operations, there
>wouldn't be any significant reason to unify them, and shader cores would be truely
>dedicated to graphics. Unification is strong proof that >dedicated hardware is not
>the universal answer. If you still think it's irrelevant >to the discussion, please elaborate.
I already explained, so perhaps repeating myself will help. They are DEDICATED TO THROUGHPUT ORIENTED WORKLOADS.
>>Let's take another example. Icera makes a very cool SDR. However, to meet the
>>performance and power efficiency requirements, they use a custom designed chip to
>>run the SDR. So, the 'dedicated hardware' is used by many different radio protocols,
>>in exactly the same way that GPU shaders are used by many different shader types.
>It's still dedicated hardware though.
>
>Does Icera's SDR support IEEE-754? I guess not, so *this* >is irrelevant.
What does IEEE-754 have to do with latency?
>It's nothing personal, but face it, you're running out of >arguments and start handwaving
>and reaching for absurd examples which I'm easily able to >debunk.
Only because you totally fail to understand and refuse to acknowledge reality.
>>In case you haven't noticed, modern CPUs are filled with >>idle silicon. Floating
>>point units, AES crypto blocks, virtualization support, >>real mode support, etc. Many of these were added recently.
>
>Floating-point is useful to graphics, so this isn't an argument against software rendering.
>
>As for AES, virtualization, real mode, etc. they certainly don't "fill" the CPU
>with idle silicon.
Microcode?
>Unless you can prove me otherwise, AES doesn't take die space
>proportional to the GPU's, ROPs, texture samplers or >rasterizers.
The ROPs are used for general purpose workloads, as are the texture sampling units. Where do you think loads and stores are executed? And atomic operations? The rasterizer is not useful for general software, but how much power does it consume? How much area?
>And like I said
>before, fast AES support is important for generic encrypted disk and network access,
>and gather/scatter speeds up software AES so the dedicated >hardware can be removed.
You said that, but you're wrong. You cannot remove it for compatibility reasons, and also for security reasons.
>VT-x and real mode are even supported by Atom cores, so >it's doubtful this takes
>any noticable die space on a desktop chip, and it's >obviously indispensable for the software that make use of >it.
Why is virtualization support in hardware? VMware was doing fine with their binary translation. Maybe it was added to improve performance and efficiency!!!! Just like rasterizers!
>Besides, like I said before GPUs also have lots of programmability features which
>may or may not be used. For instance it's doubtful I'll ever use my GeForce GTX
>460's double-precision computing capabilities. But that's fine, it's relatively
>small and it's not worth designing a separate chip for the people who do use it.
>
>So I have nothing against dedicated hardware in general, but like I said it has
>to offer a high enough efficiency advantage, weighed against its utilization. The
>problem with some of the GPU's dedicated hardware is that even during it's key application,
>graphics, it's often either a significant bottleneck or >mostly idle.
You have said that, but frankly, you've said a lot of things that are simply wrong.
How about you provide some hard data on modern high performance GPUs (e.g. most recent generation from NV or AMD) on the utilization of the rasterizer. They have performance profilers, so it shouldn't be too hard. Then you can find out how much power the rasterizers use, and we can compare it to the power consumption of SW rendering. Then you will have actually a marginal understanding of the relative efficiency.
And I'm fairly certain that you will find that comparison to be very unattractive for SW rendering.
>Unifying vertex
>and pixel processing removed the bottleneck between them >and increased utilization.
>Texture sampling is useless to generic computing and >having too few texture units
>is a bottleneck to graphics, while the importance of FP32 >texture filtering increases,
>so it makes lots of sense to start doing the filtering in >shader units and have
>more generic gather/scatter units. And support for >micropolygons would require substantial
>hardware to sustain the peak throughput, but it's again >idle during other workloads
>and even for graphics it's full capacity isn't used all >the time. Make it smaller,
>and it's a bottleneck when drawing micropolygons. Again >unification seems like the better option here to me.
You haven't even quantified the gains from utilization at all for rendering, or the cost in terms of power consumption.
>>>What you're also forgetting is that the software evolves as well. In 2001 people
>>>were really excited about pixel shader 1.1. Today, a desktop GPU with only pixel
>>>shader 1.1 support would be totally ridiculous, regardless of how power efficient
>>>it is. I've said it before; we don't need more pixels, we >need more exciting ones.
>>>Which means increasing generic programmability.
>>
>>So let the shaders evolve, and stay separate.
>
>I sincerely hope you're not being serious. There's no way >GPU manufacturers will un-unify their architectures.
Please read what I wrote, carefully and think about it. "Stay separate" implies they are already separate. What are they separate from? You seem to assume I'm talking about the vertex/pixel/geo shaders being separate from one another, but that's hardly clear.
What was meant is that the shaders should stay separate from the CPU (which is the state today, even in IGPs).
>>Every single fact that I've seen tends to suggest that software rendering is a demonstrably bad idea.
>
>You haven't demonstrated anything.
Sure I have. CPUs are not optimized for throughput and have roughly 4X lower performance efficiency. In fact, in some cases that's a vast understatement.
A Tesla has roughly 2.2 GFLOP/s per W (DP). A high performance Westmere has roughly 0.75 GFLOP/s per W. Cayman is roughly 2.7 GFLOP/s per, although a real workstation card would be lower, probably around 2.5 GFLOP/s per W.
So the reality is that the performance per watt is much worse on CPUs than GPU, by a factor of 3-4. So to achieve the same throughput, the power consumption would be 3-4X higher. So...um...CPUs aren't throughput optimized.
>And "tends to suggest" coming from someone who's clearly basing things on prejudice
>is just more handwaving. I've proven you WRONG about the necessity for dedicated
>texture decompression, using real data.
You have no real data. You had bad data from an old simulator that the author of the simulator thought was BS. Garbage in, garbage out.
>And a 6-core CPU with FMA has 6 times more
>FLOPS than SwiftShader currently uses. Furthermore, I've shown that texel fetching
>is currently a huge bottleneck and gather/scatter would >put the CPU on par with
>the GPU's capabilities,
No you haven't. "On par with" means +/-20% to me, and I don't think a high-end CPU would come close.
>plus speed up many other graphics operations and other throughput-oriented
>applications. I've also shown that the battery life of a >laptop while gaming wouldn't
>be much lower than when using an IGP.
Actually your claims about power are probably the worst.
>And finally I've shown that an IGP does cost
>quite a bit and is worthless for non-graphics applications.
That you definitely haven't shown. And IGPs are useful for the same general purpose applications that a GPU is. Fusion parts will have OpenCL and compute shader. So will Ivy Bridge.
>Bandwidth, throughput, power efficiency, cost, everything is within reach to produce
>a more powerful CPU with adequate graphics capabilities to make the IGP redundant.
>And this whole discussion has made me even more confident >it won't stop there.
Congratulations, you have entered the twilight zone.
>>>As I've shown in my previous response, the IGP is bandwidth
>>>limited and software rendering is catching up with it. >Things are converging, and
>>>it will lead to much more exciting computer graphics (and >other appliations). So
>>>there's no need to fight it. Relatively things become less >efficient, but note that
>>>GT110 is far less efficient per transistor than a 40 nm >NV20 as well. In absolute
>>>terms the efficiency still improves thanks to >semiconductor advances. This is not about to stop.
>>
>>Yes and that is why GPU shaders are evolving. What you are suggesting is that
>>we throw away 10-15 years of GPU evolution and simply graft on some major features
>>to CPUs, and hope it works. Even Intel's approach with LRB was much more reasonable.
>
>Yes, GPUs are evolving too, toward a more CPU-like >architecture! I've proven that many times now.
Yes and the relative gap in performance is still HUGE.
>And why would you even care if this means throwing away 10-15 years of GPU evolution?
>Did you shed a tear when sound cards became redundant? >There's plenty of other examples
>of technology that didn't survive evolution.
No, because A) Creative Labs was a bunch of incompetent idiots and B) sound only took up 1-2% of my CPU cycles. Notice how that's not true of graphics.
>But GPU technology doesn't completely go to waste. AMD probably learned a thing
>or two from ATI on how to improve throughput efficiency (Bulldozer), and NVIDIA
>can produce ARM processors with well balanced IPC, DLP and >TLP.
>
>Heck, I wouldn't mind if CPU technology went to waste, if >it meant that GPUs are
>efficient enough at complex tasks to run Windows, and was >sitting in the center
>of my motherboard. I'll rename that it to "CPU" then >though.
Yeah, let me know when the GPU can actually compile stuff.
The bottom line is that while it's true that GPUs and CPUs are evolving towards one another, that says nothing about how vast the distance between the two is. The reality is that there is roughly a 4X gap in performance efficiency between GPUs and CPUs on many throughput workloads, and the gap is even larger on latency sensitive workloads.
Throughput means more than just scatter/gather although it is one key aspect. But to simply throughput down to scatter/gather is pure ignorance and naivete, and shows an acute lack of understanding of the substantial differences in circuit design, microarchitecture and software.
Anyway, unless you can really produce some real data, I'm done with this conversation.
DK
Topic | Posted By | Date |
---|---|---|
Sandy Bridge CPU article online | David Kanter | 2010/09/26 09:35 PM |
Sandy Bridge CPU article online | Alex | 2010/09/27 05:22 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:06 AM |
Sandy Bridge CPU article online | someone | 2010/09/27 06:03 AM |
Sandy Bridge CPU article online | slacker | 2010/09/27 02:08 PM |
PowerPC is now Power | Paul A. Clayton | 2010/09/27 04:34 PM |
Sandy Bridge CPU article online | Dave | 2010/11/10 10:15 PM |
Sandy Bridge CPU article online | someone | 2010/09/27 06:23 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 06:39 PM |
Optimizing register clear | Paul A. Clayton | 2010/09/28 12:34 PM |
Sandy Bridge CPU article online | MS | 2010/09/27 06:54 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:15 AM |
Sandy Bridge CPU article online | MS | 2010/09/27 11:02 AM |
Sandy Bridge CPU article online | mpx | 2010/09/27 11:44 AM |
Sandy Bridge CPU article online | MS | 2010/09/27 02:37 PM |
Precisely | David Kanter | 2010/09/27 03:22 PM |
Sandy Bridge CPU article online | Richard Cownie | 2010/09/27 08:27 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 10:01 AM |
Sandy Bridge CPU article online | Richard Cownie | 2010/09/27 10:40 AM |
Sandy Bridge CPU article online | boots | 2010/09/27 11:19 AM |
Right, mid-2011, not 2010. Sorry (NT) | Richard Cownie | 2010/09/27 11:42 AM |
bulldozer single thread performance | Max | 2010/09/27 12:57 PM |
bulldozer single thread performance | Matt Waldhauer | 2011/03/02 11:32 AM |
Sandy Bridge CPU article online | Pun Zu | 2010/09/27 11:32 AM |
Sandy Bridge CPU article online | ? | 2010/09/27 11:44 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 01:11 PM |
My opinion is that anything that would take advantage of 256-bit AVX | redpriest | 2010/09/27 01:17 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/27 03:09 PM |
My opinion is that anything that would take advantage of 256-bit AVX | redpriest | 2010/09/27 04:06 PM |
My opinion is that anything that would take advantage of 256-bit AVX | David Kanter | 2010/09/27 05:23 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 03:57 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:35 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Matt Waldhauer | 2010/09/28 10:58 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/27 06:39 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:14 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Megol | 2010/09/28 02:17 AM |
My opinion is that anything that would take advantage of 256-bit AVX | Michael S | 2010/09/28 05:47 AM |
PGI | Carlie Coats | 2010/09/28 10:23 AM |
gfortran... | Carlie Coats | 2010/09/29 09:33 AM |
My opinion is that anything that would take advantage of 256-bit AVX | mpx | 2010/09/28 12:58 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Michael S | 2010/09/28 01:36 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Foo_ | 2010/09/29 01:08 AM |
My opinion is that anything that would take advantage of 256-bit AVX | mpx | 2010/09/28 11:37 AM |
My opinion is that anything that would take advantage of 256-bit AVX | Aaron Spink | 2010/09/28 01:19 PM |
My opinion is that anything that would take advantage of 256-bit AVX | hobold | 2010/09/28 03:08 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Ian Ollmann | 2010/09/28 04:26 PM |
My opinion is that anything that would take advantage of 256-bit AVX | Anthony | 2010/09/28 10:31 PM |
Sandy Bridge CPU article online | Hans de Vries | 2010/09/27 02:19 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 03:19 PM |
Sandy Bridge CPU article online | -Sweeper_ | 2010/09/27 05:50 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 06:41 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/27 02:55 PM |
Sandy Bridge CPU article online | line98 | 2010/09/27 03:05 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 03:20 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/27 03:23 PM |
Sandy Bridge CPU article online | line98 | 2010/09/27 03:42 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 09:33 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 04:04 PM |
Sandy Bridge CPU article online | Jack | 2010/09/27 04:40 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 11:47 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/27 11:54 PM |
Sandy Bridge CPU article online | Royi | 2010/09/27 11:59 PM |
Sandy Bridge CPU article online | JS | 2010/09/28 01:18 AM |
Sandy Bridge CPU article online | Royi | 2010/09/28 01:31 AM |
Sandy Bridge CPU article online | Jack | 2010/09/28 06:34 AM |
Sandy Bridge CPU article online | Royi | 2010/09/28 08:22 AM |
Sandy Bridge CPU article online | Foo_ | 2010/09/28 12:53 PM |
Sandy Bridge CPU article online | Paul | 2010/09/28 01:17 PM |
Sandy Bridge CPU article online | mpx | 2010/09/28 01:22 PM |
Sandy Bridge CPU article online | anonymous | 2010/09/28 02:06 PM |
Sandy Bridge CPU article online | IntelUser2000 | 2010/09/29 01:49 AM |
Sandy Bridge CPU article online | Jack | 2010/09/28 05:08 PM |
Sandy Bridge CPU article online | mpx | 2010/09/29 01:50 AM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/29 12:01 PM |
Sandy Bridge CPU article online | Royi | 2010/09/29 12:48 PM |
Sandy Bridge CPU article online | mpx | 2010/09/29 02:15 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/29 02:27 PM |
Sandy Bridge CPU article online | ? | 2010/09/29 11:18 PM |
Sandy Bridge CPU article online | savantu | 2010/09/30 12:28 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 03:43 AM |
Sandy Bridge CPU article online | gallier2 | 2010/09/30 04:18 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 08:38 AM |
Sandy Bridge CPU article online | David Hess | 2010/09/30 10:28 AM |
moderation (again) | hobold | 2010/10/01 05:08 AM |
Sandy Bridge CPU article online | Megol | 2010/09/30 02:13 AM |
Sandy Bridge CPU article online | ? | 2010/09/30 03:47 AM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 08:54 AM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 10:18 AM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 12:04 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 12:38 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/30 01:02 PM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 08:09 PM |
Sandy Bridge CPU article online | mpx | 2010/09/30 12:40 PM |
Sandy Bridge CPU article online | Linus Torvalds | 2010/09/30 01:00 PM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 08:44 PM |
Sandy Bridge CPU article online | David Hess | 2010/09/30 10:36 AM |
Sandy Bridge CPU article online | someone | 2010/09/30 11:23 AM |
Sandy Bridge CPU article online | mpx | 2010/09/30 01:50 PM |
wii lesson | Michael S | 2010/09/30 02:12 PM |
wii lesson | Dan Downs | 2010/09/30 03:33 PM |
wii lesson | Kevin G | 2010/10/01 12:27 AM |
wii lesson | Rohit | 2010/10/01 07:53 AM |
wii lesson | Kevin G | 2010/10/02 03:30 AM |
wii lesson | mpx | 2010/10/01 09:02 AM |
wii lesson | IntelUser2000 | 2010/10/01 09:31 AM |
GPUs and games | David Kanter | 2010/09/30 08:17 PM |
GPUs and games | hobold | 2010/10/01 05:27 AM |
GPUs and games | anonymous | 2010/10/01 06:35 AM |
GPUs and games | Gabriele Svelto | 2010/10/01 09:07 AM |
GPUs and games | Linus Torvalds | 2010/10/01 10:41 AM |
GPUs and games | Anon | 2010/10/01 11:23 AM |
Can Intel do *this* ??? | Mark Roulo | 2010/10/03 03:17 PM |
Can Intel do *this* ??? | Anon | 2010/10/03 03:29 PM |
Can Intel do *this* ??? | Mark Roulo | 2010/10/03 03:55 PM |
Can Intel do *this* ??? | Anon | 2010/10/03 05:45 PM |
Can Intel do *this* ??? | Ian Ameline | 2010/10/03 10:35 PM |
Graphics, IGPs, and Cache | Joe | 2010/10/10 09:51 AM |
Graphics, IGPs, and Cache | Anon | 2010/10/10 10:18 PM |
Graphics, IGPs, and Cache | Rohit | 2010/10/11 06:14 AM |
Graphics, IGPs, and Cache | hobold | 2010/10/11 06:43 AM |
Maybe the IGPU doesn't load into the L3 | Mark Roulo | 2010/10/11 08:05 AM |
Graphics, IGPs, and Cache | David Kanter | 2010/10/11 09:01 AM |
Can Intel do *this* ??? | Gabriele Svelto | 2010/10/04 12:31 AM |
Kanter's Law. | Ian Ameline | 2010/10/01 02:05 PM |
Kanter's Law. | David Kanter | 2010/10/01 02:18 PM |
Kanter's Law. | Ian Ameline | 2010/10/01 02:33 PM |
Kanter's Law. | Kevin G | 2010/10/01 04:19 PM |
Kanter's Law. | IntelUser2000 | 2010/10/01 10:36 PM |
Kanter's Law. | Kevin G | 2010/10/02 03:15 AM |
Kanter's Law. | IntelUser2000 | 2010/10/02 02:35 PM |
Wii vs pc's | Rohit | 2010/10/01 07:34 PM |
Wii vs pc's | Gabriele Svelto | 2010/10/01 11:54 PM |
GPUs and games | mpx | 2010/10/02 11:30 AM |
GPUs and games | Foo_ | 2010/10/02 04:03 PM |
GPUs and games | mpx | 2010/10/03 11:29 AM |
GPUs and games | Foo_ | 2010/10/03 01:52 PM |
GPUs and games | mpx | 2010/10/03 03:29 PM |
GPUs and games | Anon | 2010/10/03 03:49 PM |
GPUs and games | mpx | 2010/10/04 11:42 AM |
GPUs and games | MS | 2010/10/04 02:51 PM |
GPUs and games | Anon | 2010/10/04 08:29 PM |
persistence of vision | hobold | 2010/10/04 11:47 PM |
GPUs and games | mpx | 2010/10/05 12:51 AM |
GPUs and games | MS | 2010/10/05 06:49 AM |
GPUs and games | Jack | 2010/10/05 11:17 AM |
GPUs and games | MS | 2010/10/05 05:19 PM |
GPUs and games | Jack | 2010/10/05 11:11 AM |
GPUs and games | mpx | 2010/10/05 12:51 PM |
GPUs and games | David Kanter | 2010/10/06 09:04 AM |
GPUs and games | jack | 2010/10/06 09:34 PM |
GPUs and games | Linus Torvalds | 2010/10/05 07:29 AM |
GPUs and games | Foo_ | 2010/10/04 04:49 AM |
GPUs and games | Jeremiah | 2010/10/08 10:58 AM |
GPUs and games | MS | 2010/10/08 01:37 PM |
GPUs and games | Salvatore De Dominicis | 2010/10/04 01:41 AM |
GPUs and games | Kevin G | 2010/10/05 02:13 PM |
GPUs and games | mpx | 2010/10/03 11:36 AM |
GPUs and games | David Kanter | 2010/10/04 07:08 AM |
GPUs and games | Kevin G | 2010/10/04 10:38 AM |
Sandy Bridge CPU article online | NEON cortex | 2010/11/17 09:19 PM |
Sandy Bridge CPU article online | Ian Ameline | 2010/09/30 12:06 PM |
Sandy Bridge CPU article online | rwessel | 2010/09/30 02:29 PM |
Sandy Bridge CPU article online | Michael S | 2010/09/30 03:06 PM |
Sandy Bridge CPU article online | rwessel | 2010/09/30 06:55 PM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 03:53 AM |
Sandy Bridge CPU article online | rwessel | 2010/10/01 08:30 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 09:31 AM |
Sandy Bridge CPU article online | rwessel | 2010/10/01 10:56 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:28 PM |
Sandy Bridge CPU article online | Ricardo B | 2010/10/02 05:38 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/02 06:59 PM |
which bus more wasteful | Michael S | 2010/10/02 10:38 AM |
which bus more wasteful | rwessel | 2010/10/02 07:15 PM |
Sandy Bridge CPU article online | Ricardo B | 2010/10/01 10:08 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:31 PM |
Sandy Bridge CPU article online | Andi Kleen | 2010/10/01 11:55 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:32 PM |
Sandy Bridge CPU article online | kdg | 2010/10/01 11:26 AM |
Sandy Bridge CPU article online | Anon | 2010/10/01 11:33 AM |
Analog display out? | David Kanter | 2010/10/01 01:05 PM |
Analog display out? | mpx | 2010/10/02 11:46 AM |
Analog display out? | Anon | 2010/10/03 03:26 PM |
Digital is expensive! | David Kanter | 2010/10/03 06:36 PM |
Digital is expensive! | Anon | 2010/10/03 08:07 PM |
Digital is expensive! | David Kanter | 2010/10/03 10:02 PM |
Digital is expensive! | Steve Underwood | 2010/10/04 03:52 AM |
Digital is expensive! | David Kanter | 2010/10/04 07:03 AM |
Digital is expensive! | anonymous | 2010/10/04 07:11 AM |
Digital is not very expensive! | Steve Underwood | 2010/10/04 06:08 PM |
Digital is not very expensive! | Anon | 2010/10/04 08:33 PM |
Digital is not very expensive! | Steve Underwood | 2010/10/04 11:03 PM |
Digital is not very expensive! | mpx | 2010/10/05 01:10 PM |
Digital is not very expensive! | Gabriele Svelto | 2010/10/05 12:24 AM |
Digital is expensive! | jal142 | 2010/10/04 11:46 AM |
Digital is expensive! | mpx | 2010/10/04 01:04 AM |
Digital is expensive! | Gabriele Svelto | 2010/10/04 03:28 AM |
Digital is expensive! | Mark Christiansen | 2010/10/04 03:12 PM |
Analog display out? | slacker | 2010/10/03 06:44 PM |
Analog display out? | Anon | 2010/10/03 08:05 PM |
Analog display out? | Steve Underwood | 2010/10/04 03:48 AM |
Sandy Bridge CPU article online | David Hess | 2010/10/01 08:37 PM |
Sandy Bridge CPU article online | slacker | 2010/10/02 02:53 PM |
Sandy Bridge CPU article online | David Hess | 2010/10/02 06:49 PM |
memory bandwith | Max | 2010/09/30 12:19 PM |
memory bandwith | Anon | 2010/10/01 11:28 AM |
memory bandwith | Jack | 2010/10/01 07:45 PM |
memory bandwith | Anon | 2010/10/03 03:19 PM |
Sandy Bridge CPU article online | PiedPiper | 2010/09/30 07:05 PM |
Sandy Bridge CPU article online | Matt Sayler | 2010/09/29 04:38 PM |
Sandy Bridge CPU article online | Jack | 2010/09/29 09:39 PM |
Sandy Bridge CPU article online | mpx | 2010/09/30 12:24 AM |
Sandy Bridge CPU article online | passer | 2010/09/30 03:15 AM |
Sandy Bridge CPU article online | mpx | 2010/09/30 03:47 AM |
Sandy Bridge CPU article online | passer | 2010/09/30 04:25 AM |
SB and web browsing | Rohit | 2010/09/30 06:47 AM |
SB and web browsing | David Hess | 2010/09/30 07:10 AM |
SB and web browsing | MS | 2010/09/30 10:21 AM |
SB and web browsing | passer | 2010/09/30 10:26 AM |
SB and web browsing | MS | 2010/10/02 06:41 PM |
SB and web browsing | Rohit | 2010/10/01 08:02 AM |
Sandy Bridge CPU article online | David Kanter | 2010/09/30 08:35 AM |
Sandy Bridge CPU article online | Jack | 2010/09/30 10:40 PM |
processor evolution | hobold | 2010/09/29 02:16 PM |
processor evolution | Foo_ | 2010/09/30 06:10 AM |
processor evolution | Jack | 2010/09/30 07:07 PM |
3D gaming as GPGPU app | hobold | 2010/10/01 04:59 AM |
3D gaming as GPGPU app | Jack | 2010/10/01 07:39 PM |
processor evolution | hobold | 2010/10/01 04:35 AM |
processor evolution | David Kanter | 2010/10/01 10:02 AM |
processor evolution | Anon | 2010/10/01 11:46 AM |
Display | David Kanter | 2010/10/01 01:26 PM |
Display | Rohit | 2010/10/02 02:56 AM |
Display | Linus Torvalds | 2010/10/02 07:40 AM |
Display | rwessel | 2010/10/02 08:58 AM |
Display | sJ | 2010/10/02 10:28 PM |
Display | rwessel | 2010/10/03 08:38 AM |
Display | Anon | 2010/10/03 03:06 PM |
Display tech and compute are different | David Kanter | 2010/10/03 06:33 PM |
Display tech and compute are different | Anon | 2010/10/03 08:16 PM |
Display tech and compute are different | David Kanter | 2010/10/03 10:00 PM |
Display tech and compute are different | hobold | 2010/10/04 01:40 AM |
Display | ? | 2010/10/03 03:02 AM |
Display | Linus Torvalds | 2010/10/03 10:18 AM |
Display | Richard Cownie | 2010/10/03 11:12 AM |
Display | Linus Torvalds | 2010/10/03 12:16 PM |
Display | slacker | 2010/10/03 07:35 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/04 07:06 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/04 11:44 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/04 02:59 PM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/04 03:13 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/04 08:58 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/05 01:39 AM |
current V12 engines with >6.0 displacement | MS | 2010/10/05 06:57 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/05 01:20 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/05 09:26 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 05:39 AM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 01:22 PM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/06 03:07 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 03:56 PM |
current V12 engines with >6.0 displacement | rwessel | 2010/10/06 03:30 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 03:53 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/07 01:32 PM |
current V12 engines with >6.0 displacement | rwessel | 2010/10/07 07:54 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 09:02 PM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | slacker | 2010/10/06 07:20 PM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | Ricardo B | 2010/10/07 01:32 AM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | slacker | 2010/10/07 08:15 AM |
Top Gear is awful, and Jeremy Clarkson cannot drive. | Ricardo B | 2010/10/07 10:51 AM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 05:03 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/06 06:26 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 11:15 PM |
current V12 engines with >6.0 displacement | Howard Chu | 2010/10/07 02:16 PM |
current V12 engines with >6.0 displacement | Anon | 2010/10/05 10:31 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 05:55 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/06 06:15 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/06 06:34 AM |
I wonder is there any tech area that this forum doesn't have an opinion on (NT) | Rob Thorpe | 2010/10/06 10:11 AM |
Cunieform tablets | David Kanter | 2010/10/06 12:57 PM |
Cunieform tablets | Linus Torvalds | 2010/10/06 01:06 PM |
Ouch...maybe I should hire a new editor (NT) | David Kanter | 2010/10/06 04:38 PM |
Cunieform tablets | rwessel | 2010/10/06 03:41 PM |
Cunieform tablets | seni | 2010/10/07 10:56 AM |
Cunieform tablets | Howard Chu | 2010/10/07 01:44 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/06 06:10 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/06 10:44 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 07:55 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:51 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 07:38 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:33 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 09:04 PM |
Practical vehicles for commuting | Rob Thorpe | 2010/10/08 05:50 AM |
Practical vehicles for commuting | Gabriele Svelto | 2010/10/08 06:05 AM |
Practical vehicles for commuting | Rob Thorpe | 2010/10/08 06:21 AM |
Practical vehicles for commuting | j | 2010/10/08 02:20 PM |
Practical vehicles for commuting | Rob Thorpe | 2010/12/09 07:00 AM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/08 10:14 AM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/07 01:23 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/07 04:08 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 05:41 PM |
current V12 engines with >6.0 displacement | slacker | 2010/10/07 08:05 PM |
current V12 engines with >6.0 displacement | anonymous | 2010/10/07 08:52 PM |
current V12 engines with >6.0 displacement | Anonymous | 2010/10/08 07:52 PM |
current V12 engines with >6.0 displacement | anon | 2010/10/06 11:28 PM |
current V12 engines with >6.0 displacement | Aaron Spink | 2010/10/07 12:37 AM |
current V12 engines with >6.0 displacement | Ricardo B | 2010/10/07 01:37 AM |
current V12 engines with >6.0 displacement | slacker | 2010/10/05 02:02 AM |
Display | Linus Torvalds | 2010/10/04 10:39 AM |
Display | Gabriele Svelto | 2010/10/05 12:34 AM |
Display | Richard Cownie | 2010/10/04 06:22 AM |
Display | anon | 2010/10/04 09:22 PM |
Display | Richard Cownie | 2010/10/05 06:42 AM |
Display | mpx | 2010/10/03 11:55 AM |
Display | rcf | 2010/10/03 01:12 PM |
Display | mpx | 2010/10/03 02:36 PM |
Display | rcf | 2010/10/03 05:36 PM |
Display | Ricardo B | 2010/10/04 02:50 PM |
Display | gallier2 | 2010/10/05 03:44 AM |
Display | David Hess | 2010/10/05 05:21 AM |
Display | gallier2 | 2010/10/05 08:21 AM |
Display | David Hess | 2010/10/03 11:21 PM |
Display | rcf | 2010/10/04 08:06 AM |
Display | David Kanter | 2010/10/03 01:54 PM |
Alternative integration | Paul A. Clayton | 2010/10/06 08:51 AM |
Display | slacker | 2010/10/03 07:26 PM |
Display & marketing & analogies | ? | 2010/10/04 02:33 AM |
Display & marketing & analogies | kdg | 2010/10/04 06:00 AM |
Display | Kevin G | 2010/10/02 09:49 AM |
Display | Anon | 2010/10/03 03:43 PM |
Sandy Bridge CPU article online | David Kanter | 2010/09/29 03:17 PM |
Sandy Bridge CPU article online | Jack | 2010/09/28 06:27 AM |
Sandy Bridge CPU article online | IntelUser2000 | 2010/09/28 03:07 AM |
Sandy Bridge CPU article online | mpx | 2010/09/28 12:34 PM |
Sandy Bridge CPU article online | Aaron Spink | 2010/09/28 01:28 PM |
Sandy Bridge CPU article online | JoshW | 2010/09/28 02:13 PM |
Sandy Bridge CPU article online | mpx | 2010/09/28 02:54 PM |
Sandy Bridge CPU article online | Foo_ | 2010/09/29 01:19 AM |
Sandy Bridge CPU article online | mpx | 2010/09/29 03:06 AM |
Sandy Bridge CPU article online | JS | 2010/09/29 03:42 AM |
Sandy Bridge CPU article online | mpx | 2010/09/29 04:03 AM |
Sandy Bridge CPU article online | Foo_ | 2010/09/29 05:55 AM |
Sandy Bridge CPU article online | ajensen | 2010/09/28 12:19 AM |
Sandy Bridge CPU article online | Ian Ollmann | 2010/09/28 04:52 PM |
Sandy Bridge CPU article online | a reader | 2010/09/28 05:05 PM |
Sandy Bridge CPU article online | ajensen | 2010/09/28 11:35 PM |
Updated: Sandy Bridge CPU article | David Kanter | 2010/10/01 05:11 AM |
Updated: Sandy Bridge CPU article | anon | 2011/01/07 09:55 PM |
Updated: Sandy Bridge CPU article | Eric Bron | 2011/01/08 03:29 AM |
Updated: Sandy Bridge CPU article | anon | 2011/01/11 11:24 PM |
Updated: Sandy Bridge CPU article | anon | 2011/01/15 11:21 AM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anon | 2011/01/16 11:22 PM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anonymous | 2011/01/17 02:04 AM |
David Kanter can you shed some light? Re Updated: Sandy Bridge CPU article | anon | 2011/01/17 07:12 AM |
I can try.... | David Kanter | 2011/01/18 03:54 PM |
I can try.... | anon | 2011/01/18 08:07 PM |
I can try.... | David Kanter | 2011/01/18 11:24 PM |
I can try.... | anon | 2011/01/19 07:51 AM |
Wider fetch than execute makes sense | Paul A. Clayton | 2011/01/19 08:53 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/04 07:29 AM |
Sandy Bridge CPU article online | Seni | 2011/01/04 09:07 PM |
Sandy Bridge CPU article online | hobold | 2011/01/04 11:26 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 02:01 AM |
software assist exceptions | hobold | 2011/01/05 04:36 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 01:58 AM |
Sandy Bridge CPU article online | anon | 2011/01/05 04:51 AM |
Sandy Bridge CPU article online | Seni | 2011/01/05 08:53 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 09:03 AM |
Sandy Bridge CPU article online | anon | 2011/01/05 04:14 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 04:50 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/05 05:00 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 07:26 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/05 07:50 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 08:39 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 03:50 PM |
permuting vector elements | hobold | 2011/01/05 05:03 PM |
permuting vector elements | Nicolas Capens | 2011/01/05 06:01 PM |
permuting vector elements | Nicolas Capens | 2011/01/06 08:27 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/11 11:33 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/11 01:51 PM |
Sandy Bridge CPU article online | hobold | 2011/01/11 02:11 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/11 06:07 PM |
Sandy Bridge CPU article online | Michael S | 2011/01/12 03:25 AM |
Sandy Bridge CPU article online | hobold | 2011/01/12 05:03 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/12 11:27 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/13 02:38 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/13 03:32 AM |
Sandy Bridge CPU article online | hobold | 2011/01/13 01:53 PM |
What happened to VPERMIL2PS? | Michael S | 2011/01/13 03:46 AM |
What happened to VPERMIL2PS? | Eric Bron | 2011/01/13 06:46 AM |
Lower cost permute | Paul A. Clayton | 2011/01/13 12:11 PM |
Sandy Bridge CPU article online | anon | 2011/01/25 06:31 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/12 06:34 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/13 07:38 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/15 09:47 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/01/16 03:13 AM |
And just to make a further example | Gabriele Svelto | 2011/01/16 04:24 AM |
Sandy Bridge CPU article online | mpx | 2011/01/16 01:27 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/25 02:56 PM |
Sandy Bridge CPU article online | David Kanter | 2011/01/25 04:11 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/26 08:49 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/26 04:35 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/27 02:51 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/27 02:40 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/28 03:24 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/28 03:49 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/30 02:11 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/31 03:43 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/01 04:02 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/01 04:28 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/01 04:43 AM |
Sandy Bridge CPU article online | EduardoS | 2011/01/28 07:14 PM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/01 02:58 AM |
Sandy Bridge CPU article online | EduardoS | 2011/02/01 02:36 PM |
Sandy Bridge CPU article online | anon | 2011/02/01 04:56 PM |
Sandy Bridge CPU article online | EduardoS | 2011/02/01 09:17 PM |
Sandy Bridge CPU article online | anon | 2011/02/01 10:13 PM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/02 04:08 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/02/02 04:26 AM |
Sandy Bridge CPU article online | kalmaegi | 2011/02/01 09:29 AM |
SW Rasterization | David Kanter | 2011/01/27 05:18 PM |
Lower pin count memory | iz | 2011/01/27 09:19 PM |
Lower pin count memory | David Kanter | 2011/01/27 09:25 PM |
Lower pin count memory | iz | 2011/01/27 11:31 PM |
Lower pin count memory | David Kanter | 2011/01/27 11:52 PM |
Lower pin count memory | iz | 2011/01/28 12:28 AM |
Lower pin count memory | David Kanter | 2011/01/28 01:05 AM |
Lower pin count memory | iz | 2011/01/28 03:55 AM |
Lower pin count memory | David Hess | 2011/01/28 01:15 PM |
Lower pin count memory | David Kanter | 2011/01/28 01:57 PM |
Lower pin count memory | iz | 2011/01/28 05:20 PM |
Two years later | ForgotPants | 2013/10/26 11:33 AM |
Two years later | anon | 2013/10/26 11:36 AM |
Two years later | Exophase | 2013/10/26 12:56 PM |
Two years later | David Hess | 2013/10/26 05:05 PM |
Herz is totally the thing you DON*T care. | Jouni Osmala | 2013/10/27 01:48 AM |
Herz is totally the thing you DON*T care. | EduardoS | 2013/10/27 07:00 AM |
Herz is totally the thing you DON*T care. | Michael S | 2013/10/27 07:45 AM |
Two years later | someone | 2013/10/28 07:21 AM |
Lower pin count memory | Martin Høyer Kristiansen | 2011/01/28 01:41 AM |
Lower pin count memory | iz | 2011/01/28 03:07 AM |
Lower pin count memory | Darrell Coker | 2011/01/27 10:39 PM |
Lower pin count memory | iz | 2011/01/28 12:20 AM |
Lower pin count memory | Darrell Coker | 2011/01/28 06:07 PM |
Lower pin count memory | iz | 2011/01/28 11:57 PM |
Lower pin count memory | Darrell Coker | 2011/01/29 02:21 AM |
Lower pin count memory | iz | 2011/01/31 10:28 PM |
SW Rasterization | Nicolas Capens | 2011/02/02 08:48 AM |
SW Rasterization | Eric Bron | 2011/02/02 09:37 AM |
SW Rasterization | Nicolas Capens | 2011/02/02 04:35 PM |
SW Rasterization | Eric Bron | 2011/02/02 05:11 PM |
SW Rasterization | Eric Bron | 2011/02/03 02:13 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 07:57 AM |
SW Rasterization | Eric Bron | 2011/02/04 08:50 AM |
erratum | Eric Bron | 2011/02/04 08:58 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 05:25 PM |
SW Rasterization | David Kanter | 2011/02/04 05:33 PM |
SW Rasterization | anon | 2011/02/04 06:04 PM |
SW Rasterization | Nicolas Capens | 2011/02/05 03:39 PM |
SW Rasterization | David Kanter | 2011/02/05 05:07 PM |
SW Rasterization | Nicolas Capens | 2011/02/05 11:39 PM |
SW Rasterization | Eric Bron | 2011/02/04 10:55 AM |
Comments pt 1 | David Kanter | 2011/02/02 01:08 PM |
Comments pt 1 | Eric Bron | 2011/02/02 03:16 PM |
Comments pt 1 | Gabriele Svelto | 2011/02/03 01:37 AM |
Comments pt 1 | Eric Bron | 2011/02/03 02:36 AM |
Comments pt 1 | Nicolas Capens | 2011/02/03 11:08 PM |
Comments pt 1 | Nicolas Capens | 2011/02/03 10:26 PM |
Comments pt 1 | Eric Bron | 2011/02/04 03:33 AM |
Comments pt 1 | Nicolas Capens | 2011/02/04 05:24 AM |
example code | Eric Bron | 2011/02/04 04:51 AM |
example code | Nicolas Capens | 2011/02/04 08:24 AM |
example code | Eric Bron | 2011/02/04 08:36 AM |
example code | Nicolas Capens | 2011/02/05 11:43 PM |
Comments pt 1 | Rohit | 2011/02/04 12:43 PM |
Comments pt 1 | Nicolas Capens | 2011/02/04 05:05 PM |
Comments pt 1 | David Kanter | 2011/02/04 05:36 PM |
Comments pt 1 | Nicolas Capens | 2011/02/05 02:45 PM |
Comments pt 1 | Eric Bron | 2011/02/05 04:13 PM |
Comments pt 1 | Nicolas Capens | 2011/02/05 11:52 PM |
Comments pt 1 | Eric Bron | 2011/02/06 01:31 AM |
Comments pt 1 | Nicolas Capens | 2011/02/06 04:06 PM |
Comments pt 1 | Eric Bron | 2011/02/07 03:12 AM |
The need for gather/scatter support | Nicolas Capens | 2011/02/10 10:07 AM |
The need for gather/scatter support | Eric Bron | 2011/02/11 03:11 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/13 03:39 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 07:46 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:48 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 09:32 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 10:07 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 09:00 AM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:49 AM |
Gather/scatter performance data | Eric Bron | 2011/02/15 02:23 AM |
Gather/scatter performance data | Eric Bron | 2011/02/13 05:06 PM |
Gather/scatter performance data | Nicolas Capens | 2011/02/14 07:52 AM |
Gather/scatter performance data | Eric Bron | 2011/02/14 09:43 AM |
SW Rasterization - a long way off | Rohit | 2011/02/02 01:17 PM |
SW Rasterization - a long way off | Nicolas Capens | 2011/02/04 03:59 AM |
CPU only rendering - a long way off | Rohit | 2011/02/04 11:52 AM |
CPU only rendering - a long way off | Nicolas Capens | 2011/02/04 07:15 PM |
CPU only rendering - a long way off | Rohit | 2011/02/05 02:00 AM |
CPU only rendering - a long way off | Nicolas Capens | 2011/02/05 09:45 PM |
CPU only rendering - a long way off | David Kanter | 2011/02/06 09:51 PM |
CPU only rendering - a long way off | Gian-Carlo Pascutto | 2011/02/07 12:22 AM |
Encryption | David Kanter | 2011/02/07 01:18 AM |
Encryption | Nicolas Capens | 2011/02/07 07:51 AM |
Encryption | David Kanter | 2011/02/07 11:50 AM |
Encryption | Nicolas Capens | 2011/02/08 10:26 AM |
CPUs are latency optimized | David Kanter | 2011/02/08 11:38 AM |
efficient compiler on an efficient GPU real today. | sJ | 2011/02/08 11:29 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/09 09:49 PM |
CPUs are latency optimized | Eric Bron | 2011/02/10 12:49 AM |
CPUs are latency optimized | Antti-Ville Tuunainen | 2011/02/10 06:16 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/10 07:04 AM |
CPUs are latency optimized | Eric Bron | 2011/02/10 07:48 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/10 01:31 PM |
CPUs are latency optimized | Eric Bron | 2011/02/11 02:43 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/11 07:31 AM |
CPUs are latency optimized | EduardoS | 2011/02/10 05:29 PM |
CPUs are latency optimized | Anon | 2011/02/10 06:40 PM |
CPUs are latency optimized | David Kanter | 2011/02/10 08:33 PM |
CPUs are latency optimized | EduardoS | 2011/02/11 02:18 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/11 05:56 AM |
CPUs are latency optimized | Rohit | 2011/02/11 07:33 AM |
CPUs are latency optimized | Nicolas Capens | 2011/02/14 02:19 AM |
CPUs are latency optimized | Eric Bron | 2011/02/14 03:23 AM |
CPUs are latency optimized | EduardoS | 2011/02/14 01:11 PM |
CPUs are latency optimized | David Kanter | 2011/02/11 02:45 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/15 05:22 AM |
CPUs are latency optimized | David Kanter | 2011/02/15 12:47 PM |
CPUs are latency optimized | Nicolas Capens | 2011/02/15 07:10 PM |
Have fun | David Kanter | 2011/02/15 10:04 PM |
Have fun | Nicolas Capens | 2011/02/17 03:59 AM |
Have fun | Brett | 2011/02/17 12:56 PM |
Have fun | Nicolas Capens | 2011/02/19 04:53 PM |
Have fun | Brett | 2011/02/20 06:08 PM |
Have fun | Brett | 2011/02/20 07:13 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/02/23 05:37 PM |
On-die storage to fight Amdahl | Brett | 2011/02/23 09:59 PM |
On-die storage to fight Amdahl | Brett | 2011/02/23 10:08 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/02/24 07:42 PM |
On-die storage to fight Amdahl | Rohit | 2011/02/25 11:02 PM |
On-die storage to fight Amdahl | Nicolas Capens | 2011/03/09 06:53 PM |
On-die storage to fight Amdahl | Rohit | 2011/03/10 08:02 AM |
NVIDIA using tile based rendering? | Nathan Monson | 2011/03/11 07:58 PM |
NVIDIA using tile based rendering? | Rohit | 2011/03/12 04:29 AM |
NVIDIA using tile based rendering? | Nathan Monson | 2011/03/12 11:05 AM |
NVIDIA using tile based rendering? | Rohit | 2011/03/12 11:16 AM |
On-die storage to fight Amdahl | Brett | 2011/02/26 02:10 AM |
On-die storage to fight Amdahl | Nathan Monson | 2011/02/26 01:51 PM |
On-die storage to fight Amdahl | Brett | 2011/02/26 04:40 PM |
Convergence is inevitable | Nicolas Capens | 2011/03/09 08:22 PM |
Convergence is inevitable | Brett | 2011/03/09 10:59 PM |
Convergence is inevitable | Antti-Ville Tuunainen | 2011/03/10 03:34 PM |
Convergence is inevitable | Brett | 2011/03/10 09:39 PM |
Procedural texturing? | David Kanter | 2011/03/11 01:32 AM |
Procedural texturing? | hobold | 2011/03/11 03:59 AM |
Procedural texturing? | Dan Downs | 2011/03/11 09:28 AM |
Procedural texturing? | Mark Roulo | 2011/03/11 02:58 PM |
Procedural texturing? | Anon | 2011/03/11 06:11 PM |
Procedural texturing? | Nathan Monson | 2011/03/11 07:30 PM |
Procedural texturing? | Brett | 2011/03/15 07:45 AM |
Procedural texturing? | Seni | 2011/03/15 10:13 AM |
Procedural texturing? | Brett | 2011/03/15 11:45 AM |
Procedural texturing? | Seni | 2011/03/15 02:09 PM |
Procedural texturing? | Brett | 2011/03/11 10:02 PM |
Procedural texturing? | Brett | 2011/03/11 09:34 PM |
Procedural texturing? | Eric Bron | 2011/03/12 03:37 AM |
Convergence is inevitable | Jouni Osmala | 2011/03/09 11:28 PM |
Convergence is inevitable | Brett | 2011/04/05 05:08 PM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 05:23 AM |
Convergence is inevitable | none | 2011/04/07 07:03 AM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 10:34 AM |
Convergence is inevitable | anon | 2011/04/07 02:15 PM |
Convergence is inevitable | none | 2011/04/08 01:57 AM |
Convergence is inevitable | Brett | 2011/04/07 08:04 PM |
Convergence is inevitable | none | 2011/04/08 02:14 AM |
Gather implementation | David Kanter | 2011/04/08 12:01 PM |
RAM Latency | David Hess | 2011/04/07 08:22 AM |
RAM Latency | Brett | 2011/04/07 07:20 PM |
RAM Latency | Nicolas Capens | 2011/04/07 10:18 PM |
RAM Latency | Brett | 2011/04/08 05:33 AM |
RAM Latency | Nicolas Capens | 2011/04/10 02:23 PM |
RAM Latency | Rohit | 2011/04/08 06:57 AM |
RAM Latency | Nicolas Capens | 2011/04/10 01:23 PM |
RAM Latency | David Kanter | 2011/04/10 02:27 PM |
RAM Latency | Rohit | 2011/04/11 06:17 AM |
Convergence is inevitable | Eric Bron | 2011/04/07 09:46 AM |
Convergence is inevitable | Nicolas Capens | 2011/04/07 09:50 PM |
Convergence is inevitable | Eric Bron | 2011/04/08 12:39 AM |
Flaws in PowerVR | Rohit | 2011/02/25 11:21 PM |
Flaws in PowerVR | Brett | 2011/02/26 12:37 AM |
Flaws in PowerVR | Paul | 2011/02/26 05:17 AM |
Have fun | David Kanter | 2011/02/18 12:52 PM |
Have fun | Michael S | 2011/02/19 12:12 PM |
Have fun | David Kanter | 2011/02/19 03:26 PM |
Have fun | Michael S | 2011/02/19 04:43 PM |
Have fun | anon | 2011/02/19 05:02 PM |
Have fun | Michael S | 2011/02/19 05:56 PM |
Have fun | anon | 2011/02/20 03:50 PM |
Have fun | EduardoS | 2011/02/20 02:44 PM |
Linear vs non-linear | EduardoS | 2011/02/20 02:55 PM |
Have fun | Michael S | 2011/02/20 04:19 PM |
Have fun | EduardoS | 2011/02/20 05:51 PM |
Have fun | Nicolas Capens | 2011/02/21 11:12 AM |
Have fun | Michael S | 2011/02/21 12:38 PM |
Have fun | Eric Bron | 2011/02/21 02:10 PM |
Have fun | Eric Bron | 2011/02/21 02:39 PM |
Have fun | Michael S | 2011/02/21 06:13 PM |
Have fun | Eric Bron | 2011/02/22 12:43 AM |
Have fun | Michael S | 2011/02/22 01:47 AM |
Have fun | Eric Bron | 2011/02/22 02:10 AM |
Have fun | Michael S | 2011/02/22 11:37 AM |
Have fun | anon | 2011/02/22 01:38 PM |
Have fun | EduardoS | 2011/02/22 03:49 PM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/23 06:37 PM |
Gather/scatter efficiency | anonymous | 2011/02/23 06:51 PM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/24 06:57 PM |
Gather/scatter efficiency | anonymous | 2011/02/24 07:16 PM |
Gather/scatter efficiency | Michael S | 2011/02/25 07:45 AM |
Gather implementation | David Kanter | 2011/02/25 05:34 PM |
Gather implementation | Michael S | 2011/02/26 10:40 AM |
Gather implementation | anon | 2011/02/26 11:52 AM |
Gather implementation | Michael S | 2011/02/26 12:16 PM |
Gather implementation | anon | 2011/02/26 11:22 PM |
Gather implementation | Michael S | 2011/02/27 07:23 AM |
Gather/scatter efficiency | Nicolas Capens | 2011/02/28 03:14 PM |
Consider yourself ignored | David Kanter | 2011/02/22 01:05 AM |
one more anti-FMA flame. By me. | Michael S | 2011/02/16 07:40 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/16 08:30 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/16 09:15 AM |
one more anti-FMA flame. By me. | Nicolas Capens | 2011/02/17 06:27 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/17 07:42 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/17 05:46 PM |
Tarantula paper | Paul A. Clayton | 2011/02/18 12:38 AM |
Tarantula paper | Nicolas Capens | 2011/02/19 05:19 PM |
anti-FMA != anti-throughput or anti-SG | Eric Bron | 2011/02/18 01:48 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/20 03:46 PM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/20 05:00 PM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 04:05 AM |
Software pipelining on x86 | David Kanter | 2011/02/23 05:04 AM |
Software pipelining on x86 | JS | 2011/02/23 05:25 AM |
Software pipelining on x86 | Salvatore De Dominicis | 2011/02/23 08:37 AM |
Software pipelining on x86 | Jouni Osmala | 2011/02/23 09:10 AM |
Software pipelining on x86 | LeeMiller | 2011/02/23 10:07 PM |
Software pipelining on x86 | Nicolas Capens | 2011/02/24 03:17 PM |
Software pipelining on x86 | anonymous | 2011/02/24 07:04 PM |
Software pipelining on x86 | Nicolas Capens | 2011/02/28 09:27 AM |
Software pipelining on x86 | Antti-Ville Tuunainen | 2011/03/02 04:31 AM |
Software pipelining on x86 | Megol | 2011/03/02 12:55 PM |
Software pipelining on x86 | Geert Bosch | 2011/03/03 07:58 AM |
FMA benefits and latency predictions | David Kanter | 2011/02/25 05:14 PM |
FMA benefits and latency predictions | Antti-Ville Tuunainen | 2011/02/26 10:43 AM |
FMA benefits and latency predictions | Matt Waldhauer | 2011/02/27 06:42 AM |
FMA benefits and latency predictions | Nicolas Capens | 2011/03/09 06:11 PM |
FMA benefits and latency predictions | Rohit | 2011/03/10 08:11 AM |
FMA benefits and latency predictions | Eric Bron | 2011/03/10 09:30 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/23 05:19 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 07:50 AM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/23 10:37 AM |
FMA and beyond | Nicolas Capens | 2011/02/24 04:47 PM |
detour on terminology | hobold | 2011/02/24 07:08 PM |
detour on terminology | Nicolas Capens | 2011/02/28 02:24 PM |
detour on terminology | Eric Bron | 2011/03/01 02:38 AM |
detour on terminology | Michael S | 2011/03/01 05:03 AM |
detour on terminology | Eric Bron | 2011/03/01 05:39 AM |
detour on terminology | Michael S | 2011/03/01 08:33 AM |
detour on terminology | Eric Bron | 2011/03/01 09:34 AM |
erratum | Eric Bron | 2011/03/01 09:54 AM |
detour on terminology | Nicolas Capens | 2011/03/10 08:39 AM |
detour on terminology | Eric Bron | 2011/03/10 09:50 AM |
anti-FMA != anti-throughput or anti-SG | Nicolas Capens | 2011/02/23 06:12 AM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 11:25 PM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/17 06:51 PM |
Tarantula vector unit well-integrated | Paul A. Clayton | 2011/02/18 12:38 AM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/19 02:17 PM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 02:09 AM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/20 09:55 AM |
anti-FMA != anti-throughput or anti-SG | David Kanter | 2011/02/20 01:39 PM |
anti-FMA != anti-throughput or anti-SG | EduardoS | 2011/02/20 02:35 PM |
anti-FMA != anti-throughput or anti-SG | Megol | 2011/02/21 08:12 AM |
anti-FMA != anti-throughput or anti-SG | anon | 2011/02/17 10:44 PM |
anti-FMA != anti-throughput or anti-SG | Michael S | 2011/02/18 06:20 AM |
one more anti-FMA flame. By me. | Eric Bron | 2011/02/17 08:24 AM |
thanks | Michael S | 2011/02/17 04:56 PM |
CPUs are latency optimized | EduardoS | 2011/02/15 01:24 PM |
SwiftShader SNB test | Eric Bron | 2011/02/15 03:46 PM |
SwiftShader NHM test | Eric Bron | 2011/02/15 04:50 PM |
SwiftShader SNB test | Nicolas Capens | 2011/02/17 12:06 AM |
SwiftShader SNB test | Eric Bron | 2011/02/17 01:21 AM |
SwiftShader SNB test | Eric Bron | 2011/02/22 10:32 AM |
SwiftShader SNB test 2nd run | Eric Bron | 2011/02/22 10:51 AM |
SwiftShader SNB test 2nd run | Nicolas Capens | 2011/02/23 02:14 PM |
SwiftShader SNB test 2nd run | Eric Bron | 2011/02/23 02:42 PM |
Win7SP1 out but no AVX hype? | Michael S | 2011/02/24 03:14 AM |
Win7SP1 out but no AVX hype? | Eric Bron | 2011/02/24 03:39 AM |
CPUs are latency optimized | Eric Bron | 2011/02/15 08:02 AM |
CPUs are latency optimized | EduardoS | 2011/02/11 03:40 PM |
CPU only rendering - not a long way off | Nicolas Capens | 2011/02/07 06:45 AM |
CPU only rendering - not a long way off | David Kanter | 2011/02/07 12:09 PM |
CPU only rendering - not a long way off | anonymous | 2011/02/07 10:25 PM |
Sandy Bridge IGP EUs | David Kanter | 2011/02/07 11:22 PM |
Sandy Bridge IGP EUs | Hannes | 2011/02/08 05:59 AM |
SW Rasterization - Why? | Seni | 2011/02/02 02:53 PM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/10 03:12 PM |
Market reasons to ditch the IGP | Seni | 2011/02/11 05:42 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/16 04:29 AM |
Market reasons to ditch the IGP | Seni | 2011/02/16 01:39 PM |
An excellent post! | David Kanter | 2011/02/16 03:18 PM |
CPUs clock higher | Moritz | 2011/02/17 08:06 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/18 06:22 PM |
Market reasons to ditch the IGP | IntelUser2000 | 2011/02/18 07:20 PM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/21 02:42 PM |
Bad data (repeated) | David Kanter | 2011/02/22 12:21 AM |
Bad data (repeated) | none | 2011/02/22 03:04 AM |
13W or 8W? | Foo_ | 2011/02/22 06:00 AM |
13W or 8W? | Linus Torvalds | 2011/02/22 08:58 AM |
13W or 8W? | David Kanter | 2011/02/22 11:33 AM |
13W or 8W? | Mark Christiansen | 2011/02/22 02:47 PM |
Bigger picture | Nicolas Capens | 2011/02/24 06:33 PM |
Bigger picture | Nicolas Capens | 2011/02/24 08:06 PM |
20+ Watt | Nicolas Capens | 2011/02/24 08:18 PM |
<20W | David Kanter | 2011/02/25 01:13 PM |
>20W | Nicolas Capens | 2011/03/08 07:34 PM |
IGP is 3X more efficient | David Kanter | 2011/03/08 10:53 PM |
IGP is 3X more efficient | Eric Bron | 2011/03/09 02:44 AM |
>20W | Eric Bron | 2011/03/09 03:48 AM |
Specious data and claims are still specious | David Kanter | 2011/02/25 02:38 AM |
IGP power consumption, LRB samplers | Nicolas Capens | 2011/03/08 06:24 PM |
IGP power consumption, LRB samplers | EduardoS | 2011/03/08 06:52 PM |
IGP power consumption, LRB samplers | Rohit | 2011/03/09 07:42 AM |
Market reasons to ditch the IGP | none | 2011/02/22 02:58 AM |
Market reasons to ditch the IGP | Nicolas Capens | 2011/02/24 06:43 PM |
Market reasons to ditch the IGP | slacker | 2011/02/22 02:32 PM |
Market reasons to ditch the IGP | Seni | 2011/02/18 09:51 PM |
Correction - 28 comparators, not 36. (NT) | Seni | 2011/02/18 10:03 PM |
Market reasons to ditch the IGP | Gabriele Svelto | 2011/02/19 01:49 AM |
Market reasons to ditch the IGP | Seni | 2011/02/19 11:59 AM |
Market reasons to ditch the IGP | Exophase | 2011/02/20 10:43 AM |
Market reasons to ditch the IGP | EduardoS | 2011/02/19 10:13 AM |
Market reasons to ditch the IGP | Seni | 2011/02/19 11:46 AM |
The next revolution | Nicolas Capens | 2011/02/22 03:33 AM |
The next revolution | Gabriele Svelto | 2011/02/22 09:15 AM |
The next revolution | Eric Bron | 2011/02/22 09:48 AM |
The next revolution | Nicolas Capens | 2011/02/23 07:39 PM |
The next revolution | Gabriele Svelto | 2011/02/24 12:43 AM |
GPGPU content creation (or lack of it) | Nicolas Capens | 2011/02/28 07:39 AM |
GPGPU content creation (or lack of it) | The market begs to differ | 2011/03/01 06:32 AM |
GPGPU content creation (or lack of it) | Nicolas Capens | 2011/03/09 09:14 PM |
GPGPU content creation (or lack of it) | Gabriele Svelto | 2011/03/10 01:01 AM |
The market begs to differ | Gabriele Svelto | 2011/03/01 06:33 AM |
The next revolution | Anon | 2011/02/24 02:15 AM |
The next revolution | Nicolas Capens | 2011/02/28 02:34 PM |
The next revolution | Seni | 2011/02/22 02:02 PM |
The next revolution | Gabriele Svelto | 2011/02/23 06:27 AM |
The next revolution | Seni | 2011/02/23 09:03 AM |
The next revolution | Nicolas Capens | 2011/02/24 06:11 AM |
The next revolution | Seni | 2011/02/24 08:45 PM |
IGP sampler count | Nicolas Capens | 2011/03/03 05:19 AM |
Latency and throughput optimized cores | Nicolas Capens | 2011/03/07 03:28 PM |
The real reason no IGP /CPU converge. | Jouni Osmala | 2011/03/07 11:34 PM |
Still converging | Nicolas Capens | 2011/03/13 03:08 PM |
Homogeneous CPU advantages | Nicolas Capens | 2011/03/08 12:12 AM |
Homogeneous CPU advantages | Seni | 2011/03/08 09:23 AM |
Homogeneous CPU advantages | David Kanter | 2011/03/08 11:16 AM |
Homogeneous CPU advantages | Brett | 2011/03/09 03:37 AM |
Homogeneous CPU advantages | Jouni Osmala | 2011/03/09 12:27 AM |
SW Rasterization | firsttimeposter | 2011/02/03 11:18 PM |
SW Rasterization | Nicolas Capens | 2011/02/04 04:48 AM |
SW Rasterization | Eric Bron | 2011/02/04 05:14 AM |
SW Rasterization | Nicolas Capens | 2011/02/04 08:36 AM |
SW Rasterization | Eric Bron | 2011/02/04 08:42 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/26 03:23 AM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/02/04 04:31 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/05 08:46 PM |
Sandy Bridge CPU article online | Gabriele Svelto | 2011/02/06 06:20 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/02/06 06:07 PM |
Sandy Bridge CPU article online | arch.comp | 2011/01/06 10:58 PM |
Sandy Bridge CPU article online | Seni | 2011/01/07 10:25 AM |
Sandy Bridge CPU article online | Michael S | 2011/01/05 04:28 AM |
Sandy Bridge CPU article online | Nicolas Capens | 2011/01/05 06:06 AM |
permuting vector elements (yet again) | hobold | 2011/01/05 05:15 PM |
permuting vector elements (yet again) | Nicolas Capens | 2011/01/06 06:11 AM |
Sandy Bridge CPU article online | Eric Bron | 2011/01/05 12:46 PM |
wow ...! | hobold | 2011/01/05 05:19 PM |
wow ...! | Nicolas Capens | 2011/01/05 06:11 PM |
wow ...! | Eric Bron | 2011/01/05 10:46 PM |
compress LUT | Eric Bron | 2011/01/05 11:05 PM |
wow ...! | Michael S | 2011/01/06 02:25 AM |
wow ...! | Nicolas Capens | 2011/01/06 06:26 AM |
wow ...! | Eric Bron | 2011/01/06 09:08 AM |
wow ...! | Nicolas Capens | 2011/01/07 07:19 AM |
wow ...! | Steve Underwood | 2011/01/07 10:53 PM |
saturation | hobold | 2011/01/08 10:25 AM |
saturation | Steve Underwood | 2011/01/08 12:38 PM |
saturation | Michael S | 2011/01/08 01:05 PM |
128 bit floats | Brett | 2011/01/08 01:39 PM |
128 bit floats | Michael S | 2011/01/08 02:10 PM |
128 bit floats | Anil Maliyekkel | 2011/01/08 03:46 PM |
128 bit floats | Kevin G | 2011/02/27 11:15 AM |
128 bit floats | hobold | 2011/02/27 04:42 PM |
128 bit floats | Ian Ollmann | 2011/02/28 04:56 PM |
OpenCL FP accuracy | hobold | 2011/03/01 06:45 AM |
OpenCL FP accuracy | anon | 2011/03/01 08:03 PM |
OpenCL FP accuracy | hobold | 2011/03/02 03:53 AM |
OpenCL FP accuracy | Eric Bron | 2011/03/02 07:10 AM |
pet project | hobold | 2011/03/02 09:22 AM |
pet project | Anon | 2011/03/02 09:10 PM |
pet project | hobold | 2011/03/03 04:57 AM |
pet project | Eric Bron | 2011/03/03 02:29 AM |
pet project | hobold | 2011/03/03 05:14 AM |
pet project | Eric Bron | 2011/03/03 03:10 PM |
pet project | hobold | 2011/03/03 04:04 PM |
OpenCL and AMD | Vincent Diepeveen | 2011/03/07 01:44 PM |
OpenCL and AMD | Eric Bron | 2011/03/08 02:05 AM |
OpenCL and AMD | Vincent Diepeveen | 2011/03/08 08:27 AM |
128 bit floats | Michael S | 2011/02/27 04:46 PM |
128 bit floats | Anil Maliyekkel | 2011/02/27 06:14 PM |
saturation | Steve Underwood | 2011/01/17 04:42 AM |
wow ...! | hobold | 2011/01/06 05:05 PM |
Ring | Moritz | 2011/01/20 10:51 PM |
Ring | Antti-Ville Tuunainen | 2011/01/21 12:25 PM |
Ring | Moritz | 2011/01/23 01:38 AM |
Ring | Michael S | 2011/01/23 04:04 AM |
So fast | Moritz | 2011/01/23 07:57 AM |
So fast | David Kanter | 2011/01/23 10:05 AM |
Sandy Bridge CPU (L1D cache) | Gordon Ward | 2011/09/09 02:47 AM |
Sandy Bridge CPU (L1D cache) | David Kanter | 2011/09/09 04:19 PM |
Sandy Bridge CPU (L1D cache) | EduardoS | 2011/09/09 08:53 PM |
Sandy Bridge CPU (L1D cache) | Paul A. Clayton | 2011/09/10 05:12 AM |
Sandy Bridge CPU (L1D cache) | Michael S | 2011/09/10 09:41 AM |
Sandy Bridge CPU (L1D cache) | EduardoS | 2011/09/10 11:17 AM |
Address Ports on Sandy Bridge Scheduler | Victor | 2011/10/16 06:40 AM |
Address Ports on Sandy Bridge Scheduler | EduardoS | 2011/10/16 07:45 PM |
Address Ports on Sandy Bridge Scheduler | Megol | 2011/10/17 09:20 AM |
Address Ports on Sandy Bridge Scheduler | Victor | 2011/10/18 05:34 PM |
Benefits of early scheduling | Paul A. Clayton | 2011/10/18 06:53 PM |
Benefits of early scheduling | Victor | 2011/10/19 05:58 PM |
Consistency and invalidation ordering | Paul A. Clayton | 2011/10/20 04:43 AM |
Address Ports on Sandy Bridge Scheduler | John Upcroft | 2011/10/21 04:16 PM |
Address Ports on Sandy Bridge Scheduler | David Kanter | 2011/10/22 10:49 AM |
Address Ports on Sandy Bridge Scheduler | John Upcroft | 2011/10/26 01:24 PM |
Store TLB look-up at commit? | Paul A. Clayton | 2011/10/26 08:30 PM |
Store TLB look-up at commit? | Richard Scott | 2011/10/26 09:40 PM |
Just a guess | Paul A. Clayton | 2011/10/27 01:54 PM |