By: Maynard Handley (name99.delete@this.name99.org), May 13, 2013 6:52 pm
Room: Moderated Discussions
Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 12, 2013 5:54 pm wrote:
> * browsers are moving to a process per tab model
This is irrelevant to performance. It only affects the issues being discussed here IF
(a) one has MULTIPLE windows open all of which SIMULTANEOUSLY
(b) require 100% of a CPU.
This does not match any current realistic usage scenario.
I don't want to be rude, but this is as idiotic as the claim that tablets will NEED quad-core because soon they will be displaying multiple windows from multiple simultaneously active apps.
People are confusing a number of different issues here.
(a) Is it "common" to make use of multiple threads? The short answer is, this has been tested across a range of common software and the answer is no. Ten years ago for some definition of "common" software, a desktop could not use more than two threads. (Mostly single-threaded app plus some GUI and interrupts offload).
Recently the same tests have bumped that number up to three threads because many apps now use a lot of threads (helped by new APIs that encourage this) but most of the threads are very light weight and spend their time blocked not active.
Big ticket apps that are NOT threaded in any useful fashion (ie threaded in the part that matters, not in some BS part that is only mentioned for scoring points) include web browsers (still) and PDF rendering. Games are also not nearly as threaded as the multicore advocates would have you believe.
(b) The issue, however, is NOT how to keep your CPU pegged at 100% 24/7. The issue is how to deliver the best computing experience. Leaving aside issues of cost, this boils down to making machines as snappy as possible. I don't care if my machine is idle 59 minutes out of every 60, what I care about is that when I require it to do something it happens as fast as possible. Extra cores (or SMT) help with this to a small extent. They do help when I want certain tasks to run faster (the usual video encode, the slightly less usual large Mathematica jobs, the occasional situation where I fire up three compute intensive tasks at once) and, per my earlier point, that's still pretty good --- if they speed things up once a week, what does it matter if they're mostly idle?
(c) On current Intel HW, SMT is worth about .25 of a core, so a quad-core with SMT is (pretty reliably) worth 5 cores. If you think it's an outrage that this number is not 8 cores, whatever. 5 is bigger than 4, and it's well established that the area cost for SMT is quite a bit less than the area of an additional core.
To claim that there MUST be some slowdown from SMT just because is not helpful, neither is claiming that "obviously" for most users some small speedup (say 5% on single-threaded performance) would be more helpful than SMT. To second-guess Intel's engineers in the absence of real data is just foolish. More specifically:
What limited the performance of SMT in earlier Intel (and other) chips was the memory subsystem. To make SMT work well, Intel has given its modern cores truly astonishingly good
memory systems. To put it differently, because of SMT you probably have a better memory system than you would otherwise --- and that probably speeds up your single-threaded performance far more than worrying about the 5% cost of whatever extra buffers and mixes SMT requires.
(d) It's also ridiculous to claim that the number of HW threads in PCs has leveled out. It may well level out SOON, but it hasn't yet. Ten years ago the common config was something like a P4 with SMT --- 2 HW threads but barely because the P4's SMT sucked so badly (like I said, memory system). Five years ago the common config was 2 cores, no SMT. The common config today is some mix of two cores with SMT and four cores with SMT.
I found two cores (say Penryn class machine) frequently maxed out. There is no way that was optimal. I find 4 IB cores with SMT pretty much NEVER maxed out. so, for current software, I'd say 4 IV cores (with or without SMT) is a great config. Anything less (say 2 IB cores with SMT) is of course acceptable --- but there are still enough real world use cases where it will max out that it's worth going to 4 core.
This is desktop of course. I see no reason to believe that better than dual CPU makes sense on any phone or tablet in the next few years. Basically: it makes sense to have the extra computation boost available IF you occasionally hit situations that utilize --- that odn't have to be common, but they do have to exist. On the desktop these situations do exist; on mobile not so much.
> * browsers are moving to a process per tab model
This is irrelevant to performance. It only affects the issues being discussed here IF
(a) one has MULTIPLE windows open all of which SIMULTANEOUSLY
(b) require 100% of a CPU.
This does not match any current realistic usage scenario.
I don't want to be rude, but this is as idiotic as the claim that tablets will NEED quad-core because soon they will be displaying multiple windows from multiple simultaneously active apps.
People are confusing a number of different issues here.
(a) Is it "common" to make use of multiple threads? The short answer is, this has been tested across a range of common software and the answer is no. Ten years ago for some definition of "common" software, a desktop could not use more than two threads. (Mostly single-threaded app plus some GUI and interrupts offload).
Recently the same tests have bumped that number up to three threads because many apps now use a lot of threads (helped by new APIs that encourage this) but most of the threads are very light weight and spend their time blocked not active.
Big ticket apps that are NOT threaded in any useful fashion (ie threaded in the part that matters, not in some BS part that is only mentioned for scoring points) include web browsers (still) and PDF rendering. Games are also not nearly as threaded as the multicore advocates would have you believe.
(b) The issue, however, is NOT how to keep your CPU pegged at 100% 24/7. The issue is how to deliver the best computing experience. Leaving aside issues of cost, this boils down to making machines as snappy as possible. I don't care if my machine is idle 59 minutes out of every 60, what I care about is that when I require it to do something it happens as fast as possible. Extra cores (or SMT) help with this to a small extent. They do help when I want certain tasks to run faster (the usual video encode, the slightly less usual large Mathematica jobs, the occasional situation where I fire up three compute intensive tasks at once) and, per my earlier point, that's still pretty good --- if they speed things up once a week, what does it matter if they're mostly idle?
(c) On current Intel HW, SMT is worth about .25 of a core, so a quad-core with SMT is (pretty reliably) worth 5 cores. If you think it's an outrage that this number is not 8 cores, whatever. 5 is bigger than 4, and it's well established that the area cost for SMT is quite a bit less than the area of an additional core.
To claim that there MUST be some slowdown from SMT just because is not helpful, neither is claiming that "obviously" for most users some small speedup (say 5% on single-threaded performance) would be more helpful than SMT. To second-guess Intel's engineers in the absence of real data is just foolish. More specifically:
What limited the performance of SMT in earlier Intel (and other) chips was the memory subsystem. To make SMT work well, Intel has given its modern cores truly astonishingly good
memory systems. To put it differently, because of SMT you probably have a better memory system than you would otherwise --- and that probably speeds up your single-threaded performance far more than worrying about the 5% cost of whatever extra buffers and mixes SMT requires.
(d) It's also ridiculous to claim that the number of HW threads in PCs has leveled out. It may well level out SOON, but it hasn't yet. Ten years ago the common config was something like a P4 with SMT --- 2 HW threads but barely because the P4's SMT sucked so badly (like I said, memory system). Five years ago the common config was 2 cores, no SMT. The common config today is some mix of two cores with SMT and four cores with SMT.
I found two cores (say Penryn class machine) frequently maxed out. There is no way that was optimal. I find 4 IB cores with SMT pretty much NEVER maxed out. so, for current software, I'd say 4 IV cores (with or without SMT) is a great config. Anything less (say 2 IB cores with SMT) is of course acceptable --- but there are still enough real world use cases where it will max out that it's worth going to 4 core.
This is desktop of course. I see no reason to believe that better than dual CPU makes sense on any phone or tablet in the next few years. Basically: it makes sense to have the extra computation boost available IF you occasionally hit situations that utilize --- that odn't have to be common, but they do have to exist. On the desktop these situations do exist; on mobile not so much.