By: Linus Torvalds (torvalds.delete@this.osdl.org), October 31, 2006 2:02 pm
Room: Moderated Discussions
Rob Thorpe (rthorpe@realworldtech.com) on 10/31/06 wrote:
>
>This is a demonstration of what I'm saying: If you have
>a feature X that can be done in software, unless speed is
>of the essence, or it's fantastically weird and low-level
>it should be done in software.
Hey, I'll happily agree with that.
However, I claim that anything that actually depends on
whether a particular set of cache-lines is in the L1 cache
or not definitely falls under "fantastically weird and
low-level". Similarly for a "does this individual 4-byte
load cross a page boundary or not".
There are certain things hardware is fundamentally
better at, and they have to do with instruction-
level scheduling and decision making. Those are things
that software simply cannot do well, because software
doesn't know the dynamic situation - and software will
make the problem worse, because querying the situation
at run-time just adds new issues.
So from a software perspective:
- querying alignment on a single access basis is totally
insane.
- querying "is this in the cache" on a single access
basis is similarly insane.
- statically just saying one or the other is unworkable,
because when you're wrong, you're taking a huge hit
(whether it's a trap or just the fact that you used
four times as many instructions to do an unaligned load
even though it was actually aligned 99.9% of the time).
So hardware should handle things that are counted in
the "tens of instructions" on its own. We know that
it can do that. It's not a "complexity issue", and it
hasn't been a complexity issue for decades - these days it
is more about just how well and aggressively you want
to do it (eg how much effort you want to spend on
doing it well).
Software will happily do the rest, and take care of the
truly "unbounded" complexity issues. But there is absolutely
no point in pushing into software things that the hardware
does know how to do, and knows better than software
how to do.
That's like saying "you have to handle tons of problems
anyway, so take the problems I could handle better,
just because I'd rather not do it". If you don't see what's
wrong with that kind of argument, I don't know what I can
say.
Software has quite enough on its plate already, and people
are trying to worry about the high-level problems.
The last thing sw people need is a whiny hw person that
says "can't you handle the problems I already know how to
solve better than you, too? You're so good at problems.."
In other words, hardware can do the "micro-optimizations"
that software simply doesn't have the time for. That's
what Core 2 largely does, and daamn, Core 2 is doing
fine. It tells the sw to not sweat the small stuff,
because it can handle it.
That's what it boils down to. Software should concentrate
on what software is good at, and hardware should do what
hardware is good at. It's not an "either or" situation.
It's a symbiosis - a combination of strengths. The things
that hardware does well are often things that software is
not so good at, and vice versa.
Linus
>
>This is a demonstration of what I'm saying: If you have
>a feature X that can be done in software, unless speed is
>of the essence, or it's fantastically weird and low-level
>it should be done in software.
Hey, I'll happily agree with that.
However, I claim that anything that actually depends on
whether a particular set of cache-lines is in the L1 cache
or not definitely falls under "fantastically weird and
low-level". Similarly for a "does this individual 4-byte
load cross a page boundary or not".
There are certain things hardware is fundamentally
better at, and they have to do with instruction-
level scheduling and decision making. Those are things
that software simply cannot do well, because software
doesn't know the dynamic situation - and software will
make the problem worse, because querying the situation
at run-time just adds new issues.
So from a software perspective:
- querying alignment on a single access basis is totally
insane.
- querying "is this in the cache" on a single access
basis is similarly insane.
- statically just saying one or the other is unworkable,
because when you're wrong, you're taking a huge hit
(whether it's a trap or just the fact that you used
four times as many instructions to do an unaligned load
even though it was actually aligned 99.9% of the time).
So hardware should handle things that are counted in
the "tens of instructions" on its own. We know that
it can do that. It's not a "complexity issue", and it
hasn't been a complexity issue for decades - these days it
is more about just how well and aggressively you want
to do it (eg how much effort you want to spend on
doing it well).
Software will happily do the rest, and take care of the
truly "unbounded" complexity issues. But there is absolutely
no point in pushing into software things that the hardware
does know how to do, and knows better than software
how to do.
That's like saying "you have to handle tons of problems
anyway, so take the problems I could handle better,
just because I'd rather not do it". If you don't see what's
wrong with that kind of argument, I don't know what I can
say.
Software has quite enough on its plate already, and people
are trying to worry about the high-level problems.
The last thing sw people need is a whiny hw person that
says "can't you handle the problems I already know how to
solve better than you, too? You're so good at problems.."
In other words, hardware can do the "micro-optimizations"
that software simply doesn't have the time for. That's
what Core 2 largely does, and daamn, Core 2 is doing
fine. It tells the sw to not sweat the small stuff,
because it can handle it.
That's what it boils down to. Software should concentrate
on what software is good at, and hardware should do what
hardware is good at. It's not an "either or" situation.
It's a symbiosis - a combination of strengths. The things
that hardware does well are often things that software is
not so good at, and vice versa.
Linus