By: Brendan (btrotter.delete@this.gmail.com), July 1, 2022 1:24 pm
Room: Moderated Discussions
Hi,
dmcq (dmcq.delete@this.fano.co.uk) on July 1, 2022 6:06 am wrote:
> Brendan (btrotter.delete@this.gmail.com) on June 30, 2022 3:52 pm wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on June 30, 2022 11:20 am wrote:
> > > Brendan (btrotter.delete@this.gmail.com) on June 30, 2022 11:09 am wrote:
> > > > Kester L (nobody.delete@this.nothing.com) on June 29, 2022 1:49 pm wrote:
> > > > > https://queue.acm.org/detail.cfm?id=3534854
> > > > >
> > > > >
> > > > >
> > > > > Your thoughts on this article? I was under the impression that a lot of the 80s attempts
> > > > > at capability machines (or really, anything that wasn't trying to be a glorified PDP-11)
> > > > > floundered because of performance and cost issues (i.e. the Intel i432).
> > > >
> > > > My thoughts on the article itself is that it has a strong "marketing" smell. Let's
> > > > ignore the article's words and focus on the "safety vs. performance" question.
> > > >
> > > > You can split it into 2 very different problems:
> > > >
> > > > a) Reducing the amount of developer effort required to create correct and efficient software
> > > >
> > > > b) Maximizing the efficiency (performance per watt per $) for normal
> > > > users (not software developers) executing correct software
> > > >
> > > > For the former, the latter is mostly irrelevant. E.g. if you're running a specially instrumented
> > > > "debug version" (with lots of extra checks enabled) in a special environment (e.g. under valgrind)
> > > > and its 200 times slower, most developers simply won't have a reason to care.
> > > >
> > > > For the latter, the former is mostly irrelevant. The only thing you actually care about is protecting
> > > > everything (kernel, other processes) from untrusted (potentially deliberately malicious) code;
> > > > and for this you only care about the boundaries between "inside the process" and "outside the
> > > > process" (and you do not care about anything inside the process trashing anything else in the
> > > > process). For modern systems this is invariably done with paravirtualization - a process runs
> > > > inside its own virtual machine (with virtual memory, virtual CPUs/threads, highly abstracted virtual
> > > > IO like "open()" and "write()" instead of anything resembling physical devices).
> > > >
> > > > You can't/shouldn't combine these problems; because as soon as you do you're forcing a compromise (worse
> > > > safety for the former and/or worse efficiency for the latter). Anyone who doesn't seem to understand this
> > > > (e.g. CHERI designers) are trying to solve a problem without understanding the nature of the problem.
> > > >
> > > > However; there is some overlap. Good software developers try to create correct and efficient software (and
> > > > not just correct software). Anything intended to reduce the effort needed to create correct and efficient
> > > > software should/must be able to handle all the tricks that
> > > > could be used to improve the efficiency of correct
> > > > code. This includes using raw assembly language for highly specialized pieces; and self-modifying code and
> > > > run-time code generation; and improving locality by using a single memory allocation for several objects of
> > > > different types; and improving SIMD performance by using "structure
> > > > of arrays" instead of "array of structures/objects";
> > > > and recognizing that sometimes the same object has different types in different places (a pointer/reference
> > > > to a structure/object that is used as a raw integer when calculating a hash, a "big integer" in one place
> > > > that is an array of unsigned integers in another place, multiple inheritance, etc).
> > >
> > > I think your idea of 'normal users' needs a bit of refining.
> >
> > For "for normal users (not software developers)" I mean the 6+ billion people on the planet that
> > use computers without ever writing a single line of code themselves, without knowing how a computer
> > works, and without even knowing they're using a computer in some cases (e.g. driving a car).
> >
> > > I'm not sure there's many people left
> > > who a company would pay after they're allowed to do the sort of tricks you're talking about. Do you
> > > mean ners=ds who are just developing a program for themselves and waste their lives to no profit?
> >
> > All of those tricks happen in every computer I ever saw. The Linux kernel alone does most
> > of them (not sure if eBPF has advanced to the "run-time generated code" stage yet).
>
> 'Good developers try'. Well I can't argue with that. But do they succeed? And the Linux kernel has
> done a good job removing the masses of asm macros that were originally put in. The things you talk
> about are things that are being done by improving the language so making the faciities available
> in a less buggy way. Someone going around by hand changing an array of struct to a number of different
> arrays by hand is someone who is desperate.
Can you think of a single compiler or language that supports automatically converting "array of structures" into "structure of arrays" (and auto-detecting when it would be beneficial to do so)?
The fact is that how much a compiler can optimize is very limited (extremely good at trivial micro-optimizations; but extremely bad at fixing poor design decisions).
> They'll spend a lot of time on it and generate lots of
> bugs and yes it would be much better if there was software support in the language they were using
> instead. The only people who'd normally do that by hand are people with time to waste and are not
> concerned about bugs. Not good developers. You're mixing up desperation and desire.
No, you've fabricated a "premature optimization is always evil with no exceptions at all ever" straw-man and ignored what I'm saying.
Whether it's done by hand or done by compiler (or done by a shared library developer, or done by a a language's run-time, or ..) is irrelevant; and whether it's only done for a tiny niche (a performance critical 0.001% of software) or it's ubiquitous is irrelevant. In all cases you still want some assurance that it's correct before release (and don't care much about efficiency during pre-release testing), and still want efficiency after its released (and don't care much about assurances it's correct after you've already been assured). Any compromise between pre-release testing and post-release efficiency (e.g. CHERI) is undeniably inferior (for pre-release testing, or for post-release efficiency, or for both).
- Brendan
dmcq (dmcq.delete@this.fano.co.uk) on July 1, 2022 6:06 am wrote:
> Brendan (btrotter.delete@this.gmail.com) on June 30, 2022 3:52 pm wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on June 30, 2022 11:20 am wrote:
> > > Brendan (btrotter.delete@this.gmail.com) on June 30, 2022 11:09 am wrote:
> > > > Kester L (nobody.delete@this.nothing.com) on June 29, 2022 1:49 pm wrote:
> > > > > https://queue.acm.org/detail.cfm?id=3534854
> > > > >
> > > > >
> > > > > The linear address space as a concept is unsafe at any speed, and it badly needs mandatory CHERI
> > > > > seat belts. But even better would be to get rid of linear address spaces entirely and go back to
> > > > > the future, as successfully implemented in the Rational R1000 computer 30-plus years ago.
> > > > >
> > > > >
> > > > > Your thoughts on this article? I was under the impression that a lot of the 80s attempts
> > > > > at capability machines (or really, anything that wasn't trying to be a glorified PDP-11)
> > > > > floundered because of performance and cost issues (i.e. the Intel i432).
> > > >
> > > > My thoughts on the article itself is that it has a strong "marketing" smell. Let's
> > > > ignore the article's words and focus on the "safety vs. performance" question.
> > > >
> > > > You can split it into 2 very different problems:
> > > >
> > > > a) Reducing the amount of developer effort required to create correct and efficient software
> > > >
> > > > b) Maximizing the efficiency (performance per watt per $) for normal
> > > > users (not software developers) executing correct software
> > > >
> > > > For the former, the latter is mostly irrelevant. E.g. if you're running a specially instrumented
> > > > "debug version" (with lots of extra checks enabled) in a special environment (e.g. under valgrind)
> > > > and its 200 times slower, most developers simply won't have a reason to care.
> > > >
> > > > For the latter, the former is mostly irrelevant. The only thing you actually care about is protecting
> > > > everything (kernel, other processes) from untrusted (potentially deliberately malicious) code;
> > > > and for this you only care about the boundaries between "inside the process" and "outside the
> > > > process" (and you do not care about anything inside the process trashing anything else in the
> > > > process). For modern systems this is invariably done with paravirtualization - a process runs
> > > > inside its own virtual machine (with virtual memory, virtual CPUs/threads, highly abstracted virtual
> > > > IO like "open()" and "write()" instead of anything resembling physical devices).
> > > >
> > > > You can't/shouldn't combine these problems; because as soon as you do you're forcing a compromise (worse
> > > > safety for the former and/or worse efficiency for the latter). Anyone who doesn't seem to understand this
> > > > (e.g. CHERI designers) are trying to solve a problem without understanding the nature of the problem.
> > > >
> > > > However; there is some overlap. Good software developers try to create correct and efficient software (and
> > > > not just correct software). Anything intended to reduce the effort needed to create correct and efficient
> > > > software should/must be able to handle all the tricks that
> > > > could be used to improve the efficiency of correct
> > > > code. This includes using raw assembly language for highly specialized pieces; and self-modifying code and
> > > > run-time code generation; and improving locality by using a single memory allocation for several objects of
> > > > different types; and improving SIMD performance by using "structure
> > > > of arrays" instead of "array of structures/objects";
> > > > and recognizing that sometimes the same object has different types in different places (a pointer/reference
> > > > to a structure/object that is used as a raw integer when calculating a hash, a "big integer" in one place
> > > > that is an array of unsigned integers in another place, multiple inheritance, etc).
> > >
> > > I think your idea of 'normal users' needs a bit of refining.
> >
> > For "for normal users (not software developers)" I mean the 6+ billion people on the planet that
> > use computers without ever writing a single line of code themselves, without knowing how a computer
> > works, and without even knowing they're using a computer in some cases (e.g. driving a car).
> >
> > > I'm not sure there's many people left
> > > who a company would pay after they're allowed to do the sort of tricks you're talking about. Do you
> > > mean ners=ds who are just developing a program for themselves and waste their lives to no profit?
> >
> > All of those tricks happen in every computer I ever saw. The Linux kernel alone does most
> > of them (not sure if eBPF has advanced to the "run-time generated code" stage yet).
>
> 'Good developers try'. Well I can't argue with that. But do they succeed? And the Linux kernel has
> done a good job removing the masses of asm macros that were originally put in. The things you talk
> about are things that are being done by improving the language so making the faciities available
> in a less buggy way. Someone going around by hand changing an array of struct to a number of different
> arrays by hand is someone who is desperate.
Can you think of a single compiler or language that supports automatically converting "array of structures" into "structure of arrays" (and auto-detecting when it would be beneficial to do so)?
The fact is that how much a compiler can optimize is very limited (extremely good at trivial micro-optimizations; but extremely bad at fixing poor design decisions).
> They'll spend a lot of time on it and generate lots of
> bugs and yes it would be much better if there was software support in the language they were using
> instead. The only people who'd normally do that by hand are people with time to waste and are not
> concerned about bugs. Not good developers. You're mixing up desperation and desire.
No, you've fabricated a "premature optimization is always evil with no exceptions at all ever" straw-man and ignored what I'm saying.
Whether it's done by hand or done by compiler (or done by a shared library developer, or done by a a language's run-time, or ..) is irrelevant; and whether it's only done for a tiny niche (a performance critical 0.001% of software) or it's ubiquitous is irrelevant. In all cases you still want some assurance that it's correct before release (and don't care much about efficiency during pre-release testing), and still want efficiency after its released (and don't care much about assurances it's correct after you've already been assured). Any compromise between pre-release testing and post-release efficiency (e.g. CHERI) is undeniably inferior (for pre-release testing, or for post-release efficiency, or for both).
- Brendan