By: Brendan (btrotter.delete@this.gmail.com), June 30, 2022 11:09 am
Room: Moderated Discussions
Hi,
Kester L (nobody.delete@this.nothing.com) on June 29, 2022 1:49 pm wrote:
> https://queue.acm.org/detail.cfm?id=3534854
>
>
>
> Your thoughts on this article? I was under the impression that a lot of the 80s attempts
> at capability machines (or really, anything that wasn't trying to be a glorified PDP-11)
> floundered because of performance and cost issues (i.e. the Intel i432).
My thoughts on the article itself is that it has a strong "marketing" smell. Let's ignore the article's words and focus on the "safety vs. performance" question.
You can split it into 2 very different problems:
a) Reducing the amount of developer effort required to create correct and efficient software
b) Maximizing the efficiency (performance per watt per $) for normal users (not software developers) executing correct software
For the former, the latter is mostly irrelevant. E.g. if you're running a specially instrumented "debug version" (with lots of extra checks enabled) in a special environment (e.g. under valgrind) and its 200 times slower, most developers simply won't have a reason to care.
For the latter, the former is mostly irrelevant. The only thing you actually care about is protecting everything (kernel, other processes) from untrusted (potentially deliberately malicious) code; and for this you only care about the boundaries between "inside the process" and "outside the process" (and you do not care about anything inside the process trashing anything else in the process). For modern systems this is invariably done with paravirtualization - a process runs inside its own virtual machine (with virtual memory, virtual CPUs/threads, highly abstracted virtual IO like "open()" and "write()" instead of anything resembling physical devices).
You can't/shouldn't combine these problems; because as soon as you do you're forcing a compromise (worse safety for the former and/or worse efficiency for the latter). Anyone who doesn't seem to understand this (e.g. CHERI designers) are trying to solve a problem without understanding the nature of the problem.
However; there is some overlap. Good software developers try to create correct and efficient software (and not just correct software). Anything intended to reduce the effort needed to create correct and efficient software should/must be able to handle all the tricks that could be used to improve the efficiency of correct code. This includes using raw assembly language for highly specialized pieces; and self-modifying code and run-time code generation; and improving locality by using a single memory allocation for several objects of different types; and improving SIMD performance by using "structure of arrays" instead of "array of structures/objects"; and recognizing that sometimes the same object has different types in different places (a pointer/reference to a structure/object that is used as a raw integer when calculating a hash, a "big integer" in one place that is an array of unsigned integers in another place, multiple inheritance, etc).
- Brendan
Kester L (nobody.delete@this.nothing.com) on June 29, 2022 1:49 pm wrote:
> https://queue.acm.org/detail.cfm?id=3534854
>
>
> The linear address space as a concept is unsafe at any speed, and it badly needs mandatory CHERI
> seat belts. But even better would be to get rid of linear address spaces entirely and go back to
> the future, as successfully implemented in the Rational R1000 computer 30-plus years ago.
>
>
> Your thoughts on this article? I was under the impression that a lot of the 80s attempts
> at capability machines (or really, anything that wasn't trying to be a glorified PDP-11)
> floundered because of performance and cost issues (i.e. the Intel i432).
My thoughts on the article itself is that it has a strong "marketing" smell. Let's ignore the article's words and focus on the "safety vs. performance" question.
You can split it into 2 very different problems:
a) Reducing the amount of developer effort required to create correct and efficient software
b) Maximizing the efficiency (performance per watt per $) for normal users (not software developers) executing correct software
For the former, the latter is mostly irrelevant. E.g. if you're running a specially instrumented "debug version" (with lots of extra checks enabled) in a special environment (e.g. under valgrind) and its 200 times slower, most developers simply won't have a reason to care.
For the latter, the former is mostly irrelevant. The only thing you actually care about is protecting everything (kernel, other processes) from untrusted (potentially deliberately malicious) code; and for this you only care about the boundaries between "inside the process" and "outside the process" (and you do not care about anything inside the process trashing anything else in the process). For modern systems this is invariably done with paravirtualization - a process runs inside its own virtual machine (with virtual memory, virtual CPUs/threads, highly abstracted virtual IO like "open()" and "write()" instead of anything resembling physical devices).
You can't/shouldn't combine these problems; because as soon as you do you're forcing a compromise (worse safety for the former and/or worse efficiency for the latter). Anyone who doesn't seem to understand this (e.g. CHERI designers) are trying to solve a problem without understanding the nature of the problem.
However; there is some overlap. Good software developers try to create correct and efficient software (and not just correct software). Anything intended to reduce the effort needed to create correct and efficient software should/must be able to handle all the tricks that could be used to improve the efficiency of correct code. This includes using raw assembly language for highly specialized pieces; and self-modifying code and run-time code generation; and improving locality by using a single memory allocation for several objects of different types; and improving SIMD performance by using "structure of arrays" instead of "array of structures/objects"; and recognizing that sometimes the same object has different types in different places (a pointer/reference to a structure/object that is used as a raw integer when calculating a hash, a "big integer" in one place that is an array of unsigned integers in another place, multiple inheritance, etc).
- Brendan