By: RichardC (tich.delete@this.pobox.com), May 16, 2013 6:57 am
Room: Moderated Discussions
Brendan (btrotter.delete@this.gmail.com) on May 16, 2013 12:29 am wrote:
> Is it reasonable to expect competent developers to be able to handle that extra complexity when
> it's beneficial? I guess this depends on how you define "competent". I'd say "it's definitely
> reasonable" (it's not the 20th century anymore) but other people may have lower standards.
See this paper http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
Key quote from the conclusion: "non-trivial multi-threaded programs are incomprehensible
to humans".
And this experience with an expert team using best practices:
"A part of the Ptolemy Project experiment was to see whether effective software engineering
practices could be developed for an academic research setting. We developed a process that included
a code maturity rating system (with four levels, red, yellow, green, and blue), design reviews, code
reviews, nightly builds, regression tests, and automated code coverage metrics [43]. The portion
of the kernel that ensured a consistent view of the program structure was written in early 2000,
design reviewed to yellow, and code reviewed to green. The reviewers included concurrency experts,
not just inexperienced graduate students (Christopher Hylands (now Brooks), Bart Kienhuis, John
Reekie, and myself were all reviewers). We wrote regression tests that achieved 100 percent code
coverage. The nightly build and regression tests ran on a two processor SMP machine, which
exhibited different thread behavior than the development machines, which all had a single processor.
The Ptolemy II system itself began to be widely used, and every use of the system exercised this
code. No problems were observed until the code deadlocked on April 26, 2004, four years later.
It is certainly true that our relatively rigorous software engineering practice identified and fixed
many concurrency bugs. But the fact that a problem as serious as a deadlock that locked up the
system could go undetected for four years despite this practice is alarming. How many more such
problems remain? How long do we need test before we can be sure to have discovered all such
problems? Regrettably, I have to conclude that testing may never reveal all the problems in nontrivial
multithreaded code."
> Is it reasonable to expect competent developers to be able to handle that extra complexity when
> it's beneficial? I guess this depends on how you define "competent". I'd say "it's definitely
> reasonable" (it's not the 20th century anymore) but other people may have lower standards.
See this paper http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
Key quote from the conclusion: "non-trivial multi-threaded programs are incomprehensible
to humans".
And this experience with an expert team using best practices:
"A part of the Ptolemy Project experiment was to see whether effective software engineering
practices could be developed for an academic research setting. We developed a process that included
a code maturity rating system (with four levels, red, yellow, green, and blue), design reviews, code
reviews, nightly builds, regression tests, and automated code coverage metrics [43]. The portion
of the kernel that ensured a consistent view of the program structure was written in early 2000,
design reviewed to yellow, and code reviewed to green. The reviewers included concurrency experts,
not just inexperienced graduate students (Christopher Hylands (now Brooks), Bart Kienhuis, John
Reekie, and myself were all reviewers). We wrote regression tests that achieved 100 percent code
coverage. The nightly build and regression tests ran on a two processor SMP machine, which
exhibited different thread behavior than the development machines, which all had a single processor.
The Ptolemy II system itself began to be widely used, and every use of the system exercised this
code. No problems were observed until the code deadlocked on April 26, 2004, four years later.
It is certainly true that our relatively rigorous software engineering practice identified and fixed
many concurrency bugs. But the fact that a problem as serious as a deadlock that locked up the
system could go undetected for four years despite this practice is alarming. How many more such
problems remain? How long do we need test before we can be sure to have discovered all such
problems? Regrettably, I have to conclude that testing may never reveal all the problems in nontrivial
multithreaded code."