JasonB ( on 8/21/09 wrote:

>As I keep repeating, my observation was not related to where you ended up but rather
>the path you took to get there. I have no problem with you eventually deciding to
>use a different algorithm after exhausting the options I outlined; my observations
>related to a reluctance to even try standard library routines before reinventing
>the wheel,

It depends what you mean by "standard library routine".
If you're already using STL, then sure, using an STL
routine is natural and easy and cheap. If you're *not*
using STL, then it's different. And STL is really a
different language than non-STL C++. And furthermore,
it's a language that really didn't work well (poor
standards compliance, poor portability, slow compile,
poor debugging tools) until the last 4 years or so.
Which is why a lot of projects don't use it and a lot of
people don't know it, me included.

>>Investigate how ? If you read 500 lines an hour, then
>>merely reading a 1M-line codebase would take 2000 hours,
>>or 50 weeks of 40 hours. It can't be done.
>Hmm... Our codebase is about half the size of yours, and

Actually, ours is several millions now :-( I'm just using
the 1M figure as an example ...

>me long. Heck, I did exactly that as soon as I got a dual-core and I certainly didn't
>do it by reading through the source code at 500 lines an hour.

Presumably you did it on the basis of profiling or
other timing data. Which would be the way I'd do it as
well. Not by *randomly* "revisiting assumptions".

>Of course, I'm one of only three developers so I may have far more intimate knowledge
>of the code than someone in a much larger team, especially since performance tuning
>has largely been up to me for the past decade.

Right. Working with many developers spread across sites
on 3 continents, with each site having its own distinct
culture and programming style, adds to the fun :-(

>>Well, there's something lacking in your profiling tools
>>if they don't *show* those routines taking significant
>>time, and yet removing them makes a significant difference.
>I said "measurable", not "significant". It's faster with >the simpler code so I'm taking it out.

Yeah, the question is how you decided that it was worth
taking it out: was that by "revisiting assumptions", or
was it by seeing it in the profiling data and deciding it
was worth changing ?

>Which gets back to my original observation -- it's odd that you are so against
>using a standard routine rather than rolling-your-own when it comes to sorting,
>but you would rather rely on the unspecified behaviour of the standard memory allocation routine than roll-your-own.

Because I want the fastest performance possible, and the
best memory usage possible. Progress in either of those
dimensions helps us to sell our product against our
competition. And the "standard" stuff isn't good enough
in either dimension.

Actually we do indeed roll our own memory allocator, using
an algorithm which works great for small blocks, but then
it hands off big blocks to the normal malloc(), because
big-block allocations aren't performance-critical in our

>>Sure, so you can write a program which checks the return
>>value and thus write a orogram which is conformant ...
>>but useless! Because it craps out with an error message
>>as soon as it tries to allocate anything. Did that help ?
>What, precisely, do you think the alternative is?

I'm just pointing out that "conformant" isn't a very
helpful concept.

>What are you suggesting? That you can write a program that can survive having the
>plug knocked out of the wall, or that because no program could survive that there's
>no point writing code that can handle all of the myriad other problems that could
>occur that could be handled properly?

I'm saying anyone who writes a program using malloc()
is relying on behavior that isn't very well defined.
And everyone is vulnerable to having their program run
into trouble if malloc() changes under their feet.
I was just sailing a little closer to the edge than most.
>I certainly care about what's useful. It isn't a choice between writing "Code that
>handles errors gracefully" and "Code that is useful".

>4. If the return value is not NULL then it is disjoint from all other objects.
>Then you will not be relying on undefined behaviour at all

Indeed, and my simple char*malloc(size_t n) { return(NULL); } shows how malloc can obey the defined behavior and
yet be completely useless. So if you're using it, you're
relying on some behavior which is *not* defined - the fact
that it actually *does* return some non-NULL blocks
under some circumstances (circumstances which are never
precisely defined by any implementation ...)

>So what -- you don't bother checking to see whether you were able to allocate tens
>of GB of data because if the user's screwed anyway, you may as well liven it up
>by not even attempting to deal with the problem?

Of course we check the return code. But there's no
practical way to recover if it craps out, you just print
a friendly message and stop.

>"The problem with that is the overhead of copying data into different processes
>rules out a whole class of worthwhile operations that could be profitably threaded."
>You responded "Maybe it does, maybe it doesn't.". Well, if you agree that using
>separate processes is slower than using threads, and if you agree with the proposition
>that there's no point parallelising the algorithm if it doesn't deliver a performance
>improvement, we can only conclude that not using threads therefore rules out those
>operations that would be faster if parallelised using threads but not faster if parallelised using processes.

What's not clear is whether that class is big enough to
be interesting. Can't tell until I try it.

>>Doesn't sound particularly hard: there just needs to be
>>a local task queue manager which then passes tasks
>>(and their associated data) off to the worker processes.
>Now who's trying to second-guess decisions without being aware of the context? :-)

Well, multi-process task farming is well established as
a parallel-processing paradigm. Heck, it's how we run
our compiles and links in our build process. If you're
saying it doesn't work on a fine enough granularity for
your app, that's one thing: if you're saying it doesn't
work because everyone has to share the task queue data
structure, that's just plain wrong.

>Yes -- and the time required is fine when you're loading or saving a project, but
>again, it rules out a whole class of worthwhile operations that could be profitably threaded if this is the only option.

Maybe I know how to do it faster than you do it :-)
< Previous Post in ThreadNext Post in Thread >
