Notes
2015-03-05
I recently hacked together a small yacc for C. Initially, I was programming
as if I lived in the future and performance was not a problem.
However, with this LALR state generation you can't be too
silly, as I soon discovered. Processing this C89 grammar
took around 8 minutes, that was not really acceptable as bison could deal with
it in a fraction of second! After eliminating a few massive
performance hogs (sorted array to represent a set, loop invariant
in a loop), I was processing the above C grammar in 5 seconds.
Much better! Then comes my point, at that time the profiler was saying
that my code was fairly fast and accounted for less than a second
of runtime. The issue turned out to be in my use of malloc/free
in the libc. I used the following pattern:
- Allocate initialize some memory to store a temporary,
- if condition C is true, store the pointer
- otherwise, free the memory.
The case 3 happened much more often than the case 2 so I thought
that glibc's allocation code would simply give me the same
block all the time. Switching to the scheme below got
me a 5x speedup, so I guess I was wrong!
- Initialize a statically allocated object A,
- if condition C is true, allocate some memory and copy A to it,
then store the pointer
- otherwise, do nothing.
The conclusion I can draw is
that objects likely to be short-lived should really not be
allocated dynamically. This makes me wonder how, for instance,
C++'s RAII and Rust's automatic liveness mechanisms handle this
kind of performance problems. The speed of my yacc is now similar
to bison's after only 10% of the code was changed. This proves
the point that, instead of designing for speed, we should design
for simplicity and then, improve.