[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Are frozen files really that hot?
From: |
Daniel Goldman |
Subject: |
Re: Are frozen files really that hot? |
Date: |
Mon, 07 Jul 2014 17:13:59 -0700 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 |
Thanks for the thoughtful response.
On 7/3/2014 4:24 PM, Eric Blake wrote:
Frozen file support was added before I ever started hacking on m4; I'm
not sure what benchmarks were used at the time (other than the autoconf
case, since autoconf relies on them), and I have not personally tried to
benchmark it. It doesn't necessarily mean no benchmarks exist, just
that I haven't found any; conversely, I haven't had any reason to worry
about it.
The changelog first entry related to frozen files is 1994. And frozen
files were apparently totally implemented at that point. That's a long
history! It is still possible frozen files were a failed experiment, in
the sense that the cost is greater than the benefit. There certainly is
a long history.
While I agree with your notion that there is little evidence of benefit,
I disagree with your claim of a significant burden. The code is there,
it is covered by the testsuite, and we haven't had to patch it in
several years (thus no one is reporting bugs against it). I do not
consider that to be a maintenance burden, but evidence of something that
does its job well, even if the job is not useful to many.
As to the complexity claim, the code is fairly well segregated (see
src/freeze.c); the additional code base needed for frozen files does not
intrude into the speed or memory taken by the normal code that works
without frozen files (pretty much one if() statement at shutdown on
whether to dump to a file, to call into freeze.c; then on load time,
freeze.c does its thing and sets up the internal hash tables of known
macros, then returns control to the normal input engine). I don't see
the code base being slowed down, because we don't have to maintain any
extra state just because something is frozen.
I thought there would be a significant maintenance burden. But you would
know best. Yes, freeze.c is small. But I imagined there must be more,
perhaps a bunch of stuff in some header. I don't understand the big
picture of how the m4 source code is organized (and am going to try
figuring it out, am not asking for an explanation, don't want to waste
your time).
Bottom line: if you don't consider it a significant maintenance burden,
then no reason to change. I'm pleading ignorance of the source code, so
I can't have it both ways...
It may have been smaller than you liked, but it was definitely non-zero,
and not in the noise. 30% may not be much, but it's better than a LOT
of premature optimizations I've seen in my days that have a difference
of no more than 1%. I wasn't trying to be flippant, but actually glad
that you now have a benchmark for your use case, which shows an actual
gain (proof that the code is not complete dead weight, even if it didn't
do as much as you wanted).
Yes, 30% is non-zero. To me not enough to justify using frozen files.
But I would grant that it's a matter of personal preference. There is no
hard rule how much speed-up justifies some change.
1/2 second vs 1 second is a lot different from 50 seconds vs 100
seconds. Perhaps m4 was a bottleneck decades ago, and with the huge
speedup of hardware (and improvement of those macros you did), is much
less so. In my usage, m4 is not a bottleneck so far.
If there was going to be another benchmark, I think a synthetic one
would be better. Replicable. It would not be that hard to build, given
agreement about what macros to benchmark. A shell script would
automatically create the m4 definition and input files, and then run m4
either with frozen or not frozen. I thought of carrying this out, but
(1) I did not feel comfortable specifying which macros to benchmark, and
(2) my real-world benchmark was of more practical use to me.
Maybe the historical reasons were bad reasons. Maybe they don't apply
today. My guess is there were better ways to deal with those O(n^2) (and
even O(n^3)) macro definitions. You say they are gone now, so apparently
someone found a better way.
I was one of the programmers that spent a lot of time on autoconf trying
to eradicate stupid O(n^2) algorithms and replace it with faster
iterations, and with some definite success (the time it took to run
autoconf on a complex program such as coreutils was cut in half.
Admittedly, the time to run autoconf on one developer's machine is in
the noise compared to the time spent on running configure on all the
users' machines in the collective scheme of things, but faster developer
turnaround time can get patches to the users faster, so every little bit
helps). However, while I know autoconf runs faster now than it did in
2.59 days, I don't know how much of that speed is due to improvements in
m4 (such as using unlocked io), in frozen file handling, or in
improvements to management of configure.ac constructs (the part of the
processing done after frozen files were loaded) - only that I was
working on speedups on all three fronts at the same time several years ago.
Sounds complex and tedious, a lot of hard work.
I'm sure someone was trying to do their best
way back when. But it's possibly they messed up, that frozen files were
a failed experiment. Programmers make bad design decisions all the time,
and they can persist for many years. It's possible that happened here.
They are not a failed experiment, because autoconf still uses them. You
don't have to use it, but that doesn't mean it failed. And back-compat
demands that we can't rip it out. For that matter, I worry that ripping
it out might have more negative consequences than positive.
By "failed experiment", I meant maybe the speedup was not enough to
really justify the work, or maybe better way to get to result. But you
are right, it works and is used so in that sense is not failed. Yes,
"ripping it out" would have negative consequences. I just imagined it
had a significant maintenance downside, did not know.
What are the autoconf "quadratic algorithms" you are referring to? Are
they still around? If so, maybe there is a better approach. I would
suggest that if there is a composite macro that is more or less general,
widely used, and computation intensive, that would be a good candidate
to consider using a builtin, which could potentially be much faster,
much better, and much easier to use. But that would require an openness
to adding builtin macros.
https://www.gnu.org/software/m4/manual/m4.html#Foreach documents a
foreach macro, which (modulo `' vs. [] quoting) was originally lifted
from autoconf 2.59. I'm not sure if autoconf actually used foreach when
defining other macros in the files it eventually froze, or if it was
more a matter of using foreach in the definition of macros that then
caused quadratic expansion time while processing the user's
configure.ac. And even if the algorithm was quadratic, if your list is
small enough you'll never notice the poor scaling.
Meanwhile,
https://www.gnu.org/software/m4/manual/m4.html#Improved-foreach
documents the improved foreach definition that is no longer quadratic,
in part because of tricks I employed in getting rid of the quadratic
recursion in newer autoconf in the 2.63 days.
Another thing that I know was computation intensive was use of regex;
autoconf 2.59 definitely had some places where it used a regular
expression in order to define a new macro based on substitution of
patterns of an existing macro, and did so in an inefficient manner. In
newer autoconf, I made it a point to use fewer regex, to defer
expressions unless they were needed; and I also improved m4 to cache
frequently used expressions (as compiling a regex was a noticeable
hotspot in performance testing). This is another case where frozen
files matter (loading a frozen file does not have to compile the regular
expression used to define a macro) but where the gap may be smaller (the
code uses fewer regex to begin with).
*** Thanks for the details. I have a question:
Was there ever a suggestion that foreach might be implemented as a
builtin? When I mentioned "a composite macro that is more or less
general, widely used, and computation intensive", I had something like
foreach in mind. Is it possible that a foreach builtin macro could
potentially be faster, better, and easier to use?
The part about using regex to generate new macros has my head spinning.
I never imagined doing anything like that. I'm not saying it's bad. It's
just very surprising. Not expecting a comment back.
You totally make my point when you say "I'm not sure if you will see
better or worse numbers from autoconf". If it's not faster, there is NO
point to use frozen files. Perhaps without intending to, you make my
point that there is a possibility frozen files are not so hot.
But until someone actually runs a benchmark to prove one way or the
other, the status quo seems to be just fine.
Yes, the status quo is OK. My main point was to point out what I
observed, and find out if any other existing benchmarks.
BTW, I'm sure you would not see "worse numbers". I am NOT suggesting
that frozen files slow things down. :)
But they might. It is a very real possibility that with modern
hardware, and with improvements made in both m4 and autoconf, that
autoconf could be changed to avoid frozen files with no loss or even a
potential gain in performance. But until someone posts hard numbers, we
can speculate all day, and it won't matter.
Probably, nobody is going to post hard numbers. As you say, the status
quo works OK, so why change it? And everyone is busy. It takes some
effort to do a benchmark correctly. The right time to do the benchmark
was decades ago. Maybe they did. I'm sure they were doing their best.
They're not mandatory to use. But at this point, we can't rip it out of
m4 - there are users that depend on it. The code is designed to not
penalize people that aren't using it.
I never suggested frozen files were mandatory to use, so I don't get
your point. I am suggesting is they are mandatory to maintain. And my
guess is they add significant complexity to the software (you would be
best placed to comment on that). And as m4 development seems more or
less stuck based on what I read, maybe it might be a good idea to
strategize before adding some other "feature", and to figure out how to
get m4 development unstuck. And again, sometimes "less is more".
Okay, then it sounds like we are on the same page about leaving it alone.
While we have prepared the code to deprecate some command line options
that aren't very consistent, we haven't had to deprecate any features.
I don't see that marking frozen files as deprecated would make any
difference.
I basically agree, given there is apparently not the maintenance burden
that I imagined there would be.
At this point, m4 is stable enough that patches speak louder than words.
While it can be quite powerful at what it does, there doesn't seem to
be many people flocking to use it. Whether that is because people don't
know about it, or because m4 only fits a niche market, it's hard to
justify adding features when there is already such a low volume of
contribution and a lack of free time on my part to write new patches.
I'd love to review patches from others - but such patches are rarely
submitted.
I _do_ like suggestions for improvement, but with the limited time I
spend on m4, I like it more when those suggestions are accompanied by an
implementation that demonstrate the improvement rather than just
describing it in prose.
You have a good point about submitting patches for review. I had the
(probably naive) idea it was better for the user (me) to make a
suggestion, and the expert (you) to do the coding. I know talk is cheap,
a lot easier than changing source code. On the other hand, systems
analysis (which I hope I'm good at) is valuable, too. And a user, even
if a good C programmer, is unlikely to have an adequate understanding of
the intricacies of the m4 source code.
I get not exactly of tidal wave of m4 users. When I said "echo chamber",
I was trying to convey that maybe just a few listeners on this group,
certainly very few posts, so maybe more or less talking to myself. :(
It's understandable that a low activity software is a lower priority to
improve.
m4 seems very forbidding and obscure to new users. I speak from
experience. I use other initially very confusing but ultimately very
useful software tools (vi, sed, awk, etc.). But m4 always seemed even
more obscure and remote, not something for a "regular user".
Finally, after many years, I made the effort to learn m4. I've read the
manual several times, done some experiments. I have found it quite
useful. I've written over 50,000 (mostly simple) macros already, a kind
of data dictionary, used to process a variety of files. And it's doing
most of what I want. It's just I ran into a few (IMO) rough edges, and
so I felt compelled to post, in case helpful at some point. Of course,
frozen files are not a "rough edge", just didn't speed things up much,
at least in this one real-world case.
Thanks,
Daniel