m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

needless fopens


From: Eric Blake
Subject: needless fopens
Date: Sat, 06 Dec 2008 17:17:59 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.18) Gecko/20081105 Thunderbird/2.0.0.18 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This one-liner change reduces the time spent using autoconf.git on
coreutils.git by more than 1% on my system (0m12.471s down to 0m12.332s),
when my $TMPDIR is disk-based.  It also reduces the number of fopen's
performed in output.c from 723 down to 301 on the same testcase.  And with
autoconf 2.63 (where coreutils' configure is closer to 2 megabytes, rather
than less than 1), the speedup should be even more pronounced because
there is more diversion swapping going on.

Basically, m4 keeps track of the total size of all in-memory diversions,
to decide when to spill to temporary files; but without this line, the
total was not adjusted back down after a diversion has been discarded,
resulting in lots of needless I/O for the creation of small temporary
files when m4 didn't realize it had freed up memory.  This performance bug
has been present at least since m4 1.3 (the start of git history);
fortunately, it does not affect correct output.  Also, I suspect that a
ramdisk $TMPDIR is not affected quite as much as disk-based $TMPDIR, where
the repeated I/O doesn't cost as much.

But before I apply this patch, I'm working on one other.  Prior to m4
1.4.8, all temporary files were kept open indefinitely, but this had the
potential to run into EMFILE limitations.  So m4 1.4.8 changed things to
always close a FILE* when changing away from a large diversion, then
reopen it when coming back.  I traced the sequence of fopens in use for
coreutils' configure.ac, and noticed that with the one-liner applied, only
two temporary files were ever needed (diversion 1000, aka BODY, and
diversion 10000, aka GROW), and that most of the 301 remaining fopens were
merely revisiting the same file, or swapping over to the other hot
diversion.  So it looks like adding a cache of 1 or 2 hot diversion FILE*
should cut the number of fopen from 301 down to 2, without sacrificing the
1.4.8 fix to avoid the arbitrary EMFILE limitation, for even more
potential speedup.  It looks more and more like I have a reason to release
m4 1.4.13.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk7FjcACgkQ84KuGfSFAYBlygCeLquduCSxEkLyUpk6oSKlzOnI
DBEAoJwTBtWN5Ma5OrSOq3RXxRo7JVrx
=VX1o
-----END PGP SIGNATURE-----
diff --git a/ChangeLog b/ChangeLog
index 474cc86..3d4a808 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2008-12-06  Eric Blake  <address@hidden>
+
+       * src/output.c (insert_diversion_helper): Keep proper track of
+       in-memory diversions, to avoid undersized temporary files.
+
 2008-10-10  Eric Blake  <address@hidden>
 
        Release Version 1.4.12.
diff --git a/src/output.c b/src/output.c
index d621c4b..1754b83 100644
--- a/src/output.c
+++ b/src/output.c
@@ -1,7 +1,7 @@
 /* GNU m4 -- A simple macro processor
 
    Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2004, 2005, 2006,
-   2007 Free Software Foundation, Inc.
+   2007, 2008 Free Software Foundation, Inc.
 
    This file is part of GNU M4.
 
@@ -723,6 +723,7 @@ insert_diversion_helper (m4_diversion *diversion)
   if (diversion->size)
     {
       free (diversion->u.buffer);
+      total_buffer_size -= diversion->size;
       diversion->size = 0;
       diversion->used = 0;
     }

reply via email to

[Prev in Thread] Current Thread [Next in Thread]