m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

argv_ref patch 25: allow NUL in quote and comment delimiters


From: Eric Blake
Subject: argv_ref patch 25: allow NUL in quote and comment delimiters
Date: Wed, 18 Jun 2008 07:28:59 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Next in the series, this time to handle embedded NUL in quote and comment
syntax.  Quotes and comments were already tracking length, but there were
a number of places that were assuming they did not contain NUL or which
required a NUL terminator instead of using the length.  The patch doesn't
use any more memory, and in some cases can use slightly less (where a NUL
terminator is now omitted); and I didn't notice any obvious speed differences.

In the process of fixing this, I also simplified quite a bit of obstack
printing code by using obstack_printf.  Thus, I split things for the
master branch into output cleanup and embedded NUL handling.  The port to
master branch was not trivial, because I had to consider the interaction
with the new changesyntax.

2008-06-18  Eric Blake  <address@hidden>

        Stage 25: Handle embedded NUL in changequote and changecom.
        Track quote and comment delimiters by length, to allow embedded
        NUL.  Convert macro tracing and other locations to use
        obstack_printf rather than hand-rolled equivalents.  Ensure that
        embedded NUL in trace output does not truncate the trace string.
        Memory impact: none.
        Speed impact: none noticed.
        * m4/gnulib-cache.m4: Import obstack-printf-posix module.
        * src/m4.h (ntoa): Remove declaration.
        (DEBUG_PRINT1, DEBUG_PRINT3, MESSAGE, DEBUG_MESSAGE1)
        (DEBUG_MESSAGE2): Delete, now that these macros are unused.
        (debug_message_prefix): Rename...
        (debug_message): ...and add parameters.
        (set_quotes, set_comment): Add parameters.
        * src/debug.c (debug_message_prefix): Rename...
        (debug_message): ...and use obstack_printf.
        (trace_format): Delete.
        (trace_header): Adjust caller.
        * src/input.c (init_argv_token, input_init): Handle embedded NUL
        in comments and quotes.
        (match_input, MATCH, set_quotes, set_comment): Add parameter.
        (set_quote_age): Adjust heuristic for safe quote.
        (push_file, pop_input, next_token, peek_token): Adjust callers.
        * src/freeze.c (produce_frozen_state, reload_frozen_state): Handle
        embedded NUL in quotes and comments.
        * src/builtin.h (ntoa): Make static.
        (shipout_int, m4_eval, m4_maketemp): Use obstack_printf.
        (m4_dumpdef): Avoid truncating output on embedded NUL.
        (m4_changequote, m4_changecom): Handle embedded NUL.
        * src/format.c (expand_format): Use obstack_printf.
        * src/output.c (m4_tmpname, divert_text): Likewise.
        * src/path.c (m4_path_search): Adjust caller.
        * doc/m4.texinfo (Using frozen files): Enhance test.
        * examples/null.m4: Likewise.
        * examples/null.out: Update expected output.
        * examples/null.err: Likewise.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkhZDZsACgkQ84KuGfSFAYAxYACfaVz5M0VOLDcxiecG82tpJ6RX
N64AoJKXGjVqiFNuA22L1hS7BNW+C+6y
=/TOY
-----END PGP SIGNATURE-----
From c05ce945d2a377eb37365eada8f0dc402479a94e Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Sat, 14 Jun 2008 11:02:12 -0600
Subject: [PATCH] Stage 25a: Use obstack_printf for easier output.

* ltdl/m4/gnulib-cache.m4: Import obstack-printf-posix module.
* m4/macro.c (trace_format): Delete; use obstack_printf instead.
(trace_header, trace_pre, trace_post): All callers updated.
* m4/output.c (m4_shipout_int, m4_tmpname): Use obstack_printf.
(m4_divert_text): Speed up syncline output.
* modules/m4.c (dumpdef): Handle embedded NUL.
(numb_obstack): Speed up eval output.
(maketemp): Use obstack_printf.
* modules/format.c (format): Likewise.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog               |   16 ++++++++
 ltdl/m4/gnulib-cache.m4 |    4 +-
 m4/macro.c              |   91 +++++++---------------------------------------
 m4/output.c             |   24 +++---------
 modules/format.c        |   49 ++++++++-----------------
 modules/m4.c            |   53 ++++++++++++++-------------
 6 files changed, 81 insertions(+), 156 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 19631a6..46230ae 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,21 @@
 2008-06-16  Eric Blake  <address@hidden>
 
+       Stage 25a: Use obstack_printf for easier output.
+       Convert macro tracing and other locations to use obstack_printf
+       rather than hand-rolled equivalents.  Ensure that embedded NUL in
+       trace output does not truncate the trace string.
+       Memory impact: none.
+       Speed impact: none noticed.
+       * ltdl/m4/gnulib-cache.m4: Import obstack-printf-posix module.
+       * m4/macro.c (trace_format): Delete; use obstack_printf instead.
+       (trace_header, trace_pre, trace_post): All callers updated.
+       * m4/output.c (m4_shipout_int, m4_tmpname): Use obstack_printf.
+       (m4_divert_text): Speed up syncline output.
+       * modules/m4.c (dumpdef): Handle embedded NUL.
+       (numb_obstack): Speed up eval output.
+       (maketemp): Use obstack_printf.
+       * modules/format.c (format): Likewise.
+
        Add missing const qualifications.
        * m4/resyntax.c (m4_resyntax_map): Declare array elements as
        const.
diff --git a/ltdl/m4/gnulib-cache.m4 b/ltdl/m4/gnulib-cache.m4
index f5f442c..fb91ba5 100644
--- a/ltdl/m4/gnulib-cache.m4
+++ b/ltdl/m4/gnulib-cache.m4
@@ -15,11 +15,11 @@
 
 
 # Specification in the form of a command-line invocation:
-#   gnulib-tool --import --dir=. --local-dir=local --lib=libgnu 
--source-base=gnu --m4-base=ltdl/m4 --doc-base=doc --aux-dir=build-aux 
--with-tests --libtool --macro-prefix=M4 assert autobuild avltree-oset 
binary-io clean-temp cloexec close-stream closein config-h configmake dirname 
error exit fdl fflush filenamecat flexmember fopen-safer fseeko gendocs gettext 
git-version-gen gnumakefile gnupload gpl-3.0 intprops memmem mkstemp obstack 
progname propername quote regex regexprops-generic sprintf-posix stdbool 
stdlib-safer strnlen strtod strtol tempname unlocked-io vasnprintf-posix verror 
xalloc xalloc-die xmemdup0 xprintf-posix xstrndup xvasprintf-posix
+#   gnulib-tool --import --dir=. --local-dir=local --lib=libgnu 
--source-base=gnu --m4-base=ltdl/m4 --doc-base=doc --aux-dir=build-aux 
--with-tests --libtool --macro-prefix=M4 assert autobuild avltree-oset 
binary-io clean-temp cloexec close-stream closein config-h configmake dirname 
error exit fdl fflush filenamecat flexmember fopen-safer fseeko gendocs gettext 
git-version-gen gnumakefile gnupload gpl-3.0 intprops memmem mkstemp obstack 
obstack-printf-posix progname propername quote regex regexprops-generic 
sprintf-posix stdbool stdlib-safer strnlen strtod strtol tempname unlocked-io 
vasnprintf-posix verror xalloc xalloc-die xmemdup0 xprintf-posix xstrndup 
xvasprintf-posix
 
 # Specification in the form of a few gnulib-tool.m4 macro invocations:
 gl_LOCAL_DIR([local])
-gl_MODULES([assert autobuild avltree-oset binary-io clean-temp cloexec 
close-stream closein config-h configmake dirname error exit fdl fflush 
filenamecat flexmember fopen-safer fseeko gendocs gettext git-version-gen 
gnumakefile gnupload gpl-3.0 intprops memmem mkstemp obstack progname 
propername quote regex regexprops-generic sprintf-posix stdbool stdlib-safer 
strnlen strtod strtol tempname unlocked-io vasnprintf-posix verror xalloc 
xalloc-die xmemdup0 xprintf-posix xstrndup xvasprintf-posix])
+gl_MODULES([assert autobuild avltree-oset binary-io clean-temp cloexec 
close-stream closein config-h configmake dirname error exit fdl fflush 
filenamecat flexmember fopen-safer fseeko gendocs gettext git-version-gen 
gnumakefile gnupload gpl-3.0 intprops memmem mkstemp obstack 
obstack-printf-posix progname propername quote regex regexprops-generic 
sprintf-posix stdbool stdlib-safer strnlen strtod strtol tempname unlocked-io 
vasnprintf-posix verror xalloc xalloc-die xmemdup0 xprintf-posix xstrndup 
xvasprintf-posix])
 gl_AVOID([])
 gl_SOURCE_BASE([gnu])
 gl_M4_BASE([ltdl/m4])
diff --git a/m4/macro.c b/m4/macro.c
index b1f9f44..c638023 100644
--- a/m4/macro.c
+++ b/m4/macro.c
@@ -23,8 +23,6 @@
 
 #include <config.h>
 
-#include <stdarg.h>
-
 #include "m4private.h"
 
 #include "intprops.h"
@@ -134,9 +132,6 @@ static void    process_macro         (m4 *, m4_symbol_value 
*, m4_obstack *, int,
 
 static unsigned int trace_pre   (m4 *, m4_macro_args *);
 static void    trace_post       (m4 *, unsigned int, const m4_call_info *);
-
-static void    trace_format     (m4 *, const char *, ...)
-  M4_GNUC_PRINTF (2, 3);
 static unsigned int trace_header (m4 *, const m4_call_info *);
 static void    trace_flush      (m4 *, unsigned int);
 
@@ -796,80 +791,22 @@ process_macro (m4 *context, m4_symbol_value *value, 
m4_obstack *obs,
    This prevents tracing output from interfering with other debug
    messages generated by the various builtins.  */
 
-/* Tracing output is formatted here, by a simplified printf-to-obstack
-   function trace_format ().  Understands only %s, %d, %zu (size_t
-   value).  */
-static void
-trace_format (m4 *context, const char *fmt, ...)
-{
-  va_list args;
-  char ch;
-  const char *s;
-  char nbuf[INT_BUFSIZE_BOUND (sizeof (int) > sizeof (size_t)
-                              ? sizeof (int) : sizeof (size_t))];
-
-  va_start (args, fmt);
-
-  while (true)
-    {
-      while ((ch = *fmt++) != '\0' && ch != '%')
-       obstack_1grow (&context->trace_messages, ch);
-
-      if (ch == '\0')
-       break;
-
-      switch (*fmt++)
-       {
-       case 's':
-         s = va_arg (args, const char *);
-         break;
-
-       case 'd':
-         {
-           int d = va_arg (args, int);
-
-           sprintf (nbuf, "%d", d);
-           s = nbuf;
-         }
-         break;
-
-       case 'z':
-         ch = *fmt++;
-         assert (ch == 'u');
-         {
-           size_t z = va_arg (args, size_t);
-
-           sprintf (nbuf, "%zu", z);
-           s = nbuf;
-         }
-         break;
-
-       default:
-         abort ();
-         break;
-       }
-
-      obstack_grow (&context->trace_messages, s, strlen (s));
-    }
-
-  va_end (args);
-}
-
 /* Format the standard header attached to all tracing output lines,
    using the context in INFO as appropriate.  Return the offset into
    the trace obstack where this particular trace begins.  */
 static unsigned int
 trace_header (m4 *context, const m4_call_info *info)
 {
-  unsigned int result = obstack_object_size (&context->trace_messages);
-  trace_format (context, "m4trace:");
+  m4_obstack *trace = &context->trace_messages;
+  unsigned int result = obstack_object_size (trace);
+  obstack_grow (trace, "m4trace:", 8);
   if (info->debug_level & M4_DEBUG_TRACE_FILE)
-    trace_format (context, "%s:", info->file);
+    obstack_printf (trace, "%s:", info->file);
   if (info->debug_level & M4_DEBUG_TRACE_LINE)
-    trace_format (context, "%d:", info->line);
-  trace_format (context, " -%zu- ", context->expansion_level);
+    obstack_printf (trace, "%d:", info->line);
+  obstack_printf (trace, " -%zu- ", context->expansion_level);
   if (info->debug_level & M4_DEBUG_TRACE_CALLID)
-    trace_format (context, "id %zu: ", info->call_id);
+    obstack_printf (trace, "id %zu: ", info->call_id);
   return result;
 }
 
@@ -924,10 +861,10 @@ trace_pre (m4 *context, m4_macro_args *argv)
 {
   int trace_level = argv->info->debug_level;
   unsigned int start = trace_header (context, argv->info);
+  m4_obstack *trace = &context->trace_messages;
 
   assert (argv->info->trace);
-  obstack_grow (&context->trace_messages, argv->info->name,
-               argv->info->name_len);
+  obstack_grow (trace, argv->info->name, argv->info->name_len);
 
   if (1 < m4_arg_argc (argv) && (trace_level & M4_DEBUG_TRACE_ARGS))
     {
@@ -937,10 +874,10 @@ trace_pre (m4 *context, m4_macro_args *argv)
 
       if (trace_level & M4_DEBUG_TRACE_QUOTE)
        quotes = m4_get_syntax_quotes (M4SYNTAX);
-      trace_format (context, "(");
-      m4__arg_print (context, &context->trace_messages, argv, 1, quotes, false,
-                    NULL, ", ", &arg_length, true, module);
-      trace_format (context, ")");
+      obstack_1grow (trace, '(');
+      m4__arg_print (context, trace, argv, 1, quotes, false, NULL, ", ",
+                    &arg_length, true, module);
+      obstack_1grow (trace, ')');
     }
   return start;
 }
@@ -954,7 +891,7 @@ trace_post (m4 *context, unsigned int start, const 
m4_call_info *info)
   assert (info->trace);
   if (info->debug_level & M4_DEBUG_TRACE_EXPANSION)
     {
-      trace_format (context, " -> ");
+      obstack_grow (&context->trace_messages, " -> ", 4);
       m4_input_print (context, &context->trace_messages, info->debug_level);
     }
   trace_flush (context, start);
diff --git a/m4/output.c b/m4/output.c
index f94cbd2..f86f913 100644
--- a/m4/output.c
+++ b/m4/output.c
@@ -194,12 +194,7 @@ m4_tmpname (int divnum)
   static size_t offset;
   if (buffer == NULL)
     {
-      obstack_grow (&diversion_storage, output_temp_dir->dir_name,
-                   strlen (output_temp_dir->dir_name));
-      obstack_1grow (&diversion_storage, '/');
-      obstack_1grow (&diversion_storage, 'm');
-      obstack_1grow (&diversion_storage, '4');
-      obstack_1grow (&diversion_storage, '-');
+      obstack_printf (&diversion_storage, "%s/m4-", output_temp_dir->dir_name);
       offset = obstack_object_size (&diversion_storage);
       buffer = (char *) obstack_alloc (&diversion_storage,
                                       INT_BUFSIZE_BOUND (divnum));
@@ -474,8 +469,6 @@ m4_divert_text (m4 *context, m4_obstack *obs, const char 
*text, size_t length,
                int line)
 {
   static bool start_of_output_line = true;
-  char linebuf[6 + INT_BUFSIZE_BOUND (unsigned long int)]; /* "#line nnnn" */
-  const char *cursor;
 
   /* If output goes to an obstack, merely add TEXT to it.  */
 
@@ -537,20 +530,17 @@ m4_divert_text (m4 *context, m4_obstack *obs, const char 
*text, size_t length,
 
          if (m4_get_output_line (context) != line)
            {
+             char linebuf[sizeof "#line " + INT_BUFSIZE_BOUND (line)];
              sprintf (linebuf, "#line %lu",
                       (unsigned long int) m4_get_current_line (context));
-             for (cursor = linebuf; *cursor; cursor++)
-               OUTPUT_CHARACTER (*cursor);
+             m4_output_text (context, linebuf, strlen (linebuf));
              if (m4_get_output_line (context) < 1
                  && m4_get_current_file (context)[0] != '\0')
                {
+                 const char *file = m4_get_current_file (context);
                  OUTPUT_CHARACTER (' ');
                  OUTPUT_CHARACTER ('"');
-                 for (cursor = m4_get_current_file (context);
-                      *cursor; cursor++)
-                   {
-                     OUTPUT_CHARACTER (*cursor);
-                   }
+                 m4_output_text (context, file, strlen (file));
                  OUTPUT_CHARACTER ('"');
                }
              OUTPUT_CHARACTER ('\n');
@@ -585,9 +575,7 @@ m4_divert_text (m4 *context, m4_obstack *obs, const char 
*text, size_t length,
 void
 m4_shipout_int (m4_obstack *obs, int val)
 {
-  char buf[INT_BUFSIZE_BOUND (int)];
-  int len = sprintf(buf, "%d", val);
-  obstack_grow (obs, buf, len);
+  obstack_printf (obs, "%d", val);
 }
 
 /* Output the text S, of length LEN, to OBS.  If QUOTED, also output
diff --git a/modules/format.c b/modules/format.c
index f5695e4..e2a1a42 100644
--- a/modules/format.c
+++ b/modules/format.c
@@ -152,10 +152,8 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
      behavior in printf.  */
   char ok[128];
 
-  /* Buffer and stuff.  */
-  char *base;                  /* Current position in obs.  */
-  size_t len;                  /* Length of formatted text.  */
-  char *str;                   /* Malloc'd buffer of formatted text.  */
+  /* Check that formatted text succeeded with correct type.  */
+  int result = 0;
   enum {CHAR, INT, LONG, DOUBLE, STR} datatype;
 
   f = fmt = ARG_STR (i, argc, argv);
@@ -349,56 +347,39 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
        }
       *p++ = c;
       *p = '\0';
-      base = obstack_next_free (obs);
-      len = obstack_room (obs);
 
       switch (datatype)
        {
        case CHAR:
-         str = asnprintf (base, &len, fstart, width,
-                          ARG_INT (i, argc, argv));
+         result = obstack_printf (obs, fstart, width,
+                                  ARG_INT (i, argc, argv));
          break;
 
        case INT:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_INT (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_INT (i, argc, argv));
          break;
 
        case LONG:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_LONG (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_LONG (i, argc, argv));
          break;
 
        case DOUBLE:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_DOUBLE (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_DOUBLE (i, argc, argv));
          break;
 
        case STR:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_STR (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_STR (i, argc, argv));
          break;
 
        default:
          abort ();
        }
-
-      if (str == NULL)
-       /* NULL is unexpected (EILSEQ and EINVAL are not possible
-          based on our construction of fstart, leaving only ENOMEM,
-          which should always be fatal).  */
-       m4_error (context, EXIT_FAILURE, errno, me,
-                 _("unable to format output for `%s'"), f);
-      else if (str == base)
-       /* The output was already computed in place, but we need to
-          account for its size.  */
-       obstack_blank_fast (obs, len);
-      else
-       {
-         /* The output exceeded available obstack space, copy the
-            allocated string.  */
-         obstack_grow (obs, str, len);
-         free (str);
-       }
+      /* Since obstack_printf can only fail with EILSEQ or EINVAL, but
+        we constructed fstart, the result should not be negative.  */
+      assert (0 <= result);
     }
 }
diff --git a/modules/m4.c b/modules/m4.c
index b993ec4..0ee6a68 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -352,18 +352,19 @@ M4BUILTIN_HANDLER (dumpdef)
       m4_symbol *symbol = m4_symbol_lookup (M4SYMTAB, data.base->str,
                                            data.base->len);
       char *value;
+      size_t len;
       assert (symbol);
 
       /* TODO - add debugmode(b) option to control quoting style.  */
-      fwrite (data.base->str, 1, data.base->len, stderr);
-      fputc (':', stderr);
-      fputc ('\t', stderr);
+      obstack_grow (obs, data.base->str, data.base->len);
+      obstack_1grow (obs, ':');
+      obstack_1grow (obs, '\t');
       m4_symbol_print (context, symbol, obs, quotes, stack, arg_length,
                       module);
       obstack_1grow (obs, '\n');
-      obstack_1grow (obs, '\0');
+      len = obstack_object_size (obs);
       value = (char *) obstack_finish (obs);
-      fputs (value, stderr);
+      fwrite (value, 1, len, stderr);
       obstack_free (obs, value);
     }
 }
@@ -761,22 +762,19 @@ M4BUILTIN_HANDLER (maketemp)
       const char *str = M4ARG (1);
       size_t len = M4ARGLEN (1);
       size_t i;
-      size_t len2;
+      m4_obstack *scratch = m4_arg_scratch (context);
+      size_t pid_len = obstack_printf (scratch, "%lu",
+                                      (unsigned long) getpid ());
+      char *pid = (char *) obstack_copy0 (scratch, "", 0);
 
       for (i = len; i > 1; i--)
        if (str[i - 1] != 'X')
          break;
       obstack_grow (obs, str, i);
-      str = ntoa ((number) getpid (), 10);
-      len2 = strlen (str);
-      if (len2 > len - i)
-       obstack_grow (obs, str + len2 - (len - i), len - i);
+      if (len - i < pid_len)
+       obstack_grow (obs, pid + pid_len - (len - i), len - i);
       else
-       {
-         while (i++ < len - len2)
-           obstack_1grow (obs, '0');
-         obstack_grow (obs, str, len2);
-       }
+       obstack_printf (obs, "%.*d%s", len - i - pid_len, 0, pid);
     }
   else
     m4_make_temp (context, obs, me, M4ARG (1), M4ARGLEN (1), false);
@@ -1169,19 +1167,25 @@ numb_obstack (m4_obstack *obs, number value, int radix, 
int min)
 {
   const char *s;
   size_t len;
+  unumber uvalue;
 
   if (radix == 1)
     {
-      /* FIXME - this code currently depends on undefined behavior.  */
       if (value < 0)
        {
          obstack_1grow (obs, '-');
-         value = -value;
+         uvalue = -value;
        }
-      while (min-- - value > 0)
-       obstack_1grow (obs, '0');
-      while (value-- != 0)
-       obstack_1grow (obs, '1');
+      else
+       uvalue = value;
+      if (uvalue < min)
+       {
+         obstack_blank (obs, min - uvalue);
+         memset ((char *) obstack_next_free (obs) - (min - uvalue), '0',
+                 min - uvalue);
+       }
+      obstack_blank (obs, uvalue);
+      memset ((char *) obstack_next_free (obs) - uvalue, '1', uvalue);
       return;
     }
 
@@ -1193,10 +1197,9 @@ numb_obstack (m4_obstack *obs, number value, int radix, 
int min)
       s++;
     }
   len = strlen (s);
-  for (min -= len; --min >= 0;)
-    obstack_1grow (obs, '0');
-
-  obstack_grow (obs, s, len);
+  if (min < len)
+    min = len;
+  obstack_printf (obs, "%.*d%s", min - len, 0, s);
 }
 
 
-- 
1.5.5.1


From 7ae3299164295bb67a1c8ba46d64e01e7e82d4df Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Wed, 18 Jun 2008 06:31:44 -0600
Subject: [PATCH] Stage 25b: Handle embedded NUL in changesyntax and friends.

* m4/m4module.h (m4_set_quotes, m4_set_comment, m4_set_syntax):
Add parameter.
(m4_reset_syntax): New prototype.
* m4/syntax.c (add_syntax_set, subtract_syntax_set)
(set_syntax_set, m4_set_quotes, m4_set_comment): Add parameter, to
handle embedded NUL.
(m4_set_syntax): Likewise.  Also, split code to reset the table...
(m4_reset_syntax): ...into a new function.
(m4_syntax_create): Adjust callers.
* m4/input.c (match_input, MATCH): Add parameter.
(m4__next_token, m4__next_token_is_open): Adjust callers.
* modules/m4.h (m4_expand_ranges_func): Add parameter.
* modules/m4.c (dumpdef): Handle NUL in dumped quotes.
(changequote, changecom, translit, m4_expand_ranges): Track
delimiter length.
* modules/gnu.c (changesyntax): Handle embedded NUL.
* src/freeze.c (reload_frozen_state): Adjust callers.
* tests/freeze.at (reloading nul): Enhance test.
* tests/null.m4: Likewise.
* tests/null.out: Update expected output.
* tests/null.err: Likewise.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog       |   29 +++++++++
 m4/input.c      |   58 ++++++++++--------
 m4/m4module.h   |   10 ++-
 m4/syntax.c     |  185 ++++++++++++++++++++++++++++++++++--------------------
 modules/gnu.c   |   31 +++++++---
 modules/m4.c    |   52 +++++++++------
 modules/m4.h    |    3 +-
 src/freeze.c    |   10 ++-
 tests/freeze.at |    7 +-
 tests/null.err  |   19 ++++--
 tests/null.m4   |   33 +++++++---
 tests/null.out  |    7 ++-
 12 files changed, 292 insertions(+), 152 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 46230ae..c3eeed3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,32 @@
+2008-06-18  Eric Blake  <address@hidden>
+
+       Stage 25b: Handle embedded NUL in changesyntax and friends.
+       Track quote and comment delimiters by length, to allow embedded
+       NUL.  Improve changesyntax to support assigning syntax to NUL.
+       Memory impact: none.
+       Speed impact: none noticed.
+       * m4/m4module.h (m4_set_quotes, m4_set_comment, m4_set_syntax):
+       Add parameter.
+       (m4_reset_syntax): New prototype.
+       * m4/syntax.c (add_syntax_set, subtract_syntax_set)
+       (set_syntax_set, m4_set_quotes, m4_set_comment): Add parameter, to
+       handle embedded NUL.
+       (m4_set_syntax): Likewise.  Also, split code to reset the table...
+       (m4_reset_syntax): ...into a new function.
+       (m4_syntax_create): Adjust callers.
+       * m4/input.c (match_input, MATCH): Add parameter.
+       (m4__next_token, m4__next_token_is_open): Adjust callers.
+       * modules/m4.h (m4_expand_ranges_func): Add parameter.
+       * modules/m4.c (dumpdef): Handle NUL in dumped quotes.
+       (changequote, changecom, translit, m4_expand_ranges): Track
+       delimiter length.
+       * modules/gnu.c (changesyntax): Handle embedded NUL.
+       * src/freeze.c (reload_frozen_state): Adjust callers.
+       * tests/freeze.at (reloading nul): Enhance test.
+       * tests/null.m4: Likewise.
+       * tests/null.out: Update expected output.
+       * tests/null.err: Likewise.
+
 2008-06-16  Eric Blake  <address@hidden>
 
        Stage 25a: Use obstack_printf for easier output.
diff --git a/m4/input.c b/m4/input.c
index 212c1c6..ea59b44 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -123,7 +123,7 @@ static      void    init_builtin_token      (m4 *, 
m4_obstack *,
                                         m4_symbol_value *);
 static void    append_quote_token      (m4 *, m4_obstack *,
                                         m4_symbol_value *);
-static bool    match_input             (m4 *, const char *, bool);
+static bool    match_input             (m4 *, const char *, size_t, bool);
 static int     next_char               (m4 *, bool, bool, bool);
 static int     peek_char               (m4 *, bool);
 static bool    pop_input               (m4 *, bool);
@@ -1352,9 +1352,9 @@ m4_skip_line (m4 *context, const m4_call_info *caller)
 }
 
 
-/* This function is for matching a string against a prefix of the
-   input stream.  If the string S matches the input and CONSUME is
-   true, the input is discarded; otherwise any characters read are
+/* If the string S of length LEN matches the next characters of the
+   input stream, return true.  If CONSUME is true and a match is
+   found, the input is discarded; otherwise any characters read are
    pushed back again.  The function is used only when multicharacter
    quotes or comment delimiters are used.
 
@@ -1365,7 +1365,7 @@ m4_skip_line (m4 *context, const m4_call_info *caller)
    not properly restore the current input file and line when we
    restore unconsumed characters.  */
 static bool
-match_input (m4 *context, const char *s, bool consume)
+match_input (m4 *context, const char *s, size_t len, bool consume)
 {
   int n;                       /* number of characters matched */
   int ch;                      /* input character */
@@ -1373,11 +1373,12 @@ match_input (m4 *context, const char *s, bool consume)
   m4_obstack *st;
   bool result = false;
 
+  assert (len);
   ch = peek_char (context, false);
   if (ch != to_uchar (*s))
     return false;                      /* fail */
 
-  if (s[1] == '\0')
+  if (len == 1)
     {
       if (consume)
        next_char (context, false, false, false);
@@ -1389,7 +1390,7 @@ match_input (m4 *context, const char *s, bool consume)
     {
       next_char (context, false, false, false);
       n++;
-      if (*s == '\0')          /* long match */
+      if (--len == 1)          /* long match */
        {
          if (consume)
            return true;
@@ -1406,17 +1407,17 @@ match_input (m4 *context, const char *s, bool consume)
   return result;
 }
 
-/* The macro MATCH() is used to match an unsigned char string S
-  against the input.  The first character is handled inline, for
-  speed.  Hopefully, this will not hurt efficiency too much when
-  single character quotes and comment delimiters are used.  If
-  CONSUME, then CH is the result of next_char, and a successful match
-  will discard the matched string.  Otherwise, CH is the result of
-  peek_char, and the input stream is effectively unchanged.  */
-#define MATCH(C, ch, s, consume)                                       \
-  (to_uchar ((s)[0]) == (ch)                                           \
-   && (ch) != '\0'                                                     \
-   && ((s)[1] == '\0' || (match_input (C, (s) + (consume), consume))))
+/* The macro MATCH() is used to match an unsigned char string S of
+  length LEN against the input.  The first character is handled
+  inline, for speed.  Hopefully, this will not hurt efficiency too
+  much when single character quotes and comment delimiters are used.
+  If CONSUME, then CH is the result of next_char, and a successful
+  match will discard the matched string.  Otherwise, CH is the result
+  of peek_char, and the input stream is effectively unchanged.  */
+#define MATCH(C, ch, s, len, consume)                                  \
+  ((len) && to_uchar ((s)[0]) == (ch)                                  \
+   && ((len) == 1                                                      \
+       || match_input (C, (s) + (consume), (len) - (consume), consume)))
 
 /* While the current input character has the given SYNTAX, append it
    to OBS.  Take care not to pop input source unless the next source
@@ -1628,7 +1629,8 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
          }
       }
     else if (!m4_is_syntax_single_quotes (M4SYNTAX)
-            && MATCH (context, ch, context->syntax->quote.str1, true))
+            && MATCH (context, ch, context->syntax->quote.str1,
+                      context->syntax->quote.len1, true))
       {                                        /* QUOTED STRING, LONGER QUOTES 
*/
        if (obs)
          obs_safe = obs;
@@ -1651,14 +1653,16 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
              }
            if (ch == CHAR_BUILTIN)
              init_builtin_token (context, obs, obs ? token : NULL);
-           else if (MATCH (context, ch, context->syntax->quote.str2, true))
+           else if (MATCH (context, ch, context->syntax->quote.str2,
+                           context->syntax->quote.len2, true))
              {
                if (--quote_level == 0)
                  break;
                obstack_grow (obs_safe, context->syntax->quote.str2,
                              context->syntax->quote.len2);
              }
-           else if (MATCH (context, ch, context->syntax->quote.str1, true))
+           else if (MATCH (context, ch, context->syntax->quote.str1,
+                           context->syntax->quote.len1, true))
              {
                quote_level++;
                obstack_grow (obs_safe, context->syntax->quote.str1,
@@ -1704,7 +1708,8 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                ? M4_TOKEN_NONE : M4_TOKEN_STRING);
       }
     else if (!m4_is_syntax_single_comments (M4SYNTAX)
-            && MATCH (context, ch, context->syntax->comm.str1, true))
+            && MATCH (context, ch, context->syntax->comm.str1,
+                      context->syntax->comm.len1, true))
       {                                        /* COMMENT, LONGER DELIM */
        if (obs && !m4_get_discard_comments_opt (context))
          obs_safe = obs;
@@ -1729,7 +1734,8 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                init_builtin_token (context, NULL, NULL);
                continue;
              }
-           if (MATCH (context, ch, context->syntax->comm.str2, true))
+           if (MATCH (context, ch, context->syntax->comm.str2,
+                      context->syntax->comm.len2, true))
              {
                obstack_grow (obs_safe, context->syntax->comm.str2,
                              context->syntax->comm.len2);
@@ -1864,9 +1870,11 @@ m4__next_token_is_open (m4 *context)
                                       | M4_SYNTAX_ALPHA | M4_SYNTAX_LQUOTE
                                       | M4_SYNTAX_ACTIVE))
       || (!m4_is_syntax_single_comments (M4SYNTAX)
-         && MATCH (context, ch, context->syntax->comm.str1, false))
+         && MATCH (context, ch, context->syntax->comm.str1,
+                   context->syntax->comm.len1, false))
       || (!m4_is_syntax_single_quotes (M4SYNTAX)
-         && MATCH (context, ch, context->syntax->quote.str1, false)))
+         && MATCH (context, ch, context->syntax->quote.str1,
+                   context->syntax->quote.len1, false)))
     return false;
   return m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_OPEN);
 }
diff --git a/m4/m4module.h b/m4/m4module.h
index 29495f3..c7c06c0 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -487,9 +487,13 @@ enum {
 #define m4_has_syntax(S, C, T)                                         \
   ((m4_syntab ((S), sizeof (C) == 1 ? to_uchar (C) : (C)) & (T)) > 0)
 
-extern void    m4_set_quotes   (m4_syntax_table*, const char*, const char*);
-extern void    m4_set_comment  (m4_syntax_table*, const char*, const char*);
-extern int     m4_set_syntax   (m4_syntax_table*, char, char, const char*);
+extern void    m4_set_quotes   (m4_syntax_table *, const char *, size_t,
+                                const char *, size_t);
+extern void    m4_set_comment  (m4_syntax_table *, const char *, size_t,
+                                const char *, size_t);
+extern int     m4_set_syntax   (m4_syntax_table *, char, char, const char *,
+                                size_t);
+extern void    m4_reset_syntax (m4_syntax_table *);
 
 
 
diff --git a/m4/syntax.c b/m4/syntax.c
index 5892f2a..3bba0a8 100644
--- a/m4/syntax.c
+++ b/m4/syntax.c
@@ -159,7 +159,7 @@ m4_syntax_create (void)
       }
 
   /* Set up current table to match default.  */
-  m4_set_syntax (syntax, '\0', '\0', NULL);
+  m4_reset_syntax (syntax);
   syntax->cached_simple.str1 = syntax->cached_lquote;
   syntax->cached_simple.len1 = 1;
   syntax->cached_simple.str2 = syntax->cached_rquote;
@@ -248,12 +248,15 @@ remove_syntax_attribute (m4_syntax_table *syntax, int ch, 
int code)
   return syntax->table[ch];
 }
 
+/* Add the set CHARS of length LEN to syntax category CODE, removing
+   them from whatever category they used to be in.  */
 static void
-add_syntax_set (m4_syntax_table *syntax, const char *chars, int code)
+add_syntax_set (m4_syntax_table *syntax, const char *chars, size_t len,
+               int code)
 {
   int ch;
 
-  if (*chars == '\0')
+  if (!len)
     return;
 
   if (code == M4_SYNTAX_ESCAPE)
@@ -261,20 +264,27 @@ add_syntax_set (m4_syntax_table *syntax, const char 
*chars, int code)
 
   /* Adding doesn't affect single-quote or single-comment.  */
 
-  while ((ch = to_uchar (*chars++)))
-    add_syntax_attribute (syntax, ch, code);
+  while (len--)
+    {
+      ch = to_uchar (*chars++);
+      add_syntax_attribute (syntax, ch, code);
+    }
 }
 
+/* Remove the set CHARS of length LEN from syntax category CODE,
+   adding them to category M4_SYNTAX_OTHER instead.  */
 static void
-subtract_syntax_set (m4_syntax_table *syntax, const char *chars, int code)
+subtract_syntax_set (m4_syntax_table *syntax, const char *chars, size_t len,
+                    int code)
 {
   int ch;
 
-  if (*chars == '\0')
+  if (!len)
     return;
 
-  while ((ch = to_uchar (*chars++)))
+  while (len--)
     {
+      ch = to_uchar (*chars++);
       if ((code & M4_SYNTAX_MASKS) != 0)
        remove_syntax_attribute (syntax, ch, code);
       else if (m4_has_syntax (syntax, ch, code))
@@ -306,8 +316,13 @@ subtract_syntax_set (m4_syntax_table *syntax, const char 
*chars, int code)
     }
 }
 
+/* Make the set CHARS of length LEN become syntax category CODE,
+   removing CHARS from any other categories, and sending all bytes in
+   the category but not in CHARS to category M4_SYNTAX_OTHER
+   instead.  */
 static void
-set_syntax_set (m4_syntax_table *syntax, const char *chars, int code)
+set_syntax_set (m4_syntax_table *syntax, const char *chars, size_t len,
+               int code)
 {
   int ch;
   /* Explicit set of characters to install with this category; all
@@ -320,8 +335,11 @@ set_syntax_set (m4_syntax_table *syntax, const char 
*chars, int code)
       else if (m4_has_syntax (syntax, ch, code))
        add_syntax_attribute (syntax, ch, M4_SYNTAX_OTHER);
     }
-  while ((ch = to_uchar (*chars++)))
-    add_syntax_attribute (syntax, ch, code);
+  while (len--)
+    {
+      ch = to_uchar (*chars++);
+      add_syntax_attribute (syntax, ch, code);
+    }
 
   /* Check for any cleanup needed.  */
   check_is_macro_escaped (syntax);
@@ -329,6 +347,8 @@ set_syntax_set (m4_syntax_table *syntax, const char *chars, 
int code)
   check_is_single_comments (syntax);
 }
 
+/* Reset syntax category CODE to its default state, sending all other
+   characters in the category back to their default state.  */
 static void
 reset_syntax_set (m4_syntax_table *syntax, int code)
 {
@@ -360,47 +380,51 @@ reset_syntax_set (m4_syntax_table *syntax, int code)
   check_is_single_comments (syntax);
 }
 
-int
-m4_set_syntax (m4_syntax_table *syntax, char key, char action,
-              const char *chars)
+/* Reset the syntax table to its default state.  */
+void
+m4_reset_syntax (m4_syntax_table *syntax)
 {
-  int code;
+  /* Restore the default syntax, which has known quote and comment
+     properties.  */
+  memcpy (syntax->table, syntax->orig, sizeof syntax->orig);
 
-  assert (syntax);
-  assert (chars || key == '\0');
-
-  if (key == '\0')
-    {
-      /* Restore the default syntax, which has known quote and comment
-        properties.  */
-      memcpy (syntax->table, syntax->orig, sizeof syntax->orig);
-
-      free (syntax->quote.str1);
-      free (syntax->quote.str2);
-      free (syntax->comm.str1);
-      free (syntax->comm.str2);
-
-      syntax->quote.str1       = xstrdup (DEF_LQUOTE);
-      syntax->quote.len1       = 1;
-      syntax->quote.str2       = xstrdup (DEF_RQUOTE);
-      syntax->quote.len2       = 1;
-      syntax->comm.str1                = xstrdup (DEF_BCOMM);
-      syntax->comm.len1                = 1;
-      syntax->comm.str2                = xstrdup (DEF_ECOMM);
-      syntax->comm.len2                = 1;
+  free (syntax->quote.str1);
+  free (syntax->quote.str2);
+  free (syntax->comm.str1);
+  free (syntax->comm.str2);
 
-      add_syntax_attribute (syntax, to_uchar (syntax->quote.str2[0]),
-                           M4_SYNTAX_RQUOTE);
-      add_syntax_attribute (syntax, to_uchar (syntax->comm.str2[0]),
-                           M4_SYNTAX_ECOMM);
+  syntax->quote.str1 = xmemdup (DEF_LQUOTE, 1);
+  syntax->quote.len1 = 1;
+  syntax->quote.str2 = xmemdup (DEF_RQUOTE, 1);
+  syntax->quote.len2 = 1;
+  syntax->comm.str1 = xmemdup (DEF_BCOMM, 1);
+  syntax->comm.len1 = 1;
+  syntax->comm.str2 = xmemdup (DEF_ECOMM, 1);
+  syntax->comm.len2 = 1;
+
+  add_syntax_attribute (syntax, to_uchar (syntax->quote.str2[0]),
+                       M4_SYNTAX_RQUOTE);
+  add_syntax_attribute (syntax, to_uchar (syntax->comm.str2[0]),
+                       M4_SYNTAX_ECOMM);
+
+  syntax->is_single_quotes = true;
+  syntax->is_single_comments = true;
+  syntax->is_macro_escaped = false;
+  set_quote_age (syntax, true, false);
+}
 
-      syntax->is_single_quotes         = true;
-      syntax->is_single_comments       = true;
-      syntax->is_macro_escaped         = false;
-      set_quote_age (syntax, true, false);
-      return 0;
-    }
+/* Alter the syntax for category KEY, according to ACTION: '+' to add,
+   '-' to subtract, '=' to set, or '\0' to reset.  The array CHARS of
+   length LEN describes the characters to modify; it is ignored if
+   ACTION is '\0'.  Return -1 if KEY is invalid, otherwise return the
+   syntax category matching KEY.  */
+int
+m4_set_syntax (m4_syntax_table *syntax, char key, char action,
+              const char *chars, size_t len)
+{
+  int code;
 
+  assert (syntax && chars);
   code = m4_syntax_code (key);
   if (code < 0)
     {
@@ -409,15 +433,16 @@ m4_set_syntax (m4_syntax_table *syntax, char key, char 
action,
   switch (action)
     {
     case '+':
-      add_syntax_set (syntax, chars, code);
+      add_syntax_set (syntax, chars, len, code);
       break;
     case '-':
-      subtract_syntax_set (syntax, chars, code);
+      subtract_syntax_set (syntax, chars, len, code);
       break;
     case '=':
-      set_syntax_set (syntax, chars, code);
+      set_syntax_set (syntax, chars, len, code);
       break;
     case '\0':
+      assert (!len);
       reset_syntax_set (syntax, code);
       break;
     default:
@@ -555,8 +580,13 @@ check_is_macro_escaped (m4_syntax_table *syntax)
 /* Functions for setting quotes and comment delimiters.  Used by
    m4_changecom () and m4_changequote ().  Both functions override the
    syntax table to maintain compatibility.  */
+
+/* Set the quote delimiters to LQ and RQ, with respective lengths
+   LQ_LEN and RQ_LEN.  Pass NULL if the argument was not present, to
+   distinguish from an explicit empty string.  */
 void
-m4_set_quotes (m4_syntax_table *syntax, const char *lq, const char *rq)
+m4_set_quotes (m4_syntax_table *syntax, const char *lq, size_t lq_len,
+              const char *rq, size_t rq_len)
 {
   int ch;
 
@@ -572,21 +602,27 @@ m4_set_quotes (m4_syntax_table *syntax, const char *lq, 
const char *rq)
   if (!lq)
     {
       lq = DEF_LQUOTE;
+      lq_len = 1;
+      rq = DEF_RQUOTE;
+      rq_len = 1;
+    }
+  else if (!rq || (lq_len && !rq_len))
+    {
       rq = DEF_RQUOTE;
+      rq_len = 1;
     }
-  else if (!rq || (*lq && !*rq))
-    rq = DEF_RQUOTE;
 
-  if (strcmp (syntax->quote.str1, lq) == 0
-      && strcmp (syntax->quote.str2, rq) == 0)
+  if (syntax->quote.len1 == lq_len && syntax->quote.len2 == rq_len
+      && memcmp (syntax->quote.str1, lq, lq_len) == 0
+      && memcmp (syntax->quote.str2, rq, rq_len) == 0)
     return;
 
   free (syntax->quote.str1);
   free (syntax->quote.str2);
-  syntax->quote.str1 = xstrdup (lq);
-  syntax->quote.len1 = strlen (lq);
-  syntax->quote.str2 = xstrdup (rq);
-  syntax->quote.len2 = strlen (rq);
+  syntax->quote.str1 = xmemdup (lq, lq_len);
+  syntax->quote.len1 = lq_len;
+  syntax->quote.str2 = xmemdup (rq, rq_len);
+  syntax->quote.len2 = rq_len;
 
   /* changequote overrides syntax_table, but be careful when it is
      used to select a start-quote sequence that is effectively
@@ -620,8 +656,12 @@ m4_set_quotes (m4_syntax_table *syntax, const char *lq, 
const char *rq)
   set_quote_age (syntax, false, false);
 }
 
+/* Set the comment delimiters to BC and EC, with respective lengths
+   BC_LEN and EC_LEN.  Pass NULL if the argument was not present, to
+   distinguish from an explicit empty string.  */
 void
-m4_set_comment (m4_syntax_table *syntax, const char *bc, const char *ec)
+m4_set_comment (m4_syntax_table *syntax, const char *bc, size_t bc_len,
+               const char *ec, size_t ec_len)
 {
   int ch;
 
@@ -635,20 +675,27 @@ m4_set_comment (m4_syntax_table *syntax, const char *bc, 
const char *ec)
      comment.  See the texinfo for what some other implementations
      do.  */
   if (!bc)
-    bc = ec = "";
-  else if (!ec || (*bc && !*ec))
-    ec = DEF_ECOMM;
+    {
+      bc = ec = "";
+      bc_len = ec_len = 0;
+    }
+  else if (!ec || (bc_len && !ec_len))
+    {
+      ec = DEF_ECOMM;
+      ec_len = 1;
+    }
 
-  if (strcmp (syntax->comm.str1, bc) == 0
-      && strcmp (syntax->comm.str2, ec) == 0)
+  if (syntax->comm.len1 == bc_len && syntax->comm.len2 == ec_len
+      && memcmp (syntax->comm.str1, bc, bc_len) == 0
+      && memcmp (syntax->comm.str2, ec, ec_len) == 0)
     return;
 
   free (syntax->comm.str1);
   free (syntax->comm.str2);
-  syntax->comm.str1 = xstrdup (bc);
-  syntax->comm.len1 = strlen (bc);
-  syntax->comm.str2 = xstrdup (ec);
-  syntax->comm.len2 = strlen (ec);
+  syntax->comm.str1 = xmemdup (bc, bc_len);
+  syntax->comm.len1 = bc_len;
+  syntax->comm.str2 = xmemdup (ec, ec_len);
+  syntax->comm.len2 = ec_len;
 
   /* changecom overrides syntax_table, but be careful when it is used
      to select a start-comment sequence that is effectively
diff --git a/modules/gnu.c b/modules/gnu.c
index 99df3ef..75e5363 100644
--- a/modules/gnu.c
+++ b/modules/gnu.c
@@ -510,26 +510,41 @@ M4BUILTIN_HANDLER (changesyntax)
       size_t i;
       for (i = 1; i < argc; i++)
        {
-         const char *spec = M4ARG (i);
-         char key = *spec++;
-         char action = key ? *spec : '\0';
+         size_t len = M4ARGLEN (i);
+         const char *spec;
+         char key;
+         char action;
+
+         if (!len)
+           {
+             m4_reset_syntax (M4SYNTAX);
+             continue;
+           }
+         spec = M4ARG (i);
+         key = *spec++;
+         len--;
+         action = len ? *spec : '\0';
          switch (action)
            {
            case '-':
            case '+':
            case '=':
              spec++;
+             len--;
              break;
            case '\0':
-             break;
+             if (!len)
+               break;
+             /* fall through */
            default:
              action = '=';
              break;
            }
-         if (m4_set_syntax (M4SYNTAX, key, action,
-                            key ? m4_expand_ranges (spec, obs) : "") < 0)
-           m4_warn (context, 0, me, _("undefined syntax code: `%c'"),
-                    key);
+         if (len)
+           spec = m4_expand_ranges (spec, &len, m4_arg_scratch (context));
+         if (m4_set_syntax (M4SYNTAX, key, action, spec, len) < 0)
+           m4_warn (context, 0, me, _("undefined syntax code: %s"),
+                    quotearg_style_mem (locale_quoting_style, &key, 1));
        }
     }
   else
diff --git a/modules/m4.c b/modules/m4.c
index 0ee6a68..1857d6f 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -51,7 +51,7 @@ extern void m4_set_sysval    (int);
 extern void m4_sysval_flush  (m4 *, bool);
 extern void m4_dump_symbols  (m4 *, m4_dump_symbol_data *, size_t,
                              m4_macro_args *, bool);
-extern const char *m4_expand_ranges (const char *, m4_obstack *);
+extern const char *m4_expand_ranges (const char *, size_t *, m4_obstack *);
 extern void m4_make_temp     (m4 *, m4_obstack *, const m4_call_info *,
                              const char *, size_t, bool);
 
@@ -633,8 +633,8 @@ M4BUILTIN_HANDLER (shift)
 M4BUILTIN_HANDLER (changequote)
 {
   m4_set_quotes (M4SYNTAX,
-                (argc >= 2) ? M4ARG (1) : NULL,
-                (argc >= 3) ? M4ARG (2) : NULL);
+                (argc >= 2) ? M4ARG (1) : NULL, M4ARGLEN (1),
+                (argc >= 3) ? M4ARG (2) : NULL, M4ARGLEN (2));
 }
 
 /* Change the current comment delimiters.  The function set_comment ()
@@ -642,8 +642,8 @@ M4BUILTIN_HANDLER (changequote)
 M4BUILTIN_HANDLER (changecom)
 {
   m4_set_comment (M4SYNTAX,
-                 (argc >= 2) ? M4ARG (1) : NULL,
-                 (argc >= 3) ? M4ARG (2) : NULL);
+                 (argc >= 2) ? M4ARG (1) : NULL, M4ARGLEN (1),
+                 (argc >= 3) ? M4ARG (2) : NULL, M4ARGLEN (2));
 }
 
 
@@ -951,31 +951,36 @@ M4BUILTIN_HANDLER (substr)
 }
 
 
-/* Ranges are expanded by the following function, and the expanded strings,
-   without any ranges left, are used to translate the characters of the
-   first argument.  A single - (dash) can be included in the strings by
-   being the first or the last character in the string.  If the first
-   character in a range is after the first in the character set, the range
-   is made backwards, thus 9-0 is the string 9876543210.  */
+/* Any ranges in string S of length *LEN are expanded, using OBS for
+   scratch space, and the expansion returned.  *LEN is set to the
+   expanded length.  A single - (dash) can be included in the strings
+   by being the first or the last character in the string.  If the
+   first character in a range is after the first in the character set,
+   the range is made backwards, thus 9-0 is the string 9876543210.  */
 const char *
-m4_expand_ranges (const char *s, m4_obstack *obs)
+m4_expand_ranges (const char *s, size_t *len, m4_obstack *obs)
 {
   unsigned char from;
   unsigned char to;
+  const char *end = s + *len;
 
   assert (obstack_object_size (obs) == 0);
-  for (from = '\0'; *s != '\0'; from = *s++)
+  assert (s != end);
+  from = *s++;
+  obstack_1grow (obs, from);
+
+  for ( ; s != end; from = *s++)
     {
-      if (*s == '-' && from != '\0')
+      if (*s == '-')
        {
-         to = *++s;
-         if (to == '\0')
+         if (++s == end)
            {
              /* trailing dash */
              obstack_1grow (obs, '-');
              break;
            }
-         else if (from <= to)
+         to = *s;
+         if (from <= to)
            {
              while (from++ < to)
                obstack_1grow (obs, from);
@@ -989,8 +994,9 @@ m4_expand_ranges (const char *s, m4_obstack *obs)
       else
        obstack_1grow (obs, *s);
     }
-  obstack_1grow (obs, '\0');
-  return obstack_finish (obs);
+  *len = obstack_object_size (obs);
+  /* FIXME - use obstack_finish once translit is updated.  */
+  return (char *) obstack_copy0 (obs, "", 0);
 }
 
 /* The macro "translit" translates all characters in the first
@@ -1003,6 +1009,8 @@ M4BUILTIN_HANDLER (translit)
   const char *data;
   const char *from;
   const char *to;
+  size_t from_len;
+  size_t to_len;
   char map[UCHAR_MAX + 1] = {0};
   char found[UCHAR_MAX + 1] = {0};
   unsigned char ch;
@@ -1014,16 +1022,18 @@ M4BUILTIN_HANDLER (translit)
     }
 
   from = M4ARG (2);
+  from_len = M4ARGLEN (2);
   if (strchr (from, '-') != NULL)
     {
-      from = m4_expand_ranges (from, m4_arg_scratch (context));
+      from = m4_expand_ranges (from, &from_len, m4_arg_scratch (context));
       assert (from);
     }
 
   to = M4ARG (3);
+  to_len = M4ARGLEN (3);
   if (strchr (to, '-') != NULL)
     {
-      to = m4_expand_ranges (to, m4_arg_scratch (context));
+      to = m4_expand_ranges (to, &to_len, m4_arg_scratch (context));
       assert (to);
     }
 
diff --git a/modules/m4.h b/modules/m4.h
index a82584e..63701ee 100644
--- a/modules/m4.h
+++ b/modules/m4.h
@@ -41,7 +41,8 @@ typedef void m4_set_sysval_func (int value);
 typedef void m4_dump_symbols_func (m4 *context, m4_dump_symbol_data *data,
                                   size_t argc, m4_macro_args *argv,
                                   bool complain);
-typedef const char *m4_expand_ranges_func (const char *s, m4_obstack *obs);
+typedef const char *m4_expand_ranges_func (const char *s, size_t *len,
+                                          m4_obstack *obs);
 typedef void m4_make_temp_func (m4 *context, m4_obstack *obs,
                                const m4_call_info *macro, const char *name,
                                size_t len, bool dir);
diff --git a/src/freeze.c b/src/freeze.c
index 0373459..7261b09 100644
--- a/src/freeze.c
+++ b/src/freeze.c
@@ -588,7 +588,7 @@ reload_frozen_state (m4 *context, const char *name)
        m4__module_open (context, "gnu", NULL);
       /* Disable { and } categories, since ${11} was not supported in
         1.4.x.  */
-      m4_set_syntax (M4SYNTAX, 'O', '+', "{}");
+      m4_set_syntax (M4SYNTAX, 'O', '+', "{}", 2);
       break;
     default:
       if (version > 2)
@@ -771,7 +771,7 @@ ill-formed frozen file, version 2 directive `%c' 
encountered"), 'S');
             other characters are additive.  */
          if ((m4_set_syntax (M4SYNTAX, syntax,
                              (m4_syntax_code (syntax) & M4_SYNTAX_MASKS
-                              ? '=' : '+'), string[0]) < 0)
+                              ? '=' : '+'), string[0], number[0]) < 0)
              && (syntax != '\0'))
            {
              m4_error (context, 0, 0, NULL,
@@ -843,7 +843,8 @@ ill-formed frozen file, version 2 directive `%c' 
encountered"), 't');
 
              /* Change comment strings.  */
 
-             m4_set_comment (M4SYNTAX, string[0], string[1]);
+             m4_set_comment (M4SYNTAX, string[0], number[0], string[1],
+                             number[1]);
              break;
 
            case 'D':
@@ -859,7 +860,8 @@ ill-formed frozen file, version 2 directive `%c' 
encountered"), 't');
 
              /* Change quote strings.  */
 
-             m4_set_quotes (M4SYNTAX, string[0], string[1]);
+             m4_set_quotes (M4SYNTAX, string[0], number[0], string[1],
+                            number[1]);
              break;
 
            default:
diff --git a/tests/freeze.at b/tests/freeze.at
index 6d76d32..a3b4b35 100644
--- a/tests/freeze.at
+++ b/tests/freeze.at
@@ -384,12 +384,13 @@ AT_CLEANUP
 AT_SETUP([reloading nul])
 AT_KEYWORDS([frozen])
 
-dnl AT_DATA can't generate NUL bytes (at least, not in all shells)
-AT_CHECK([printf 'define(-\0-,hi)dnl
+dnl AT_DATA can't generate NUL bytes (at least, not in all shells).
+# Skip the test if printf(1) is insufficient.
+AT_CHECK([printf 'define(-\0-,hi)changequote([,\0])changecom(--\0)dnl
 divert(1)undivert(null.out)' || exit 77],
  [0], [stdout], [ignore])
 mv stdout frozen.m4
-printf 'divert(0)indir(-\0-)\n' > unfrozen.m4
+printf 'divert(0)[divnum\0] @%:@-- indir(-\0-)\n' > unfrozen.m4
 
 # First generate the `expout' output by running over the sources before
 # freezing.
diff --git a/tests/null.err b/tests/null.err
index 8bf1f4f..9a3f322 100644
--- a/tests/null.err
+++ b/tests/null.err
@@ -1,17 +1,22 @@
 builtin:
-m4:null.m4:19: Warning: builtin: undefined builtin `-\0-'
+m4:null.m4:21: Warning: builtin: undefined builtin `-\0-'
+changequote:
+echo:  address@hidden/
+m4trace: -1- dumpdef(echo/) -> /
+changesyntax:
+m4:null.m4:46: Warning: changesyntax: undefined syntax code: `\0'
 defn:
-m4:null.m4:39: Warning: defn: undefined macro `\0-\0'
+m4:null.m4:54: Warning: defn: undefined macro `\0-\0'
 dumpdef:
-m4:null.m4:51: Warning: dumpdef: undefined macro `\0-\0'
+m4:null.m4:66: Warning: dumpdef: undefined macro `\0-\0'
 :      `empty'
 -:     `dash'
---:   `odd name: $1'
---:   `odd name: $1'
+--:   ``$0': $1'
+--:   ``$0': $1'
 --:    `dashes'
 errprint: -- --
 indir:
-m4:null.m4:81: Warning: indir: undefined macro `\0-\0'
-m4:null.m4:83: Warning: \0\0%%: extra arguments ignored: 1 > 0
+m4:null.m4:96: Warning: indir: undefined macro `\0-\0'
+m4:null.m4:98: Warning: \0\0%%: extra arguments ignored: 1 > 0
 traceon:
 m4trace: -1- --(`--') -> `strange: --'
diff --git a/tests/null.m4 b/tests/null.m4
index 55bd3bd..18a5e1d 100644
--- a/tests/null.m4
+++ b/tests/null.m4
@@ -13,26 +13,41 @@ dnl Passed through $1, $*, $@:
 define(`echo', address@hidden')define(`', `empty')dnl
 define(`-', `dash')define(`--', `dashes')dnl
 user: echo(--,`11')
+dnl Macro name of define:
+define(`--', ``$0': $1')dnl
 dnl All macros matching __*__ take no arguments, and never produce NUL.
 dnl First argument of builtin:
 errprint(`builtin:
 ')builtin(`--')dnl
 dnl Remaining arguments of builtin:
 `builtin:' builtin(`len', --)
-dnl Single-byte delimiter in changecom: not tested yet
-dnl Multi-byte delimiter in changecom: not tested yet
-dnl Single-byte delimiter in changequote: not tested yet
-dnl Multi-byte delimiter in changequote: not tested yet
-dnl Quotes in trace and dump output: not tested yet
+dnl Single-byte delimiter in changecom:
+`changecom:' changecom(,/)echo/changecom(`/',`')/echodnl
+dnl Multi-byte delimiter in changecom:
+ changecom(`--', `-')--echo-changecom(`#')
+dnl Single-byte delimiter in changequote:
+`changequote:' changequote(,/)echo/changequote`'dnl
+changequote(`/',`')/echochangequote`'dnl
+dnl Multi-byte delimiter in changequote:
+ changequote(`--', `-')--echo-changequote`'
+dnl Quotes in trace and dump output:
+errprint(`changequote:
+')traceon(`dumpdef')dumpdef(`echo'changequote(,/))changequote`'dnl
+traceoff(`dumpdef')dnl
 dnl Warning from changeresyntax: not tested yet. No resyntax includes NUL, 
needs to warn
-dnl Macro name in changesyntax: not tested yet
-dnl Escape in changesyntax: not tested yet
+dnl Macro name in changesyntax:
+`changesyntax:' changesyntax(`W+-')-- --(-)`'changesyntax()dnl
+dnl Escape in changesyntax:
+ changesyntax(address@hidden')echo echo --changesyntax(`O=')dnl
+dnl Active in changesyntax:
+ changesyntax(`A')define(`', `nul')`'changesyntax(`A-')undefine(`')
+dnl Warning from changesyntax:
+errprint(`changesyntax:
+')changesyntax()dnl
 dnl Ignored by changesyntax: TODO - support ignored category?
 dnl Warning from debugfile: not tested yet. No file name includes NUL, needs 
to warn
 dnl Warning from debugmode: not tested yet. NUL not a valid mode, needs to warn
 dnl Warning from decr: not tested yet. NUL not a number, needs to warn
-dnl Macro name of define:
-define(`--', `odd name: $1')dnl
 dnl Definition of define: not tested yet
 dnl Undefined argument of defn:
 errprint(`defn:
diff --git a/tests/null.out b/tests/null.out
index 3a96faa..9e48a6a 100644
--- a/tests/null.out
+++ b/tests/null.out
@@ -4,13 +4,16 @@ quoted: --
 commented: #--
 user: .--.--,11.--,11.
 builtin: 3
-defn: odd name: $1
+changecom: echo//echo --echo-
+changequote: echoecho echo
+changesyntax: -- --: dash echo .... dash- nul
+defn: `$0': $1
 divert: --
 esyscmd: [] 0
 ifdef: yes: -- no: --
 ifelse: yes: --
 index: 2 -1 -1 8
-indir: odd name: 11 0 3
+indir: --: 11 0 3
 len: 1 3
 m4symbols: --
 patsubst: .. -- abc -!- ---
-- 
1.5.5.1

From 10b0347d04e2d00fddfce01d0321d3e75aaf6520 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Wed, 19 Dec 2007 17:16:28 -0700
Subject: [PATCH] Stage 25: Handle embedded NUL in changequote and changecom.

* m4/gnulib-cache.m4: Import obstack-printf-posix module.
* src/m4.h (ntoa): Remove declaration.
(DEBUG_PRINT1, DEBUG_PRINT3, MESSAGE, DEBUG_MESSAGE1)
(DEBUG_MESSAGE2): Delete, now that these macros are unused.
(debug_message_prefix): Rename...
(debug_message): ...and add parameters.
(set_quotes, set_comment): Add parameters.
* src/debug.c (debug_message_prefix): Rename...
(debug_message): ...and use obstack_printf.
(trace_format): Delete.
(trace_header): Adjust caller.
* src/input.c (init_argv_token, input_init): Handle embedded NUL
in comments and quotes.
(match_input, MATCH, set_quotes, set_comment): Add parameter.
(set_quote_age): Adjust heuristic for safe quote.
(push_file, pop_input, next_token, peek_token): Adjust callers.
* src/freeze.c (produce_frozen_state, reload_frozen_state): Handle
embedded NUL in quotes and comments.
* src/builtin.h (ntoa): Make static.
(shipout_int, m4_eval, m4_maketemp): Use obstack_printf.
(m4_dumpdef): Avoid truncating output on embedded NUL.
(m4_changequote, m4_changecom): Handle embedded NUL.
* src/format.c (expand_format): Use obstack_printf.
* src/output.c (m4_tmpname, divert_text): Likewise.
* src/path.c (m4_path_search): Adjust caller.
* doc/m4.texinfo (Using frozen files): Enhance test.
* examples/null.m4: Likewise.
* examples/null.out: Update expected output.
* examples/null.err: Likewise.

(cherry picked from commit 40c640f486bf7a99c6e16d91332f25872f501488)

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog          |   39 ++++++++++++
 doc/m4.texinfo     |    7 +-
 examples/null.err  |   11 ++-
 examples/null.m4   |   18 ++++--
 examples/null.out  |    2 +
 m4/gnulib-cache.m4 |    4 +-
 src/builtin.c      |   64 +++++++++----------
 src/debug.c        |   88 +++++++-------------------
 src/format.c       |   47 ++++----------
 src/freeze.c       |   29 ++++++---
 src/input.c        |  175 ++++++++++++++++++++++++++++++----------------------
 src/m4.h           |   59 +-----------------
 src/output.c       |   23 ++-----
 src/path.c         |    6 +-
 14 files changed, 268 insertions(+), 304 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index ee7e246..60a5a9e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,42 @@
+2008-06-18  Eric Blake  <address@hidden>
+
+       Stage 25: Handle embedded NUL in changequote and changecom.
+       Track quote and comment delimiters by length, to allow embedded
+       NUL.  Convert macro tracing and other locations to use
+       obstack_printf rather than hand-rolled equivalents.  Ensure that
+       embedded NUL in trace output does not truncate the trace string.
+       Memory impact: none.
+       Speed impact: none noticed.
+       * m4/gnulib-cache.m4: Import obstack-printf-posix module.
+       * src/m4.h (ntoa): Remove declaration.
+       (DEBUG_PRINT1, DEBUG_PRINT3, MESSAGE, DEBUG_MESSAGE1)
+       (DEBUG_MESSAGE2): Delete, now that these macros are unused.
+       (debug_message_prefix): Rename...
+       (debug_message): ...and add parameters.
+       (set_quotes, set_comment): Add parameters.
+       * src/debug.c (debug_message_prefix): Rename...
+       (debug_message): ...and use obstack_printf.
+       (trace_format): Delete.
+       (trace_header): Adjust caller.
+       * src/input.c (init_argv_token, input_init): Handle embedded NUL
+       in comments and quotes.
+       (match_input, MATCH, set_quotes, set_comment): Add parameter.
+       (set_quote_age): Adjust heuristic for safe quote.
+       (push_file, pop_input, next_token, peek_token): Adjust callers.
+       * src/freeze.c (produce_frozen_state, reload_frozen_state): Handle
+       embedded NUL in quotes and comments.
+       * src/builtin.h (ntoa): Make static.
+       (shipout_int, m4_eval, m4_maketemp): Use obstack_printf.
+       (m4_dumpdef): Avoid truncating output on embedded NUL.
+       (m4_changequote, m4_changecom): Handle embedded NUL.
+       * src/format.c (expand_format): Use obstack_printf.
+       * src/output.c (m4_tmpname, divert_text): Likewise.
+       * src/path.c (m4_path_search): Adjust caller.
+       * doc/m4.texinfo (Using frozen files): Enhance test.
+       * examples/null.m4: Likewise.
+       * examples/null.out: Update expected output.
+       * examples/null.err: Likewise.
+
 2008-06-16  Eric Blake  <address@hidden>
 
        Add missing const qualifications.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index fe429b4..5a645b8 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -7012,12 +7012,13 @@ ifdef(`__unix__', ,
       `errprint(` skipping: syscmd does not have unix semantics
 ')m4exit(`77')')dnl
 changequote(`[', `]')dnl
-syscmd([printf 'define(-\0-,hi)dnl
+syscmd([printf 'define(-\0-,hi)changequote([,\0])changecom(--\0)dnl
 divert(1)undivert(null.out)' | ]__program__[ -F in.m4f \
-     && printf 'errprint(indir(-\0-))' | ]__program__[ -R in.m4f \
+     && printf 'errprint([divnum\0] #-- indir(-\0-))' \
+       | ]__program__[ -R in.m4f \
      && rm in.m4f])errprint([ ]sysval[
 ])dnl
address@hidden 0
address@hidden #-- hi 0
 @end example
 @end ignore
 
diff --git a/examples/null.err b/examples/null.err
index 05a1ba3..5f989ee 100644
--- a/examples/null.err
+++ b/examples/null.err
@@ -1,9 +1,12 @@
 builtin:
 m4:examples/null.m4:19: Warning: builtin: undefined builtin `-\0-'
+changequote:
+echo:  address@hidden/
+m4trace: -1- dumpdef(echo/) -> /
 defn:
-m4:examples/null.m4:37: Warning: defn: undefined macro `\0-\0'
+m4:examples/null.m4:45: Warning: defn: undefined macro `\0-\0'
 dumpdef:
-m4:examples/null.m4:49: Warning: dumpdef: undefined macro `\0-\0'
+m4:examples/null.m4:57: Warning: dumpdef: undefined macro `\0-\0'
 :      `empty'
 -:     `dash'
 --:   `odd name: $1'
@@ -11,7 +14,7 @@ m4:examples/null.m4:49: Warning: dumpdef: undefined macro 
`\0-\0'
 --:    `dashes'
 errprint: -- --
 indir:
-m4:examples/null.m4:79: Warning: indir: undefined macro `\0-\0'
-m4:examples/null.m4:81: Warning: \0\0%%: extra arguments ignored: 1 > 0
+m4:examples/null.m4:87: Warning: indir: undefined macro `\0-\0'
+m4:examples/null.m4:89: Warning: \0\0%%: extra arguments ignored: 1 > 0
 traceon:
 m4trace: -1- --(`--') -> `strange: --'
diff --git a/examples/null.m4 b/examples/null.m4
index c928360..de76742 100644
--- a/examples/null.m4
+++ b/examples/null.m4
@@ -19,11 +19,19 @@ errprint(`builtin:
 ')builtin(`--')dnl
 dnl Remaining arguments of builtin:
 `builtin:' builtin(`len', --)
-dnl Single-byte delimiter in changecom: not tested yet
-dnl Multi-byte delimiter in changecom: not tested yet
-dnl Single-byte delimiter in changequote: not tested yet
-dnl Multi-byte delimiter in changequote: not tested yet
-dnl Quotes in trace and dump output: not tested yet
+dnl Single-byte delimiter in changecom:
+`changecom:' changecom(,/)echo/changecom(`/',`')/echodnl
+dnl Multi-byte delimiter in changecom:
+ changecom(`--', `-')--echo-changecom(`#')
+dnl Single-byte delimiter in changequote:
+`changequote:' changequote(,/)echo/changequote`'dnl
+changequote(`/',`')/echochangequote`'dnl
+dnl Multi-byte delimiter in changequote:
+ changequote(`--', `-')--echo-changequote`'
+dnl Quotes in trace and dump output:
+errprint(`changequote:
+')traceon(`dumpdef')dumpdef(`echo'changequote(,/))changequote`'dnl
+traceoff(`dumpdef')dnl
 dnl Used in changeword (if changeword available): not tested yet
 dnl Bad regex in changeword: not tested yet
 dnl Warning from debugfile: not tested yet. No file name includes NUL, needs 
to warn
diff --git a/examples/null.out b/examples/null.out
index 66f41b5..5e90221 100644
--- a/examples/null.out
+++ b/examples/null.out
@@ -4,6 +4,8 @@ quoted: --
 commented: #--
 user: .--.--,11.--,11.
 builtin: 3
+changecom: echo//echo --echo-
+changequote: echoecho echo
 defn: odd name: $1
 divert: --
 esyscmd: [] 0
diff --git a/m4/gnulib-cache.m4 b/m4/gnulib-cache.m4
index 5aa4820..e20ee63 100644
--- a/m4/gnulib-cache.m4
+++ b/m4/gnulib-cache.m4
@@ -15,11 +15,11 @@
 
 
 # Specification in the form of a command-line invocation:
-#   gnulib-tool --import --dir=. --local-dir=local --lib=libm4 
--source-base=lib --m4-base=m4 --doc-base=doc --aux-dir=build-aux --with-tests 
--no-libtool --macro-prefix=M4 announce-gen assert autobuild avltree-oset 
binary-io clean-temp cloexec close-stream closein config-h error fdl fflush 
flexmember fopen-safer fseeko gendocs getopt git-version-gen gnumakefile 
gnupload gpl-3.0 intprops memchr2 memmem mkstemp obstack progname quote regex 
stdbool stdint stdlib-safer strtod strtol unlocked-io vasnprintf-posix verror 
version-etc version-etc-fsf xalloc xmemdup0 xprintf xvasprintf-posix
+#   gnulib-tool --import --dir=. --local-dir=local --lib=libm4 
--source-base=lib --m4-base=m4 --doc-base=doc --aux-dir=build-aux --with-tests 
--no-libtool --macro-prefix=M4 announce-gen assert autobuild avltree-oset 
binary-io clean-temp cloexec close-stream closein config-h error fdl fflush 
flexmember fopen-safer fseeko gendocs getopt git-version-gen gnumakefile 
gnupload gpl-3.0 intprops memchr2 memmem mkstemp obstack obstack-printf-posix 
progname quote regex stdbool stdint stdlib-safer strtod strtol unlocked-io 
vasnprintf-posix verror version-etc version-etc-fsf xalloc xmemdup0 xprintf 
xvasprintf-posix
 
 # Specification in the form of a few gnulib-tool.m4 macro invocations:
 gl_LOCAL_DIR([local])
-gl_MODULES([announce-gen assert autobuild avltree-oset binary-io clean-temp 
cloexec close-stream closein config-h error fdl fflush flexmember fopen-safer 
fseeko gendocs getopt git-version-gen gnumakefile gnupload gpl-3.0 intprops 
memchr2 memmem mkstemp obstack progname quote regex stdbool stdint stdlib-safer 
strtod strtol unlocked-io vasnprintf-posix verror version-etc version-etc-fsf 
xalloc xmemdup0 xprintf xvasprintf-posix])
+gl_MODULES([announce-gen assert autobuild avltree-oset binary-io clean-temp 
cloexec close-stream closein config-h error fdl fflush flexmember fopen-safer 
fseeko gendocs getopt git-version-gen gnumakefile gnupload gpl-3.0 intprops 
memchr2 memmem mkstemp obstack obstack-printf-posix progname quote regex 
stdbool stdint stdlib-safer strtod strtol unlocked-io vasnprintf-posix verror 
version-etc version-etc-fsf xalloc xmemdup0 xprintf xvasprintf-posix])
 gl_AVOID([])
 gl_SOURCE_BASE([lib])
 gl_M4_BASE([m4])
diff --git a/src/builtin.c b/src/builtin.c
index e68ea8d..6b107ae 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -587,7 +587,7 @@ numeric_arg (const call_info *name, const char *arg, int 
*valuep)
 /* Digits for number to ascii conversions.  */
 static char const digits[] = "0123456789abcdefghijklmnopqrstuvwxyz";
 
-const char *
+static const char *
 ntoa (int32_t value, int radix)
 {
   bool negative;
@@ -629,10 +629,7 @@ ntoa (int32_t value, int radix)
 static void
 shipout_int (struct obstack *obs, int val)
 {
-  const char *s;
-
-  s = ntoa ((int32_t) val, 10);
-  obstack_grow (obs, s, strlen (s));
+  obstack_printf (obs, "%d", val);
 }
 
 
@@ -908,11 +905,10 @@ m4_dumpdef (struct obstack *obs, int argc, 
macro_arguments *argv)
        {
        case TOKEN_TEXT:
          if (debug_level & DEBUG_TRACE_QUOTE)
-           DEBUG_PRINT3 ("%s%s%s\n",
-                         curr_quote.str1, SYMBOL_TEXT (data.base[0]),
-                         curr_quote.str2);
-         else
-           DEBUG_PRINT1 ("%s\n", SYMBOL_TEXT (data.base[0]));
+           fwrite (curr_quote.str1, 1, curr_quote.len1, debug);
+         fputs (SYMBOL_TEXT (data.base[0]), debug);
+         if (debug_level & DEBUG_TRACE_QUOTE)
+           fwrite (curr_quote.str2, 1, curr_quote.len2, debug);
          break;
 
        case TOKEN_FUNC:
@@ -922,7 +918,7 @@ m4_dumpdef (struct obstack *obs, int argc, macro_arguments 
*argv)
              assert (!"m4_dumpdef");
              abort ();
            }
-         DEBUG_PRINT1 ("<%s>\n", bp->name);
+         xfprintf (debug, "<%s>", bp->name);
          break;
 
        default:
@@ -930,6 +926,7 @@ m4_dumpdef (struct obstack *obs, int argc, macro_arguments 
*argv)
          abort ();
          break;
        }
+      fputc ('\n', debug);
     }
 }
 
@@ -1211,11 +1208,14 @@ m4_eval (struct obstack *obs, int argc, macro_arguments 
*argv)
          obstack_1grow (obs, '-');
          value = -value;
        }
-      /* This assumes 2's-complement for correctly handling INT_MIN.  */
-      while (min-- - value > 0)
-       obstack_1grow (obs, '0');
-      while (value-- != 0)
-       obstack_1grow (obs, '1');
+      if ((uint32_t) value < min)
+       {
+         obstack_blank (obs, min - value);
+         memset ((char *) obstack_next_free (obs) - (min - value), '0',
+                 min - value);
+       }
+      obstack_blank (obs, value);
+      memset ((char *) obstack_next_free (obs) - value, '1', value);
       return;
     }
 
@@ -1227,10 +1227,9 @@ m4_eval (struct obstack *obs, int argc, macro_arguments 
*argv)
       s++;
     }
   len = strlen (s);
-  for (min -= len; --min >= 0;)
-    obstack_1grow (obs, '0');
-
-  obstack_grow (obs, s, len);
+  if (min < len)
+    min = len;
+  obstack_printf (obs, "%.*d%s", min - len, 0, s);
 }
 
 static void
@@ -1378,8 +1377,8 @@ m4_changequote (struct obstack *obs, int argc, 
macro_arguments *argv)
   bad_argc (arg_info (argv), argc, 0, 2);
 
   /* Explicit NULL distinguishes between empty and missing argument.  */
-  set_quotes ((argc >= 2) ? ARG (1) : NULL,
-             (argc >= 3) ? ARG (2) : NULL);
+  set_quotes ((argc >= 2) ? ARG (1) : NULL, ARG_LEN (1),
+             (argc >= 3) ? ARG (2) : NULL, ARG_LEN (2));
 }
 
 /*--------------------------------------------------------------------.
@@ -1393,8 +1392,8 @@ m4_changecom (struct obstack *obs, int argc, 
macro_arguments *argv)
   bad_argc (arg_info (argv), argc, 0, 2);
 
   /* Explicit NULL distinguishes between empty and missing argument.  */
-  set_comment ((argc >= 2) ? ARG (1) : NULL,
-              (argc >= 3) ? ARG (2) : NULL);
+  set_comment ((argc >= 2) ? ARG (1) : NULL, ARG_LEN (1),
+              (argc >= 3) ? ARG (2) : NULL, ARG_LEN (2));
 }
 
 #ifdef ENABLE_CHANGEWORD
@@ -1535,23 +1534,20 @@ m4_maketemp (struct obstack *obs, int argc, 
macro_arguments *argv)
       const char *str = ARG (1);
       size_t len = ARG_LEN (1);
       size_t i;
-      size_t len2;
+      struct obstack *scratch = arg_scratch ();
+      size_t pid_len = obstack_printf (scratch, "%lu",
+                                      (unsigned long) getpid ());
+      char *pid = (char *) obstack_copy0 (scratch, "", 0);
 
       m4_warn (0, me, _("recommend using mkstemp instead"));
       for (i = len; i > 1; i--)
        if (str[i - 1] != 'X')
          break;
       obstack_grow (obs, str, i);
-      str = ntoa ((int32_t) getpid (), 10);
-      len2 = strlen (str);
-      if (len2 > len - i)
-       obstack_grow (obs, str + len2 - (len - i), len - i);
+      if (len - i < pid_len)
+       obstack_grow (obs, pid + pid_len - (len - i), len - i);
       else
-       {
-         while (i++ < len - len2)
-           obstack_1grow (obs, '0');
-         obstack_grow (obs, str, len2);
-       }
+       obstack_printf (obs, "%.*d%s", len - i - pid_len, 0, pid);
     }
   else
     mkstemp_helper (obs, me, ARG (1), ARG_LEN (1));
diff --git a/src/debug.c b/src/debug.c
index 2b2388f..c3f85bd 100644
--- a/src/debug.c
+++ b/src/debug.c
@@ -215,17 +215,25 @@ debug_set_output (const call_info *caller, const char 
*name)
 `-----------------------------------------------------------------------*/
 
 void
-debug_message_prefix (void)
+debug_message (const char *format, ...)
 {
-  xfprintf (debug, "m4debug:");
-  if (current_line)
-  {
-    if (debug_level & DEBUG_TRACE_FILE)
-      xfprintf (debug, "%s:", current_file);
-    if (debug_level & DEBUG_TRACE_LINE)
-      xfprintf (debug, "%d:", current_line);
-  }
-  putc (' ', debug);
+  va_list args;
+  if (debug)
+    {
+      xfprintf (debug, "m4debug:");
+      if (current_line)
+       {
+         if (debug_level & DEBUG_TRACE_FILE)
+           xfprintf (debug, "%s:", current_file);
+         if (debug_level & DEBUG_TRACE_LINE)
+           xfprintf (debug, "%d:", current_line);
+       }
+      putc (' ', debug);
+      va_start (args, format);
+      xvfprintf (debug, format, args);
+      va_end (args);
+      putc ('\n', debug);
+    }
 }
 
 /* The rest of this file contains the functions for macro tracing output.
@@ -234,55 +242,6 @@ debug_message_prefix (void)
    output from interfering with other debug messages generated by the
    various builtins.  */
 
-/*-------------------------------------------------------------------.
-| Tracing output to the obstack is formatted here, by a simplified   |
-| printf-like function trace_format ().  Understands only %s (1 arg: |
-| text), %d (1 arg: integer).                                        |
-`-------------------------------------------------------------------*/
-
-static void
-trace_format (const char *fmt, ...)
-{
-  va_list args;
-  char ch;
-  int d;
-  const char *s;
-  size_t maxlen;
-
-  va_start (args, fmt);
-
-  while (true)
-    {
-      while ((ch = *fmt++) != '\0' && ch != '%')
-       obstack_1grow (&trace, ch);
-
-      if (ch == '\0')
-       break;
-
-      maxlen = SIZE_MAX;
-      switch (*fmt++)
-       {
-       case 's':
-         s = va_arg (args, const char *);
-         break;
-
-       case 'd':
-         d = va_arg (args, int);
-         s = ntoa (d, 10);
-         break;
-
-       default:
-         s = "";
-         break;
-       }
-
-      if (shipout_string_trunc (&trace, s, SIZE_MAX, &maxlen))
-       break;
-    }
-
-  va_end (args);
-}
-
 /*------------------------------------------------------------------.
 | Format the standard header attached to all tracing output lines,  |
 | using the context in INFO as appropriate.  Return the offset into |
@@ -294,14 +253,15 @@ trace_header (const call_info *info)
 {
   int trace_level = info->debug_level;
   unsigned int result = obstack_object_size (&trace);
-  trace_format ("m4trace:");
+
+  obstack_grow (&trace, "m4trace:", 8);
   if (trace_level & DEBUG_TRACE_FILE)
-    trace_format ("%s:", info->file);
+    obstack_printf (&trace, "%s:", info->file);
   if (trace_level & DEBUG_TRACE_LINE)
-    trace_format ("%d:", info->line);
-  trace_format (" -%d- ", expansion_level);
+    obstack_printf (&trace, "%d:", info->line);
+  obstack_printf (&trace, " -%d- ", expansion_level);
   if (trace_level & DEBUG_TRACE_CALLID)
-    trace_format ("id %d: ", info->call_id);
+    obstack_printf (&trace, "id %d: ", info->call_id);
   return result;
 }
 
diff --git a/src/format.c b/src/format.c
index c783d11..3325853 100644
--- a/src/format.c
+++ b/src/format.c
@@ -156,9 +156,7 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
   char ok[128];
 
   /* Buffer and stuff.  */
-  char *base;                  /* Current position in obs.  */
-  size_t len;                  /* Length of formatted text.  */
-  char *str;                   /* Malloc'd buffer of formatted text.  */
+  int result = 0;
   enum {CHAR, INT, LONG, DOUBLE, STR} datatype;
 
   f = fmt = ARG_STR (i, argc, argv);
@@ -352,56 +350,39 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
        }
       *p++ = c;
       *p = '\0';
-      base = obstack_next_free (obs);
-      len = obstack_room (obs);
 
       switch (datatype)
        {
        case CHAR:
-         str = asnprintf (base, &len, fstart, width,
-                          ARG_INT (i, argc, argv));
+         result = obstack_printf (obs, fstart, width,
+                                  ARG_INT (i, argc, argv));
          break;
 
        case INT:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_INT (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_INT (i, argc, argv));
          break;
 
        case LONG:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_LONG (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_LONG (i, argc, argv));
          break;
 
        case DOUBLE:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_DOUBLE (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_DOUBLE (i, argc, argv));
          break;
 
        case STR:
-         str = asnprintf (base, &len, fstart, width, prec,
-                          ARG_STR (i, argc, argv));
+         result = obstack_printf (obs, fstart, width, prec,
+                                  ARG_STR (i, argc, argv));
          break;
 
        default:
          abort ();
        }
-
-      if (str == NULL)
-       /* NULL is unexpected (EILSEQ and EINVAL are not possible
-          based on our construction of fstart, leaving only ENOMEM,
-          which should always be fatal).  */
-       m4_error (EXIT_FAILURE, errno, me,
-                 _("unable to format output for `%s'"), f);
-      else if (str == base)
-       /* The output was already computed in place, but we need to
-          account for its size.  */
-       obstack_blank_fast (obs, len);
-      else
-       {
-         /* The output exceeded available obstack space, copy the
-            allocated string.  */
-         obstack_grow (obs, str, len);
-         free (str);
-       }
+      /* Since obstack_printf can only fail with EILSEQ or EINVAL, but
+        we constructed fstart, the result should not be negative.  */
+      assert (0 <= result);
     }
 }
diff --git a/src/freeze.c b/src/freeze.c
index e67bcc8..5e35c81 100644
--- a/src/freeze.c
+++ b/src/freeze.c
@@ -71,16 +71,27 @@ produce_frozen_state (const char *name)
 
   /* Dump quote delimiters.  */
 
-  if (strcmp (curr_quote.str1, DEF_LQUOTE)
-      || strcmp (curr_quote.str2, DEF_RQUOTE))
-    xfprintf (file, "Q%d,%d\n%s%s\n", (int) curr_quote.len1,
-             (int) curr_quote.len2, curr_quote.str1, curr_quote.str2);
+  if (curr_quote.len1 != 1 || curr_quote.len2 != 1
+      || *curr_quote.str1 != *DEF_LQUOTE || *curr_quote.str2 != *DEF_RQUOTE)
+    {
+      xfprintf (file, "Q%d,%d\n", (int) curr_quote.len1,
+               (int) curr_quote.len2);
+      fwrite (curr_quote.str1, 1, curr_quote.len1, file);
+      fwrite (curr_quote.str2, 1, curr_quote.len2, file);
+      fputc ('\n', file);
+    }
 
   /* Dump comment delimiters.  */
 
-  if (strcmp (curr_comm.str1, DEF_BCOMM) || strcmp (curr_comm.str2, DEF_ECOMM))
-    xfprintf (file, "C%d,%d\n%s%s\n", (int) curr_comm.len1,
-             (int) curr_comm.len2, curr_comm.str1, curr_comm.str2);
+  if (curr_comm.len1 != 1 || curr_comm.len2 != 1
+      || *curr_comm.str1 != *DEF_BCOMM || *curr_comm.str2 != *DEF_ECOMM)
+    {
+      xfprintf (file, "C%d,%d\n", (int) curr_comm.len1,
+               (int) curr_comm.len2);
+      fwrite (curr_comm.str1, 1, curr_comm.len1, file);
+      fwrite (curr_comm.str2, 1, curr_comm.len2, file);
+      fputc ('\n', file);
+    }
 
   /* Dump all symbols.  */
 
@@ -329,7 +340,7 @@ reload_frozen_state (const char *name)
 
              /* Change comment strings.  */
 
-             set_comment (string[0], string[1]);
+             set_comment (string[0], number[0], string[1], number[1]);
              break;
 
            case 'D':
@@ -361,7 +372,7 @@ reload_frozen_state (const char *name)
 
              /* Change quote strings.  */
 
-             set_quotes (string[0], string[1]);
+             set_quotes (string[0], number[0], string[1], number[1]);
              break;
 
            default:
diff --git a/src/input.c b/src/input.c
index acbc370..589fbb1 100644
--- a/src/input.c
+++ b/src/input.c
@@ -252,7 +252,7 @@ push_file (FILE *fp, const char *title, bool 
close_when_done)
     }
 
   if (debug_level & DEBUG_TRACE_INPUT)
-    DEBUG_MESSAGE1 ("input read from %s", title);
+    debug_message ("input read from %s", title);
 
   i = (input_block *) obstack_alloc (current_input, sizeof *i);
   i->type = INPUT_FILE;
@@ -653,10 +653,10 @@ pop_input (bool cleanup)
       if (debug_level & DEBUG_TRACE_INPUT)
        {
          if (tmp != &input_eof)
-           DEBUG_MESSAGE2 ("input reverted to %s, line %d",
-                           tmp->file, tmp->line);
+           debug_message ("input reverted to %s, line %d",
+                          tmp->file, tmp->line);
          else
-           DEBUG_MESSAGE ("input exhausted");
+           debug_message ("input exhausted");
        }
 
       if (ferror (isp->u.u_f.fp))
@@ -1182,8 +1182,8 @@ init_argv_token (struct obstack *obs, token_data *td)
      last element of the $@ ref is reparsed, we must increase the argv
      refcount here, to compensate for the fact that it will be
      decreased once the final element is parsed.  */
-  assert (*curr_comm.str1 != ',' && *curr_comm.str1 != ')'
-         && *curr_comm.str1 != *curr_quote.str1);
+  assert (!curr_comm.len1 || (*curr_comm.str1 != ',' && *curr_comm.str1 != ')'
+                             && *curr_comm.str1 != *curr_quote.str1));
   ch = peek_input (true);
   if (ch != ',' && ch != ')')
     {
@@ -1198,25 +1198,26 @@ init_argv_token (struct obstack *obs, token_data *td)
 
 /*------------------------------------------------------------------.
 | This function is for matching a string against a prefix of the    |
-| input stream.  If the string S matches the input and CONSUME is   |
-| true, the input is discarded; otherwise any characters read are   |
-| pushed back again.  The function is used only when multicharacter |
-| quotes or comment delimiters are used.                            |
+| input stream.  If the string S of length SLEN matches the input   |
+| and CONSUME is true, the input is discarded; otherwise any        |
+| characters read are pushed back again.  The function is used only |
+| when multicharacter quotes or comment delimiters are used.        |
 `------------------------------------------------------------------*/
 
 static bool
-match_input (const char *s, bool consume)
+match_input (const char *s, size_t slen, bool consume)
 {
   int n;                       /* number of characters matched */
   int ch;                      /* input character */
   const char *t;
   bool result = false;
 
+  assert (slen);
   ch = peek_input (false);
   if (ch != to_uchar (*s))
     return false;                      /* fail */
 
-  if (s[1] == '\0')
+  if (slen == 1)
     {
       if (consume)
        next_char (false, false);
@@ -1228,7 +1229,7 @@ match_input (const char *s, bool consume)
     {
       next_char (false, false);
       n++;
-      if (*s == '\0')          /* long match */
+      if (--slen == 1)         /* long match */
        {
          if (consume)
            return true;
@@ -1244,20 +1245,21 @@ match_input (const char *s, bool consume)
   return result;
 }
 
-/*--------------------------------------------------------------------.
-| The macro MATCH() is used to match a string S against the input.    |
-| The first character is handled inline, for speed.  Hopefully, this  |
-| will not hurt efficiency too much when single character quotes and  |
-| comment delimiters are used.  If CONSUME, then CH is the result of  |
-| next_char, and a successful match will discard the matched string.  |
-| Otherwise, CH is the result of peek_input, and the input stream is  |
-| effectively unchanged.                                              |
-`--------------------------------------------------------------------*/
+/*---------------------------------------------------------------.
+| The macro MATCH() is used to match a string S of length SLEN   |
+| against the input.  The first character is handled inline, for |
+| speed.  Hopefully, this will not hurt efficiency too much when |
+| single character quotes and comment delimiters are used.  If   |
+| CONSUME, then CH is the result of next_char, and a successful  |
+| match will discard the matched string.  Otherwise, CH is the   |
+| result of peek_input, and the input stream is effectively      |
+| unchanged.                                                     |
+`---------------------------------------------------------------*/
 
-#define MATCH(ch, s, consume)                                          \
-  (to_uchar ((s)[0]) == (ch)                                           \
-   && (ch) != '\0'                                                     \
-   && ((s)[1] == '\0' || (match_input ((s) + (consume), consume))))
+#define MATCH(ch, s, slen, consume)                                    \
+  ((slen) && to_uchar ((s)[0]) == (ch)                                 \
+   && ((slen) == 1                                                     \
+       || (match_input ((s) + (consume), (slen) - (consume), consume))))
 
 
 /*----------------------------------------------------------.
@@ -1289,14 +1291,14 @@ input_init (void)
 
   start_of_input_line = false;
 
-  curr_quote.str1 = xstrdup (DEF_LQUOTE);
-  curr_quote.len1 = strlen (curr_quote.str1);
-  curr_quote.str2 = xstrdup (DEF_RQUOTE);
-  curr_quote.len2 = strlen (curr_quote.str2);
-  curr_comm.str1 = xstrdup (DEF_BCOMM);
-  curr_comm.len1 = strlen (curr_comm.str1);
-  curr_comm.str2 = xstrdup (DEF_ECOMM);
-  curr_comm.len2 = strlen (curr_comm.str2);
+  curr_quote.str1 = xmemdup (DEF_LQUOTE, 1);
+  curr_quote.len1 = 1;
+  curr_quote.str2 = xmemdup (DEF_RQUOTE, 1);
+  curr_quote.len2 = 1;
+  curr_comm.str1 = xmemdup (DEF_BCOMM, 1);
+  curr_comm.len1 = 1;
+  curr_comm.str2 = xmemdup (DEF_ECOMM, 1);
+  curr_comm.len2 = 1;
 
 #ifdef ENABLE_CHANGEWORD
   set_word_regexp (NULL, user_word_regexp);
@@ -1306,14 +1308,15 @@ input_init (void)
 }
 
 
-/*--------------------------------------------------------------------.
-| Set the quote delimiters to LQ and RQ.  Used by m4_changequote ().  |
-| Pass NULL if the argument was not present, to distinguish from an   |
-| explicit empty string.                                              |
-`--------------------------------------------------------------------*/
+/*-----------------------------------------------------------------.
+| Set the quote delimiters to LQ and RQ, with respective lengths   |
+| LQ_LEN and RQ_LEN.  Used by m4_changequote ().  Pass NULL if the |
+| argument was not present, to distinguish from an explicit empty  |
+| string.                                                          |
+`-----------------------------------------------------------------*/
 
 void
-set_quotes (const char *lq, const char *rq)
+set_quotes (const char *lq, size_t lq_len, const char *rq, size_t rq_len)
 {
   /* POSIX states that with 0 arguments, the default quotes are used.
      POSIX XCU ERN 112 states that behavior is implementation-defined
@@ -1325,31 +1328,39 @@ set_quotes (const char *lq, const char *rq)
   if (!lq)
     {
       lq = DEF_LQUOTE;
+      lq_len = 1;
       rq = DEF_RQUOTE;
+      rq_len = 1;
+    }
+  else if (!rq || (lq_len && !rq_len))
+    {
+      rq = DEF_RQUOTE;
+      rq_len = 1;
     }
-  else if (!rq || (*lq && !*rq))
-    rq = DEF_RQUOTE;
 
-  if (strcmp (curr_quote.str1, lq) == 0 && strcmp (curr_quote.str2, rq) == 0)
+  if (curr_quote.len1 == lq_len && curr_quote.len2 == rq_len
+      && memcmp (curr_quote.str1, lq, lq_len) == 0
+      && memcmp (curr_quote.str2, rq, rq_len) == 0)
     return;
 
   free (curr_quote.str1);
   free (curr_quote.str2);
-  curr_quote.str1 = xstrdup (lq);
-  curr_quote.len1 = strlen (curr_quote.str1);
-  curr_quote.str2 = xstrdup (rq);
-  curr_quote.len2 = strlen (curr_quote.str2);
+  curr_quote.str1 = xmemdup (lq, lq_len);
+  curr_quote.len1 = lq_len;
+  curr_quote.str2 = xmemdup (rq, rq_len);
+  curr_quote.len2 = rq_len;
   set_quote_age ();
 }
 
-/*--------------------------------------------------------------------.
-| Set the comment delimiters to BC and EC.  Used by m4_changecom ().  |
-| Pass NULL if the argument was not present, to distinguish from an   |
-| explicit empty string.                                              |
-`--------------------------------------------------------------------*/
+/*-----------------------------------------------------------------.
+| Set the comment delimiters to BC and EC, with respective lengths |
+| BC_LEN and EC_LEN.  Used by m4_changecom ().  Pass NULL if the   |
+| argument was not present, to distinguish from an explicit empty  |
+| string.                                                          |
+`-----------------------------------------------------------------*/
 
 void
-set_comment (const char *bc, const char *ec)
+set_comment (const char *bc, size_t bc_len, const char *ec, size_t ec_len)
 {
   /* POSIX requires no arguments to disable comments.  It requires
      empty arguments to be used as-is, but this is counter to
@@ -1359,19 +1370,27 @@ set_comment (const char *bc, const char *ec)
      This implementation assumes the aardvark will be approved.  See
      the texinfo for what some other implementations do.  */
   if (!bc)
-    bc = ec = "";
-  else if (!ec || (*bc && !*ec))
-    ec = DEF_ECOMM;
+    {
+      bc = ec = "";
+      bc_len = ec_len = 0;
+    }
+  else if (!ec || (bc_len && !ec_len))
+    {
+      ec = DEF_ECOMM;
+      ec_len = 1;
+    }
 
-  if (strcmp (curr_comm.str1, bc) == 0 && strcmp (curr_comm.str2, ec) == 0)
+  if (curr_comm.len1 == bc_len && curr_comm.len2 == ec_len
+      && memcmp (curr_comm.str1, bc, bc_len) == 0
+      && memcmp (curr_comm.str2, ec, ec_len) == 0)
     return;
 
   free (curr_comm.str1);
   free (curr_comm.str2);
-  curr_comm.str1 = xstrdup (bc);
-  curr_comm.len1 = strlen (curr_comm.str1);
-  curr_comm.str2 = xstrdup (ec);
-  curr_comm.len2 = strlen (curr_comm.str2);
+  curr_comm.str1 = xmemdup (bc, bc_len);
+  curr_comm.len1 = bc_len;
+  curr_comm.str2 = xmemdup (ec, ec_len);
+  curr_comm.len2 = ec_len;
   set_quote_age ();
 }
 
@@ -1459,18 +1478,26 @@ set_quote_age (void)
    quote_age to zero, but at least a quote_age of zero always produces
    correct results (although it may take more time in doing so).  */
 
-  /* Hueristic of characters that might impact rescan if they appear in
-     a quote delimiter.  */
+  /* Hueristic of characters that might impact rescan if they appear
+     in a quote delimiter.  Using a single NUL as one of the two quote
+     delimiters is safe, but strchr matches it, so we must special
+     case the strchr below.  If we were willing to guarantee a
+     trailing NUL, we could use strpbrk(quote, unsafe) rather than
+     strchr(unsafe, *quote) and avoid the special case; on the other
+     hand, many strpbrk implementations are not as efficient as
+     strchr, and we save memory by avoiding the trailing NUL.  */
 #define Letters "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
   static const char unsafe[] = Letters "_0123456789(,) \t\n\r\f\v";
 #undef Letters
 
   if (curr_quote.len1 == 1 && curr_quote.len2 == 1
-      && strpbrk (curr_quote.str1, unsafe) == NULL
-      && strpbrk (curr_quote.str2, unsafe) == NULL
+      && (!*curr_quote.str1 || strchr (unsafe, *curr_quote.str1) == NULL)
+      && (!*curr_quote.str2 || strchr (unsafe, *curr_quote.str2) == NULL)
       && default_word_regexp && *curr_quote.str1 != *curr_quote.str2
-      && *curr_comm.str1 != '(' && *curr_comm.str1 != ','
-      && *curr_comm.str1 != ')' && *curr_comm.str1 != *curr_quote.str1)
+      && (!curr_comm.len1
+         || (*curr_comm.str1 != '(' && *curr_comm.str1 != ','
+             && *curr_comm.str1 != ')'
+             && *curr_comm.str1 != *curr_quote.str1)))
     current_quote_age = (((*curr_quote.str1 & 0xff) << 8)
                         | (*curr_quote.str2 & 0xff));
   else
@@ -1625,7 +1652,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
       return TOKEN_ARGV;
     }
 
-  if (MATCH (ch, curr_comm.str1, true))
+  if (MATCH (ch, curr_comm.str1, curr_comm.len1, true))
     {
       if (obs)
        obs_td = obs;
@@ -1650,7 +1677,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
              init_macro_token (obs, obs ? td : NULL);
              continue;
            }
-         if (MATCH (ch, curr_comm.str2, true))
+         if (MATCH (ch, curr_comm.str2, curr_comm.len2, true))
            {
              obstack_grow (obs_td, curr_comm.str2, curr_comm.len2);
              break;
@@ -1709,7 +1736,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
 
 #endif /* ENABLE_CHANGEWORD */
 
-  else if (!MATCH (ch, curr_quote.str1, true))
+  else if (!MATCH (ch, curr_quote.str1, curr_quote.len1, true))
     {
       assert (ch < CHAR_EOF);
       switch (ch)
@@ -1754,13 +1781,13 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
            init_macro_token (obs, obs ? td : NULL);
          else if (ch == CHAR_QUOTE)
            append_quote_token (obs, td);
-         else if (MATCH (ch, curr_quote.str2, true))
+         else if (MATCH (ch, curr_quote.str2, curr_quote.len2, true))
            {
              if (--quote_level == 0)
                break;
              obstack_grow (obs_td, curr_quote.str2, curr_quote.len2);
            }
-         else if (MATCH (ch, curr_quote.str1, true))
+         else if (MATCH (ch, curr_quote.str1, curr_quote.len1, true))
            {
              quote_level++;
              obstack_grow (obs_td, curr_quote.str1, curr_quote.len1);
@@ -1856,7 +1883,7 @@ peek_token (void)
     {
       result = TOKEN_MACDEF;
     }
-  else if (MATCH (ch, curr_comm.str1, false))
+  else if (MATCH (ch, curr_comm.str1, curr_comm.len1, false))
     {
       result = TOKEN_STRING;
     }
@@ -1868,7 +1895,7 @@ peek_token (void)
     {
       result = TOKEN_WORD;
     }
-  else if (MATCH (ch, curr_quote.str1, false))
+  else if (MATCH (ch, curr_quote.str1, curr_quote.len1, false))
     {
       result = TOKEN_STRING;
     }
diff --git a/src/m4.h b/src/m4.h
index d16d87a..3afe476 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -199,63 +199,11 @@ extern FILE *debug;
 /* default flags -- equiv: aeq */
 #define DEBUG_TRACE_DEFAULT 0x007
 
-#define DEBUG_PRINT1(Fmt, Arg1)                                        \
-  do                                                           \
-    {                                                          \
-      if (debug != NULL)                                       \
-       xfprintf (debug, Fmt, Arg1);                            \
-    }                                                          \
-  while (0)
-
-#define DEBUG_PRINT3(Fmt, Arg1, Arg2, Arg3)                    \
-  do                                                           \
-    {                                                          \
-      if (debug != NULL)                                       \
-       xfprintf (debug, Fmt, Arg1, Arg2, Arg3);                \
-    }                                                          \
-  while (0)
-
-#define DEBUG_MESSAGE(Fmt)                                     \
-  do                                                           \
-    {                                                          \
-      if (debug != NULL)                                       \
-       {                                                       \
-         debug_message_prefix ();                              \
-         xfprintf (debug, Fmt);                                \
-         putc ('\n', debug);                                   \
-       }                                                       \
-    }                                                          \
-  while (0)
-
-#define DEBUG_MESSAGE1(Fmt, Arg1)                              \
-  do                                                           \
-    {                                                          \
-      if (debug != NULL)                                       \
-       {                                                       \
-         debug_message_prefix ();                              \
-         xfprintf (debug, Fmt, Arg1);                          \
-         putc ('\n', debug);                                   \
-       }                                                       \
-    }                                                          \
-  while (0)
-
-#define DEBUG_MESSAGE2(Fmt, Arg1, Arg2)                                \
-  do                                                           \
-    {                                                          \
-      if (debug != NULL)                                       \
-       {                                                       \
-         debug_message_prefix ();                              \
-         xfprintf (debug, Fmt, Arg1, Arg2);                    \
-         putc ('\n', debug);                                   \
-       }                                                       \
-    }                                                          \
-  while (0)
-
 void debug_init (void);
 int debug_decode (const char *);
 void debug_flush_files (void);
 bool debug_set_output (const call_info *, const char *);
-void debug_message_prefix (void);
+void debug_message (const char *, ...) M4_GNUC_PRINTF (1, 2);
 
 void trace_prepre (const call_info *);
 unsigned int trace_pre (macro_arguments *);
@@ -430,8 +378,8 @@ extern string_pair curr_quote;
 #define DEF_BCOMM "#"
 #define DEF_ECOMM "\n"
 
-void set_quotes (const char *, const char *);
-void set_comment (const char *, const char *);
+void set_quotes (const char *, size_t, const char *, size_t);
+void set_comment (const char *, size_t, const char *, size_t);
 #ifdef ENABLE_CHANGEWORD
 void set_word_regexp (const call_info *, const char *);
 #endif
@@ -584,7 +532,6 @@ void undivert_all (void);
 void expand_user_macro (struct obstack *, symbol *, int, macro_arguments *);
 void m4_placeholder (struct obstack *, int, macro_arguments *);
 void init_pattern_buffer (struct re_pattern_buffer *, struct re_registers *);
-const char *ntoa (int32_t, int);
 
 const builtin *find_builtin_by_addr (builtin_func *);
 const builtin *find_builtin_by_name (const char *);
diff --git a/src/output.c b/src/output.c
index ee1907b..6d74ecd 100644
--- a/src/output.c
+++ b/src/output.c
@@ -191,12 +191,7 @@ m4_tmpname (int divnum)
   static size_t offset;
   if (buffer == NULL)
     {
-      obstack_grow (&diversion_storage, output_temp_dir->dir_name,
-                   strlen (output_temp_dir->dir_name));
-      obstack_1grow (&diversion_storage, '/');
-      obstack_1grow (&diversion_storage, 'm');
-      obstack_1grow (&diversion_storage, '4');
-      obstack_1grow (&diversion_storage, '-');
+      obstack_printf (&diversion_storage, "%s/m4-", output_temp_dir->dir_name);
       offset = obstack_object_size (&diversion_storage);
       buffer = (char *) obstack_alloc (&diversion_storage,
                                       INT_BUFSIZE_BOUND (divnum));
@@ -473,7 +468,6 @@ void
 divert_text (struct obstack *obs, const char *text, int length, int line)
 {
   static bool start_of_output_line = true;
-  const char *cursor;
 
   /* If output goes to an obstack, merely add TEXT to it.  */
 
@@ -533,20 +527,15 @@ divert_text (struct obstack *obs, const char *text, int 
length, int line)
 
          if (output_current_line != line)
            {
-             OUTPUT_CHARACTER ('#');
-             OUTPUT_CHARACTER ('l');
-             OUTPUT_CHARACTER ('i');
-             OUTPUT_CHARACTER ('n');
-             OUTPUT_CHARACTER ('e');
-             OUTPUT_CHARACTER (' ');
-             for (cursor = ntoa (line, 10); *cursor; cursor++)
-               OUTPUT_CHARACTER (*cursor);
+             static char line_buf[sizeof "#line " + INT_BUFSIZE_BOUND (line)];
+             sprintf (line_buf, "#line %d", line);
+             output_text (line_buf, strlen (line_buf));
+             assert (strlen (line_buf) < sizeof line_buf);
              if (output_current_line < 1 && current_file[0] != '\0')
                {
                  OUTPUT_CHARACTER (' ');
                  OUTPUT_CHARACTER ('"');
-                 for (cursor = current_file; *cursor; cursor++)
-                   OUTPUT_CHARACTER (*cursor);
+                 output_text (current_file, strlen (current_file));
                  OUTPUT_CHARACTER ('"');
                }
              OUTPUT_CHARACTER ('\n');
diff --git a/src/path.c b/src/path.c
index 98d4567..998c0ed 100644
--- a/src/path.c
+++ b/src/path.c
@@ -1,7 +1,7 @@
 /* GNU m4 -- A simple macro processor
 
-   Copyright (C) 1989, 1990, 1991, 1992, 1993, 2004, 2006, 2007 Free
-   Software Foundation, Inc.
+   Copyright (C) 1989, 1990, 1991, 1992, 1993, 2004, 2006, 2007, 2008
+   Free Software Foundation, Inc.
 
    This file is part of GNU M4.
 
@@ -160,7 +160,7 @@ m4_path_search (const char *file, char **result)
       if (fp != NULL)
        {
          if (debug_level & DEBUG_TRACE_PATH)
-           DEBUG_MESSAGE2 ("path search for `%s' found `%s'", file, name);
+           debug_message ("path search for `%s' found `%s'", file, name);
          if (set_cloexec_flag (fileno (fp), true) != 0)
            m4_warn (errno, NULL, _("cannot protect input file across forks"));
          if (result)
-- 
1.5.5.1


reply via email to

[Prev in Thread] Current Thread [Next in Thread]