m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

argv_ref patch 27: allow NUL through more builtins


From: Eric Blake
Subject: argv_ref patch 27: allow NUL through more builtins
Date: Wed, 03 Dec 2008 21:29:21 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.18) Gecko/20081105 Thunderbird/2.0.0.18 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It's been a while since I've worked on porting the argv_ref branch to
master, but I finally got one done.  This fixes regular expressions,
format, and translit to transparently handle NUL; mostly a matter of
passing lengths around rather than NUL-termination (thank heavens that
glibc, and thus gnulib, already provide a regex interface that handles
NUL).  No big change in speed or memory usage.  And as long as I was
editing format, I made it better able to detect excess or missing
arguments.  Meanwhile, I didn't see a way to use NUL in eval, so that now
warns.

        Stage 27: Allow embedded NUL in text processing macros.
        Pass NUL through regular expressions, format, and translit, and
        diagnose it in eval.  Improve warning capabilities of format.
        Memory impact: none.
        Speed impact: none noticed.
        * src/m4.h (evaluate): Add parameter.
        * src/builtin.c (compile_pattern) [DEBUG_REGEX]: Support NUL in
        output messages.
        (set_macro_sequence): Likewise.
        (m4_eval): Normalize messages, and adjust caller.
        (expand_ranges, substitute): Support NUL in macro expansion.
        (m4_translit, m4_regexp, m4_patsubst): Adjust callers, to manage
        NUL bytes.
        * src/format.c (expand_format): Manage NUL bytes.
        * src/eval.c (eval_error): Add EMPTY_ARGUMENT.
        (end_text): New variable.
        (eval_init_lex): Add parameter.
        (eval_lex, evaluate): Detect NUL in macro expansion.
        * doc/m4.texinfo (Format): Update to cover new behavior.
        (Eval): Mention that result is unquoted.
        * examples/null.m4: Enhance test.
        * examples/null.err: Update expected output.
        * examples/null.out: Likewise.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkk3XKEACgkQ84KuGfSFAYCAvACeMDWbgh+2qAadcmFkWGzTUh83
RfkAoKzg1pjjyqRPgXeD6fvTEqXbvi1V
=8FT2
-----END PGP SIGNATURE-----
From 41d0c77062c8046730101d82b56f682edc70957e Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Tue, 2 Dec 2008 22:51:14 -0700
Subject: [PATCH] Stage 27: Allow embedded NUL in text processing macros.

* modules/m4.c (m4_expand_ranges): Don't append extra bytes.
(translit): Manage NUL bytes.
* modules/format.c (format): Likewise.
* modules/gnu.c (substitute, regexp_substitute): Likewise.
(m4_resyntax_encode_safe): Add parameter.
(regexp, patsubst, renamesyms): Update callers.
(regexp_compile): Adjust error message.
* modules/evalparse.c (m4_evaluate): Use consistent message.
(end_text): New variable.
(eval_init_lex): Add parameter.
(eval_lex): Detect embedded NUL.
* src/freeze.c (reload_frozen_state): Likewise.
* doc/m4.texinfo (Format): Update to cover new behavior.
(Eval): Mention that result is unquoted.
* tests/freeze.at (reloading nul): Enhance test.
* tests/null.m4: Likewise.
* tests/null.err: Update expected output.
* tests/null.out: Likewise.
* tests/options.at (--regexp-syntax): Likewise.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog           |   28 +++++++++++
 doc/m4.texinfo      |   13 ++++-
 modules/evalparse.c |   15 ++++--
 modules/format.c    |   43 +++++++++++------
 modules/gnu.c       |  127 ++++++++++++++++++++++++++++++---------------------
 modules/m4.c        |   57 ++++++++++++++++-------
 src/freeze.c        |    7 ++-
 tests/freeze.at     |    6 ++
 tests/null.err      |   24 ++++++++--
 tests/null.m4       |   52 +++++++++++++++------
 tests/null.out      |    7 ++-
 tests/options.at    |    8 ++--
 12 files changed, 265 insertions(+), 122 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b350524..69849a9 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,31 @@
+2008-12-02  Eric Blake  <address@hidden>
+
+       Stage 27: Allow embedded NUL in text processing macros.
+       Pass NUL through regular expressions, format, and translit, and
+       diagnose it in eval and changeresyntax.  Improve warning
+       capabilities of format.
+       Memory impact: none.
+       Speed impact: none noticed.
+       * modules/m4.c (m4_expand_ranges): Don't append extra bytes.
+       (translit): Manage NUL bytes.
+       * modules/format.c (format): Likewise.
+       * modules/gnu.c (substitute, regexp_substitute): Likewise.
+       (m4_resyntax_encode_safe): Add parameter.
+       (regexp, patsubst, renamesyms): Update callers.
+       (regexp_compile): Adjust error message.
+       * modules/evalparse.c (m4_evaluate): Use consistent message.
+       (end_text): New variable.
+       (eval_init_lex): Add parameter.
+       (eval_lex): Detect embedded NUL.
+       * src/freeze.c (reload_frozen_state): Likewise.
+       * doc/m4.texinfo (Format): Update to cover new behavior.
+       (Eval): Mention that result is unquoted.
+       * tests/freeze.at (reloading nul): Enhance test.
+       * tests/null.m4: Likewise.
+       * tests/null.err: Update expected output.
+       * tests/null.out: Likewise.
+       * tests/options.at (--regexp-syntax): Likewise.
+
 2008-11-28  Eric Blake  <address@hidden>
 
        Resync NEWS with branches.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 0287a60..bee9aec 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -7288,7 +7288,7 @@ Format
 @example
 format(`%p', `0')
 @error{}m4:stdin:1: Warning: format: unrecognized specifier in `%p'
address@hidden
address@hidden
 format(`%*d', `')
 @error{}m4:stdin:2: Warning: format: empty string treated as 0
 @error{}m4:stdin:2: Warning: format: too few arguments: 2 < 3
@@ -7605,7 +7605,9 @@ Eval
 @var{radix} is the empty string.  A warning results if the radix is
 outside the range of 1 through 36, inclusive.  The result of @code{eval}
 is always taken to be signed.  No radix prefix is output, and for
-radices greater than 10, the digits are lower case.  The @var{width}
+radices greater than 10, the digits are lower case (although some
+other implementations use upper case).  The output is unquoted, and
+subject to further macro expansion.  The @var{width}
 argument specifies the minimum output width, excluding any negative
 sign.  The result is zero-padded to extend the expansion to the
 requested width.  A warning results if the width is negative.  If
@@ -7636,8 +7638,13 @@ Eval
 @error{}m4:stdin:10: Warning: eval: negative width: -1
 @result{}
 eval()
address@hidden:stdin:11: Warning: eval: empty string treated as zero
address@hidden:stdin:11: Warning: eval: empty string treated as 0
address@hidden
+eval(` ')
address@hidden:stdin:12: Warning: eval: empty string treated as 0
 @result{}0
+define(`a', `hi')eval(` 10 ', `16')
address@hidden
 @end example
 
 @node Mpeval
diff --git a/modules/evalparse.c b/modules/evalparse.c
index 8ad7182..9927e13 100644
--- a/modules/evalparse.c
+++ b/modules/evalparse.c
@@ -99,10 +99,15 @@ static const char *eval_text;
    can back up, if we have read too much.  */
 static const char *last_text;
 
+/* Detect when to end parsing.  */
+static const char *end_text;
+
+/* Prime the lexer at the start of TEXT, with length LEN.  */
 static void
-eval_init_lex (const char *text)
+eval_init_lex (const char *text, size_t len)
 {
   eval_text = text;
+  end_text = text + len;
   last_text = NULL;
 }
 
@@ -119,12 +124,12 @@ eval_undo (void)
 static eval_token
 eval_lex (number *val)
 {
-  while (isspace (to_uchar (*eval_text)))
+  while (eval_text != end_text && isspace (to_uchar (*eval_text)))
     eval_text++;
 
   last_text = eval_text;
 
-  if (*eval_text == '\0')
+  if (eval_text == end_text)
     return EOTEXT;
 
   if (isdigit (to_uchar (*eval_text)))
@@ -915,13 +920,13 @@ m4_evaluate (m4 *context, m4_obstack *obs, size_t argc, 
m4_macro_args *argv)
     }
 
   numb_initialise ();
-  eval_init_lex (str);
+  eval_init_lex (str, M4ARGLEN (1));
 
   numb_init (val);
   et = eval_lex (&val);
   if (et == EOTEXT)
     {
-      m4_warn (context, 0, me, _("empty string treated as zero"));
+      m4_warn (context, 0, me, _("empty string treated as 0"));
       numb_set (val, numb_ZERO);
     }
   else
diff --git a/modules/format.c b/modules/format.c
index e2a1a42..af983cd 100644
--- a/modules/format.c
+++ b/modules/format.c
@@ -123,11 +123,12 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
 {
   const m4_call_info *me = m4_arg_info (argv);
   const char *f;                       /* Format control string.  */
+  size_t f_len;                                /* Length of f.  */
   const char *fmt;                     /* Position within f.  */
   char fstart[] = "%'+- 0#*.*hhd";     /* Current format spec.  */
   char *p;                             /* Position within fstart.  */
   unsigned char c;                     /* A simple character.  */
-  int i = 0;                           /* Index within argc used so far.  */
+  int i = 1;                           /* Index within argc used so far.  */
   bool valid_format = true;            /* True if entire format string ok.  */
 
   /* Flags.  */
@@ -156,25 +157,24 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
   int result = 0;
   enum {CHAR, INT, LONG, DOUBLE, STR} datatype;
 
-  f = fmt = ARG_STR (i, argc, argv);
+  f = fmt = M4ARG (1);
+  f_len = M4ARGLEN (1);
+  assert (!f[f_len]); /* Requiring a terminating NUL makes parsing simpler.  */
   memset (ok, 0, sizeof ok);
-  while (true)
+  while (f_len--)
     {
-      while ((c = *fmt++) != '%')
+      c = *fmt++;
+      if (c != '%')
        {
-         if (c == '\0')
-           {
-             if (valid_format)
-               m4_bad_argc (context, argc, me, i, i, true);
-             return;
-           }
          obstack_1grow (obs, c);
+         continue;
        }
 
       if (*fmt == '%')
        {
          obstack_1grow (obs, '%');
          fmt++;
+         f_len--;
          continue;
        }
 
@@ -225,7 +225,7 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
              break;
            }
        }
-      while (!(flags & DONE) && fmt++);
+      while (!(flags & DONE) && (f_len--, fmt++));
       if (flags & THOUSANDS)
        *p++ = '\'';
       if (flags & PLUS)
@@ -247,12 +247,14 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
        {
          width = ARG_INT (i, argc, argv);
          fmt++;
+         f_len--;
        }
       else
        while (isdigit ((unsigned char) *fmt))
          {
            width = 10 * width + *fmt - '0';
            fmt++;
+           f_len--;
          }
 
       /* Maximum precision; an explicit negative precision is the same
@@ -263,10 +265,12 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
       if (*fmt == '.')
        {
          ok['c'] = 0;
+         f_len--;
          if (*(++fmt) == '*')
            {
              prec = ARG_INT (i, argc, argv);
              ++fmt;
+             f_len--;
            }
          else
            {
@@ -275,6 +279,7 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
                {
                  prec = 10 * prec + *fmt - '0';
                  fmt++;
+                 f_len--;
                }
            }
        }
@@ -285,30 +290,34 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
          *p++ = 'l';
          lflag = 1;
          fmt++;
+         f_len--;
          ok['c'] = ok['s'] = 0;
        }
       else if (*fmt == 'h')
        {
          *p++ = 'h';
          fmt++;
+         f_len--;
          if (*fmt == 'h')
            {
              *p++ = 'h';
              fmt++;
+             f_len--;
            }
          ok['a'] = ok['A'] = ok['c'] = ok['e'] = ok['E'] = ok['f'] = ok['F']
            = ok['g'] = ok['G'] = ok['s'] = 0;
        }
 
-      c = *fmt++;
-      if (c > sizeof ok || !ok[c])
+      c = *fmt;
+      if (c > sizeof ok || !ok[c] || !f_len)
        {
-         m4_warn (context, 0, me, _("unrecognized specifier in `%s'"), f);
+         m4_warn (context, 0, me, _("unrecognized specifier in %s"),
+                  quotearg_style_mem (locale_quoting_style, f, M4ARGLEN (1)));
          valid_format = false;
-         if (c == '\0')
-           fmt--;
          continue;
        }
+      fmt++;
+      f_len--;
 
       /* Specifiers.  We don't yet recognize C, S, n, or p.  */
       switch (c)
@@ -382,4 +391,6 @@ format (m4 *context, m4_obstack *obs, int argc, 
m4_macro_args *argv)
         we constructed fstart, the result should not be negative.  */
       assert (0 <= result);
     }
+  if (valid_format)
+    m4_bad_argc (context, argc, me, i, i, true);
 }
diff --git a/modules/gnu.c b/modules/gnu.c
index fd557eb..8ad1722 100644
--- a/modules/gnu.c
+++ b/modules/gnu.c
@@ -167,8 +167,8 @@ regexp_compile (m4 *context, const m4_call_info *caller, 
const char *regexp,
 
   if (msg != NULL)
     {
-      m4_error (context, 0, 0, caller, _("bad regular expression `%s': %s"),
-               regexp, msg);
+      m4_warn (context, 0, caller, _("bad regular expression %s: %s"),
+              quotearg_style_mem (locale_quoting_style, regexp, len), msg);
       regfree (pat);
       free (pat);
       return NULL;
@@ -225,28 +225,38 @@ regexp_search (m4_pattern_buffer *buf, const char 
*string, const int size,
 
 /* Function to perform substitution by regular expressions.  Used by
    the builtins regexp, patsubst and renamesyms.  The changed text is
-   placed on the obstack OBS.  The substitution is REPL, with \&
-   substituted by this part of VICTIM matched by the last whole
-   regular expression, and \N substituted by the text matched by the
-   Nth parenthesized sub-expression in BUF.  Any warnings are issued
-   on behalf of CALLER.  BUF may be NULL for the empty regex.  */
+   placed on the obstack OBS.  The substitution is REPL of length
+   REPL_LEN, with \& substituted by this part of VICTIM matched by the
+   last whole regular expression, and \N substituted by the text
+   matched by the Nth parenthesized sub-expression in BUF.  Any
+   warnings are issued on behalf of CALLER.  BUF may be NULL for the
+   empty regex.  */
 
 static void
 substitute (m4 *context, m4_obstack *obs, const m4_call_info *caller,
-           const char *victim, const char *repl, m4_pattern_buffer *buf)
+           const char *victim, const char *repl, size_t repl_len,
+           m4_pattern_buffer *buf)
 {
   int ch;
 
-  for (;;)
+  while (repl_len--)
     {
-      while ((ch = *repl++) != '\\')
+      ch = *repl++;
+      if (ch != '\\')
        {
-         if (ch == '\0')
-           return;
          obstack_1grow (obs, ch);
+         continue;
+       }
+      if (!repl_len)
+       {
+         m4_warn (context, 0, caller,
+                  _("trailing \\ ignored in replacement"));
+         return;
        }
 
-      switch ((ch = *repl++))
+      ch = *repl++;
+      repl_len--;
+      switch (ch)
        {
        case '&':
          if (buf)
@@ -265,11 +275,6 @@ substitute (m4 *context, m4_obstack *obs, const 
m4_call_info *caller,
                          buf->regs.end[ch] - buf->regs.start[ch]);
          break;
 
-       case '\0':
-         m4_warn (context, 0, caller,
-                  _("trailing \\ ignored in replacement"));
-         return;
-
        default:
          obstack_1grow (obs, ch);
          break;
@@ -278,18 +283,19 @@ substitute (m4 *context, m4_obstack *obs, const 
m4_call_info *caller,
 }
 
 
-/* For each match against compiled REGEXP (held in BUF -- as returned
-   by regexp_compile) in VICTIM, substitute REPLACE.  Non-matching
-   characters are copied verbatim, and the result copied to the
-   obstack.  Errors are reported on behalf of CALLER.  Return true if
-   a substitution was made.  If OPTIMIZE is set, don't worry about
-   copying the input if no changes are made.  */
+/* For each match against REGEXP of length REGEXP_LEN (precompiled in
+   BUF as returned by regexp_compile) in VICTIM of length LEN,
+   substitute REPLACE of length REPL_LEN.  Non-matching characters are
+   copied verbatim, and the result copied to the obstack.  Errors are
+   reported on behalf of CALLER.  Return true if a substitution was
+   made.  If OPTIMIZE is set, don't worry about copying the input if
+   no changes are made.  */
 
 static bool
 regexp_substitute (m4 *context, m4_obstack *obs, const m4_call_info *caller,
                   const char *victim, size_t len, const char *regexp,
-                  m4_pattern_buffer *buf, const char *replace,
-                  bool optimize)
+                  size_t regexp_len, m4_pattern_buffer *buf,
+                  const char *replace, size_t repl_len, bool optimize)
 {
   regoff_t matchpos = 0;       /* start position of match */
   size_t offset = 0;           /* current match offset */
@@ -309,7 +315,9 @@ regexp_substitute (m4 *context, m4_obstack *obs, const 
m4_call_info *caller,
 
          if (matchpos == -2)
            m4_error (context, 0, 0, caller,
-                     _("error matching regular expression `%s'"), regexp);
+                     _("problem matching regular expression %s"),
+                     quotearg_style_mem (locale_quoting_style, regexp,
+                                         regexp_len));
          else if (offset < len && subst)
            obstack_grow (obs, victim + offset, len - offset);
          break;
@@ -322,7 +330,7 @@ regexp_substitute (m4 *context, m4_obstack *obs, const 
m4_call_info *caller,
 
       /* Handle the part of the string that was covered by the match.  */
 
-      substitute (context, obs, caller, victim, replace, buf);
+      substitute (context, obs, caller, victim, replace, repl_len, buf);
       subst = true;
 
       /* Update the offset to the end of the match.  If the regexp
@@ -465,18 +473,24 @@ M4BUILTIN_HANDLER (builtin)
 }
 
 
-/* Change the current regexp syntax to SPEC, or report failure on
-   behalf of CALLER.  Currently this affects the builtins: `patsubst',
-   `regexp' and `renamesyms'.  */
+/* Change the current regexp syntax to SPEC of length LEN, or report
+   failure on behalf of CALLER.  Currently this affects the builtins:
+   `patsubst', `regexp' and `renamesyms'.  */
 
 static int
 m4_resyntax_encode_safe (m4 *context, const m4_call_info *caller,
-                        const char *spec)
+                        const char *spec, size_t len)
 {
-  int resyntax = m4_regexp_syntax_encode (spec);
+  int resyntax;
+
+  if (strlen (spec) < len)
+    resyntax = -1;
+  else
+    resyntax = m4_regexp_syntax_encode (spec);
 
   if (resyntax < 0)
-    m4_warn (context, 0, caller, _("bad syntax-spec: `%s'"), spec);
+    m4_warn (context, 0, caller, _("bad syntax-spec: %s"),
+            quotearg_style_mem (locale_quoting_style, spec, len));
 
   return resyntax;
 }
@@ -488,7 +502,7 @@ m4_resyntax_encode_safe (m4 *context, const m4_call_info 
*caller,
 M4BUILTIN_HANDLER (changeresyntax)
 {
   int resyntax = m4_resyntax_encode_safe (context, m4_arg_info (argv),
-                                         M4ARG (1));
+                                         M4ARG (1), M4ARGLEN (1));
 
   if (resyntax >= 0)
     m4_set_regexp_syntax_opt (context, resyntax);
@@ -749,31 +763,32 @@ M4BUILTIN_HANDLER (patsubst)
   m4_pattern_buffer *buf;      /* compiled regular expression */
   int resyntax;
 
-  pattern = M4ARG (2);
-  replace = M4ARG (3);
-
   resyntax = m4_get_regexp_syntax_opt (context);
   if (argc >= 5)               /* additional args ignored */
     {
-      resyntax = m4_resyntax_encode_safe (context, me, M4ARG (4));
+      resyntax = m4_resyntax_encode_safe (context, me, M4ARG (4),
+                                         M4ARGLEN (4));
       if (resyntax < 0)
        return;
     }
 
   /* The empty regex matches everywhere, but if there is no
      replacement, we need not waste time with it.  */
-  if (!*pattern && !*replace)
+  if (m4_arg_empty (argv, 2) && m4_arg_empty (argv, 3))
     {
       m4_push_arg (context, obs, argv, 1);
       return;
     }
 
+  pattern = M4ARG (2);
+  replace = M4ARG (3);
+
   buf = regexp_compile (context, me, pattern, M4ARGLEN (2), resyntax);
   if (!buf)
     return;
 
-  regexp_substitute (context, obs, me, M4ARG (1), M4ARGLEN (1),
-                    pattern, buf, replace, false);
+  regexp_substitute (context, obs, me, M4ARG (1), M4ARGLEN (1), pattern,
+                    M4ARGLEN (2), buf, replace, M4ARGLEN (3), false);
 }
 
 
@@ -810,7 +825,7 @@ M4BUILTIN_HANDLER (regexp)
         is a valid RESYNTAX, yet we want `regexp(aab, a*, )' to return
         an empty string as per M4 1.4.x.  */
 
-      if ((*replace == '\0') || (resyntax < 0))
+      if (m4_arg_empty (argv, 3) || (resyntax < 0))
        /* regexp(VICTIM, REGEXP, REPLACEMENT) */
        resyntax = m4_get_regexp_syntax_opt (context);
       else
@@ -820,7 +835,8 @@ M4BUILTIN_HANDLER (regexp)
   else if (argc >= 5)
     {
       /* regexp(VICTIM, REGEXP, REPLACEMENT, RESYNTAX) */
-      resyntax = m4_resyntax_encode_safe (context, me, M4ARG (4));
+      resyntax = m4_resyntax_encode_safe (context, me, M4ARG (4),
+                                         M4ARGLEN (4));
       if (resyntax < 0)
        return;
     }
@@ -828,11 +844,11 @@ M4BUILTIN_HANDLER (regexp)
     /* regexp(VICTIM, REGEXP)  */
     replace = NULL;
 
-  if (!*pattern)
+  if (m4_arg_empty (argv, 2))
     {
       /* The empty regex matches everything.  */
       if (replace)
-       substitute (context, obs, me, M4ARG (1), replace, NULL);
+       substitute (context, obs, me, M4ARG (1), replace, M4ARGLEN (3), NULL);
       else
        m4_shipout_int (obs, 0);
       return;
@@ -848,15 +864,16 @@ M4BUILTIN_HANDLER (regexp)
 
   if (startpos == -2)
     {
-      m4_error (context, 0, 0, me, _("error matching regular expression `%s'"),
-               pattern);
+      m4_error (context, 0, 0, me, _("problem matching regular expression %s"),
+               quotearg_style_mem (locale_quoting_style, pattern,
+                                   M4ARGLEN (2)));
       return;
     }
 
   if (replace == NULL)
     m4_shipout_int (obs, startpos);
   else if (startpos >= 0)
-    substitute (context, obs, me, victim, replace, buf);
+    substitute (context, obs, me, victim, replace, M4ARGLEN (3), buf);
 }
 
 
@@ -874,7 +891,9 @@ M4BUILTIN_HANDLER (renamesyms)
     {
       const m4_call_info *me = m4_arg_info (argv);
       const char *regexp;      /* regular expression string */
+      size_t regexp_len;
       const char *replace;     /* replacement expression string */
+      size_t replace_len;
 
       m4_pattern_buffer *buf;  /* compiled regular expression */
 
@@ -883,17 +902,20 @@ M4BUILTIN_HANDLER (renamesyms)
       int resyntax;
 
       regexp  = M4ARG (1);
+      regexp_len = M4ARGLEN (1);
       replace = M4ARG (2);
+      replace_len = M4ARGLEN (2);
 
       resyntax = m4_get_regexp_syntax_opt (context);
       if (argc >= 4)
        {
-         resyntax = m4_resyntax_encode_safe (context, me, M4ARG (3));
+         resyntax = m4_resyntax_encode_safe (context, me, M4ARG (3),
+                                             M4ARGLEN (3));
          if (resyntax < 0)
            return;
        }
 
-      buf = regexp_compile (context, me, regexp, M4ARGLEN (1), resyntax);
+      buf = regexp_compile (context, me, regexp, regexp_len, resyntax);
       if (!buf)
        return;
 
@@ -905,7 +927,8 @@ M4BUILTIN_HANDLER (renamesyms)
          const m4_string *key = &data.base[0];
 
          if (regexp_substitute (context, data.obs, me, key->str, key->len,
-                                regexp, buf, replace, true))
+                                regexp, regexp_len, buf, replace, replace_len,
+                                true))
            {
              size_t newlen = obstack_object_size (data.obs);
              m4_symbol_rename (M4SYMTAB, key->str, key->len,
diff --git a/modules/m4.c b/modules/m4.c
index e9695a3..f78a177 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -998,8 +998,7 @@ m4_expand_ranges (const char *s, size_t *len, m4_obstack 
*obs)
        obstack_1grow (obs, *s);
     }
   *len = obstack_object_size (obs);
-  /* FIXME - use obstack_finish once translit is updated.  */
-  return (char *) obstack_copy0 (obs, "", 0);
+  return (char *) obstack_finish (obs);
 }
 
 /* The macro "translit" translates all characters in the first
@@ -1018,7 +1017,9 @@ M4BUILTIN_HANDLER (translit)
   char found[UCHAR_MAX + 1] = {0};
   unsigned char ch;
 
-  if (argc <= 2)
+  enum { ASIS, REPLACE, DELETE };
+
+  if (m4_arg_empty (argv, 1) || m4_arg_empty (argv, 2))
     {
       m4_push_arg (context, obs, argv, 1);
       return;
@@ -1026,7 +1027,7 @@ M4BUILTIN_HANDLER (translit)
 
   from = M4ARG (2);
   from_len = M4ARGLEN (2);
-  if (strchr (from, '-') != NULL)
+  if (memchr (from, '-', from_len) != NULL)
     {
       from = m4_expand_ranges (from, &from_len, m4_arg_scratch (context));
       assert (from);
@@ -1034,35 +1035,57 @@ M4BUILTIN_HANDLER (translit)
 
   to = M4ARG (3);
   to_len = M4ARGLEN (3);
-  if (strchr (to, '-') != NULL)
+  if (memchr (to, '-', to_len) != NULL)
     {
       to = m4_expand_ranges (to, &to_len, m4_arg_scratch (context));
       assert (to);
     }
 
-  /* Calling strchr(from) for each character in data is quadratic,
+  /* Calling memchr(from) for each character in data is quadratic,
      since both strings can be arbitrarily long.  Instead, create a
      from-to mapping in one pass of from, then use that map in one
      pass of data, for linear behavior.  Traditional behavior is that
      only the first instance of a character in from is consulted,
      hence the found map.  */
-  for ( ; (ch = *from) != '\0'; from++)
+  while (from_len--)
     {
-      if (!found[ch])
+      ch = *from++;
+      if (found[ch] == ASIS)
+       {
+         if (to_len)
+           {
+             found[ch] = REPLACE;
+             map[ch] = *to;
+           }
+         else
+           found[ch] = DELETE;
+       }
+      if (to_len)
        {
-         found[ch] = 1;
-         map[ch] = *to;
+         to++;
+         to_len--;
        }
-      if (*to != '\0')
-       to++;
     }
 
-  for (data = M4ARG (1); (ch = *data) != '\0'; data++)
+  data = M4ARG (1);
+  from_len = M4ARGLEN (1);
+  while (from_len--)
     {
-      if (!found[ch])
-       obstack_1grow (obs, ch);
-      else if (map[ch])
-       obstack_1grow (obs, map[ch]);
+      ch = *data++;
+      switch (found[ch])
+       {
+       case ASIS:
+         obstack_1grow (obs, ch);
+         break;
+       case REPLACE:
+         obstack_1grow (obs, map[ch]);
+         break;
+       case DELETE:
+         break;
+       default:
+         assert (!"translit");
+         abort ();
+       }
     }
 }
 
diff --git a/src/freeze.c b/src/freeze.c
index 5d5b4ee..3008f27 100644
--- a/src/freeze.c
+++ b/src/freeze.c
@@ -634,7 +634,7 @@ ill-formed frozen file, version 2 directive `%c' 
encountered"), 'd');
 
          if (m4_debug_decode (context, string[0]) < 0)
            m4_error (context, EXIT_FAILURE, 0, NULL,
-                     _("unknown debug mode `%s'"),
+                     _("unknown debug mode %s"),
                      quotearg_style_mem (locale_quoting_style, string[0],
                                          number[0]));
          break;
@@ -751,10 +751,11 @@ ill-formed frozen file, version 2 directive `%c' 
encountered"), 'R');
 
          m4_set_regexp_syntax_opt (context,
                                    m4_regexp_syntax_encode (string[0]));
-         if (m4_get_regexp_syntax_opt (context) < 0)
+         if (m4_get_regexp_syntax_opt (context) < 0
+             || strlen (string[0]) < number[0])
            {
              m4_error (context, EXIT_FAILURE, 0, NULL,
-                       _("unknown regexp syntax code `%s'"),
+                       _("bad syntax-spec %s"),
                        quotearg_style_mem (locale_quoting_style, string[0],
                                            number[0]));
            }
diff --git a/tests/freeze.at b/tests/freeze.at
index 9b8c946..693ae54 100644
--- a/tests/freeze.at
+++ b/tests/freeze.at
@@ -409,6 +409,12 @@ AT_CHECK_M4([-R frozen.m4f unfrozen.m4], [0], [stdout], 
[experr], [], [ ])
 
 AT_CHECK([cat out1 stdout], [0], [expout])
 
+dnl Check that unexpected embedded NULs are recognized.
+printf '# bogus frozen file\nV2\nR4\ngnu\0\n' > bogus.m4f
+AT_CHECK_M4([-R bogus.m4f], [1], [],
+[[m4:bogus.m4f:4: bad syntax-spec `gnu\0'
+]])
+
 AT_CLEANUP
 ])
 
diff --git a/tests/null.err b/tests/null.err
index 74ec09d..7b9f798 100644
--- a/tests/null.err
+++ b/tests/null.err
@@ -3,12 +3,14 @@ m4:null.m4:21: Warning: builtin: undefined builtin `-\0-'
 changequote:
 echo:  address@hidden/
 m4trace: -1- dumpdef(echo/) -> /
+changeresyntax:
+m4:null.m4:39: Warning: changeresyntax: bad syntax-spec: `\0'
 changesyntax:
-m4:null.m4:46: Warning: changesyntax: undefined syntax code: `\0'
+m4:null.m4:48: Warning: changesyntax: undefined syntax code: `\0'
 defn:
-m4:null.m4:55: Warning: defn: undefined macro `\0-\0'
+m4:null.m4:57: Warning: defn: undefined macro `\0-\0'
 dumpdef:
-m4:null.m4:68: Warning: dumpdef: undefined macro `\0-\0'
+m4:null.m4:70: Warning: dumpdef: undefined macro `\0-\0'
 :      `empty'
 -:     `dash'
 --:   ``$0': $1'
@@ -16,9 +18,21 @@ m4:null.m4:68: Warning: dumpdef: undefined macro `\0-\0'
 --:    `dashes'
 body:  `--'
 errprint: -- --
+format:
+m4:null.m4:87: Warning: format: unrecognized specifier in `%\0%'
+m4:null.m4:87: Warning: format: unrecognized specifier in `%\0%'
 indir:
-m4:null.m4:99: Warning: indir: undefined macro `\0-\0'
-m4:null.m4:101: Warning: \0\0%%: extra arguments ignored: 1 > 0
+m4:null.m4:104: Warning: indir: undefined macro `\0-\0'
+m4:null.m4:106: Warning: \0\0%%: extra arguments ignored: 1 > 0
+patsubst:
+m4:null.m4:124: Warning: patsubst: bad regular expression `\\\0\\': Trailing 
backslash
+m4:null.m4:134: Warning: patsubst: bad syntax-spec: `\0'
+regexp:
+m4:null.m4:146: Warning: regexp: bad regular expression `\\\0\\': Trailing 
backslash
+m4:null.m4:156: Warning: regexp: bad syntax-spec: `\0'
+renamesyms:
+m4:null.m4:161: Warning: renamesyms: bad regular expression `\\\0\\': Trailing 
backslash
+m4:null.m4:167: Warning: renamesyms: bad syntax-spec: `\0'
 traceon:
 m4trace: -1- --(`--') -> `strange: --'
 m4trace: -1- body -> `-'
diff --git a/tests/null.m4 b/tests/null.m4
index 77b6e67..f7a1587 100644
--- a/tests/null.m4
+++ b/tests/null.m4
@@ -34,7 +34,9 @@ dnl Quotes in trace and dump output:
 errprint(`changequote:
 ')traceon(`dumpdef')dumpdef(`echo'changequote(,/))changequote`'dnl
 traceoff(`dumpdef')dnl
-dnl Warning from changeresyntax: not tested yet. No resyntax includes NUL, 
needs to warn
+dnl Warning from changeresyntax:
+errprint(`changeresyntax:
+')changeresyntax(`')dnl
 dnl Macro name in changesyntax:
 `changesyntax:' changesyntax(`W+-')-- --(-)`'changesyntax()dnl
 dnl Escape in changesyntax:
@@ -78,8 +80,11 @@ dnl Generated from esyscmd:
 `esyscmd:' esyscmd(__program__` -DNUL '__file__) sysval
 dnl First argument of eval: not tested yet. NUL not a number, needs to warn
 dnl Other arguments of eval: not tested yet. NUL not a number, needs to warn
-dnl First argument to format: not tested yet
-dnl Invalid specifier in format: not tested yet, needs to warn
+dnl First argument to format:
+`format:' format(`%s%s', `-', `-')dnl
+dnl Invalid specifier in format:
+errprint(`format:
+') format(`%%')
 dnl Numeric and string arguments to format: not tested yet, needs to warn
 dnl Character argument to format: not tested yet, %c semantics needed
 dnl Macro name in ifdef, passed through ifdef:
@@ -114,15 +119,19 @@ m4wrap(``m4wrap:' --
 dnl Warning from maketemp: not tested yet. No file name includes NUL, needs to 
warn
 dnl Warning from mkdtemp: not tested yet. No file name includes NUL, needs to 
warn
 dnl Warning from mkstemp: not tested yet. No file name includes NUL, needs to 
warn
-dnl Bad regex in patsubst: not tested yet
+dnl Bad regex in patsubst:
+errprint(`patsubst:
+')patsubst(`a', `\\')dnl
 dnl First argument of patsubst:
 `patsubst:' patsubst(`--', `-', `.')dnl
 dnl Matching via meta-character in patsubst:
  patsubst(`--', `[^-]')dnl
 dnl Second argument of patsubst:
  patsubst(`abc', `b', `-') patsubst(`--', `', `!')dnl
-dnl Third argument of patsubst: not tested yet
-dnl Syntax argument of patsubst: not tested yet, needs to warn
+dnl Third argument of patsubst:
+ patsubst(`-!-', `!', `')dnl
+dnl Syntax argument of patsubst:
+patsubst(`a', `a', `b', `')dnl
 dnl Replacement via reference in patsubst:
  patsubst(`----', `-\(.\)-', `\1-\1')
 dnl Defined argument of popdef:
@@ -132,20 +141,30 @@ dnl Macro name of pushdef:
 `pushdef:' pushdef(`--', `strange: $1')ifdef(`--', `ok', `oops')`'dnl
 dnl Definition of pushdef:
  pushdef(`body', `-')body
-dnl Bad regex in regexp: not tested yet
+dnl Bad regex in regexp:
+errprint(`regexp:
+')regexp(`a', `\\')dnl
 dnl First argument of regexp:
 `regexp:' regexp(`ab', `b')dnl
 dnl Matching via meta-character in regexp:
  regexp(`--', `[^-]', `!')dnl
 dnl Second argument of regexp:
  regexp(`--', `')dnl
-dnl Third argument of regexp: not tested yet
-dnl Syntax argument of patsubst: not tested yet, needs to warn
+dnl Third argument of regexp:
+ regexp(`!', `!', `--')dnl
+dnl Syntax argument of patsubst:
+regexp(`a', `a', `b', `')dnl
 dnl Replacement via reference in regexp:
  regexp(`----', `-\(.\)-', `\1-\1')
-dnl Bad regex in renamesyms: not tested yet
-dnl Direct rename via renamesyms: not tested yet
-dnl Meta-character rename via renamesyms: not tested yet
+dnl Bad regex in renamesyms:
+errprint(`renamesyms:
+')renamesyms(`\\', `-')dnl
+dnl Direct rename via renamesyms:
+`renamesyms:' renamesyms(`%%', `--%%')indir(`--%%')dnl
+dnl Meta-character rename via renamesyms:
+ renamesyms(`..\(%%\)', `\1')indir(`%%')
+dnl Syntax argument of renamesyms:
+renamesyms(`a', `b', `')dnl
 dnl Passed through shift:
 `shift:' shift(`hi', `--', --)
 dnl Warning from sinclude: not tested yet. No file name includes NUL, needs to 
warn
@@ -162,9 +181,12 @@ dnl Macro name and arguments of traceon:
 ')traceon(`--')indir(`--', `--')dnl
 dnl Defined text of traceon:
  traceon(`body')body
-dnl First argument of translit: not tested yet
-dnl Single character in other arguments of translit: not tested yet
-dnl Character ranges of translit: not tested yet
+dnl First argument of translit:
+`translit:' translit(`..', `.', `-')dnl
+dnl Single character in other arguments of translit:
+ translit(`.', `.', `.')dnl
+dnl Character ranges of translit:
+ translit(`abcd', `-b')
 dnl Defined argument of undefine:
 `undefine:' undefine(`--')ifdef(`--', `oops', `ok')
 dnl Undefined argument of undefine: not tested yet. Should it warn?
diff --git a/tests/null.out b/tests/null.out
index 5f6df39..97f80dd 100644
--- a/tests/null.out
+++ b/tests/null.out
@@ -11,18 +11,21 @@ define: --
 defn: `$0': $1 --
 divert: --
 esyscmd: [] 0
+format: -- 
 ifdef: yes: -- no: --
 ifelse: yes: --
 index: 2 -1 -1 8
 indir: --: 11 0 3
 len: 1 3
 m4symbols: --
-patsubst: .. -- abc -!- ---
+patsubst: .. -- abc -!- -- ---
 popdef: ok
 pushdef: ok -
-regexp: 2 ! 0 -
+regexp: 2 ! 1 -- -
+renamesyms: 0 0
 shift: --,--
 substr: --
 traceon: strange: -- -
+translit: -- .. cd
 undefine: ok
 m4wrap: --
diff --git a/tests/options.at b/tests/options.at
index 9331a21..dce43f8 100644
--- a/tests/options.at
+++ b/tests/options.at
@@ -714,8 +714,8 @@ AT_CHECK_M4([--regexp-syntax=unknown in], [1], [],
 AT_CHECK_M4([--regexp-syntax= in], [0], [[0
 ]])
 
-AT_CHECK_M4([-rEXTENDED in], [1], [[
-]], [[m4:in:1: regexp: bad regular expression `(': Unmatched ( or \(
+AT_CHECK_M4([-rEXTENDED in], [0], [[
+]], [[m4:in:1: Warning: regexp: bad regular expression `(': Unmatched ( or \(
 ]])
 
 AT_CHECK_M4([-rgnu-m4 in], [0], [[0
@@ -725,9 +725,9 @@ AT_CHECK_M4([-r"gnu M4" in], [0], [[0
 ]])
 
 dnl Test behavior of -r intermixed with files
-AT_CHECK_M4([-rEXTENDED in --regexp-syntax in], [1], [[
+AT_CHECK_M4([-rEXTENDED in --regexp-syntax in], [0], [[
 0
-]], [[m4:in:1: regexp: bad regular expression `(': Unmatched ( or \(
+]], [[m4:in:1: Warning: regexp: bad regular expression `(': Unmatched ( or \(
 ]])
 
 AT_CLEANUP
-- 
1.6.0.4

From 715c42128d8d357e3e751ec605069137d693c757 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Thu, 17 Jan 2008 14:34:36 -0700
Subject: [PATCH] Stage 27: Allow embedded NUL in text processing macros.

* src/m4.h (evaluate): Add parameter.
* src/builtin.c (compile_pattern) [DEBUG_REGEX]: Support NUL in
output messages.
(set_macro_sequence): Likewise.
(m4_eval): Normalize messages, and adjust caller.
(expand_ranges, substitute): Support NUL in macro expansion.
(m4_translit, m4_regexp, m4_patsubst): Adjust callers, to manage
NUL bytes.
* src/format.c (expand_format): Manage NUL bytes.
* src/eval.c (eval_error): Add EMPTY_ARGUMENT.
(end_text): New variable.
(eval_init_lex): Add parameter.
(eval_lex, evaluate): Detect NUL in macro expansion.
* doc/m4.texinfo (Format): Update to cover new behavior.
(Eval): Mention that result is unquoted.
* examples/null.m4: Enhance test.
* examples/null.err: Update expected output.
* examples/null.out: Likewise.

Signed-off-by: Eric Blake <address@hidden>
(cherry picked from commit 948d1ed0ca4089c2db579fe3d8b3ce172b3e616f)
---
 ChangeLog         |   26 ++++++
 doc/m4.texinfo    |   15 +++-
 examples/null.err |   11 ++-
 examples/null.m4  |   30 +++++--
 examples/null.out |    6 +-
 src/builtin.c     |  232 ++++++++++++++++++++++++++++++++++------------------
 src/eval.c        |   33 ++++++--
 src/format.c      |   43 ++++++----
 src/m4.h          |    2 +-
 9 files changed, 275 insertions(+), 123 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 2085dea..e991a8c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,29 @@
+2008-12-03  Eric Blake  <address@hidden>
+
+       Stage 27: Allow embedded NUL in text processing macros.
+       Pass NUL through regular expressions, format, and translit, and
+       diagnose it in eval.  Improve warning capabilities of format.
+       Memory impact: none.
+       Speed impact: none noticed.
+       * src/m4.h (evaluate): Add parameter.
+       * src/builtin.c (compile_pattern) [DEBUG_REGEX]: Support NUL in
+       output messages.
+       (set_macro_sequence): Likewise.
+       (m4_eval): Normalize messages, and adjust caller.
+       (expand_ranges, substitute): Support NUL in macro expansion.
+       (m4_translit, m4_regexp, m4_patsubst): Adjust callers, to manage
+       NUL bytes.
+       * src/format.c (expand_format): Manage NUL bytes.
+       * src/eval.c (eval_error): Add EMPTY_ARGUMENT.
+       (end_text): New variable.
+       (eval_init_lex): Add parameter.
+       (eval_lex, evaluate): Detect NUL in macro expansion.
+       * doc/m4.texinfo (Format): Update to cover new behavior.
+       (Eval): Mention that result is unquoted.
+       * examples/null.m4: Enhance test.
+       * examples/null.err: Update expected output.
+       * examples/null.out: Likewise.
+
 2008-11-28  Eric Blake  <address@hidden>
 
        Add extension to divert builtin.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 8301bb7..2fb676d 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -6448,7 +6448,7 @@ Format
 @example
 format(`%p', `0')
 @error{}m4:stdin:1: Warning: format: unrecognized specifier in `%p'
address@hidden
address@hidden
 format(`%*d', `')
 @error{}m4:stdin:2: Warning: format: empty string treated as 0
 @error{}m4:stdin:2: Warning: format: too few arguments: 2 < 3
@@ -6734,7 +6734,9 @@ Eval
 @var{radix} is the empty string.  A warning results if the radix is
 outside the range of 1 through 36, inclusive.  The result of @code{eval}
 is always taken to be signed.  No radix prefix is output, and for
-radices greater than 10, the digits are lower case.  The @var{width}
+radices greater than 10, the digits are lower case (although some
+other implementations use upper case).  The output is unquoted, and
+subject to further macro expansion.  The @var{width}
 argument specifies the minimum output width, excluding any negative
 sign.  The result is zero-padded to extend the expansion to the
 requested width.  A warning results if the width is negative.  If
@@ -6759,14 +6761,19 @@ Eval
 eval(`10', `16')
 @result{}a
 eval(`1', `37')
address@hidden:stdin:9: Warning: eval: radix 37 out of range
address@hidden:stdin:9: Warning: eval: radix out of range: 37
 @result{}
 eval(`1', , `-1')
address@hidden:stdin:10: Warning: eval: negative width
address@hidden:stdin:10: Warning: eval: negative width: -1
 @result{}
 eval()
 @error{}m4:stdin:11: Warning: eval: empty string treated as 0
 @result{}0
+eval(` ')
address@hidden:stdin:12: Warning: eval: empty string treated as 0
address@hidden
+define(`a', `hi')eval(` 10 ', `16')
address@hidden
 @end example
 
 @node Shell commands
diff --git a/examples/null.err b/examples/null.err
index 897ce34..977b3b7 100644
--- a/examples/null.err
+++ b/examples/null.err
@@ -16,9 +16,16 @@ m4:examples/null.m4:67: Warning: dumpdef: undefined macro 
`\0-\0'
 --:    `dashes'
 body:  `--'
 errprint: -- --
+format:
+m4:examples/null.m4:84: Warning: format: unrecognized specifier in `%\0%'
+m4:examples/null.m4:84: Warning: format: unrecognized specifier in `%\0%'
 indir:
-m4:examples/null.m4:98: Warning: indir: undefined macro `\0-\0'
-m4:examples/null.m4:100: Warning: \0\0%%: extra arguments ignored: 1 > 0
+m4:examples/null.m4:101: Warning: indir: undefined macro `\0-\0'
+m4:examples/null.m4:103: Warning: \0\0%%: extra arguments ignored: 1 > 0
+patsubst:
+m4:examples/null.m4:116: Warning: patsubst: bad regular expression `\\\0\\': 
Trailing backslash
+regexp:
+m4:examples/null.m4:136: Warning: regexp: bad regular expression `\\\0\\': 
Trailing backslash
 traceon:
 m4trace: -1- --(`--') -> `strange: --'
 m4trace: -1- body -> `-'
diff --git a/examples/null.m4 b/examples/null.m4
index 1823073..e60aec5 100644
--- a/examples/null.m4
+++ b/examples/null.m4
@@ -77,8 +77,11 @@ dnl Generated from esyscmd:
 `esyscmd:' esyscmd(__program__` -DNUL '__file__) sysval
 dnl First argument of eval: not tested yet. NUL not a number, needs to warn
 dnl Other arguments of eval: not tested yet, needs to warn
-dnl First argument to format: not tested yet
-dnl Invalid specifier in format: not tested yet, needs to warn
+dnl First argument to format:
+`format:' format(`%s%s', `-', `-')dnl
+dnl Invalid specifier in format:
+errprint(`format:
+') format(`%%')
 dnl Numeric and string arguments to format: not tested yet, needs to warn
 dnl Character argument to format: not tested yet, %c semantics needed
 dnl Macro name in ifdef, passed through ifdef:
@@ -108,14 +111,17 @@ m4wrap(``m4wrap:' --
 ')dnl
 dnl Warning from maketemp: not tested yet. No file name includes NUL, needs to 
warn
 dnl Warning from mkstemp: not tested yet. No file name includes NUL, needs to 
warn
-dnl Bad regex in patsubst: not tested yet
+dnl Bad regex in patsubst:
+errprint(`patsubst:
+')patsubst(`a', `\\')dnl
 dnl First argument of patsubst:
 `patsubst:' patsubst(`--', `-', `.')dnl
 dnl Matching via meta-character in patsubst:
  patsubst(`--', `[^-]')dnl
 dnl Second argument of patsubst:
  patsubst(`abc', `b', `-') patsubst(`--', `', `!')dnl
-dnl Third argument of patsubst: not tested yet
+dnl Third argument of patsubst:
+ patsubst(`-!-', `!', `')dnl
 dnl Replacement via reference in patsubst:
  patsubst(`----', `-\(.\)-', `\1-\1')
 dnl Defined argument of popdef:
@@ -125,14 +131,17 @@ dnl Macro name of pushdef:
 `pushdef:' pushdef(`--', `strange: $1')ifdef(`--', `ok', `oops')`'dnl
 dnl Definition of pushdef:
  pushdef(`body', `-')body
-dnl Bad regex in regexp: not tested yet
+dnl Bad regex in regexp:
+errprint(`regexp:
+')regexp(`a', `\\')dnl
 dnl First argument of regexp:
 `regexp:' regexp(`ab', `b')dnl
 dnl Matching via meta-character in regexp:
  regexp(`--', `[^-]', `!')dnl
 dnl Second argument of regexp:
  regexp(`--', `')dnl
-dnl Third argument of regexp: not tested yet
+dnl Third argument of regexp:
+ regexp(`!', `!', `--')dnl
 dnl Replacement via reference in regexp:
  regexp(`----', `-\(.\)-', `\1-\1')
 dnl Passed through shift:
@@ -150,9 +159,12 @@ dnl Macro name and arguments of traceon:
 ')traceon(`--')indir(`--', `--')dnl
 dnl Defined text of traceon:
  traceon(`body')body
-dnl First argument of translit: not tested yet
-dnl Single character in other arguments of translit: not tested yet
-dnl Character ranges of translit: not tested yet
+dnl First argument of translit:
+`translit:' translit(`..', `.', `-')dnl
+dnl Single character in other arguments of translit:
+ translit(`.', `.', `.')dnl
+dnl Character ranges of translit:
+ translit(`abcd', `-b')
 dnl Defined argument of undefine:
 `undefine:' undefine(`--')ifdef(`--', `oops', `ok')
 dnl Undefined argument of undefine: not tested yet. Should it warn?
diff --git a/examples/null.out b/examples/null.out
index dd83416..c2c1cb9 100644
--- a/examples/null.out
+++ b/examples/null.out
@@ -11,17 +11,19 @@ define: --
 defn: `$0': $1 --
 divert: --
 esyscmd: [] 0
+format: -- 
 ifdef: yes: -- no: --
 ifelse: yes: --
 index: 2 -1 -1 8
 indir: --: 11 0 3
 len: 1 3
-patsubst: .. -- abc -!- ---
+patsubst: .. -- abc -!- -- ---
 popdef: ok
 pushdef: ok -
-regexp: 2 ! 0 -
+regexp: 2 ! 1 -- -
 shift: --,--
 substr: --
 traceon: strange: -- -
+translit: -- .. cd
 undefine: ok
 m4wrap: --
diff --git a/src/builtin.c b/src/builtin.c
index 24f2df6..613e1d2 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -311,7 +311,11 @@ compile_pattern (const char *str, size_t len, struct 
re_pattern_buffer **buf,
        regex_cache[i].count++;
 #ifdef DEBUG_REGEX
        if (trace_file)
-         xfprintf (trace_file, "cached:{%s}\n", str);
+         {
+           fputs ("cached:{", trace_file);
+           fwrite (str, 1, len, trace_file);
+           fputs ("}\n", trace_file);
+         }
 #endif /* DEBUG_REGEX */
        return NULL;
       }
@@ -321,7 +325,11 @@ compile_pattern (const char *str, size_t len, struct 
re_pattern_buffer **buf,
   msg = re_compile_pattern (str, len, new_buf);
 #ifdef DEBUG_REGEX
   if (trace_file)
-    xfprintf (trace_file, "compile:{%s}\n", str);
+    {
+      fputs ("compile:{", trace_file);
+      fwrite (str, 1, len, trace_file);
+      fputs ("}\n", trace_file);
+    }
 #endif /* DEBUG_REGEX */
   if (msg)
     {
@@ -356,7 +364,11 @@ compile_pattern (const char *str, size_t len, struct 
re_pattern_buffer **buf,
     {
 #ifdef DEBUG_REGEX
       if (trace_file)
-       xfprintf (trace_file, "flush:{%s}\n", victim->str);
+       {
+         fputs ("flush:{", trace_file);
+         fwrite (victim->str, 1, victim->len, trace_file);
+         fputs ("}\n", trace_file);
+       }
 #endif /* DEBUG_REGEX */
       free (victim->str);
       regfree (victim->buf);
@@ -404,8 +416,8 @@ set_macro_sequence (const char *regexp)
   msg = re_compile_pattern (regexp, strlen (regexp), &macro_sequence_buf);
   if (msg != NULL)
     m4_error (EXIT_FAILURE, 0, NULL,
-             _("--warn-macro-sequence: bad regular expression `%s': %s"),
-             regexp, msg);
+             _("--warn-macro-sequence: bad regular expression %s: %s"),
+             quotearg_style (locale_quoting_style, regexp), msg);
   re_set_registers (&macro_sequence_buf, &macro_sequence_regs,
                    macro_sequence_regs.num_regs,
                    macro_sequence_regs.start, macro_sequence_regs.end);
@@ -1208,7 +1220,7 @@ m4_eval (struct obstack *obs, int argc, macro_arguments 
*argv)
 
   if (radix < 1 || radix > 36)
     {
-      m4_warn (0, me, _("radix %d out of range"), radix);
+      m4_warn (0, me, _("radix out of range: %d"), radix);
       return;
     }
 
@@ -1216,13 +1228,11 @@ m4_eval (struct obstack *obs, int argc, macro_arguments 
*argv)
     return;
   if (min < 0)
     {
-      m4_warn (0, me, _("negative width"));
+      m4_warn (0, me, _("negative width: %d"), min);
       return;
     }
 
-  if (arg_empty (argv, 1))
-    m4_warn (0, me, _("empty string treated as 0"));
-  else if (evaluate (me, ARG (1), &value))
+  if (evaluate (me, ARG (1), ARG_LEN (1), &value))
     return;
 
   if (radix == 1)
@@ -1887,34 +1897,42 @@ m4_substr (struct obstack *obs, int argc, 
macro_arguments *argv)
   obstack_grow (obs, ARG (1) + start, length);
 }
 
-/*------------------------------------------------------------------------.
-| For "translit", ranges are allowed in the second and third argument.   |
-| They are expanded in the following function, and the expanded strings,  |
-| without any ranges left, are used to translate the characters of the   |
-| first argument.  A single - (dash) can be included in the strings by   |
-| being the first or the last character in the string.  If the first     |
-| character in a range is after the first in the character set, the range |
-| is made backwards, thus 9-0 is the string 9876543210.                        
  |
-`------------------------------------------------------------------------*/
+/*------------------------------------------------------------------.
+| For "translit", ranges are allowed in the second and third        |
+| argument.  They are expanded in the following function, and the   |
+| expanded strings, without any ranges left, are used to translate  |
+| the characters of the first argument.  A single - (dash) can be   |
+| included in the strings by being the first or the last character  |
+| in the string.  If the first character in a range is after the    |
+| first in the character set, the range is made backwards, thus 9-0 |
+| is the string 9876543210.  This function expands S of length *LEN |
+| using OBS for the expansion, sets *LEN to the new length, and     |
+| returns the expansion.                                            |
+`------------------------------------------------------------------*/
 
 static const char *
-expand_ranges (const char *s, struct obstack *obs)
+expand_ranges (const char *s, size_t *len, struct obstack *obs)
 {
   unsigned char from;
   unsigned char to;
+  const char *end = s + *len;
+
+  assert (s != end);
+  from = *s++;
+  obstack_1grow (obs, from);
 
-  for (from = '\0'; *s != '\0'; from = to_uchar (*s++))
+  for ( ; s != end; from = *s++)
     {
-      if (*s == '-' && from != '\0')
+      if (*s == '-')
        {
-         to = to_uchar (*++s);
-         if (to == '\0')
+         if (++s == end)
            {
              /* trailing dash */
              obstack_1grow (obs, '-');
              break;
            }
-         else if (from <= to)
+         to = *s;
+         if (from <= to)
            {
              while (from++ < to)
                obstack_1grow (obs, from);
@@ -1928,7 +1946,7 @@ expand_ranges (const char *s, struct obstack *obs)
       else
        obstack_1grow (obs, *s);
     }
-  obstack_1grow (obs, '\0');
+  *len = obstack_object_size (obs);
   return (char *) obstack_finish (obs);
 }
 
@@ -1946,25 +1964,32 @@ m4_translit (struct obstack *obs, int argc, 
macro_arguments *argv)
   const char *data;
   const char *from;
   const char *to;
+  size_t from_len;
+  size_t to_len;
   char map[UCHAR_MAX + 1] = {0};
   char found[UCHAR_MAX + 1] = {0};
   unsigned char ch;
 
-  if (bad_argc (arg_info (argv), argc, 2, 3))
+  enum { ASIS, REPLACE, DELETE };
+
+  if (bad_argc (arg_info (argv), argc, 2, 3) || arg_empty (argv, 1)
+      || arg_empty (argv, 2))
     {
       /* builtin(`translit') is blank, but translit(`abc') is abc.  */
-      if (argc == 2)
+      if (argc >= 2)
        push_arg (obs, argv, 1);
       return;
     }
 
   from = ARG (2);
-  if (strchr (from, '-') != NULL)
-    from = expand_ranges (from, arg_scratch ());
+  from_len = ARG_LEN (2);
+  if (memchr (from, '-', from_len) != NULL)
+    from = expand_ranges (from, &from_len, arg_scratch ());
 
   to = ARG (3);
-  if (strchr (to, '-') != NULL)
-    to = expand_ranges (to, arg_scratch ());
+  to_len = ARG_LEN (3);
+  if (memchr (to, '-', to_len) != NULL)
+    to = expand_ranges (to, &to_len, arg_scratch ());
 
   assert (from && to);
 
@@ -1974,23 +1999,45 @@ m4_translit (struct obstack *obs, int argc, 
macro_arguments *argv)
      pass of data, for linear behavior.  Traditional behavior is that
      only the first instance of a character in from is consulted,
      hence the found map.  */
-  for ( ; (ch = *from) != '\0'; from++)
+  while (from_len--)
     {
-      if (!found[ch])
+      ch = *from++;
+      if (found[ch] == ASIS)
+       {
+         if (to_len)
+           {
+             found[ch] = REPLACE;
+             map[ch] = *to;
+           }
+         else
+           found[ch] = DELETE;
+       }
+      if (to_len)
        {
-         found[ch] = 1;
-         map[ch] = *to;
+         to++;
+         to_len--;
        }
-      if (*to != '\0')
-       to++;
     }
 
-  for (data = ARG (1); (ch = *data) != '\0'; data++)
+  data = ARG (1);
+  from_len = ARG_LEN (1);
+  while (from_len--)
     {
-      if (!found[ch])
-       obstack_1grow (obs, ch);
-      else if (map[ch])
-       obstack_1grow (obs, map[ch]);
+      ch = *data++;
+      switch (found[ch])
+       {
+       case ASIS:
+         obstack_1grow (obs, ch);
+         break;
+       case REPLACE:
+         obstack_1grow (obs, map[ch]);
+         break;
+       case DELETE:
+         break;
+       default:
+         assert (!"m4_translit");
+         abort ();
+       }
     }
 }
 
@@ -2020,20 +2067,27 @@ static int substitute_warned = 0;
 
 static void
 substitute (struct obstack *obs, const call_info *me, const char *victim,
-           const char *repl, struct re_registers *regs)
+           const char *repl, size_t repl_len, struct re_registers *regs)
 {
   int ch;
 
-  for (;;)
+  while (repl_len--)
     {
-      while ((ch = *repl++) != '\\')
+      ch = *repl++;
+      if (ch != '\\')
        {
-         if (ch == '\0')
-           return;
          obstack_1grow (obs, ch);
+         continue;
+       }
+      if (!repl_len)
+       {
+         m4_warn (0, me, _("trailing \\ ignored in replacement"));
+         return;
        }
 
-      switch ((ch = *repl++))
+      ch = *repl++;
+      repl_len--;
+      switch (ch)
        {
        case '0':
          if (!substitute_warned)
@@ -2060,10 +2114,6 @@ substitute (struct obstack *obs, const call_info *me, 
const char *victim,
                          regs->end[ch] - regs->start[ch]);
          break;
 
-       case '\0':
-         m4_warn (0, me, _("trailing \\ ignored in replacement"));
-         return;
-
        default:
          obstack_1grow (obs, ch);
          break;
@@ -2122,26 +2172,36 @@ m4_regexp (struct obstack *obs, int argc, 
macro_arguments *argv)
   regexp = ARG (2);
   repl = ARG (3);
 
-  if (!*regexp)
+  if (arg_empty (argv, 2))
     {
       /* The empty regex matches everything!  */
       if (argc == 3)
        shipout_int (obs, 0);
       else
-       substitute (obs, me, victim, repl, NULL);
+       substitute (obs, me, victim, repl, ARG_LEN (3), NULL);
       return;
     }
 
 #ifdef DEBUG_REGEX
   if (trace_file)
-    xfprintf (trace_file, "r:{%s}:%s%s%s\n", regexp,
-             argc == 3 ? "" : "{", repl, argc == 3 ? "" : "}");
+    {
+      fputs ("r:{", trace_file);
+      fwrite (regexp, 1, ARG_LEN (2), trace_file);
+      if (argc > 3)
+       {
+         fputs ("}:{", trace_file);
+         fwrite (repl, 1, ARG_LEN (3), trace_file);
+       }
+      fputs ("}\n", trace_file);
+    }
 #endif /* DEBUG_REGEX */
 
   msg = compile_pattern (regexp, ARG_LEN (2), &buf, &regs);
   if (msg != NULL)
     {
-      m4_warn (0, me, _("bad regular expression: `%s': %s"), regexp, msg);
+      m4_warn (0, me, _("bad regular expression %s: %s"),
+              quotearg_style_mem (locale_quoting_style, regexp, ARG_LEN (2)),
+              msg);
       return;
     }
 
@@ -2151,11 +2211,12 @@ m4_regexp (struct obstack *obs, int argc, 
macro_arguments *argv)
                        argc == 3 ? NULL : regs);
 
   if (startpos == -2)
-    m4_warn (0, me, _("problem matching regular expression `%s'"), regexp);
+    m4_warn (0, me, _("problem matching regular expression %s"),
+            quotearg_style_mem (locale_quoting_style, regexp, ARG_LEN (2)));
   else if (argc == 3)
     shipout_int (obs, startpos);
   else if (startpos >= 0)
-    substitute (obs, me, victim, repl, regs);
+    substitute (obs, me, victim, repl, ARG_LEN (3), regs);
 }
 
 /*------------------------------------------------------------------.
@@ -2170,16 +2231,17 @@ static void
 m4_patsubst (struct obstack *obs, int argc, macro_arguments *argv)
 {
   const call_info *me = arg_info (argv);
-  const char *victim;          /* first argument */
-  const char *regexp;          /* regular expression */
-  const char *repl;
-
-  struct re_pattern_buffer *buf;/* compiled regular expression */
-  struct re_registers *regs;   /* for subexpression matches */
-  const char *msg;             /* error message from re_compile_pattern */
-  int matchpos;                        /* start position of match */
-  int offset;                  /* current match offset */
-  int length;                  /* length of first argument */
+  const char *victim;          /* First argument.  */
+  const char *regexp;          /* Regular expression.  */
+  const char *repl;            /* Replacement text.  */
+
+  struct re_pattern_buffer *buf;/* Compiled regular expression.  */
+  struct re_registers *regs;   /* For subexpression matches.  */
+  const char *msg;             /* Error message from re_compile_pattern.  */
+  int matchpos;                        /* Start position of match.  */
+  int offset;                  /* Current match offset.  */
+  int length;                  /* Length of first argument.  */
+  size_t repl_len;             /* Length of replacement.  */
 
   if (bad_argc (me, argc, 2, 3))
     {
@@ -2189,27 +2251,36 @@ m4_patsubst (struct obstack *obs, int argc, 
macro_arguments *argv)
       return;
     }
 
-  victim = ARG (1);
-  regexp = ARG (2);
-  repl = ARG (3);
-
   /* The empty regex matches everywhere, but if there is no
      replacement, we need not waste time with it.  */
-  if (!*regexp && !*repl)
+  if (arg_empty (argv, 2) && arg_empty (argv, 3))
     {
       push_arg (obs, argv, 1);
       return;
     }
 
+  victim = ARG (1);
+  regexp = ARG (2);
+  repl = ARG (3);
+  repl_len = ARG_LEN (3);
+
 #ifdef DEBUG_REGEX
   if (trace_file)
-    xfprintf (trace_file, "p:{%s}:{%s}\n", regexp, repl);
+    {
+      fputs ("p:{", trace_file);
+      fwrite (regexp, 1, ARG_LEN (2), trace_file);
+      fputs ("}:{", trace_file);
+      fwrite (repl, 1, repl_len, trace_file);
+      fputs ("}\n", trace_file);
+    }
 #endif /* DEBUG_REGEX */
 
   msg = compile_pattern (regexp, ARG_LEN (2), &buf, &regs);
   if (msg != NULL)
     {
-      m4_warn (0, me, _("bad regular expression `%s': %s"), regexp, msg);
+      m4_warn (0, me, _("bad regular expression %s: %s"),
+              quotearg_style_mem (locale_quoting_style, regexp, ARG_LEN (2)),
+              msg);
       return;
     }
 
@@ -2229,8 +2300,9 @@ m4_patsubst (struct obstack *obs, int argc, 
macro_arguments *argv)
             copied verbatim.  */
 
          if (matchpos == -2)
-           m4_warn (0, me, _("problem matching regular expression `%s'"),
-                    regexp);
+           m4_warn (0, me, _("problem matching regular expression %s"),
+                    quotearg_style_mem (locale_quoting_style, regexp,
+                                        ARG_LEN (2)));
          else if (offset < length)
            obstack_grow (obs, victim + offset, length - offset);
          break;
@@ -2243,7 +2315,7 @@ m4_patsubst (struct obstack *obs, int argc, 
macro_arguments *argv)
 
       /* Handle the part of the string that was covered by the match.  */
 
-      substitute (obs, me, victim, repl, regs);
+      substitute (obs, me, victim, repl, repl_len, regs);
 
       /* Update the offset to the end of the match.  If the regexp
         matched a null string, advance offset one more, to avoid
diff --git a/src/eval.c b/src/eval.c
index e2e600b..1b617ed 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -58,7 +58,8 @@ typedef enum eval_error
     MISSING_RIGHT,
     UNKNOWN_INPUT,
     EXCESS_INPUT,
-    INVALID_OPERATOR
+    INVALID_OPERATOR,
+    EMPTY_ARGUMENT
   }
 eval_error;
 
@@ -87,10 +88,15 @@ static const char *eval_text;
    can back up, if we have read too much.  */
 static const char *last_text;
 
+/* Detect when to end parsing.  */
+static const char *end_text;
+
+/* Prime the lexer at the start of TEXT, with length LEN.  */
 static void
-eval_init_lex (const char *text)
+eval_init_lex (const char *text, size_t len)
 {
   eval_text = text;
+  end_text = text + len;
   last_text = NULL;
 }
 
@@ -105,12 +111,12 @@ eval_undo (void)
 static eval_token
 eval_lex (int32_t *val)
 {
-  while (isspace (to_uchar (*eval_text)))
+  while (eval_text != end_text && isspace (to_uchar (*eval_text)))
     eval_text++;
 
   last_text = eval_text;
 
-  if (*eval_text == '\0')
+  if (eval_text == end_text)
     return EOTEXT;
 
   if (isdigit (to_uchar (*eval_text)))
@@ -287,14 +293,17 @@ eval_lex (int32_t *val)
 `---------------------------------------*/
 
 bool
-evaluate (const call_info *me, const char *expr, int32_t *val)
+evaluate (const call_info *me, const char *expr, size_t len, int32_t *val)
 {
   eval_token et;
   eval_error err;
 
-  eval_init_lex (expr);
+  eval_init_lex (expr, len);
   et = eval_lex (val);
-  err = logical_or_term (me, et, val);
+  if (et == EOTEXT)
+    err = EMPTY_ARGUMENT;
+  else
+    err = logical_or_term (me, et, val);
 
   if (err == NO_ERROR && *eval_text != '\0')
     {
@@ -306,9 +315,15 @@ evaluate (const call_info *me, const char *expr, int32_t 
*val)
 
   switch (err)
     {
+      /* Cases where result is printed.  */
     case NO_ERROR:
-      break;
+      return false;
+
+    case EMPTY_ARGUMENT:
+      m4_warn (0, me, _("empty string treated as 0"));
+      return false;
 
+      /* Cases where error makes result meaningless.  */
     case MISSING_RIGHT:
       m4_warn (0, me, _("bad expression (missing right parenthesis): %s"),
               expr);
@@ -347,7 +362,7 @@ evaluate (const call_info *me, const char *expr, int32_t 
*val)
       abort ();
     }
 
-  return err != NO_ERROR;
+  return true;
 }
 
 /*---------------------------.
diff --git a/src/format.c b/src/format.c
index 3325853..8b2b11a 100644
--- a/src/format.c
+++ b/src/format.c
@@ -126,11 +126,12 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
 {
   const call_info *me = arg_info (argv);/* Macro name.  */
   const char *f;                       /* Format control string.  */
+  size_t f_len;                                /* Length of f.  */
   const char *fmt;                     /* Position within f.  */
   char fstart[] = "%'+- 0#*.*hhd";     /* Current format spec.  */
   char *p;                             /* Position within fstart.  */
   unsigned char c;                     /* A simple character.  */
-  int i = 0;                           /* Index within argc used so far.  */
+  int i = 1;                           /* Index within argc used so far.  */
   bool valid_format = true;            /* True if entire format string ok.  */
 
   /* Flags.  */
@@ -159,25 +160,24 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
   int result = 0;
   enum {CHAR, INT, LONG, DOUBLE, STR} datatype;
 
-  f = fmt = ARG_STR (i, argc, argv);
+  f = fmt = ARG (1);
+  f_len = ARG_LEN (1);
+  assert (!f[f_len]); /* Requiring a terminating NUL makes parsing simpler.  */
   memset (ok, 0, sizeof ok);
-  while (true)
+  while (f_len--)
     {
-      while ((c = *fmt++) != '%')
+      c = *fmt++;
+      if (c != '%')
        {
-         if (c == '\0')
-           {
-             if (valid_format)
-               bad_argc (me, argc, i, i);
-             return;
-           }
          obstack_1grow (obs, c);
+         continue;
        }
 
       if (*fmt == '%')
        {
          obstack_1grow (obs, '%');
          fmt++;
+         f_len--;
          continue;
        }
 
@@ -228,7 +228,7 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
              break;
            }
        }
-      while (!(flags & DONE) && fmt++);
+      while (!(flags & DONE) && (f_len--, fmt++));
       if (flags & THOUSANDS)
        *p++ = '\'';
       if (flags & PLUS)
@@ -250,12 +250,14 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
        {
          width = ARG_INT (i, argc, argv);
          fmt++;
+         f_len--;
        }
       else
        while (isdigit (to_uchar (*fmt)))
          {
            width = 10 * width + *fmt - '0';
            fmt++;
+           f_len--;
          }
 
       /* Maximum precision; an explicit negative precision is the same
@@ -266,10 +268,12 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
       if (*fmt == '.')
        {
          ok['c'] = 0;
+         f_len--;
          if (*(++fmt) == '*')
            {
              prec = ARG_INT (i, argc, argv);
              ++fmt;
+             f_len--;
            }
          else
            {
@@ -278,6 +282,7 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
                {
                  prec = 10 * prec + *fmt - '0';
                  fmt++;
+                 f_len--;
                }
            }
        }
@@ -288,30 +293,34 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
          *p++ = 'l';
          lflag = 1;
          fmt++;
+         f_len--;
          ok['c'] = ok['s'] = 0;
        }
       else if (*fmt == 'h')
        {
          *p++ = 'h';
          fmt++;
+         f_len--;
          if (*fmt == 'h')
            {
              *p++ = 'h';
              fmt++;
+             f_len--;
            }
          ok['a'] = ok['A'] = ok['c'] = ok['e'] = ok['E'] = ok['f'] = ok['F']
            = ok['g'] = ok['G'] = ok['s'] = 0;
        }
 
-      c = *fmt++;
-      if (c > sizeof ok || !ok[c])
+      c = *fmt;
+      if (c > sizeof ok || !ok[c] || !f_len)
        {
-         m4_warn (0, me, _("unrecognized specifier in `%s'"), f);
+         m4_warn (0, me, _("unrecognized specifier in %s"),
+                  quotearg_style_mem (locale_quoting_style, f, ARG_LEN (1)));
          valid_format = false;
-         if (c == '\0')
-           fmt--;
          continue;
        }
+      fmt++;
+      f_len--;
 
       /* Specifiers.  We don't yet recognize C, S, n, or p.  */
       switch (c)
@@ -385,4 +394,6 @@ expand_format (struct obstack *obs, int argc, 
macro_arguments *argv)
         we constructed fstart, the result should not be negative.  */
       assert (0 <= result);
     }
+  if (valid_format)
+    bad_argc (me, argc, i, i);
 }
diff --git a/src/m4.h b/src/m4.h
index f643e49..76c697b 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -549,7 +549,7 @@ FILE *m4_path_search (const char *, char **);
 
 /* File: eval.c  --- expression evaluation.  */
 
-bool evaluate (const call_info *, const char *, int32_t *);
+bool evaluate (const call_info *, const char *, size_t, int32_t *);
 
 /* File: format.c  --- printf like formatting.  */
 
-- 
1.6.0.4


reply via email to

[Prev in Thread] Current Thread [Next in Thread]