m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dumpdef a pushdef'd stack


From: Eric Blake
Subject: Re: dumpdef a pushdef'd stack
Date: Fri, 29 Sep 2006 06:25:51 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Thunderbird/1.5.0.7 Mnenhy/0.7.4.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Eric Blake on 9/27/2006 7:20 AM:
> I would like to do a followup to this patch, which does three things.
> First, --arglength currently does not validate its numeric argument, so
> 'm4 -loops' is equivalent to 'm4 -l0'; I'd like to see the first case at
> least warn on the invalid argument.

With this, -l and -L now validate their arguments, and accept scaled
arguments.  So, for example, 'm4 -L2k' sets the nesting limit to 2048.  I
still need to document the format of scaled arguments; it is probably
worth a new node in the manual.  I also think that eval should accept a
scaled number as its first argument, and point to the same to-be-written
node on how scaled numbers work.

Yesterday, I sorted the tests in options.at, so that today's patch was
easier to see where I added three tests to the testsuite.

2006-09-29  Eric Blake  <address@hidden>

        * ltdl/m4/gnulib-cache.m4: Augment with gnulib-tool --import
        xstrtol.
        * m4/system_.h (N_): Define.
        * src/main.c (main): Validate numeric arguments.
        (size_opt): New function, idea borrowed from coreutils.
        * m4/macro.c (expand_macro): -L0 implies no limit.
        * doc/m4.texinfo (Invoking m4): Document this change.
        * NEWS: Likewise.
        * tests/options.at: (--arglength, --nesting-limit)
        (--regexp-syntax): New tests of argument validation.

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFHRDP84KuGfSFAYARAtRBAJ0c3dMibHX+N/M27cMLWd4Ey9AKUwCfSAAg
Kd4pfnhd5gv0kIvUGaaeVY8=
=fPtq
-----END PGP SIGNATURE-----
? ltdl/m4/double-slash-root.m4
? ltdl/m4/xstrtol.m4
Index: NEWS
===================================================================
RCS file: /sources/m4/m4/NEWS,v
retrieving revision 1.22
diff -u -p -r1.22 NEWS
--- NEWS        28 Sep 2006 04:22:33 -0000      1.22
+++ NEWS        29 Sep 2006 12:18:59 -0000
@@ -96,7 +96,14 @@ promoted to 2.0.
   the prefix `m4' or `__'.
 
 * The `-l'/`--arglength' command line argument now affects dumpdef
-  output as well as trace output.
+  output as well as trace output.  Also, it now performs argument
+  validation and accepts an optional multiplier suffix.
+
+* The `-L'/`--nesting-limit' command line option can now be set to 0
+  to remove the default limit.  However, it is still possible that heavily
+  nested input can cause abrupt program termination due to stack
+  overflow.  Also, the option now performs argument validation and accepts
+  an optional multiplier suffix.
 
 * FIXME: `m4wrap' semantics need an update to FIFO.
 
Index: doc/m4.texinfo
===================================================================
RCS file: /sources/m4/m4/doc/m4.texinfo,v
retrieving revision 1.54
diff -u -p -r1.54 m4.texinfo
--- doc/m4.texinfo      28 Sep 2006 04:22:33 -0000      1.54
+++ doc/m4.texinfo      29 Sep 2006 12:19:00 -0000
@@ -609,7 +609,13 @@ loads the @samp{traditional} module in p
 @itemx address@hidden
 Artificially limit the nesting of macro calls to @var{NUM} levels,
 stopping program execution if this limit is ever exceeded.  When not
-specified, nesting is limited to 1024 levels.
+specified, nesting is limited to 1024 levels.  A value of zero means
+unlimited; but then heavily nested code could potentially cause a stack
+overflow.  @var{NUM} can have an optional scaling suffix.
address@hidden FIXME - need a node on what scaling suffixes are supported (see
address@hidden [info coreutils 'block size'] for ideas), and need to consider
address@hidden whether builtins should also understand scaling suffixes:
address@hidden eval, mpeval, perhaps format
 
 The precise effect of this option might be more correctly associated
 with textual nesting than dynamic recursion.  It has been useful
@@ -695,10 +701,12 @@ future release.
 Restrict the size of the output generated by macro tracing or by
 @code{dumpdef} to @var{NUM} characters per string.  If unspecified or
 zero, output is unlimited.  @xref{Debug Levels}, for more details.
address@hidden can have an optional scaling suffix.
 @comment FIXME - should we add a debuglen macro that can alter this
 @comment setting on the fly?  Also, should we add an option that
 @comment controls whether output strings are sanitized with escape
 @comment sequences, so that dumpdef is truly one line per macro?
address@hidden FIXME - see comment on --nesting-limit about NUM.
 
 @item -t @var{NAME}
 @itemx address@hidden
Index: ltdl/m4/gnulib-cache.m4
===================================================================
RCS file: /sources/m4/m4/ltdl/m4/gnulib-cache.m4,v
retrieving revision 1.11
diff -u -p -r1.11 gnulib-cache.m4
--- ltdl/m4/gnulib-cache.m4     27 Sep 2006 13:21:05 -0000      1.11
+++ ltdl/m4/gnulib-cache.m4     29 Sep 2006 12:19:00 -0000
@@ -15,11 +15,11 @@
 
 
 # Specification in the form of a command-line invocation:
-#   gnulib-tool --import --dir=. --lib=libgnu --source-base=gnu 
--m4-base=ltdl/m4 --doc-base=doc --aux-dir=ltdl/config --libtool 
--macro-prefix=M4 assert binary-io cloexec close-stream dirname error exit fdl 
filenamecat fopen-safer free gendocs gettext gnupload mkstemp obstack progname 
regex regexprops-generic stdbool stdlib-safer strnlen strtol unlocked-io verror 
xalloc xalloc-die xstrndup xvasprintf
+#   gnulib-tool --import --dir=. --lib=libgnu --source-base=gnu 
--m4-base=ltdl/m4 --doc-base=doc --aux-dir=ltdl/config --libtool 
--macro-prefix=M4 assert binary-io cloexec close-stream dirname error exit fdl 
filenamecat fopen-safer free gendocs gettext gnupload mkstemp obstack progname 
regex regexprops-generic stdbool stdlib-safer strnlen strtol unlocked-io verror 
xalloc xalloc-die xstrndup xstrtol xvasprintf
 
 # Specification in the form of a few gnulib-tool.m4 macro invocations:
 gl_LOCAL_DIR([])
-gl_MODULES([assert binary-io cloexec close-stream dirname error exit fdl 
filenamecat fopen-safer free gendocs gettext gnupload mkstemp obstack progname 
regex regexprops-generic stdbool stdlib-safer strnlen strtol unlocked-io verror 
xalloc xalloc-die xstrndup xvasprintf])
+gl_MODULES([assert binary-io cloexec close-stream dirname error exit fdl 
filenamecat fopen-safer free gendocs gettext gnupload mkstemp obstack progname 
regex regexprops-generic stdbool stdlib-safer strnlen strtol unlocked-io verror 
xalloc xalloc-die xstrndup xstrtol xvasprintf])
 gl_AVOID([])
 gl_SOURCE_BASE([gnu])
 gl_M4_BASE([ltdl/m4])
Index: m4/macro.c
===================================================================
RCS file: /sources/m4/m4/m4/macro.c,v
retrieving revision 1.56
diff -u -p -r1.56 macro.c
--- m4/macro.c  28 Sep 2006 04:22:33 -0000      1.56
+++ m4/macro.c  29 Sep 2006 12:19:00 -0000
@@ -238,7 +238,8 @@ expand_macro (m4 *context, const char *n
   value = m4_get_symbol_value (symbol);
   VALUE_PENDING (value)++;
   expansion_level++;
-  if (expansion_level > m4_get_nesting_limit_opt (context))
+  if (m4_get_nesting_limit_opt (context) > 0
+      && expansion_level > m4_get_nesting_limit_opt (context))
     m4_error (context, EXIT_FAILURE, 0, _("\
 recursion limit of %d exceeded, use -L<N> to change it"),
              m4_get_nesting_limit_opt (context));
Index: m4/system_.h
===================================================================
RCS file: /sources/m4/m4/m4/system_.h,v
retrieving revision 1.15
diff -u -p -r1.15 system_.h
--- m4/system_.h        26 Sep 2006 13:19:26 -0000      1.15
+++ m4/system_.h        29 Sep 2006 12:19:00 -0000
@@ -51,10 +51,12 @@
 #ifndef _
 # ifdef ENABLE_NLS
 #  include <libintl.h>
-#  define _(Text) gettext ((Text))
+#  define _(Text) gettext (Text)
 # else
 #  define _(Text) (Text)
 # endif
+# define gettext_noop(Text) Text
+# define N_(Text) gettext_noop (Text)
 #endif
 
 
Index: src/main.c
===================================================================
RCS file: /sources/m4/m4/src/main.c,v
retrieving revision 1.88
diff -u -p -r1.88 main.c
--- src/main.c  26 Sep 2006 21:21:50 -0000      1.88
+++ src/main.c  29 Sep 2006 12:19:00 -0000
@@ -25,6 +25,7 @@
 #include "version-etc.h"
 #include "gnu/progname.h"
 #include "pathconf.h"
+#include "xstrtol.h"
 
 #include <limits.h>
 
@@ -236,6 +237,22 @@ enum interactive_choice
   INTERACTIVE_NO       /* -b specified last */
 };
 
+/* Convert OPT to size_t, reporting an error using MSGID if it does
+   not fit.  */
+static size_t
+size_opt (char const *opt, char const *msgid)
+{
+  unsigned long int size;
+  strtol_error status = xstrtoul (opt, NULL, 10, &size, "kKmMgGtTPEZY0");
+  if (SIZE_MAX < size && status == LONGINT_OK)
+    status = LONGINT_OVERFLOW;
+  if (status != LONGINT_OK)
+    STRTOL_FATAL_ERROR (opt, _(msgid), status);
+  return size;
+}
+
+
+/* Main entry point.  Parse arguments, load modules, then parse input.  */
 int
 main (int argc, char *const *argv, char *const *envp)
 {
@@ -243,6 +260,7 @@ main (int argc, char *const *argv, char 
   macro_definition *tail;
   macro_definition *defn;
   int optchar;                 /* option character */
+  size_t size;                 /* for parsing numeric option arguments */
 
   macro_definition *defines;
   FILE *fp;
@@ -291,8 +309,8 @@ main (int argc, char *const *argv, char 
 
       case 'H':
       case HASHSIZE_OPTION:
-        /* -H was supported in 1.4.x, but is a no-op now.  FIXME -
-            remove support for -H after 2.0.  */
+       /* -H was supported in 1.4.x, but is a no-op now.  FIXME -
+           remove support for -H after 2.0.  */
        error (0, 0, _("Warning: `%s' is deprecated"),
               optchar == 'H' ? "-H" : "--hashsize");
        break;
@@ -370,7 +388,8 @@ main (int argc, char *const *argv, char 
        break;
 
       case 'L':
-       m4_set_nesting_limit_opt (context, atoi (optarg));
+       size = size_opt (optarg, N_("nesting limit"));
+       m4_set_nesting_limit_opt (context, size);
        break;
 
       case 'M':
@@ -424,15 +443,15 @@ main (int argc, char *const *argv, char 
       case 'e':
        error (0, 0, _("Warning: `%s' is deprecated, use `%s' instead"),
               "-e", "-i");
-        /* fall through */
+       /* fall through */
       case 'i':
        interactive = INTERACTIVE_YES;
        break;
 
       case 'l':
-       m4_set_max_debug_arg_length_opt (context, atoi (optarg));
-       if (m4_get_max_debug_arg_length_opt (context) <= 0)
-         m4_set_max_debug_arg_length_opt (context, 0);
+       size = size_opt (optarg,
+                        N_("debug argument length"));
+       m4_set_max_debug_arg_length_opt (context, size);
        break;
 
       case 'o':
Index: tests/options.at
===================================================================
RCS file: /sources/m4/m4/tests/options.at,v
retrieving revision 1.13
diff -u -p -r1.13 options.at
--- tests/options.at    28 Sep 2006 16:14:09 -0000      1.13
+++ tests/options.at    29 Sep 2006 12:19:00 -0000
@@ -133,6 +133,53 @@ m@&address@hidden()
 AT_CLEANUP
 
 
+## --------- ##
+## arglength ##
+## --------- ##
+
+AT_SETUP([--arglength])
+
+dnl Check for argument validation.
+
+AT_DATA([in],
+[[define(`echo', `$@')dnl
+traceon(`echo')dnl
+echo(`long string')
+]])
+
+AT_CHECK_M4([--arglength=-1 in], [1], [],
+[[m4: invalid debug argument length `-1'
+]])
+
+AT_CHECK_M4([--arglength oops in], [1], [],
+[[m4: invalid debug argument length `oops'
+]])
+
+AT_CHECK_M4([-l 10oops in], [1], [],
+[[m4: invalid character following debug argument length in `10oops'
+]])
+
+dnl MiB is the suffix to implict 1, resulting in 1048576
+AT_CHECK_M4([-lMiB in], [0], [[long string
+]], [[m4trace: -1- echo(`long string') -> ``long string''
+]])
+
+dnl this assumes size_t is no bigger than 64 bits
+AT_CHECK_M4([-l 123456789012345678901234567890 in], [1], [],
+[[m4: debug argument length `123456789012345678901234567890' too large
+]])
+
+AT_CHECK_M4([-l 3 in], [0], [[long string
+]], [[m4trace: -1- echo(`lon...') -> ``lo...'
+]])
+
+AT_CHECK_M4([--arglength=3 -l0 in], [0], [[long string
+]], [[m4trace: -1- echo(`long string') -> ``long string''
+]])
+
+AT_CLEANUP
+
+
 ## ----------- ##
 ## debug flags ##
 ## ----------- ##
@@ -294,6 +341,59 @@ OVERRIDE=It is changed.
 AT_CLEANUP
 
 
+## ------------- ##
+## nesting-limit ##
+## ------------- ##
+
+AT_SETUP([--nesting-limit])
+
+dnl Check for argument validation.
+
+AT_DATA([in],
+[[define(`echo', `$@')dnl
+echo(echo(echo(echo(`nested string'))))
+echo(echo(echo(echo(echo(echo(echo(echo(echo(`nested string')))))))))
+]])
+
+AT_CHECK_M4([--nesting-limit=-1 in], [1], [],
+[[m4: invalid nesting limit `-1'
+]])
+
+AT_CHECK_M4([--nesting-limit oops in], [1], [],
+[[m4: invalid nesting limit `oops'
+]])
+
+AT_CHECK_M4([-L 10oops in], [1], [],
+[[m4: invalid character following nesting limit in `10oops'
+]])
+
+dnl MiB is the suffix to implict 1, resulting in 1048576
+AT_CHECK_M4([-LMiB in], [0], [[nested string
+nested string
+]])
+
+dnl this assumes size_t is no bigger than 64 bits
+AT_CHECK_M4([-L 123456789012345678901234567890 in], [1], [],
+[[m4: nesting limit `123456789012345678901234567890' too large
+]])
+
+AT_CHECK_M4([-L 5 in], [1], [[nested string
+]],
+[[m4:in:3: recursion limit of 5 exceeded, use -L<N> to change it
+]])
+
+dnl per POSIX guidelines, this is a decimal number 10, not octal 8
+AT_CHECK_M4([-L 010 in], [0], [[nested string
+nested string
+]])
+
+AT_CHECK_M4([--nesting-limit=3 -L0 in], [0], [[nested string
+nested string
+]])
+
+AT_CLEANUP
+
+
 ## --------------- ##
 ## prepend-include ##
 ## --------------- ##
@@ -338,6 +438,37 @@ in post/blah
 AT_CLEANUP
 
 
+## ------------- ##
+## regexp-syntax ##
+## ------------- ##
+
+AT_SETUP([--regexp-syntax])
+
+dnl test argument validation
+
+AT_DATA([[in]], [[regexp(`(', `(')
+]])
+
+AT_CHECK_M4([--regexp-syntax=unknown in], [1], [],
+[[m4: bad regexp syntax option: `unknown'
+]])
+
+AT_CHECK_M4([--regexp-syntax '' in], [0], [[0
+]])
+
+AT_CHECK_M4([-r EXTENDED in], [1], [[
+]], [[m4:in:1: regexp: bad regular expression `(': Unmatched ( or \(
+]])
+
+AT_CHECK_M4([-rgnu-m4 in], [0], [[0
+]])
+
+AT_CHECK_M4([-r "gnu M4" in], [0], [[0
+]])
+
+AT_CLEANUP
+
+
 ## ----- ##
 ## safer ##
 ## ----- ##

reply via email to

[Prev in Thread] Current Thread [Next in Thread]