[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: improve AS_ESCAPE
From: |
Eric Blake |
Subject: |
Re: improve AS_ESCAPE |
Date: |
Mon, 2 Mar 2009 17:30:59 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Ralf Wildenhues <Ralf.Wildenhues <at> gmx.de> writes:
Hi Ralf,
> a couple of comments on this patch (thanks as always for all your work
> on this!)
Thanks for your reviews, even if they do come in later than you would like.
> > address@hidden AS_ESCAPE (@var{string}, @dvar{chars, `\"$})
> > address@hidden
> > +Expands to @var{string}, with any characters in @var{chars} escaped with
> > +a backslash (@samp{\}). @var{chars} should be at most four bytes long,
>
> I'm not too happy with this seemingly arbitrary limitation, and just for
> the sake of optimization, but I guess in practice it will be one of the
> less problematic ones. Oh well.
>
> > +and only contain characters from the set @samp{`\"$};
>
> (and this limitation, of course; with it, the former isn't a problem)
We never supported -, [, or ], which meant you have never been able to use this
macro to safely escape all shell meta-characters for use in an unquoted shell
context. Additionally, I could not find any uses in the wild trying to quote
any more than just the four special characters within "" context.
A more general macro can be written based on m4_bpatsubst that supports more
than four characters, but it is also a slower macro (the point of these
optimizations is that it is faster to use m4_translit and m4_index to check if
anything needs escaping, since most user strings need no escaping, than to pay
the price of compiling a regex for every string on the off-chance that it might
need escaping).
>
> > however,
> > +characters may be safely listed more than once in @var{chars} for the
> > +sake of syntax highlighting editors. The current implementation expands
> > address@hidden after adding escapes; if @var{string} contains macro calls
> > +that have text that needs quoting, you can use
>
> Does "quoting" refer to M4 or to shell quoting here? This text is not
> easy to read.
shell quoting. I'll add that.
>
> > +This macro is often used with @code{AS_ECHO}. For example, this snippet
> > +will produce shell code that outputs the four lines @samp{"$foo" =
> > +"bar"}, @samp{macro}, @samp{a, b}, and @samp{a, \b}:
>
> The first @samp will be line-wrapped in the info page here, which is
> unfortunate, given that it's valuable to see the verbatim results. How
> about making them a four-line result @example?
Done.
>
> > address@hidden
> > +foo=bar
> > +AS_ECHO(["AS_ESCAPE(["$foo" = ])AS_ESCAPE(["$foo"], [""])"])
> > +m4_define([macro], [a, [\b]])
> > +AS_ECHO(["AS_ESCAPE([[macro]])"])
> > +AS_ECHO(["AS_ESCAPE([macro])"])
> > +AS_ECHO(["AS_ESCAPE(m4_dquote(m4_expand([macro])))"])
> > address@hidden example
>
> > address@hidden Should we add AS_ESCAPE_SINGLE? If we do, we can optimize in
> > address@hidden the case of @var{string} that does not contain '.
>
> What would it do?
AS_ECHO(['AS_ESCAPE_SINGLE([a'"b])'])
=> $as_echo 'a'\''"b'
that is, AS_ESCAPE_SINGLE provides the shell-escaping necessary in a '' shell
context, comparable to AS_ESCAPE in the "" context. I'm not convinced it is
necessary, though, hence the @comment. What do you think? Nuke the @comment,
or implement the macro?
> > @@ -68,7 +68,7 @@ m4_define([AH_OUTPUT], [])
> > # Quote for Perl '' strings, which are those used by Autoheader.
> > m4_define([AH_VERBATIM],
> > [AS_LITERAL_IF([$1],
> > - [AH_OUTPUT([$1], AS_ESCAPE([[$2]], [\\'']))])])
> > + [AH_OUTPUT([$1], AS_ESCAPE([[$2]], [\']))])])
>
> Why is this change needed? Or is it a cosmetic fix only?
Speed. AS_ESCAPE([[$2]], [\']) is faster than AS_ESCAPE([[$2]], [[\\'']]),
because the former calls m4_translit with a two-byte change, where m4 can use
memchr2 to rapidly scan a word at a time for instances of one of those two
bytes, while the latter calls m4_translit with a four-byte change, where m4
must create a 256-byte mapping table and process the string a byte at a time.
It doesn't take much input length for vector operations to outperform byte-wise.
>
> > --- a/lib/m4sugar/m4sh.m4
> > +++ b/lib/m4sugar/m4sh.m4
> > @@ -680,18 +680,33 @@ m4_define([AS_MESSAGE_LOG_FD])
>
> > +# _AS_ESCAPE(STRING, KEY, SET)
> > +# ----------------------------
> > +# Backslash-escape all instances of the singly byte KEY or up to four
>
> s/singly/single/
Thanks.
>
> > +# bytes in SET occurring in STRING. Although a character can occur
> > +# multiple times, optimum efficiency occurs when KEY and SET are
> > +# distinct, and when SET does not exceed two bytes. These particular
> > +# semantics allow for the fewest number of parses of STRING, as well
> > +# as taking advantage of newer m4's optimizations when m4_translit is
>
> s/newer m4's optimizations/optimizations in M4 versions XX and newer/
> ?
m4 1.4.13. Changed.
From: Eric Blake <address@hidden>
Date: Mon, 2 Mar 2009 10:29:35 -0700
Subject: [PATCH] Improve wording for AS_ESCAPE.
* doc/autoconf.texi (Common Shell Constructs) <AS_ESCAPE>: Touch
up documentation.
* lib/m4sugar/m4sh.m4 (_AS_ESCAPE): Fix comment typos.
Reported by Ralf Wildenhues.
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 6 ++++++
doc/autoconf.texi | 11 +++++++----
lib/m4sugar/m4sh.m4 | 6 +++---
3 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 7546c75..789474c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
2009-03-02 Eric Blake <address@hidden>
+ Improve wording for AS_ESCAPE.
+ * doc/autoconf.texi (Common Shell Constructs) <AS_ESCAPE>: Touch
+ up documentation.
+ * lib/m4sugar/m4sh.m4 (_AS_ESCAPE): Fix comment typos.
+ Reported by Ralf Wildenhues.
+
Silence AC_CACHE_VAL if not part of AC_MSG sequence.
* lib/autoconf/general.m4 (AC_MSG_CHECKING, AC_MSG_RESULT): Track
pairs of calls.
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 9e3fc46..b511bee 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -12356,7 +12356,7 @@ Common Shell Constructs
characters may be safely listed more than once in @var{chars} for the
sake of syntax highlighting editors. The current implementation expands
@var{string} after adding escapes; if @var{string} contains macro calls
-that have text that needs quoting, you can use
+that in turn expand to text needing shell quoting, you can use
@code{AS_ESCAPE(m4_dquote(m4_expand([string])))}.
The default for @var{chars} (@samp{\"$`}) is the set of characters
@@ -12371,17 +12371,20 @@ Common Shell Constructs
expansions (such as @address@hidden "hi"address@hidden) that would be broken
with improper escapes.
-This macro is often used with @code{AS_ECHO}. For example, this snippet
-will produce shell code that outputs the four lines @samp{"$foo" =
-"bar"}, @samp{macro}, @samp{a, b}, and @samp{a, \b}:
+This macro is often used with @code{AS_ECHO}. For an example, observe
+the output generated by the shell code generated from this snippet:
@example
foo=bar
AS_ECHO(["AS_ESCAPE(["$foo" = ])AS_ESCAPE(["$foo"], [""])"])
address@hidden"$foo" = "bar"
m4_define([macro], [a, [\b]])
AS_ECHO(["AS_ESCAPE([[macro]])"])
address@hidden
AS_ECHO(["AS_ESCAPE([macro])"])
address@hidden, b
AS_ECHO(["AS_ESCAPE(m4_dquote(m4_expand([macro])))"])
address@hidden, \b
@end example
@comment Should we add AS_ESCAPE_SINGLE? If we do, we can optimize in
diff --git a/lib/m4sugar/m4sh.m4 b/lib/m4sugar/m4sh.m4
index e033063..d884ead 100644
--- a/lib/m4sugar/m4sh.m4
+++ b/lib/m4sugar/m4sh.m4
@@ -696,13 +696,13 @@ m4_define([AS_ESCAPE],
# _AS_ESCAPE(STRING, KEY, SET)
# ----------------------------
-# Backslash-escape all instances of the singly byte KEY or up to four
+# Backslash-escape all instances of the single byte KEY or up to four
# bytes in SET occurring in STRING. Although a character can occur
# multiple times, optimum efficiency occurs when KEY and SET are
# distinct, and when SET does not exceed two bytes. These particular
# semantics allow for the fewest number of parses of STRING, as well
-# as taking advantage of newer m4's optimizations when m4_translit is
-# passed SET of size 2 or smaller.
+# as taking advantage of the optimizations in m4 1.4.13+ when
+# m4_translit is passed SET of size 2 or smaller.
m4_define([_AS_ESCAPE],
[m4_if(m4_index(m4_translit([[$1]], [$3], [$2$2$2$2]), [$2]), [-1],
[$0_], [m4_bpatsubst])([$1], [[$2$3]], [\\\&])])
--
1.6.1.2
- Re: improve AS_ESCAPE,
Eric Blake <=