m4-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[SCM] GNU M4 source repository branch, master, updated. 54c0ec6e81571e04


From: Eric Blake
Subject: [SCM] GNU M4 source repository branch, master, updated. 54c0ec6e81571e04ac60bb8320f2c71f74cca462
Date: Tue, 02 Oct 2007 22:04:06 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU M4 source repository".

http://git.sv.gnu.org/gitweb/?p=m4.git;a=commitdiff;h=54c0ec6e81571e04ac60bb8320f2c71f74cca462

The branch, master has been updated
       via  54c0ec6e81571e04ac60bb8320f2c71f74cca462 (commit)
      from  8b55caca720db14c67afcbc78294d56a02a0e6a0 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 54c0ec6e81571e04ac60bb8320f2c71f74cca462
Author: Eric Blake <address@hidden>
Date:   Tue Oct 2 14:01:51 2007 -0600

    Document quoting pitfalls in capitalize.
    
    * doc/m4.texinfo (Patsubst): Use the examples directory.  Also
    document shortfall.
    (Improved capitalize): New node.
    * examples/capitalize.m4: Update to match manual.
    * examples/capitalize2.m4: New file.
    
    Signed-off-by: Eric Blake <address@hidden>

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog               |    9 +++
 doc/m4.texinfo          |  170 +++++++++++++++++++++++++++++++++++++++++++++--
 examples/capitalize.m4  |   16 +++--
 examples/capitalize2.m4 |   19 +++++
 4 files changed, 201 insertions(+), 13 deletions(-)
 create mode 100644 examples/capitalize2.m4

diff --git a/ChangeLog b/ChangeLog
index 307a697..4bff9c4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2007-10-02  Eric Blake  <address@hidden>
+
+       Document quoting pitfalls in capitalize.
+       * doc/m4.texinfo (Patsubst): Use the examples directory.  Also
+       document shortfall.
+       (Improved capitalize): New node.
+       * examples/capitalize.m4: Update to match manual.
+       * examples/capitalize2.m4: New file.
+
 2007-10-01  Eric Blake  <address@hidden>
 
        Another Autoconf usage pattern optimization.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index b5dc467..d3f3b8f 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -289,6 +289,7 @@ Correct version of some examples
 * Improved forloop::            Solution for @code{forloop}
 * Improved foreach::            Solution for @code{foreach}
 * Improved cleardivert::        Solution for @code{cleardivert}
+* Improved capitalize::         Solution for @code{capitalize}
 * Improved fatal_error::        Solution for @code{fatal_error}
 
 How to make copies of the overall M4 package
@@ -6047,18 +6048,47 @@ to lower case, and @code{capitalize} changes the first 
character of each
 word to upper case and the remaining characters to lower case.
 @end deffn
 
+First, an example of their usage, using implementations distributed in
address@hidden@value{VERSION}/@/examples/@/capitalize.m4}.
+
address@hidden examples
 @example
-define(`upcase', `translit(`$*', `a-z', `A-Z')')dnl
-define(`downcase', `translit(`$*', `A-Z', `a-z')')dnl
-define(`capitalize1',
-       `regexp(`$1', `^\(\w\)\(\w*\)',
-               `upcase(`\1')`'downcase(`\2')')')dnl
-define(`capitalize',
-       `patsubst(`$1', `\w+', `capitalize1(`\&')')')dnl
+$ @kbd{m4 -I examples}
+include(`capitalize.m4')
address@hidden
+upcase(`GNUs not Unix')
address@hidden NOT UNIX
+downcase(`GNUs not Unix')
address@hidden not unix
 capitalize(`GNUs not Unix')
 @result{}Gnus Not Unix
 @end example
 
+Now for the implementation.  There is a helper macro @code{_capitalize}
+which puts only its first word in mixed case.  Then @code{capitalize}
+merely parses out the words, and replaces them with an invocation of
address@hidden  (As presented here, the @code{capitalize} macro has
+some subtle flaws.  You should try to see if you can find and correct
+them; or @pxref{Improved capitalize, , Answers}).
+
address@hidden examples
address@hidden
+$ @kbd{m4 -I examples}
+undivert(`capitalize.m4')dnl
address@hidden(`-1')
address@hidden upcase(text)
address@hidden downcase(text)
address@hidden capitalize(text)
address@hidden   change case of text, simple version
address@hidden(`upcase', `translit(`$*', `a-z', `A-Z')')
address@hidden(`downcase', `translit(`$*', `A-Z', `a-z')')
address@hidden(`_capitalize',
address@hidden       `regexp(`$1', `^\(\w\)\(\w*\)',
address@hidden               `upcase(`\1')`'downcase(`\2')')')
address@hidden(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
address@hidden'dnl
address@hidden example
+
 If @var{resyntax} is given, @var{regexp} must be given according to
 the syntax chosen, though the default regular expression syntax
 remains unchanged for other invocations:
@@ -7943,6 +7973,7 @@ presented here.
 * Improved forloop::            Solution for @code{forloop}
 * Improved foreach::            Solution for @code{foreach}
 * Improved cleardivert::        Solution for @code{cleardivert}
+* Improved capitalize::         Solution for @code{capitalize}
 * Improved fatal_error::        Solution for @code{fatal_error}
 @end menu
 
@@ -8250,6 +8281,131 @@ undivert
 @result{}
 @end example
 
address@hidden Improved capitalize
address@hidden Solution for @code{capitalize}
+
+The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
+not allow clients to follow the quoting rule of thumb.  Consider the
+three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
+difference between calling @code{capitalize} with the expansion of a
+macro, expanding the result of a case change, and changing the case of a
+double-quoted string:
+
address@hidden examples
address@hidden
+$ @kbd{m4 -I examples}
+include(`capitalize.m4')dnl
+define(`active', `act1, ive')dnl
+define(`Active', `Act2, Ive')dnl
+define(`ACTIVE', `ACT3, IVE')dnl
+upcase(active)
address@hidden,IVE
+upcase(`active')
address@hidden, IVE
+upcase(``active'')
address@hidden
+downcase(ACTIVE)
address@hidden,ive
+downcase(`ACTIVE')
address@hidden, ive
+downcase(``ACTIVE'')
address@hidden
+capitalize(active)
address@hidden
+capitalize(`active')
address@hidden
+capitalize(``active'')
address@hidden(`active')
+define(`A', `OOPS')
address@hidden
+capitalize(active)
address@hidden
+capitalize(`active')
address@hidden
address@hidden example
+
+First, when @code{capitalize} is called with more than one argument, it
+was throwing away later arguments, whereas @code{upcase} and
address@hidden used @samp{$*} to collect them all.  The fix is simple:
+use @samp{$*} consistently.
+
+Next, with single-quoting, @code{capitalize} outputs a single character,
+a set of quotes, then the rest of the characters, making it impossible
+to invoke @code{Active} after the fact, and allowing the alternate macro
address@hidden to interfere.  Here, the solution is to use additional quoting
+in the helper macros, then pass the final over-quoted output string
+through @code{_arg1} to remove the extra quoting and finally invoke the
+concatenated portions as a single string.
+
+Finally, when passed a double-quoted string, the nested macro
address@hidden is never invoked because it ended up nested inside
+quotes.  This one is the toughest to fix.  In short, we have no idea how
+many levels of quotes are in effect on the substring being altered by
address@hidden  If the replacement string cannot be expressed entirely
+in terms of literal text and backslash substitutions, then we need a
+mechanism to guarantee that the helper macros are invoked outside of
+quotes.  In other words, this sounds like a job for @code{changequote}
+(@pxref{Changequote}).  By changing the active quoting characters, we
+can guarantee that replacement text injected by @code{patsubst} always
+occurs in the middle of a string that has exactly one level of
+over-quoting using alternate quotes; so the replacement text closes the
+quoted string, invokes the helper macros, then reopens the quoted
+string.  In turn, that means the replacement text has unbalanced quotes,
+necessitating another round of @code{changequote}.
+
+In the fixed version below, (also shipped as
address@hidden@value{VERSION}/@/examples/@/capitalize.m4}), @code{capitalize}
+uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
+strings are chosen so as to be less likely to appear in the text being
+converted).  The helpers @code{_to_alt} and @code{_from_alt} merely
+reduce the number of characters required to perform a
address@hidden, since the definition changes twice.  The outermost
+pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
+with alternate quoting; the innermost pair is used so that the third
+argument to @code{patsubst} can contain an unbalanced
address@hidden>>}/@samp{<<[} pair.  Note that @code{upcase} and @code{downcase}
+must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
+they contain nested quotes but are invoked with the alternate quoting
+scheme in effect.
+
address@hidden examples
address@hidden
+$ @kbd{m4 -I examples}
+include(`capitalize2.m4')dnl
+define(`active', `act1, ive')dnl
+define(`Active', `Act2, Ive')dnl
+define(`ACTIVE', `ACT3, IVE')dnl
+define(`A', `OOPS')dnl
+capitalize(active)
address@hidden,Ive
+capitalize(`active')
address@hidden, Ive
+capitalize(``active'')
address@hidden
+capitalize(```actIVE''')
address@hidden'
+undivert(`capitalize2.m4')dnl
address@hidden(`-1')
address@hidden upcase(text)
address@hidden downcase(text)
address@hidden capitalize(text)
address@hidden   change case of text, improved version
address@hidden(`upcase', `translit(`$*', `a-z', `A-Z')')
address@hidden(`downcase', `translit(`$*', `A-Z', `a-z')')
address@hidden(`_arg1', `$1')
address@hidden(`_to_alt', `changequote(`<<[', `]>>')')
address@hidden(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
address@hidden(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
address@hidden(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
address@hidden(`_capitalize_alt',
address@hidden  `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
address@hidden    
<<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
address@hidden(`capitalize',
address@hidden  `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
address@hidden    _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
address@hidden'dnl
address@hidden example
+
 @node Improved fatal_error
 @section Solution for @code{fatal_error}
 
diff --git a/examples/capitalize.m4 b/examples/capitalize.m4
index 5c28de2..d4e4a50 100644
--- a/examples/capitalize.m4
+++ b/examples/capitalize.m4
@@ -1,8 +1,12 @@
-dnl
-dnl convert to upper- resp. lowercase
+divert(`-1')
+# upcase(text)
+# downcase(text)
+# capitalize(text)
+#   change case of text, simple version
 define(`upcase', `translit(`$*', `a-z', `A-Z')')
 define(`downcase', `translit(`$*', `A-Z', `a-z')')
-dnl
-dnl capitalize a single word
-define(`capitalize1', `regexp(`$1', `^\(\w\)\(\w*\)', 
`upcase(`\1')`'downcase(`\2')')')
-define(`capitalize', `patsubst(`$1', `\w+', ``'capitalize1(`\0')')')
+define(`_capitalize',
+       `regexp(`$1', `^\(\w\)\(\w*\)',
+               `upcase(`\1')`'downcase(`\2')')')
+define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
+divert`'dnl
diff --git a/examples/capitalize2.m4 b/examples/capitalize2.m4
new file mode 100644
index 0000000..154dc50
--- /dev/null
+++ b/examples/capitalize2.m4
@@ -0,0 +1,19 @@
+divert(`-1')
+# upcase(text)
+# downcase(text)
+# capitalize(text)
+#   change case of text, improved version
+define(`upcase', `translit(`$*', `a-z', `A-Z')')
+define(`downcase', `translit(`$*', `A-Z', `a-z')')
+define(`_arg1', `$1')
+define(`_to_alt', `changequote(`<<[', `]>>')')
+define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
+define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
+define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
+define(`_capitalize_alt',
+  `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
+    <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
+define(`capitalize',
+  `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
+    _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
+divert`'dnl


hooks/post-receive
--
GNU M4 source repository




reply via email to

[Prev in Thread] Current Thread [Next in Thread]