bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Miscellaneous code readability improvements.


From: Joel E. Denny
Subject: Re: [PATCH] Miscellaneous code readability improvements.
Date: Wed, 19 Aug 2009 00:58:14 -0400 (EDT)
User-agent: Alpine 1.00 (DEB 882 2007-12-20)

Hi Akim.

On Tue, 18 Aug 2009, Joel E. Denny wrote:

> On Fri, 14 Aug 2009, Akim Demaille wrote:

> > Maybe we need to get rid of the backslash from the culprit part:
> > 
> >     invalid character for \-escape: `z'
> >     invalid character for \-escape: `\n'
> > 
> > ?
> 
> Ah, I like that approach.
> 
> Also, as you mentioned earlier, I think we could drop the quotes.

The patch below, not yet pushed, implements that.  What do you think?

> We'd 
> just have to request that quotearg convert spaces into escape sequences.
> 
> However, when requesting a special quoting style from quotearg, AFAICT, 
> both the quoting style definition and the quoted string require memory 
> allocation that isn't required otherwise.

Nevermind.  I quoted spaces instead.

>From 652cd2ae51a7191b65553da14f517d6535510104 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Wed, 19 Aug 2009 00:41:52 -0400
Subject: [PATCH] Fix complaints about escape sequences.

Discussed starting at
<http://lists.gnu.org/archive/html/bison-patches/2009-08/msg00036.html>.
* src/scan-gram.l (SC_ESCAPED_STRING, SC_ESCAPED_CHARACTER):
For a \0 and similar escape sequences meaning the null
character, report an invalid escape sequence instead of an
invalid null character because the latter does not actually
appear in the user's input.
In all escape sequence complaints, don't escape the initial
backslash, and don't quote when the sequence appears at the end
of the complaint line unless there's a space character.
Consistently say "invalid" not "unrecognized".
Consistently prefer "empty character literal" over "extra
characters in character literal" warning for invalid escape
sequences; that is, consistently discard those sequences.
* tests/input.at (Bad escapes in literals): New.
---
 ChangeLog       |   19 +++++++++++++++++++
 src/scan-gram.l |   29 +++++++++++++++--------------
 tests/input.at  |   42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index d201948..5129272 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,22 @@
+2009-08-19  Joel E. Denny  <address@hidden>
+
+       Fix complaints about escape sequences.
+       Discussed starting at
+       <http://lists.gnu.org/archive/html/bison-patches/2009-08/msg00036.html>.
+       * src/scan-gram.l (SC_ESCAPED_STRING, SC_ESCAPED_CHARACTER):
+       For a \0 and similar escape sequences meaning the null
+       character, report an invalid escape sequence instead of an
+       invalid null character because the latter does not actually
+       appear in the user's input.
+       In all escape sequence complaints, don't escape the initial
+       backslash, and don't quote when the sequence appears at the end
+       of the complaint line unless there's a space character.
+       Consistently say "invalid" not "unrecognized".
+       Consistently prefer "empty character literal" over "extra
+       characters in character literal" warning for invalid escape
+       sequences; that is, consistently discard those sequences.
+       * tests/input.at (Bad escapes in literals): New.
+
 2009-08-18  Joel E. Denny  <address@hidden>
 
        maint: update for gnulib's recent update-copyright changes
diff --git a/src/scan-gram.l b/src/scan-gram.l
index 7a6d7bf..ac91e31 100644
--- a/src/scan-gram.l
+++ b/src/scan-gram.l
@@ -578,10 +578,9 @@ splice      (\\[ \f\t\v]*\n)*
 {
   \\[0-7]{1,3} {
     unsigned long int c = strtoul (yytext + 1, NULL, 8);
-    if (UCHAR_MAX < c)
-      complain_at (*loc, _("invalid escape sequence: %s"), quote (yytext));
-    else if (! c)
-      complain_at (*loc, _("invalid null character: %s"), quote (yytext));
+    if (!c || UCHAR_MAX < c)
+      complain_at (*loc, _("invalid number after \\-escape: %s"),
+                   yytext+1);
     else
       obstack_1grow (&obstack_for_string, c);
   }
@@ -589,10 +588,9 @@ splice      (\\[ \f\t\v]*\n)*
   \\x[0-9abcdefABCDEF]+ {
     verify (UCHAR_MAX < ULONG_MAX);
     unsigned long int c = strtoul (yytext + 2, NULL, 16);
-    if (UCHAR_MAX < c)
-      complain_at (*loc, _("invalid escape sequence: %s"), quote (yytext));
-    else if (! c)
-      complain_at (*loc, _("invalid null character: %s"), quote (yytext));
+    if (!c || UCHAR_MAX < c)
+      complain_at (*loc, _("invalid number after \\-escape: %s"),
+                   yytext+1);
     else
       obstack_1grow (&obstack_for_string, c);
   }
@@ -610,16 +608,19 @@ splice     (\\[ \f\t\v]*\n)*
 
   \\(u|U[0-9abcdefABCDEF]{4})[0-9abcdefABCDEF]{4} {
     int c = convert_ucn_to_byte (yytext);
-    if (c < 0)
-      complain_at (*loc, _("invalid escape sequence: %s"), quote (yytext));
-    else if (! c)
-      complain_at (*loc, _("invalid null character: %s"), quote (yytext));
+    if (c <= 0)
+      complain_at (*loc, _("invalid number after \\-escape: %s"),
+                   yytext+1);
     else
       obstack_1grow (&obstack_for_string, c);
   }
   \\(.|\n)     {
-    complain_at (*loc, _("unrecognized escape sequence: %s"), quote (yytext));
-    STRING_GROW;
+    char const *p = yytext + 1;
+    if (*p == ' ')
+      p = "` '";
+    else
+      p = quotearg_style_mem (escape_quoting_style, p, 1);
+    complain_at (*loc, _("invalid character after \\-escape: %s"), p);
   }
 }
 
diff --git a/tests/input.at b/tests/input.at
index 146d581..1629f90 100644
--- a/tests/input.at
+++ b/tests/input.at
@@ -1209,3 +1209,45 @@ three.y:4.8-11: missing `'' at end of file
 ]])
 
 AT_CLEANUP
+
+## ------------------------- ##
+## Bad escapes in literals.  ##
+## ------------------------- ##
+
+AT_SETUP([[Bad escapes in literals]])
+
+AT_DATA([input.y],
+[[%%
+start: '\777' '\0' '\xfff' '\x0'
+       '\uffff' '\u0000' '\Uffffffff' '\U00000000'
+       '\ ' '\A';
+]])
+echo 'start: "\T\F\0" ;' | tr 'TF0' '\011\014\0' >> input.y
+
+AT_BISON_CHECK([input.y], [1], [],
+[[input.y:2.9-12: invalid number after \-escape: 777
+input.y:2.8-13: warning: empty character literal
+input.y:2.16-17: invalid number after \-escape: 0
+input.y:2.15-18: warning: empty character literal
+input.y:2.21-25: invalid number after \-escape: xfff
+input.y:2.20-26: warning: empty character literal
+input.y:2.29-31: invalid number after \-escape: x0
+input.y:2.28-32: warning: empty character literal
+input.y:3.9-14: invalid number after \-escape: uffff
+input.y:3.8-15: warning: empty character literal
+input.y:3.18-23: invalid number after \-escape: u0000
+input.y:3.17-24: warning: empty character literal
+input.y:3.27-36: invalid number after \-escape: Uffffffff
+input.y:3.26-37: warning: empty character literal
+input.y:3.40-49: invalid number after \-escape: U00000000
+input.y:3.39-50: warning: empty character literal
+input.y:4.9-10: invalid character after \-escape: ` '
+input.y:4.8-11: warning: empty character literal
+input.y:4.14-15: invalid character after \-escape: A
+input.y:4.13-16: warning: empty character literal
+input.y:5.9-16: invalid character after \-escape: \t
+input.y:5.17: invalid character after \-escape: \f
+input.y:5.18: invalid character after \-escape: \0
+]])
+
+AT_CLEANUP
-- 
1.5.4.3





reply via email to

[Prev in Thread] Current Thread [Next in Thread]