bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] bug in gawk


From: arnold
Subject: Re: [bug-gawk] bug in gawk
Date: Sun, 07 Apr 2019 00:02:18 -0600
User-agent: Heirloom mailx 12.5 7/5/10

Hi.

Aleksey Cheusov <address@hidden> wrote:

> 05.04.2019, 10:51, "address@hidden" <address@hidden>:
> > Hi.
> >
> >> ??Aleksey Cheusov <address@hidden> wrote:
> >>
> >> ??> 0 0 dictd>echo a | env LC_ALL=C gawk '/^[\300-\337]/ {print 1}'
> >> ??> gawk: cmd. line:1: error: Invalid range end: /^[??-??]/
> >>
> >> ??I have reproduced this. It's strange. I will investigate further.
> >
> > I have found the cause of the problem. I have to think a little
> > bit about how to fix it.
> >
> > I should have a fix within a few days.
> >
> > Thank you for the report!
>
> Great! You are one of the best upstream I've ever seen :-)

Thanks. I try.

Here is the diff. I will get this into the git repo sometime in
the next few days, but this will let you move ahead.

Arnold
---------------------------------------------------------
diff --git a/eval.c b/eval.c
index 4650150..132c850 100644
--- a/eval.c
+++ b/eval.c
@@ -104,6 +104,12 @@ char casetable[] = {
        '\170', '\171', '\172', '\173', '\174', '\175', '\176', '\177',
 
        /* Latin 1: */
+       /*
+        * 4/2019: This is now overridden; in single byte locales
+        * we call load_casetable from main and it fills in the values
+        * based on the current locale. In particular, we want LC_ALL=C
+        * to work correctly for values >= 0200.
+        */
        C('\200'), C('\201'), C('\202'), C('\203'), C('\204'), C('\205'), 
C('\206'), C('\207'),
        C('\210'), C('\211'), C('\212'), C('\213'), C('\214'), C('\215'), 
C('\216'), C('\217'),
        C('\220'), C('\221'), C('\222'), C('\223'), C('\224'), C('\225'), 
C('\226'), C('\227'),
@@ -201,18 +207,12 @@ load_casetable(void)
 {
 #if defined(LC_CTYPE)
        int i;
-       char *cp;
        static bool loaded = false;
 
        if (loaded || do_traditional)
                return;
 
        loaded = true;
-       cp = setlocale(LC_CTYPE, NULL);
-
-       /* this is not per standard, but it's pretty safe */
-       if (cp == NULL || strcmp(cp, "C") == 0 || strcmp(cp, "POSIX") == 0)
-               return;
 
 #ifndef USE_EBCDIC
        /* use of isalpha is ok here (see is_alpha in awkgram.y) */
@@ -710,7 +710,7 @@ set_IGNORECASE()
                warned = true;
                lintwarn(_("`IGNORECASE' is a gawk extension"));
        }
-       load_casetable();
+
        if (do_traditional)
                IGNORECASE = false;
        else
diff --git a/main.c b/main.c
index e2bcd72..d6e3426 100644
--- a/main.c
+++ b/main.c
@@ -320,6 +320,10 @@ main(int argc, char **argv)
        /* init the cache for checking bytes if they're characters */
        init_btowc_cache();
 
+       /* set up the single byte case table */
+       if (gawk_mb_cur_max == 1)
+               load_casetable();
+
        if (do_nostalgia)
                nostalgia();
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]