bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bison lexer


From: Hans Åberg
Subject: Re: Bison lexer
Date: Fri, 31 Aug 2018 23:39:37 +0200

> On 31 Aug 2018, at 22:26, Frank Heckenbach <address@hidden> wrote:
> 
> Hans Åberg wrote:
> 
>>> For a start, I didn't have very good experience communicating with
>>> Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
>>> etc. in the generated code, so over the years I'd been adjusting
>>> various warning-suppression gcc options or doing dirty #define
>>> tricks to avoid warnings, or sometimes even post-processing the
>>> generated lexer with sed.
>> 
>> GCC 8.2 uses C17 as default.
> 
> I haven't used gcc-8 yet, but how is this relevant? If anything, I
> expect newer gcc versions to produce more warnings (usually useful)
> which flex might also suffer from.

Maybe the Flex lexers errors is due to using C89 to compile it or something.

>>> But the final straw was when, after changing to C++ Bison, I wanted
>>> to switch to C++ Flex too and found this beautiful comment:
>>> 
>>>   /* The c++ scanner is a mess. The FlexLexer.h header file relies on the
>>>    * following macro. This is required in order to pass the 
>>> c++-multiple-scanners
>>>    * test in the regression suite. We get reports that it breaks 
>>> inheritance.
>>>    * We will address this in a future release of flex, or omit the C++ 
>>> scanner
>>>    * altogether. */
>> 
>> It has been like that since the 1990s, I believe.
> 
> Even better! :(
> 
> Especially since C++ in the 1990s was totally different from modern
> C++, so I have no idea if anything of this comment is still
> relevant, or maybe even more relevant, today compared to then.

Indeed, very old.

> Lesson (as if anyone was listening): Always put a date on such
> messages.

Probably just a hack, never actually developed.

>>> So I wrote a small library that builds that massive RE out of single
>>> rules and maps subexpressions back to rules (even in the case that
>>> rules contain subexpressions of their own), and that works for me.
>> 
>> I did that, too: I wrote some DFA/NFA code, and incidentally found
>> the most efficient method make action matches via a reverse NFA
>> lookup, cf. [1-3]. Also, I have made UTF-8/32 to octet character
>> class translations.
>> 
>> 1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
>> 2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
>> 3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
> 
> Interesting, thanks. Fortunately, my REs are not so complex, so the
> bug you reported won't affect me and lexing speed is not so
> important for me, so (at least for now) I can just use the library
> as is. But if I ever need something more sophisticated, I'll keep
> this in mind.

If that is what you are using, note that it is recursive, so the function stack 
might overflow. But perhaps the rewrite it someday.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]