bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dlang: initial changes to run the calc tests on it


From: H. S. Teoh
Subject: Re: dlang: initial changes to run the calc tests on it
Date: Wed, 27 Feb 2019 22:32:54 -0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Tue, Feb 26, 2019 at 06:33:55PM +0100, Akim Demaille wrote:
[...]
> What I did below is quite ugly.  In particular, I don't know how to
> write a decent scanner in D.  What I did is truly scary, a way to
> force C code (with gets and ungetc) into my zero knowledge of D.  What
> is the right way to do the following?
[...]

ungetc is a truly nasty hack of an API in C; is it really necessary to
support that?  D supports a range API that lets you query the front of a
range (in this case, a stream of chars) without moving the current
position of the stream. So ungetc really shouldn't be necessary unless
it's an inextricable part of the Bison-generated parser.

What I'd do is to templatize CalcLexer on an arbitrary input range of
chars, and leave the specifics of binding to a File (or whatever else,
like a string in a unittest) to the caller. And I wouldn't bother with
using class inheritance at all, since I can't envision we'd ever need to
swap in multiple lexers to the same parser.  So something like this:

-----snip-----
import std.range.primitives;

// Convenience method to instantiate CalcLexer (so that you don't have
// to name the range type explicitly).
auto calcLexer(R)(R range)
        if (isInputRange!R && is(ElementType!R : dchar))
{
        return CalcLexer!R(range);
}

struct CalcLexer(R)
        if (isInputRange!R && is(ElementType!R : dchar))
{
        private R input;
        private YYSemanticType semanticVal_;

        @property YYSemanticType semanticVal()
        {
                return semanticVal_;
        }

        int yylex()
        {
                import std.uni : isWhite, isNumber;

                // Skip initial spaces
                while (!input.empty && isWhite(input.front))
                        input.popFront;

                // Handle EOF
                if (input.empty)
                        return YYTokenType.EOF;

                // Numbers
                assert(!input.empty);
                if (input.front == '.' || input.front.isNumber)
                {
                        import std.conv : parse;
                        semanticVal_.ival = input.parse!int;
                        return YYTokenType.NUM;
                }

                // Individual characters
                auto ch = input.front;
                input.popFront;
                return ch;
        }
}
-----snip-----


On the caller's side, you'll need to somehow get a range of characters
out of a File, for the sake of the example.  I'd do something like this:

        import std.algorithm : map, joiner;
        import std.stdio;
        import std.utf : byDchar;

        File inputFile = stdin; // for example
        auto lexer = inputFile
                .byChunk(1024)  // avoid making a syscall roundtrip per char
                .map!(chunk => cast(char[]) chunk) // because byChunk returns 
ubyte[]
                .joiner         // combine chunks into a single virtual range 
of char
                .byDchar        // UTF-8 decode (optional)
                .calcLexer;     // instantiate CalcLexer object

        ... // pass `lexer` to the Bison parser


T

-- 
Life begins when you can spend your spare time programming instead of watching 
television. -- Cal Keegan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]