[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Syntax error if paragraph contains more than 1 printable character
From: |
Steve Litt |
Subject: |
Re: Syntax error if paragraph contains more than 1 printable character |
Date: |
Wed, 13 Dec 2023 19:01:22 -0500 |
James K. Lowden said on Tue, 12 Dec 2023 20:24:35 -0500
>On Tue, 12 Dec 2023 23:06:14 -0500
>Steve Litt <slitt@troubleshooters.com> wrote:
>
>> I've already split paratext into multiple LINE tokens which represent
>> a line without its NL, and now I'm thinking of splitting line into
>> multiple chars ("[^\n]"). Perhaps this will make the rules less
>> complicated, though longer.
>
>Have the scanner return two tokens only:
>
> LINE a line of text, no newline
> SEP a blank line
>
>The lexer might have:
>
>.+/\n { ... return LINE; }
>(\n[[:blank:]]*){2,} { return SEP; } // two or more blank lines
>\n { /* ignore */ }
Thanks James, this looks great!
I won't need to consider end of line spaces because I now have a sed 1
liner preprocessor that gets rid of trailing space :-).
Right now I've gone back to the Hello World stage and am making a
Flex/Bison scanner that does nothing but copy the file. Once I learn
from that, I'll try your suggestions. They look refreshingly simple and
understandable to me.
Thanks much,
SteveT
Steve Litt
Autumn 2023 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21