[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Side-effect of change to scan-code.l
From: |
Paul Eggert |
Subject: |
Re: Side-effect of change to scan-code.l |
Date: |
Tue, 04 Jan 2011 13:51:03 -0800 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7 |
I think Paul Hilfinger was on the right track.
It's better to use lex and regular expressions
to specify rules for tokens, rather than to use
ad hoc C code. The proposed ad hoc code would
treat identifiers differently after "$", which
sounds like a recipe for trouble.
Under the previous rules, an identifier could not
begin with a number, because that is confusing.
So, for example "$35.x" does not contain the
identifier "35.x"; instead, it is "$35" followed
by the identifier ".x".
We should be consistent, and continue to do this.
An identifier should not be allowed to begin with a number,
even a negative number. So, for example, "$-35.x" should
not contain the identifier "-35.x" because "-35.x" should
not be an identifier: it should be "$-35" followed
by the identifier ".x".
We can do this by using something like the following:
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
id -*(-|{letter}({letter}|[-0-9])*)
ref -?[0-9]+|{id}|"["{id}"]"|"$"
and then "id" would be treated consistently everywhere.