[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: PHP support
From: |
Tim Landscheidt |
Subject: |
Re: PHP support |
Date: |
Tue, 22 Nov 2011 21:01:58 +0000 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) |
Akim Demaille <address@hidden> wrote:
>> I hope this list is the right place for this.
>> In the past few weeks, I started working on "%language
>> PHP". You can browse the code at
>> <URI:https://github.com/scfc/bison-php> (bison's "master" is
>> "upstream").
> I am very curious about this. Are you really wanting to _use_ a Bison parser
> in PHP? Or is this some kind of experimental toy project? Unfortunately the
> current maintainers don't spend a lot of time on Bison, and features that
> might never be maintained will finish by hindering the development of the
> whole project.
<excursus>
The starting point of this endeavour is
<URI:https://bugzilla.wikimedia.org/show_bug.cgi?id=17465>.
ATM, MediaWiki uses an external Ocaml program (generated by
ocamllex/ocamlyacc) to determine whether a "<math>" segment
contains only "safe" TeX by validating against a (subset of)
TeX grammar.
So I looked for existing PHP scanner/parser generators and
found several, let's say, code dumps. Few were working
code, actively maintained were none and grammars, if docu-
mented at all, looked rather funky and/or were lacking
flex's/Bison's functionalities.
Rather than whipping one of them in usable shape, I adapt-
ed Bison's Java generator as grammar, concept & Co. are fa-
miliar to *everyone*, the infrastructure of testsuites,
mailing lists, bug trackers, etc. is already in place and it
would be much easier to port new features like GLR if the
need arises.
</excursus>
>> But the second "is" should be a "$is". I tried some
>> variants of "$$" and patsubst at different places, but
>> unfortunately, m4's levels of quoting have always exceed-
>> ed my imagination :-).
> Bison is not ready for this, not at all. The easiest would be to
> post-process the result as use some kind of new quadrigraph to denote $, say
> @S|@ :)
> Autoconf uses this:
> s/\@<:\@/[/g;
> s/\@:>\@/]/g;
> s/address@hidden:\@/(/g;
> s/\@:\}\@/)/g;
> s/address@hidden|\@/\$/g;
> s/address@hidden:\@/#/g;
> s/\@&t\@//g;
> A cleaner design requires more thinking.
Actually, the real problem lay elsewhere as Bison expected
"type identifier" as the argument to %lex-param and just si-
lently accepted, but "mistreated" "$identifier" :-). What
do you think of the patch:
| diff --git a/data/php.m4 b/data/php.m4
| index 7095d8a..6ab2600 100644
| --- a/data/php.m4
| +++ b/data/php.m4
| @@ -262,7 +262,9 @@ m4_define([b4_lex_param_call],
| [$1])])
| m4_define([b4_param_calls],
| [m4_map([b4_param_call], address@hidden)])
| -m4_define([b4_param_call], [, $2])
| +# FIXME: This should probably better be dealt with in parse-gram.y's
| +# add_param ().
| +m4_define([b4_param_call], [, m4_bpatsubst($1, [^.* ], [])])
that I posted in
<URI:news:address@hidden>?
I would strongly disagree with your assumption that Bison
isn't ready for this, though. The code at
<URI:https://github.com/scfc/bison-php> is already working
for simple examples. The only real showstopper yet - on the
C side - is the use of "$variables" in actions for which I
have written a patch that I will post for discussion in the
next few days (which reminds that I still have to reply to
Bruno on bug-gnulib :-().
Tim