grammatica-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Grammatica-users] sample css2 grammar


From: Francis Norton
Subject: [Grammatica-users] sample css2 grammar
Date: Tue, 31 May 2005 22:45:26 +0100
User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)

Here is a reasonably serious CSS2 grammar. It recognises all the tokens and productions described in http://www.w3.org/TR/REC-CSS2/syndata.html, except that:

[a] the productions are more permissive, and accept all the stylesheets from a random set of live stylesheets [b] the tokens have been simplified - in the absence of regex macros I couldn't quite face the ugliness of allowing escaped characters in nmstart and nmchar characters. If anyone wants the full W3C token syntax, ask me and I'll bite that bullet.

I also have what I feel is almost an acceptable OO API to this parser, in C#. By acceptable I mean:

[1] I want sensible default classes for all productions, with helpful behaviour for each class like being able to reserialise itself [2] I want to be able to override the classes for individual productions, and for individual events for a production class, for example when a specific token or production is added to it [3] I don't want to re-write everything I've customised just because I've changed the grammar and regenerated the parser

I'm still having some problems with [3] - my current solution is, frankly, ugly. But if anyone is interested in reviewing it, I'd appreciate comments or suggestions. And if there's a demand for it, I might put it all together as an open source project.

Francis.
/*
 * css.grammar
 *
 * This work is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published
 * by the Free Software Foundation; either version 2 of the License,
 * or (at your option) any later version.
 *
 * This work is distributed in the hope that it will be useful, but
 * WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 * General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software 
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
 * USA
 *
 * As a special exception, the copyright holders of this library give
 * you permission to link this library with independent modules to
 * produce an executable, regardless of the license terms of these
 * independent modules, and to copy and distribute the resulting
 * executable under terms of your choice, provided that you also meet,
 * for each linked independent module, the terms and conditions of the
 * license of that module. An independent module is a module which is
 * not derived from or based on this library. If you modify this
 * library, you may extend this exception to your version of the
 * library, but you are not obligated to do so. If you do not wish to
 * do so, delete this exception statement from your version.
 *
 * Copyright (c) 2004 Francis Norton. All rights reserved.
 */



%header%

GRAMMARTYPE = "LL"

DESCRIPTION = "A grammar for CSS2 syntax."

AUTHOR      = "Francis Norton, <francis at redrice dot com>"
VERSION     = "1.1"
DATE        = "2005/03/26"

LICENSE     = "Permission is granted to copy this document verbatim in any
               medium, provided that this copyright notice is left intact."

COPYRIGHT   = "Copyright (c) 2005 Francis Norton. All rights reserved."


%tokens%

/* 
 * this token set is compatible with that specified in
 * http://www.w3.org/TR/REC-CSS2/syndata.html but the regular 
 * expressions are simplified to compensate for the absence of 
 * lexical macros.
 */

IDENT               = <<[a-zA-Z][a-zA-Z0-9_-]*>>
IMPORT_SYM          = "@[Ii][Mm][Pp][Oo][Rr][Tt]"
PAGE_SYM            = "@[Pp][Aa][Gg][Ee]"
MEDIA_SYM           = "@[Mm][Ee][Dd][Ii][Aa]"
FONT_FACE_SYM       = "@[Ff][Oo][Nn][Tt]-[Ff][Aa][Cc][Ee]"
CHARSET_SYM         = "@[Cc][Hh][Aa][Rr][Ss][Ee][Tt]"
ATKEYWORD           = <<@[a-zA-Z0-9-]*>>
STRING              = <<(".*[^\\]")|('.*[^\\]')>>
HASH                = <<#[a-zA-Z0-9-]*>>
EMS                 = <<([0-9]+|[0-9]*\.[0-9]+)[Ee][Mm]>>
EXS                 = <<([0-9]+|[0-9]*\.[0-9]+)[Ee][Xx]>>
LENGTH              = 
<<([0-9]+|[0-9]*\.[0-9]+)([Pp][Xx]|[Cc][Mm]|[Mm][Mm]|[Ii][Nn]|[Pp][Tt]|[Pp][Cc])>>
ANGLE               = 
<<([0-9]+|[0-9]*\.[0-9]+)([Dd][Ee][Gg]|[Rr][Aa][Dd]|[Gg][Rr][Aa][Dd]) >>
TIME                = <<([0-9]+|[0-9]*\.[0-9]+)([Mm][Ss]|[Ss])>>
FREQ                = <<([0-9]+|[0-9]*\.[0-9]+)[Kk]? [Hh][Zz]>>
PERCENTAGE          = <<([0-9]+|[0-9]*\.[0-9]+)%>>
NUMBER              = <<[0-9]+|[0-9]*\.[0-9]+>>
URI                 = <<[Uu][Rr][Ll]\([^\)]*\)>>
RGB                 = <<[Rr][Gg][Bb]\([^\)]*\)>>
UNICODERANGE        = <<U\+[0-9A-F?]{1,6}(-[0-9A-F]{1,6})? >>
S                   = <<[ \t\n\r\f]+>>
IMPORTANT_SYM       = <<![Ii][Mm][Pp][Oo][Rr][Tt][Aa][Nn][Tt]>>
LEFT_BRACE          = "{"
RIGHT_BRACE         = "}"
LEFT_PAREN          = "("
RIGHT_PAREN         = ")"
LEFT_BRACKET        = "["
RIGHT_BRACKET       = "]"
COLON               = ":"
SEMI_COLON          = ";"
COMMA               = ","
CDO                 = "<!--"
CDC                 = "-->"
COMMENT             = <</\*([^*]|\*[^/])*\*/>>
FUNCTION_SYM        = <<[a-zA-Z][a-zA-Z0-9_-]*\([^\)]*\)>>
INCLUDES            = "~="
DASHMATCH           = "|="
SLASH               = "/"
PLUS                = "+"
MINUS               = "-"
STAR                = "*"
GT                  = ">"
DOT                 = "."
EQUALS              = "="
DELIM               = <<.>>


%productions%

/* 
 * these productions are closely based on the productions 
 * specified in http://www.w3.org/TR/REC-CSS2/syndata.html
 */

stylesheet          = ( CHARSET_SYM sep? STRING sep? ';' )?
                      (sep|CDO|CDC)* ( import (sep|CDO|CDC)* )*
                      ( ( ruleset | media | page | font_face ) (sep|CDO|CDC)* 
)+ ;
import              = IMPORT_SYM sep? (STRING|URI) sep? (medium ( ',' sep? 
medium)* )? ';' sep? ;
media               = MEDIA_SYM sep? medium ( ',' sep? medium )* '{' sep? 
ruleset* '}' sep? ;
medium              = IDENT sep?;
page                = PAGE_SYM sep? [(IDENT | pseudo_page | IDENT pseudo_page) 
sep?] 
                      '{' sep? [declaration sep? ( ';' sep? declaration )*] '}' 
sep? ;
pseudo_page         = ':' IDENT ;
font_face           = FONT_FACE_SYM sep? '{' sep? declaration ( ';' sep? 
declaration )* '}' sep? ;
operator            = '/' sep? | ',' sep? /* | empty */ ;
combinator          = '+' sep? | '>' sep? /* | empty */ ;
unary_operator      = '-' | '+' ;
property            = IDENT sep? ;
ruleset             = selector ( ',' sep? selector? )*
                      '{' sep? [declaration ( ';' sep?  declaration? )* ] '}' 
sep? ;
selector            = simple_selector ( [combinator] simple_selector )* ;
simple_selector     = ((element_name) | (element_name? ( HASH | class | attrib 
| pseudo )))+ sep? ;
class               = '.' IDENT ;
element_name        = IDENT | '*' ;
attrib              = '[' sep? IDENT sep? ( ( '=' | INCLUDES | DASHMATCH ) sep?
                      ( IDENT | STRING ) sep? )? ']' ;
pseudo              = ':' ( IDENT | FUNCTION_SYM sep? IDENT sep? ')' ) ;
declaration         = property ':' sep? expr prio? /* | empty */ ;
prio                = IMPORTANT_SYM sep? ;
expr                = term ( ([operator]  term) )* ;
term                = unary_operator?
                      ( NUMBER sep? | PERCENTAGE sep? | LENGTH sep? | EMS sep? 
| EXS sep? 
                      | ANGLE sep? | TIME sep? | FREQ sep? | function )
                      | STRING sep? | IDENT sep? | URI sep? |  RGB sep? | 
UNICODERANGE sep? | hexcolor ;
function            = (IDENT (':' | '.'))* FUNCTION_SYM sep? ;
hexcolor            = HASH sep? ;
sep                 = (S | COMMENT)+;


reply via email to

[Prev in Thread] Current Thread [Next in Thread]