bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bison 3.4.2 fails to handle large model


From: Tom Kramer
Subject: bison 3.4.2 fails to handle large model
Date: Fri, 27 Sep 2019 17:13:22 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

Hello bison bug fixers -

I built C++ software (xmlInstanceParserGenerator) that reads XML schema files and writes C++ classes and a YACC-Lex parser for XML data files conforming to the schema files.

The Quality Information Framework (QIF) ANSI standard is defined using 22 (or 23) XML schema files. The schema files are valid in commercial tools such as XMLSpy and oXygen. Data files conforming to the standard are XML files. I am one of the developers of QIF.

For the QIF model, the QIFDocument.y file that is generated by the xmlInstanceParserGenerator is 6460833 bytes. When I process that file with bison 3.4.2 there are no errors or warnings. The QIFDocumentYACC.cc file that is generated by bison is 20550463 bytes. In that file YYNSTATES is 68691 and YYNRULES is 25278. The QIFDocumentParser built from that file (and a couple dozen other files) compiles without error or warning.

I got nearly identical results when I used bison 3.4.1.

The problem seems to be that the parser that is built does not seem to handle more than 32768 (2^15) states. I have an XML test file (16 pages) that is valid against the schema in XMLSpy and exhibits the problem. I have a radically reduced version of that file (1 page) that exhibits the same problem. The problem manifests itself by putting negative values (the state number minus 2^16) on the state stack for states larger than 32768 (I do not know what happens with states larger than 2^16). This causes a segmentation fault when the states are referenced.

I ran the parser with yydebug set to 1. A portion of the end of the output is shown below. The segmentation fault occurs at the same place in the data file when yydebug is not set. I have studied the dump data, and all of the steps before negative state numbers arise seem to be correct.

My request is: please enable bison to handle much larger state numbers. I am guessing that all that is required is to change some numerical type from 16 bits to 32 bits, which should be very easy. But that's a guess and I realize "should be very easy" is often a dream.

I will be happy to send you more information, but I think a quick glance at the dump data shown below is probably all that is required to understand the problem. Since the problem occurs when yydebug is not set, I do not think the problem is limited to printing the dump data.

Thanks.

Tom Kramer
address@hidden

Entering state 32204
Next token is token BodyIdsSTART ()
Reducing stack by rule 19683 (line 122101):
-> $$ = nterm y_SecurityClassification_SecurityClassificationType_0 ()
Stack now 0 3 6 8 13 105 113 120 125 132 143 181 234 280 318 367 471 684 1001 1369 1696 1990 2200 2362 2555 2880 3361 3913 4560 5316 5996 6702 7343 7954 8604 9377 10347 11442 12613 13847 15191 16648 18216 19638 20929 22252 23632 25096 26659 28411 30203 32204
Entering state 34474
Next token is token BodyIdsSTART ()
Reducing stack by rule 6259 (line 55426):
-> $$ = nterm y_ExportControlClassification_XmlString_0 ()
Stack now 0 3 6 8 13 105 113 120 125 132 143 181 234 280 318 367 471 684 1001 1369 1696 1990 2200 2362 2555 2880 3361 3913 4560 5316 5996 6702 7343 7954 8604 9377 10347 11442 12613 13847 15191 16648 18216 19638 20929 22252 23632 25096 26659 28411 30203 32204 -31062
Entering state 36931
Next token is token BodyIdsSTART ()
Reducing stack by rule 6784 (line 58254):
-> $$ = nterm y_FeatureNominalIds_ArrayReferenceType_0 ()
Stack now 0 3 6 8 13 105 113 120 125 132 143 181 234 280 318 367 471 684 1001 1369 1696 1990 2200 2362 2555 2880 3361 3913 4560 5316 5996 6702 7343 7954 8604 9377 10347 11442 12613 13847 15191 16648 18216 19638 20929 22252 23632 25096 26659 28411 30203 32204 -31062 -28605

...

Entering state 47074
Next token is token BodyIdsSTART ()
Reducing stack by rule 16131 (line 103986):
-> $$ = nterm y_PartNoteIds_ArrayReferenceType_0 ()
Stack now 0 3 6 8 13 105 113 120 125 132 143 181 234 280 318 367 471 684 1001 1369 1696 1990 2200 2362 2555 2880 3361 3913 4560 5316 5996 6702 7343 7954 8604 9377 10347 11442 12613 13847 15191 16648 18216 19638 20929 22252 23632 25096 26659 28411 30203 32204 -31062 -28605 -25982 -23359 -20886 -18462
Entering state 18172
Reducing stack by rule 16138 (line 104024):
   $1 = nterm (null) ()
   $2 = nterm (null) ()
   $3 = token closedATTR ()
   $4 = nterm (null) ()
   $5 = nterm y_PartNoteIds_ArrayReferenceType_0 ()

Segmentation fault (core dumped)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]