|
From: | Di-an JAN |
Subject: | Some minor cleanups |
Date: | Sat, 27 Sep 2008 15:20:26 -0700 (PDT) |
I've been reading the bison source code and came up with some changes I would like to be considered. The for_all_rules.patch changes the double loop over all rules deriving all variables to a single loop over all rules. Tested with make check on Cygwin with 1 expected failure: 49: Output file name: address@hidden&*()-=_+{}[]|\:;<>, .' FAILED (output.at:188) failed because the file system (NTFS) does not allow \/:?*"<>| see http://lists.gnu.org/archive/html/bug-bison/2008-07/msg00008.html 2008-09-27 Di-an Jan <address@hidden> * src/closure.c (set_firsts): Use single loop over all rules. diff --git a/src/closure.c b/src/closure.c index ff3109c..4c1d40c 100755 --- a/src/closure.c +++ b/src/closure.c @@ -123,17 +123,16 @@ print_fderives (void) static void set_firsts (void) { - symbol_number i, j; + rule_number r; firsts = bitsetv_create (nvars, nvars, BITSET_FIXED); - for (i = ntokens; i < nsyms; i++) - for (j = 0; derives[i - ntokens][j]; ++j) - { - item_number sym = derives[i - ntokens][j]->rhs[0]; - if (ISVAR (sym)) - bitset_set (FIRSTS (i), sym - ntokens); - } + for (r = 0; r < nrules; r++) + { + item_number sym = rules[r].rhs[0]; + if (ISVAR (sym)) + bitset_set (FIRSTS (rules[r].lhs->number), sym - ntokens); + } if (trace_flag & trace_sets) bitsetv_matrix_dump (stderr, "RTC: Firsts Input", firsts); The shift_symbol-bitset.patch changes an array of NSYMS integers that's sorted after adding all the shift symbols of a state to a bitset of NSYMS bits. Obviously there's space saving. For speed, the trade off is between an insertion sort verses possibly slower clear, set, count and iterate. The changes to state.* allows transitions states to be filled in in place rather than copied over. I'm not sure whether BITSET_FOR_EACH or loop and bitset_test is more efficient, or whether the original code is faster. I couldn't get good timing on Cygwin. So I attached patches for both. In general, someone should figure out which is more efficient for each bitset and document it. The bitset_test version is tested with make check on Cygwin with 1 expected failure. The bitset_iterator version is tested in combination with other stuff. 2008-09-27 Di-an Jan <address@hidden> * src/LR0.c (shift_symbol): Change to bitset. (nshifts, shiftset): Remove. (allocate_storage): Initialize shift_symbol, remove shiftset. (free_storage): Free shift_symbol, remove shiftset. (new_itemsets): Clear and update shift_symbol. (append_states): Iterate over shift_symbol already sorted. Allocate and fill in state transitions in place. Don't use shiftset. (set_states): Adjust call to state_transitions_set. (generate_states): Move state_transitions_set to append_states. * src/state.h (state_transitions_set): Remove states parameter. * src/state.c (state_transitions_set): Remove states parameter. * src/state.c (transitions_new): Remove states parameter. Other things I'm thinking of working on: Rework the line wrapping logic in src/print.c (print_grammar). A straight forward version is attached as print_grammar.patch, but it seems to be slower than with string buffers. Maybe I could do a version with a printf-like interface rather than END_TEST. In src/reader.c, instead of using a single symbol_list for all rules, use a list of rules, each containing a symbol_list. At the cost of an extra structure, this would move the per-rule fields out of each node of symbol_list and make the code clearer. Should be at least as efficient since all iterations over the one symbol_list stops after each rule. Use macros/inline functions for accessing ritem. Makes the code easy to understand without having to remember the special ritem structure. Introduce the typedef var_number for 0..nvars-1. More self-documenting and makes some code simpler. Do something like struct transitions { int num; state** states; }; struct state { ... struct transitions transitions; ... }; This uses just one pointer, as before, but probably more efficient and doesn't use "state *states[1];" to allocate 0 or more states. Similarly for reductions and errs. Consistent style. i++ vs ++i vs i+=1 statements, and put them in expressions as in dst[i++] = src[j++]. Remove semicolons after {}. Use XNMALLOC(n, type) or xnmalloc(n, s) and xmemdup from gnulib. Di-an Jan
for_all_rules.patch
Description: Text document
shift_symbol-bitset_test.patch
Description: Text document
shift_symbol-bitset_iterator.patch
Description: Text document
print_grammar.patch
Description: Text document
[Prev in Thread] | Current Thread | [Next in Thread] |