[Findutils-patches] 'find' code structure issues

findutils-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Findutils-patches] 'find' code structure issues

From:	Dale R. Worley
Subject:	[Findutils-patches] 'find' code structure issues
Date:	Sun, 7 Sep 2014 14:00:24 -0400

While investigating the changes needed for my proposed extension of
-type/-xtype, I've run into some places where I think the code could
be improved.

1. A parsing function may add more than one predicate node, but the
last of them must be the "important" node.

Some of them affect build_expression_tree() in tree.c,
in particular, the code which is run after a successful call of 
    (*(parse_entry->parser_func)) (parse_entry, argv, &i)
that is, the parsing function for the predicate which the parser is
now looking for:

          last_pred->p_name = predicate_name;

          /* If the parser consumed an argument, save it. */
          if (i != oldi)
            last_pred->arg_text = argv[oldi];
          else
            last_pred->arg_text = NULL;

Fundamentally, this code is a layering violation; it finishes setting
the members of the last predicate entry on the list.  This loading
should be done completely by the parsing function.

This way to organize the code also has the problem that if the parsing
function adds more than one code, the last of the nodes that it adds
has to be the "important" one, which will get labeled with information
from the argument to 'find' that is currently being parsed.  And not
that any basic predicate can add more than one node -- if the
preceding node was also a basic predicate, the parsing function has
to insert a "-a" node before it can add the node for the current basic
predicate.  If the loading was handled entirely by the parsing
function, the last added node would not need to directly correspond to
the current 'find' argument.

2. Setting ->p_name.

The code in build_expression_tree sets the p_name field of the last
predicate node to be the argument of 'find' that is currently being
processed.  But the p_name field is also set in other places during
parsing:

line 1283 in build_expression_tree in tree.c, when it is constructing the
initial "(" node to surround the entire predicate list.  Similarly at
lines 1361, 1378, and 1382

line 1561 in get_new_pred_chk_op in tree.c, when it is constructing
the implied "-a" node between two basic predicates.

line 109 in insert_primary_withpred in util.c, when it is constructing
a new node for a basic predicate.  But note that the value being
inserted is incorrect:  it uses entry->parser_name, which if the
predicate is a word with preceding "-", doesn't include the "-".  (But
it will eventually be corrected by the code in build_expression_tree.)

It seems to me that the correct way to handle this is to set p_name as
near as possible to the point the node is created, and then to leave
it set.  This assigns the responsibility for handing its value
clearly, and avoids problems if the final node created doesn't
directly correspond to the 'find' argument.

3. Setting ->arg_text.

If the predicate that is currently being parsed has a "value"
argument, then the code in build_expression_tree sets the arg_text
field of the last predicate node to be the value.  E.g., if the
predicate is "-atime +30", then arg_text is set to "+30".

But arg_text is also set in line 1578 in get_new_pred_chk_op in
tree.c, which is given the proper value in the 'arg' argument.

It seems to me that the correct way to handle this is to set as near
as possible to the point the node is created, in functions like
get_new_pred_chk_op.  This assigns the responsibility for handing its
value clearly, and avoids problems if the final node created doesn't
directly correspond to the 'find' argument.

4. Creating implied nodes.

There are a number of places where nodes are created that do not
correspond to specific arguments to 'find':

line 1283 in tree.c:  The initial "(" to enclose the whole expression.

line 1361 in tree.c:  The final "-print" if no action is specified.

line 1378 in tree.c:  The final ")" to enclose the whole expression.

line 1382 in tree.c:  The final "-print" if no action is specified.

line 1561 in tree.c:  The implied "-a" between two basic predicates.

In all of these cases except the last, the node is constructed by
calling the parse_*() function for the particular node, and then
executing one or more statements to perform the work that is now done
by 1341 to 1347.  These could be structured more cleanly if the work
of 1341 to 1347 was moved into the parsing functions (as described
above).

Comments?

Dale

[Prev in Thread]

Current Thread

[Next in Thread]

[Findutils-patches] 'find' code structure issues, Dale R. Worley <=

Prev by Date: [Findutils-patches] Proposed extension to 'find'
Previous by thread: [Findutils-patches] Proposed extension to 'find'
Index(es):
- Date
- Thread