gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] Dotted lists


From: Camm Maguire
Subject: Re: [Gcl-devel] Dotted lists
Date: 25 Aug 2002 16:31:42 -0400

Greetings, and thanks for your feedback here!  I'm trying to get a
sense of the dominant usage/intent of the language, and to that extent
I really appreciate the experience of working lisp users.

"Vadim V. Zhytnikov" <address@hidden> writes:

> Frankly, I don't quite understand you.  Why do you think that GCL
> does not support dotted list?  Actually the only structure any Lisp

Of course gcl can read and display a dotted list.  But *many* supplied
functions, a well as functions output by the compiler, operate on
lists via a syntax like:
        for (;!endp(list);list=list->c.c_cdr) {...}

Where endp is equivalent to

(type_of(list)==t_cons ? FALSE : (list ==Cnil ? TRUE : 
FEwrong_type_of_arg(...)))

Here it is obviously assumed that all arguments should terminate with
a Cnil in the last c_cdr.  endp is used both to detect the end of list
and to report an error should any other item/atom be encountered along
the way.   If you read the section under the spec for endp, in the
notes at the end, this is described as the best way to handle things
for endp, as one can never distinguish the case that an atomic
argument was the final cdr of a dotted list from the case that a user
just passed a bogus non-list argument.  This is even so knowing full
well that one will encounter a problem at the end of a dotted list
should one use endp to detect the end.

Gcl has (broadly) two endp functions, one exposed to the user, and one
used internally to traverse lists.  The former must preserve the above
functionality, as far as I can tell.  Using this in the latter
contexts, though, introduces a type error for every function on a list
when passed a dotted list.  Some of these appeared in Paul's previous
reports.  The question is how best to fix these existing bugs.

endp is used in *a lot* of places!  This fact basically led me to the
solution I have yet to commit.  But it might still not be the best
approach if I'm misunderstanding the typical use of dotted lists in
lisp.  Let me briefly outline the two general approaches as I see
them:

1) Rewrite every lisp traversing loop, in the supplied C or written by
   the compiler, to check for a non-list *before* entering the loop,
   terminate the loop not with endp but with type_of(list)!=t_cons,
   and then write a final element processing piece of code to handle
   the last atom in case it is not Cnil.  Pros -- transparent, easy to
   read/understand/debug ; Cons -- *a lot* or work, with perhaps the
   majority of functions needing some editing.

2) To use something like the following for the *internal* endp:

EXTER struct cons my_dot,my_dot_pit;

#define endp(x) ({\
    object _x=(x);\
    bool _b=FALSE;\
    \
    if (type_of(_x)==t_cons) {\
       if (type_of(_x->c.c_cdr)!=t_cons && _x->c.c_cdr!=Cnil)\
          my_dot.c_car=_x->c.c_cdr;\
       else \
          my_dot.c_car=(object)&my_dot_pit;\
    } else {\
       if (_x==my_dot.c_car)\
          x=(object)&my_dot;\
       else {\
         my_dot.c_car=(object)&my_dot_pit;\
         if (_x==Cnil || _x==(object)&my_dot_pit)\
             _b=TRUE;\
         else\
             FEwrong_type_argument(sLlist, _x);\
       }\
    }\
    _b;\
    })

   This C macro will basically traverse a list as normal until it
   encounters the last cons with a non-Cnil, non-cons cdr.  In this
   case, it sets the car of a special cons called my_dot to the final
   atom.  If the next call to endp is given an argument of this same
   atom, then the macro will redirect the argument variable to the
   address of my_dot, allowing the body of the loop to operate
   transparently on this final atom as if it were the car of the next
   cons.  

   In this way, all atoms are processed by all existing functions the
   same way.  All that remains to be adjusted are cases which examine
   the loop variable at the end of the loop, or assume that it is
   Cnil.  Such functions are likely those which return another list on
   output, such as ldiff.  With no modification, all functions will
   continue to work as written on dotted lists, with the exception
   that they will always return their answer in the form of a proper
   list.  (ldiff '(a b c . "abcde") 'd) -> (a b c "abcde").  To handle
   such cases, the cdr of my_dot is set to a special non-Cnil symbol,
   'my_dot_pit', which enables the end of list processing code to
   determine whether the the list end was proper or dotted, and to act
   accordingly.  For example, the patch to ldiff becomes:

+#define proper_list(a) (type_of(a)==t_cons || (a)==Cnil)

+        if (proper_list(x))
                vs_push(Cnil);
+       else
+               i--;
        while (i-- > 0)
                stack_cons();

   The merit of this approach only really pays off if two assumptions
   hold:
        a) the number of functions like ldiff requiring end-of-loop
   modification is small, and
        b) Almost all functions taking a proper list as an argument
   are meant to behave identically on a dotted-list argument.

   Both of these seem valid to me, hence the solution I've
   implemented.  But I'm not sure, and so am asking the list.  In sum,
   for this approach:
        Pros -- All existing 'loop-body' code of functions traversing
   a list will correctly handle the final atom of a dotted list with
   no further modification, while preserving the existing
   error-reporting functionality of endp when passed a random atomic
   argument.   Less work.
        Cons -- Not as transparent, still leaves (what will hopefully
   be a few) end of loop issues, does not reject dotted lists for
   functions which are only supposed to be supplied proper lists
   (again hopefully few in number).

   Another question is whether there are areas of the code output by
   the compiler which by design are only passed proper list arguments.
   If this is so, using the new endp here as well will cause no harm,
   as it handles proper lists in the same way as in the past, but
   carries slightly more overhead.  Are all functions in lisp directly
   accessible to the user?  Can a user pass a dotted-list to some
   internal compiler function, for example?  Sorting out the
   difference will likely be more trouble than its worth, no?  But
   surely something like the following from cmpmap.lsp will have to be
   changed (non-safe-compile case:)

       (cond (*safe-compile*
              (wt-nl "if(endp(" (car handies) ")")
              (dolist** (loc (cdr handies)) (wt "||endp(" loc ")"))
              (wt "){"))
             (t
              (wt-nl "if(" (car handies) "==Cnil")
                                          ^^^^^^^
              (dolist** (loc (cdr handies)) (wt "||" loc "==Cnil"))
              (wt "){")))


   In the tree I have, all internal uses of endp have been centralized
   to the one listed above, with two exceptions: 1) as any user code
   invoking endp by name will use the old function (as opposed to
   internal uses of endp), the optimizer must be instructed not to
   replace '(endp' with the new macro, but rather with the old:

cmpopt.lsp:

#define endp_prop(a) (type_of(a)==t_cons ? FALSE : ((a)==Cnil ? TRUE : 
(FEwrong_type_argument(sLlist, (a)),FALSE)))
;;ENDP
 (push '((t) boolean #.(flags)"endp_prop(#0)")
   (get 'endp 'inline-safe))

   2) This setup won't work if the argument of endp is the output of a
      function, as it cannot be redirected.  There is as yet one place
      in the compiler where I have not addressed this, though it can
      be cleaned up pretty easily in the future:

cmpmap.lsp:
         (wt-nl "while(!endp_prop(MMcdr(" handy ")))" handy "=MMcdr(" handy 
");")

   The tree I have correctly compiles maxima and acl2 with no
   perceptible performance loss, and fixes to my understanding all of
   Paul's related bug reports.

So what should we do?


> (and CL and GCL in particular) works with are binary trees which
> recursively consist of so called dotted pairs (CAR . CDR) where
> CAR and CDR are either atoms or dotted pairs onece again.
> Notice that such structures are comletetely symmetric with
> respect to CAR and CDR.  However it is true that usually it is 
> convenient and sufficient to work with subclass od binary trees
> - proper lists.  And many CL bult in functions expect proper
> lists as their arguments.  With this exception proper and dotted
> lists must be treated on equal footing.

OK, this would seem to argue in favor of the new endp macro, yes?


> 
> I don't know about compiler but a) and b) seems to be OK.
> Dotted lists are absolutely essential and must not be
> mapped into proper lists.  Any practical distiction must
> be addressed at each concrete functin.

OK.  Thanks for this!  I'd appreciate your comments on the above as
well if you have time.

Take care,

> 
> Best wishes,
> 
> Vadim
> 
> -- 
>        Vadim V. Zhytnikov
> 
>         <address@hidden>
>        <address@hidden>
>        <address@hidden>
>       <address@hidden>
> 
> 
> 
> 
> 

-- 
Camm Maguire                                            address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah




reply via email to

[Prev in Thread] Current Thread [Next in Thread]