[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to record source properties for all symbols?

From: Fis Trivial
Subject: Re: How to record source properties for all symbols?
Date: Mon, 4 Jun 2018 15:41:52 +0000

Mark H Weaver writes:

> The problem is that there's no place to store the source information for
> symbols in the standard S-expression representation.
> The principal defining characteristic of symbols -- that "two symbols
> are identical (in the sense of 'eqv?') if and only if their names are
> spelled the same way" (R5RS ยง 6.3.3) -- combined with the fact that
> 'eq?' is specified to be the same as 'eqv?' for symbols, leaves us no
> way to distinguish two instances of the same symbol, and therefore no
> way to store per-instance annotations such as source information.
> Fixing this would require abandoning the plain S-expression
> representation in favor of one in which symbols are represented by a
> different data structure.  Our reader would need to be extended to
> support the option of returning this new data representation instead of
> plain S-expressions, and our macro expander would need to be modified to
> accept this new representation as input.

I still believe it's crucial to give user correct and detailed error
message, since these day the software world is so large we have to learn
stuff by trial and error. New languages strides to embed a full tutorial
in their error message. Like (I am NOT promoting) rust which gives
explanation for basically every syntax and semantic error. Now users
basically take correct error message for granted.

After poking for a few days, I found that I have hard time understanding
the code, can you give me some hints for reading the code so that I can
understand how to encode the source information. Currently, I am still
trying to make a baby step, encode source information into symbol. I
know that will break everything, but at least I will have basic
understanding of underlying mechanic. Then latter I will try to redefine
whatever special structure needed.

I have a few experience with simple compilers, like the one from dragon
book, or some simple DSL. But it seems guile's compiler is quite
different from those I used to know. I'm still trying to understand what
happens when a symbol is read.

Though I didn't find any keyword like `parser' or `lexer', but I tried
to dig into `scm_read_expression', which returns a stringbuf. In
`scm_read_sexp', tmp is somehow encoded in tl by these three lines of

      new_tail = scm_cons (tmp, SCM_EOL);
      SCM_SETCDR (tl, new_tail);
      tl = new_tail;

And by `scm_cons', `tmp' is in the GC, but is it still a stringbuf or
turned into meaningful symbol? Is the GC somehow also represents the
term "environment" in other compiler front ends? What's the effect if I
apply `maybe_annotate_source' to variable `tl'? Well, I tried the last
one, it doesn't do anything.

I guess it will take a long time before I can understand stuffs in
guile, I'm not a smart person. I will keep working on it at spare time,
but if maintainers have any desire to make the wished feature happen
before I make another baby step, please let me know, I can offer my

reply via email to

[Prev in Thread] Current Thread [Next in Thread]