freetype
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freetype] Summary of ANSI preprocessor trouble..


From: Antoine Leca
Subject: Re: [Freetype] Summary of ANSI preprocessor trouble..
Date: Thu, 14 Dec 2000 12:13:06 +0100

Hi folks,

I was not carefully folloing the discussions lately, but I have
the feeling I can help. Just please forgive me if I restate
things that have been beaten to death, but I did not carefully
re-read the whole threads this about.

A bit of background is in order: I am the French member of
the ISO C committee, in charge of the definition of the C language.
So I know quite a bit about this issue, what can be done, what
cannot, and what is debatable (unfortunately, there are things
that are debatable, because the standard does not mandate an
unique behaviour; "#include identifier" is one of them).

David Turner wrote:
> 
> Here's a short summary of the situation regarding the infamous
> pre-processor concatenation problem. I hope it clarify things..
> 
> > Apparently, macros are not expanded for some compilers before
> > concatenation.  For example,
> >
> >   #define FOO xxx
> >   #define BAR FOO ## yyy
> >
> > gives `FOOyyy' instead of `xxxyyy'.

As such, the later result is not conformant.
But I believe things are not *that* simple...


>   - lexems are formed by _any_ suite of contiguous characters
>     that are not whitespace !!

To be more exact, it is the longuest sequence; and as whitespace
is not part of any token (lexem), they always separate tokens.
OTOH, "a12345+ rest" is *3* tokens
(David's rule taken strictly ends with only 2).

>     So #include <FOO/xxxx> should normally
>     never substitute, even though it does in a number of cases..

Indeed, C standard says this is not specified exactly, because
FOO/xxxx is a special token (this area is fuzzy, as a matter of
facts, but it stated that there are NO substitution here).

 
>   - when either '#' or '##' is used, macro substitution of the
>     arguments does _not_ happen. This explains the "FOOyyy"
>     thing

Well, it will happens, but *after* application of the effects of ##.
 
>   - when lexem concatenation is used, the whitespace surrounding the "##"
>     is discarded. which means that
> 
>        #define  z x ## y
> 
>     will result in "z xy"

Exacly.

>     A FEW ANSI C COMPILERS DO NOT RESPECT THIS RULE.

They are wrong and ought to be corrected. ASAP. This should be reported
to their authors, since this is very clearly stated in the standard.
(I can help you with Jacob Navia is you need to).


>   - a common way to perform concatenation with macro substitution is to use
>     a two-level scheme like (according to K&R):

Agreed, this is standard practice that every preprocessor ought to support.

>    - to make things worse, Werner recently sent me this piece of poetry
>      from the GNU cpp.info:
> 
>   """
>    The usual case of concatenation is concatenating two names (or a
>    name and a number) into a longer name.  But this isn't the only
>    valid case.  It is also possible to concatenate two numbers (or a
>    number and a name, such as `1.5' and `e3') into a number.  Also,
>    multi-character operators such as `+=' can be formed by
>    concatenation. 

Agreed so far. The later case is rather rare.

>    In some cases it is even possible to piece together
>    a string constant.

I do not see how, unless the string is really produced by the
# operator, which is a different beast.

>    However, two pieces of text that don't together
>    form a valid lexical unit cannot be concatenated.  For example,
>    concatenation with `x' on one side and `+' on the other is not
>    meaningful because those two characters can't fit together in any
>    lexical unit of C.  The ANSI standard says that such attempts at
>    concatenation are undefined,

Agreed (it explicitely says so, so "thou shalt not do that").

>    but in the GNU C preprocessor it is
>    well defined: it puts the `x' and `+' side by side with no
>    particular special results.

In such a case, a number of preprocessor (and this may very well
include gcc preprocessor), output the two tokens, *separated by a
space*. This is to avoid further passes to incorrectly parse an
apparently merged result as a single token (which it is not).

> This clearly contradicts Kernighan and Richie !!

Where? Please explain your reasonment.
BTW, D. Ritchie is the author of one of the ANSI conformant cpp,
and he intents to keep it up to date, so I really believe you
missed some point (no pun intended).


> It seems we need to
> put our hands on the _real_ ANSI C standard and see what the exact words
> are on it.. Anyone has this document and could share a quick lecture
> with us ??

I can do that easily! Can someone post (or a private mail) to direct me
at the real problem in the source(s). As I said, #include forms are special
business on this respect, and this may well be the root of the evil.

OTOH, if someone is willingful to take its head inside the C standard
(and has a few days to spend on this item), a late draft of C99 is
online <URL:http://anubis.dkuug.dk/JTC1/SC22/WG14/www/docs/n869/>.
The preprocessor (clause 6.10, but you should also read 6.4 in order to
understand) does not change in fact between this version and the real
standard (and the '89 ANSI standard is also fairly close on this).
Of course, this is not the real standard, so do not use if you intend to
claim conformance or doing likewise things.

I intent to follow this discussion on address@hidden, since it looks
like rather technical to me.


Antoine



reply via email to

[Prev in Thread] Current Thread [Next in Thread]