bug-mailutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The state I'm in - rfc 822 parsing.


From: Alain Magloire
Subject: Re: The state I'm in - rfc 822 parsing.
Date: Wed, 14 Mar 2001 22:18:17 -0500 (EST)

> 
> > > If I find the awful comment, I put it where I would have
> > > put a phrase.
> > 
> > Seems you actually remove comments totaly.
> > Some clients may want them right.  Especially in the "Received:"
> 
> Uh, GetMailBox() grabs the comment and puts in the personal,
> doesn't it? I don't have the source in front of me, but
> if what I sent you doesn't do that I sent the wrong diff.

# ./addr
alain(the squirrel fighter)@qnx.com
alain(the squirrel fighter)@qnx.com=> pcount 1
1 email <address@hidden>
   local-part <alain> domain <qnx.com>

# ./addr -1
alain(the squirrel fighter)@qnx.com
alain(the squirrel fighter)@qnx.com=> pcount 1
1 email <address@hidden>
   comments <the squirrel fighter>

> The underlying GetComment() collects the contents of the comment,
> so if (when) I parse Received fields, I will get the comment
> contents and return them.

Actually, I'm off topic with the Received field, since I do 
not think anyone will use it in address_create().

> > Ok, suggestion; how about to build on top(not now, but much ... much later)
> > something that would permit a user to build and save aliases:
> > alias_create (&alias, "canadian_hackers");
> > alias_add (alias, "address@hidden");
> > alias_add (alias, "address@hidden");
> 
> Sure, but I'm not a fan of calling it alias. Alias is used as a nickname
> feature in mail clients, and the alias can be to a single route-addr,
> and usually is. Alias is used in sendmail, I think, to expand to
> a list of addresses, #(mailbox), not to a group. In general, I like
> to call something by the RFC name, i.e. "group".

Ok.

> > 8-), If I remember djb, he was not easy to deal with.
> 
> Careful there, some people say that about RMS! ;-) Actually, his
..

8-)

> > > - <>, valid in SMTP "rcpt from:", should I parse and ignore it?
> > 
> > I vote strip. Since I want to reuse the email part to build the "From "
> > string separator for Unix Mbox.
> 
> By strip, you mean parse and ignore? If I parse and ignore, this
> would be a list with two addresses: address@hidden, <>, address@hidden

Yes, which sounds reasonnable, no?
for exemple:
        To: Alain M. <address@hidden>
I would expect address_get_email () --> "address@hidden".

> > Yes, you could use the "poison" mallocs (xmalloc(), xrealloc, xcalloc)
> > but I do know if that is the proper way for a library to behave. 
> 
> Our test suites at work do amazing things, we do a loop around
> our test cases, and before entering the loop set the n'th malloc()
> to fail. Then we call the test, it fails with ENOMEM, we then
> check that all allocated memory was free. Then we increment n.
> It's stunning what exhaustive checking turns up... I'll bet you
> a case of beer the same approach to libmailbox would turn up
> bugs.

I would __love__ to run this type of test on the library and programs.
You might be suprise, for the POP and MBOX(unix) client, mailutils
could pass we good grades 8-).

So far there is no serious test case for the mailutils, maybe later.

> > The realloc-by-1 is hard.  Why not construct a "string class" and realloc
> > by a fixed size, say 16/32, since it's only short strings.
> > 
> > struct _string {
> >   char *buffer;
> >   size_t buflen;
> > };
> > and pass around the struct _string *;
> 
> Ok. How about I use my favorite optimization, double memory on realloc(),
> half when usage drops below a quarter of allocated. This leads to O(log n)
> performance, or something, rather than O(n/16). I'll do this.
> 
> I was also thinking about using
> struct _strspan {
>       char* begin;
>       char* end;
>       };

Sounds good.

> So I could parse through stuff, return what I found, and the caller could
> decide whether they wanted to allocate space for it. Have to look and see
> if  that's useful.

How would you do this in C?  for example,
status = address_create (&address, big_complex_address_string);

if status == ENOMEM, I expect the address object to not be constructed and
any memory acquire in the parsing process release. Actually this goes
for any status != 0  i.e status != 0 means an error occured.


> > Or using GNU obstack object.
> 
> How easy would it be to cut GNU obstack code into mailutils? Not
> all libraries are GNU...

It's a generic container.

> > - (*)Proper prototypes is a good thing.
> 
> parse822_ as a function prefix ok? _ as word seperator, with
> all lower case function names ok? The current names are a
> result of ':%s/MParse822:://g'...

If you intend to extern them, yes to reserve a prefix for your
namespace is a good idea.

> > - (*)A more consitent use of "const char *" versus "char *"
> >   for example StrAppend*() should be :
> >   int StrAppend (char **to, const char *from);
> 
> I didn't do this initially, because the C spec has a bug:
> 
> char* acp[];
> 
> void f(const char* const a[]);
> 
> f(acp); /* this causes warnings in C, though not C++, C is
>   being lazy, its a safe conversion. */
> 
> It's a pain in the ass. Since I use double indirection a lot,
> I'll have to cast all over the place.

Ok, but this is not our case since you have "const char *" and for
double indirection "const char **".  Are you missing C++ issues,
Remember this is C ;-)

> > - (*)"new = (char*) realloc()", cast not needed
> 
> realloc() returns void*, same as malloc, doesn't it? I'll check.

Yes, realloc()/malloc()/calloc() returns a the generic pointer (void *)
so an explicit cast is not needed, You only will get a warning
if you did not include <stdlib.h>.  I do not know for C++.

> > - (**)GNU coding style: this not a high priority, I probably will not do 
> > nothing
> >   now, since lots of othe code have different style, for example, mime.c.
> >   It's up to you, some people can not stand it.
> 
> The GNU coding style was invented as a demonstration of how cool
> emacs is, IMO. There's not enough time in my life to convince my
> editor to support GNU conventions. I could run it through indent,
> but it would probably make changes all over the place. How do
> people who don't use emacs usually produce GNU code? If somebody
> has a vim cfg statement that will do it, I'll use it.

8-), sure, have pointed this to shaleh too, this is done for consistency.
There is a GNU std which is much more then just coding style.  So far having
a thight std as been very good in the harmony of the GNU project, this
goes for things like long_options which are consistent etc ..
If you browse the latest packages on alpha.gnu.org/gnu/{tar,grep,fetish}
(fetish is {file,shell,text}utils) you will notice an overall consistency
even in the function names ;-).

But that said, program in any coding style, you want, we can kosher
after, other things are more important like not having array limits.

> > - (***)Some memory leaks, when the parsing failed, since it's a library
> >   we should release, (see append patch, ok to commit?).  
> 
> Some? I don't recall checking the return value of a single StrAppend()!

8-) that's another issue, for example if the parsing failed, there were
some leaks. For example :
# ./addr 
alain me @me
alain me @me=> pcount 0

Would leak.  But very obvious to spot.

> I'll look at the patch tonight.
..
> In general, I'm not adverse to fixing bugs in code I wrote! 8-)

8-), well sometimes the authors may have a better patch  and
since parse822 is new, you may have already change things in your
repository.  For minor stuff, I usually commit straight.

> I'll integrate your patch with a bunch of the other things.

I just noted, some of diff are just trailing differences.

> Btw, my intention was to make the parsing functions public,
> not static, that's what I meant by "proper" prototypes.
> 
> Should parse822.h be in mailbox/include, mailbox, or
> include/mailbox? The component functions can be quickly used
> by others, for instance if you want'ed to parse the
> route, you would build your own loop, but use the parse822
> functions to grab the bits, skip the comments, etc. Or if
> you were to use it to parse the header fields, which I was,
> you might want to use its functions from somewhere else.

Yes good idea, parse822 stand on its own, I would recommand
the prefix you propose above (parse822_*()) and static anything else
internal.  Also to uncomment out the GetFieldName() GetFieldBody().

-- 
au revoir, alain
----
Aussi haut que l'on soit assis, on est toujours assis que sur son cul !!!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]