mailutils/caching

bug-mailutils
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
mailutils/caching

From:	Alain Magloire
Subject:	mailutils/caching
Date:	Tue, 27 Mar 2001 21:42:09 -0500 (EST)
Bonjour

This a email exchange with Nic Fermier, it is interesting because it reflects
the idea that I've been bouncing with some of you privately.

Nic presented the problem quite accuretly, so worth a read.

GNU mailutils will try to go with the mailbox caching instead of the
stream caching, since we are not bound to an API(JavaMail).

#######################
Subject: Re: A question
To: address@hidden (Nic Ferrier)
Date: Tue, 27 Mar 2001 20:19:40 -0500 (EST)
From: "Alain Magloire" <address@hidden>

> 
> >>> "Alain Magloire" <address@hidden> 27-Mar-01 5:49:43 PM >>>
> 
> >For referrals, I suspect that all(most) commercial mailer
> >(Eudora, OE, Mulburry etc ...) supports this feature.
> 
> (stroking white cat) "Excellent! Excellent! Prepare the launch pad!"
> 
> 
> >And on your side? How are things coming along ?
> 
> I've got the start of the server storage abstraction system coming
> along. I need to work out exactly how I'm going to handle multiple
> copies of messages for same store recipients (I'll use some sort of
> linking system, but how sophisticated?)
> 
> 
> The IMAP Javamail provider is working (pretty much) for reads - I've
> got to add the write stuff to it yet. The Sun Javamail lead is very
> negative about using a single connection - which is nice. I don't have
> much time for the Sun Javamail team: Bill Shannon doesn't seem to be
> aware that IMAP is used on really *big* systems as well as
> departmental ones. His doesn't understand the needs of that
> environment.
> 
> I've had a rather good idea that I'm going to put into the client,
> I've realised that a partial fetch is the most obvious way to map IMAP
> data to an InputStream.
> 
> This is all a bit deep Java so ignore it if you're not interested,
> I'm using this email to work it out apart from anything  /8->
> 
> 
> Basically the problem is that you can get a stream from a Message to
> read the content but (with the Sun impl) you'll be reading a buffer of
> the content. If the Message is a small text message that's okay but if
> there's a PDF or some other attachment of any size you'll be reading
> data for a while.
> 
> The problem is exacrebated by the fact that conventional buffering
> streams are used to read data in and out of buffers. These streams
> cause massive amounts of temporary buffer to be created.
> 
> But I've realised that if the read of a message is deferred until the
> InputStream's read method is called then I can implement message
> handling just like any other buffering system.
> 
> When the user requests a stream from the javamail Message object a
> special IMAPFetchStream will be returned. An IMAPFetchStream will know
> how to do partial fetches of the message and part it represents.
> 
> The partial fetch amount will be governed by how much the user
> requests and a default lowest transaction value.
> 
> So for example if the user calls:
> 
>   Message m=folder.getMessage(1);
>   InputStream in=message.getRawInputStream();
>   byte[] buf=new byte[5000];
>   int bytes=in.read(buf,0,5000);
> 
> then 5Kb will be read using the command:
> 
>    FETCH 1 (body[]) <0.5000>
> 
> the contents can be returned automatically to the user but could also
> be used to provide a partial cache.
> 
> Thus a user could read the content of the message, part by part, and
> still cause all message content to be cached, or could ensure that no
> content is cached.
> 
> Cool huh?

8-) Yes.

I would recommand to provide a threshold, for example caching upto XXX Kbytes
meaning if the threshold is reach part of the buffer should be flush away
and reset etc .. this will make the code a bit harder but save in term
of memory when it is desired.

in the case of
  FETCH 1 (body[]) <0.5000>
The buffer will be reset 10 times for limit of 500.  For a limit
of 10000 it will still have room to grow.

Caching makes random access of the message with different offset easier.

The way GNU mailutils will tackle this, is different, the two main ideas
we are considering are:

- A caching mailbox, a mailbox that sits on top of another one so all the
  requests are cache. For example:

  mailbox_t mbox;
  mailbox_create (&mbox, "cache:pop://localhost");

  so the requests are pass to the pop://localhost mailbox but the
  results are cache by the caching"cache:" mailbox.  The cache mailbox can
  also give persistency by saving the results to files.
  Since the caching is abstract :
  
  mailbox_create (&mbox, "cache:imap://localhost/INBOX");
etc ..

- Doing the caching of the stream:

  mailbox_t mbox;
  message_t msg;
  stream_t stream;
  mailbox_create (&mbox, "pop://localhost");
  mailbox_get_message (mbox, 2, &msg);
  message_get_stream (msg, &stream);

  stream_read (stream, buffer, sizeof (buffer), NULL);
 ...

  By doing the stream_read (), everything will be cache.

- Having a special stream caching object, this follow what you
  were proposing above.

> This is one of the things I've been most trying to achieve - low
> memory performance ofset by traffic increase. Since one of the targets
> where I will use this most is webmail that's an acceptable offset.
>
[Prev in Thread]
Current Thread
[Next in Thread]
mailutils/caching, Alain Magloire <=
Prev by Date: Re: mailutils (rfc822 parser)
Next by Date: Re: A question
Previous by thread: autogen.sh bug in CVS mailutils from 17.Mar.2001
Next by thread: Re: A question
Index(es):
- Date
- Thread