|
From: | address@hidden |
Subject: | Re: [lwip-devel] Re: [task #7040] Work on tcp_enqueue |
Date: | Mon, 02 Feb 2009 20:17:18 +0100 |
User-agent: | Thunderbird 2.0.0.19 (Macintosh/20081209) |
bill wrote:
>From what I saw, this adds more code and in my tests, degraded performance. Even though slight, there is a loss on both accounts. If it's going to be added for these minimal systems, then there needs to be an option for it because the minimal system might have the memory to offset the added time required to append and split the pbufs. I know it adds time, because I benchmarked full pbufs doing it in tcp_sent versus doing it using Stoklund's patch. And we know from the +'s to -'s that more code was added than removed.
Unfortunately, I still don't see what you mean by 'split the pbufs'? You allocate the pbufs you need and copy data into it. Or at least that's how I see the current code...
We do that for small writes, so why not do it for big writes? I'd favour to get rid of the long pbuf chains altogether since they waste RAM (the pbuf structs) and are inefficient (when calculating checksum).Isn't it done for small writes by the Nagle option?
No, it's not. The nagle option prevents _sending_ segments that are not full, it doesn't prevent the user _enqueueing_ many small chunks and every chunk gets its own pbuf, even for 1 byte. Of course using the stack like that is inperformant because it leads to many tasks switches (or only locking if you use the pre-sockets2), but still: wasting memory like that shouldn't be necessary.
Wouldn't it be transparent if tcp_sndbuf returned a multiple of MSS when there is more than MSS in the snd_buf? Then, the standard code in tcp_sent would send a multiple of MSS or the remainder when not. In fact, if TCP_SND_WND is a multiple of TCP_MSS, isn't the problem solved then as well?
I don't know, with a high latency it could be that you waste the last < MSS chunk because ACKs are received late...
I'm trying not to be a pain, but I sense changes are being proposed which will hurt bandwidth on some systems for sending data when there is a simple solution at the application level or maybe even in the lwIP settings to resolve it. Recently there was an lwIP user really fighting a transmit bandwidth problem and I don't know if they abandoned lwIP, or their project, or what.
The problem can only partly be solved by the application layer: You can limit the data passed to tcp_write in a way that you work around the bad implementation of it, but then again, why shouldn't you move exactly that code into tcp_enqueue, where raw API users can also benefit from it? The code size should stay the same.
What you can't solve by existing options or in the API layer is the small chunk problem: you cannot hold back the chunks from tcp_write without caching them in some way, which would mean an extra copy.
I'm not saying we have to tune lwIP's TCP for speed only, but in my opinion the current implementation is, let's say, sub-optimal and worth thinking about a solution for it.
Simon
[Prev in Thread] | Current Thread | [Next in Thread] |