lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] LWIP - TCP receive assert failed


From: Sylvain Rochet
Subject: Re: [lwip-users] LWIP - TCP receive assert failed
Date: Fri, 16 Jan 2015 16:59:48 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Hello Jackie,

On Fri, Jan 16, 2015 at 11:46:02PM +0800, Jackie wrote:
> Hi Sylvain,
> 
> Thanks for your reply. I've been working hard on this issue lately, and
> I found something interesting. Specifically I am using FTP for
> upper-level application protocol, based on TCP connection in LWIP.
> Because of convenience of test, I use PPP to connect the FTP server on a
> host PC. So basically it is like,
> 
> FTP client <---> TCP/IP (LWIP) <---> PPP <----------------------->
> TCP/IP (Linux) <---> FTP server.
> 
> After stress test and debugging, more than 10 hours uploading data, I
> found the PCB got corrupt in tcp_output(). The case is that tcp_output()
> can be blocked by the lower-level function call in tcp_output_segment(),
> in which somehow the buffer of lower-layer protocol is full, so the
> upper-layer is pending, and at the same time, tcp timer is running, 
> tcp_slowtmr() is also calling tcp_output(), so this tcp_output() is
> called before the previous call is finished, like,
> 
> tcp_output()
> {
>     ......
>     tcp_output_segment();  // may be pending here ---> tcp_output() is
> called by tcp_slowtmr(), and returned;
>     ......
>     do something about pcb->unacked and pcb->unsent;
>     ......
> }
> 
> Obviously pcb->unacked and pcb->unsent can be corrupt, but
> pcb->snd_queuelen is unchanged, thus resulting a mismatch between the
> queue length and the data in the queue of unacked and unsent. Eventually
> the program will go into an assertion.
> 
> Since I am using a very old version of LWIP, I am not sure if there is a
> problem in the new one. In my opinion, tcp_output() is better to be
> designed as reentrant function, it can be blocked, in case the buffer
> form lower layer is full, it will be waiting a "write signal" to
> continue sending data.
> 
> What I changed as a workaround is try to re-check the pcb after
> tcp_output_segment(), when the local pointer useg should be pointing to
> the tail of unacked queue, otherwise, the unacked queue's content can be
> re-written.
> 
> Do you have any concern about it? Any suggestion and discussion is welcome.

Looks like you are hitting a real bug here.  Maybe it is #34435 or 
#36380, maybe, maybe, this is only from what I remember.

Since you are able to reproduce this bug, could you check if it happens 
with master branch ?  The PPP API changed a little bit, maybe your lwIP 
port as well should be redesigned a bit if you are really using a very 
old lwIP version, but I am confident you will be able to sort that out :-)

Sylvain

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]