lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-users] Poor RX performance, misconfigured lwipopts?


From: josephjah
Subject: [lwip-users] Poor RX performance, misconfigured lwipopts?
Date: Wed, 6 Mar 2019 22:02:23 -0700 (MST)

Hello everyone,

I've been troubleshooting an RX performance issue for a couple of weeks now
and while I've made incremental gains and improved my own driver-side code,
I'm still missing something big here and I suspect that I'm likely just
misusing lwIP or have a bad config. Any help would be greatly appreciated.


Setup:

 - Port: Unix
 - Hardware: Modern desktop-class hardware for both RX and TX
 - API: Socket (I know the raw API is more performant).
 - Protocol of concern: TCP


Test setup:

For each of the following tests I'm sending and receiving blobs of data of
exponentially increasing size (from 4mb to 256mb). All sizes exhibit roughly
the same problem over the course of their TCP stream.

 - (1) OS native sockets TX -> OS native sockets RX (for baseline
comparison)
 - (2) lwIP socket API TX -> OS native RX (no issue here)
 - (3) OS native TX -> lwIP socket RX (poor performance)
 - (4) lwIP socket TX -> lwIP socket RX (exceedingly poor performance)

 - In the capture file I have attached (11.7.7.15) is the native OS stack
sending to lwIP at (11.7.7.130)

 I have tested against macOS and linux stacks, both react similarly to the
odd lwIP behavior.


Observations:

In all test configurations (1/2/3/4) at home on my LAN I get roughly
comparable throughput and everything behaves just fine. However when I test
from a workstation at my office to home things change. For test (2) I see
about 90% of baseline, but for test (3) I usually see 25% of baseline (very
rarely it performs around 90% like test (2)). And test (4) is absolutely
horrendous with about 5% of baseline.

When looking at a packet capture (attached), I see a smattering of DUP ACKS,
Retransmissions, and in some cases ACKs that seem out of order or extremely
old but that are not duplicates!

It appears that eventually the sending side decides to reduce the segment
size.

I've turned on DEBUG types and observed delayed ACK from lwIP

I've also turned on STATS and am not noticing any errors whatsoever.

Conclusions:

 - Given that I only see this from my office I am suspecting that either
packet loss or the increased latency is creating a situation that my current
lwIP config isn't handling well.
 - Since I can observe >95% of baseline when on my LAN I doubt there is a
bottleneck in my code or lwIP itself.
 - Since the performance seems to get worse with the addition of (lwIP RX)
on either side I'm suspecting that if I fix my issue in test (3) it should
also fix test (4).


Theories:

 - After much reading a common theme of dropped frames comes up, I've
inspected and simplified my ethernet driver to the point where I don't
believe this is a possibility, especially given how well it performs on my
LAN.
 - I've read that TCP_TMR_INTERVAL is tick based and not based on an actual
timer. I've toyed around with this value lowering it to 1 and raising it to
4000 but I feel this is a shortsighted approach asking for trouble. I've
looked at the unix port and it does look like it's using a clock so I'm
somewhat confused on this point.
 - Initially I thought this could be a window scaling issue so I've bumped
it up to a pretty high value for testing.

Questions:

 - Am I out in the weeds?
 - What else can I do to narrow down the issue?
 - What are some reasonable values to explore in lwipopts.h given my setup?

Thanks in advance for any assistance!
 - Joseph

lwipopts.h:

#define LWIP_MTU          2800
#define LWIP_CHKSUM_ALGORITHM          2
// memory
#define MEMP_NUM_NETCONN          1024
#define MEMP_NUM_NETBUF          2
#define MEMP_NUM_TCPIP_MSG_API          64
#define MEMP_NUM_TCPIP_MSG_INPKT        64
#define PBUF_POOL_SIZE                  128
#define TCP_DEFAULT_LISTEN_BACKLOG      0xff
// arp
#define ARP_TABLE_SIZE                  64
#define ARP_MAXAGE                      300
#define ARP_QUEUEING                    1
#define ARP_QUEUE_LEN                   3
// ip
#define IP_REASS_MAXAGE                 15
#define IP_REASS_MAX_PBUFS              32
// tcp
#define TCP_TMR_INTERVAL                100
#define TCP_WND                         0x7fff8
#define TCP_MAXRTX                      12
#define TCP_SYNMAXRTX                   12
#define LWIP_TCP_SACK_OUT               1
#define LWIP_TCP_MAX_SACK_NUM           4
#define TCP_MSS                         (LWIP_MTU - 40)
#define TCP_SND_BUF                     (64 * TCP_MSS)
#define TCP_SND_QUEUELEN                (64 * (2 * (TCP_SND_BUF/TCP_MSS)))
#define TCP_SNDLOWAT                    (0xffff - (4*TCP_MSS) - 1)
#define TCP_SNDQUEUELOWAT               LWIP_MAX(((TCP_SND_QUEUELEN)/2), 5)
#define TCP_WND_UPDATE_THRESHOLD        LWIP_MIN((TCP_WND / 4), (TCP_MSS *
4))
#define LWIP_WND_SCALE                  1
#define TCP_RCV_SCALE                   3
// tcpip
#define TCPIP_MBOX_SIZE                 0
#define LWIP_TCPIP_CORE_LOCKING         1
#define LWIP_TCPIP_CORE_LOCKING_INPUT   1
// netconn
#define LWIP_NETCONN_FULLDUPLEX         0
// netif
#define LWIP_SINGLE_NETIF               0
#define LWIP_NETIF_HWADDRHINT           1
#define LWIP_NETIF_TX_SINGLE_PBUF       0
#define TCPIP_THREAD_PRIO               1
 

lwip_poor_rx_perf_subset.pcapng
<http://lwip.100.n7.nabble.com/file/t1811/lwip_poor_rx_perf_subset.pcapng>  



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]