Hi all
I have an odd issue that's been plaguing me for a while now. I have a CentOS 6 machine directly connected to an ISP with a /26 block of static IPs. No PPPoE, just a static IP configuration in place.
On this server, I'm running a simple HTTP server (thttpd) with no scripting/plugins or any other overhead. HTTP server visible here:
http://203.217.168.98:8000/
The problem is that my server seems to be dropping some connections with large packet sizes. To replicate this, use a distributed http checker on these URLs:
http://203.217.168.98:8000/1kb.html
http://203.217.168.98:8000/8kb.html
Results are as below respectively:
http://s11.postimg.org/hbrgl20wj/screenshot_9.png
http://s11.postimg.org/svwad39xv/screenshot_10.png
The following output also shows the issue:
Code:
[user@server~]$ httperf --server 203.217.168.98 --port 8000 --uri /1kb.html --num-conn 10 --num-cal 5 --rate 2 --timeout 5
httperf --timeout=5 --client=0/1 --server=203.217.168.98 --port=8000 --uri=/1kb.html --rate=2 --send-buffer=4096 --recv-buffer=16384 --num-conns=10 --num-calls=5
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 1
Total: connections 10 requests 20 replies 10 test-duration 4.529 s
Connection rate: 2.2 conn/s (452.9 ms/conn, <=1 concurrent connections)
Connection time [ms]: min 28.7 avg 28.9 max 29.2 median 28.5 stddev 0.2
Connection time [ms]: connect 14.1
Connection length [replies/conn]: 1.000
Request rate: 4.4 req/s (226.5 ms/req)
Request size [B]: 75.0
Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples)
Reply time [ms]: response 14.7 transfer 0.0
Reply size [B]: header 226.0 content 1024.0 footer 0.0 (total 1250.0)
Reply status: 1xx=0 2xx=10 3xx=0 4xx=0 5xx=0
CPU time [s]: user 0.70 system 3.79 (user 15.5% system 83.7% total 99.2%)
Net I/O: 3.0 KB/s (0.0*10^6 bps)
Errors: total 10 client-timo 0 socket-timo 0 connrefused 0 connreset 10
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
[user@server~]$ httperf --server 203.217.168.98 --port 8000 --uri /8kb.html --num-conn 10 --num-cal 5 --rate 2 --timeout 5
httperf --timeout=5 --client=0/1 --server=203.217.168.98 --port=8000 --uri=/8kb.html --rate=2 --send-buffer=4096 --recv-buffer=16384 --num-conns=10 --num-calls=5
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 1
Total: connections 10 requests 10 replies 0 test-duration 9.515 s
Connection rate: 1.1 conn/s (951.5 ms/conn, <=10 concurrent connections)
Connection time [ms]: min 0.0 avg 0.0 max 0.0 median 0.0 stddev 0.0
Connection time [ms]: connect 15.1
Connection length [replies/conn]: 0.000
Request rate: 1.1 req/s (951.5 ms/req)
Request size [B]: 75.0
Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (1 samples)
Reply time [ms]: response 0.0 transfer 0.0
Reply size [B]: header 0.0 content 0.0 footer 0.0 (total 0.0)
Reply status: 1xx=0 2xx=0 3xx=0 4xx=0 5xx=0
CPU time [s]: user 1.07 system 8.03 (user 11.2% system 84.4% total 95.6%)
Net I/O: 0.1 KB/s (0.0*10^6 bps)
Errors: total 10 client-timo 10 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
I've contacted my ISP a couple of times about this and they're currently digging deeper but insist it's not at their end (of course).
How might I diagnose this problem further?
Thanks