LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 12-14-2007, 06:54 AM   #1
Sheridan
Member
 
Registered: Aug 2007
Location: Hungary
Distribution: Fedora, CentOS
Posts: 91

Rep: Reputation: 21
Question Too many FIN_WAIT2 in netstat output


Hi,

I use the machine in question to periodically test connection to (mostly dead) machines to see if they're up/workable yet. I use this to maintain a very long list of working servers.

I use wget to try and download a file from them as a test. Wget has been set at a connection timeout of 5, and read timeout of 15.

Now, I have discovered that when typing in the "netstat" command, I have tremendous amount of connections marked "FIN_WAIT2" in the list (I mean, 500+ average).

Can I do anything to make Linux destroy those connections sooner? (preferably without making the machine to reboot)

Levente

Last edited by Sheridan; 12-14-2007 at 06:57 AM.
 
Old 12-15-2007, 03:00 AM   #2
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -

You can always try reducing the "tcp_fin_timeout" kernel parameter on your server:

http://www.cs.uwaterloo.ca/~brecht/s.../ip-sysctl.txt
Quote:
tcp_fin_timeout - INTEGER
Time to hold socket in state FIN-WAIT-2, if it was closed
by our side. Peer can be broken and never close its side,
or even died unexpectedly. Default value is 60sec.
Usual value used in 2.2 was 180 seconds, you may restore
it, but remember that if your machine is even underloaded WEB server,
you risk to overflow memory with kilotons of dead sockets,
FIN-WAIT-2 sockets are less dangerous than FIN-WAIT-1,
because they eat maximum 1.5K of memory, but they tend
to live longer. Cf. tcp_max_orphans.
http://www.cs.uwaterloo.ca/~brecht/servers/tcp.html
Quote:
% cat /proc/sys/net/ipv4/tcp_fin_timeout
60

[To change this to 3 seconds]
# echo "3" > /proc/sys/net/ipv4/tcp_fin_timeout

[To have these new values take effect you may have to do (as root)]
# /etc/rc.d/init.d/network restart

If you want these new values to survive across reboots you can at them to /etc/sysctl.conf.

# Allowed local port range
net.ipv4.ip_local_port_range = 1025 65535
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 3
'Hope that helps .. PSM

Last edited by paulsm4; 12-15-2007 at 03:02 AM.
 
Old 12-25-2007, 07:24 AM   #3
Sheridan
Member
 
Registered: Aug 2007
Location: Hungary
Distribution: Fedora, CentOS
Posts: 91

Original Poster
Rep: Reputation: 21
Quote:
Originally Posted by paulsm4 View Post
Hi -

You can always try reducing the "tcp_fin_timeout" kernel parameter on your server:

http://www.cs.uwaterloo.ca/~brecht/s.../ip-sysctl.txt


http://www.cs.uwaterloo.ca/~brecht/servers/tcp.html


'Hope that helps .. PSM

This was it. Thank you very much!!!
 
Old 01-23-2008, 05:59 PM   #4
Sheridan
Member
 
Registered: Aug 2007
Location: Hungary
Distribution: Fedora, CentOS
Posts: 91

Original Poster
Rep: Reputation: 21
Unhappy

Quote:
Originally Posted by Sheridan View Post
This was it. Thank you very much!!!
Err... Nope. After some time, I checked that machine again, and yo and behold... Let's do first a very barbaristic grep for FIN wait stuff, something like...

lsof | grep 'FIN_WAIT' | wc -l

After some time this concludes to some 7000+ FIN_WAIT2 garbage. Most of which is done by one single process.

I know now that adjusting fin timeout is not the issue - at least it doesn't help with this after all. Even if I set it to like 10, it has no apparent effect on those particular FIN_WAIT2 connections.

And.. like is said I also discovered that all these stale connections are done by the same process...

So... how can I really kill those stale connections (some of them have been up since quite a few weeks or more now)?

One obvious choice would be to just go ahead and kill and restart the process that created them perhaps, yes... Unfortunately, the process which causes this cannot be killed - it would cause major disturbance (and outrage...) from a number of guys since about 10+ people are using it at any given time for simulating/running/calculating/whatev. all kinds of stuff.. And some of those project thingies have been running there for many months, nearing completion with a deadline and all. And, this wouldn't be a long-term solution anyway. You see I cannot go around restarting this thing every now and then and make everyone angry as hell and unable to work seriously on long-term...

So... A solution which doesn't entail shutting down the originator process(es) is needed unfortunately. Any ideas? How to make a little cron script or sg. to "clean-up" long stale FIN_WAIT2 state "connections"?
 
Old 10-05-2011, 08:58 AM   #5
perosr
LQ Newbie
 
Registered: Oct 2011
Posts: 1

Rep: Reputation: Disabled
Hi Sheridan,


I have the same problem using embedded jetty in my high load website. Did you manage to solve this issue, because fin timeout isn't working for me also.


Best Regards
Pero Gjuzelov

Quote:
Originally Posted by Sheridan View Post
Err... Nope. After some time, I checked that machine again, and yo and behold... Let's do first a very barbaristic grep for FIN wait stuff, something like...

lsof | grep 'FIN_WAIT' | wc -l

After some time this concludes to some 7000+ FIN_WAIT2 garbage. Most of which is done by one single process.

I know now that adjusting fin timeout is not the issue - at least it doesn't help with this after all. Even if I set it to like 10, it has no apparent effect on those particular FIN_WAIT2 connections.

And.. like is said I also discovered that all these stale connections are done by the same process...

So... how can I really kill those stale connections (some of them have been up since quite a few weeks or more now)?

One obvious choice would be to just go ahead and kill and restart the process that created them perhaps, yes... Unfortunately, the process which causes this cannot be killed - it would cause major disturbance (and outrage...) from a number of guys since about 10+ people are using it at any given time for simulating/running/calculating/whatev. all kinds of stuff.. And some of those project thingies have been running there for many months, nearing completion with a deadline and all. And, this wouldn't be a long-term solution anyway. You see I cannot go around restarting this thing every now and then and make everyone angry as hell and unable to work seriously on long-term...

So... A solution which doesn't entail shutting down the originator process(es) is needed unfortunately. Any ideas? How to make a little cron script or sg. to "clean-up" long stale FIN_WAIT2 state "connections"?
 
Old 10-06-2011, 01:27 AM   #6
Sheridan
Member
 
Registered: Aug 2007
Location: Hungary
Distribution: Fedora, CentOS
Posts: 91

Original Poster
Rep: Reputation: 21
Quote:
Originally Posted by perosr View Post
Hi Sheridan,


I have the same problem using embedded jetty in my high load website. Did you manage to solve this issue, because fin timeout isn't working for me also.


Best Regards
Pero Gjuzelov
Hi there,

Unfortunately, I got fired over this shortly after I posted this last message. I hear that four other guys (my "succesors", one of them my friend) have also been fired because of the same issue afer I was gone during the course of a single year, so I do not believe a solution has been found. I did not follow it any further. And I certainly wasn't able to think of anything else I could have tried to solve this.

I never had this issue ever since...

I'm sorry but I don't have a clue... I couldn't even help myself.

S.
 
Old 11-20-2013, 01:22 AM   #7
kr_gaurav651
LQ Newbie
 
Registered: Nov 2013
Posts: 1

Rep: Reputation: Disabled
Hello, I got stuck with the same problem. My application is using Websocket over TCP and transfers MBs of data in a go. So the connection stays for an average time of 1 minute. After that either the client OR server may initiate a dis-connection as the case may be.
Everything is working fine upto application functionality. Problem is, my server socket stuck in FIN_WAIT2 state for life time until I kill my server process (running on port#8095). So i can see multiple connections in FIN_WAIT2 state. Following is the output of 'netstat -na | grep 8095'

Quote:
[root@localhost ~]# netstat -na | grep 8095
tcp 0 0 ::ffff:172.31.209.18:8095 :::* LISTEN
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:59.182.155.150:20858 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.27:6071 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.27:6570 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.7:32440 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:59.182.155.150:20721 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.119:47107 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.27:15202 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.27:19023 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.119:48941 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.7:13325 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.7:10760 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.7:24342 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:10222 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:1778 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.64:39367 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.24:58245 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:49259 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.80:1083 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:2918 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.124:34061 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:53370 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.88:11815 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.88:47397 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.64:12556 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.64:15456 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:58406 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.10:53305 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:18495 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.80:42849 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.64:27007 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.64:60737 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.10:20746 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.64:30023 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.12:17950 FIN_WAIT2
tcp 0 0 ::ffff:172.31.209.18:8095 ::ffff:49.32.0.40:47676 FIN_WAIT2
The client application has been shut down after completing the application flow. Following is the TCP kernel tuning on my Red Hat Enterprise Linux Server release 5.8 (Tikanga) running on HP Proliant G8 server with 192 GB of RAM. No other process is running on this system.
[sysctl.conf]

Quote:
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl( and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1


# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

# Default max queues syatem wide = 1024
kernel.msgmni = 2048

#Maximum number of file descriptor allowed
fs.file-max = 1048576



# The Linux autotuning TCP buffer limit
# Defines per-socket memory usage for auto-tuning. The third value is the maximum send buffer space
# (overridden by r/wmem_max).
# The receive buffers for auto-tuning
net.ipv4.tcp_rmem = 4096 87380 16777216

# The write (send) buffers for auto-tuning
net.ipv4.tcp_wmem = 4096 65536 16777216

# Determines how the TCP stack should behave for memory usage; each count is in memory pages (typically 4KB).
# Increase the count for large BDP (but remember, it's memory pages, not bytes).
net.ipv4.tcp_mem = 8388608 8388608 8388608

# For 10G NIC
net.core.netdev_max_backlog = 32768

net.ipv4.tcp_congestion_control = htcp

# Recommended for hosts with jumbo frames enabled
net.ipv4.tcp_mtu_probing = 1

# Enables window scaling as defined by RFC 1323. Must be enabled to support windows larger than 64KB.
net.ipv4.tcp_window_scaling = 1


# Enables selective acknowledgment, which improves performance by selectively acknowledging packets received out of order
# (causing the sender to retransmit only the missing segments), but it can increase CPU utilization.
net.ipv4.tcp_sack = 1

#
net.ipv4.tcp_fack = 1

net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_low_latency = 1

net.ipv4.tcp_westwood = 1

net.ipv4.tcp_bic = 1

net.ipv4.tcp_max_tw_buckets = 1440000

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_tw_reuse = 1

net.ipv4.tcp_fin_timeout = 5

# Limit of socket listen() backlog.Defaults to 128. See also tcp_max_syn_backlog for additional tuning
net.ipv4.somaxconn = 4096

net.core.somaxconn = 4096

# Maximal number of remembered connection requests, which are still did not receive an acknowledgment from connecting client
net.ipv4.tcp_max_syn_backlog = 4096

#
net.ipv4.tcp_no_metrics_save = 1

net.ipv4.tcp_keepalive_intvl = 30

net.ipv4.tcp_keepalive_time = 400

net.ipv4.tcp_keepalive_probes = 5

vm.swappiness = 20
net.ipv4.tcp_max_orphans = 5
net.ipv4.tcp_orphan_retries = 2
Please help and let me know if additional information is required regarding this.

Thanx in advance.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
netstat output rob.rice Linux - Networking 2 04-29-2007 10:18 AM
Netstat output Raafi Linux - Security 4 05-24-2005 11:14 PM
What does this netstat output mean? Kovacs Linux - Security 2 01-25-2004 07:32 PM
netstat -l output help dai Linux - Security 2 07-02-2003 04:40 PM
netstat output... WeNdeL Linux - Networking 3 03-20-2003 10:45 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 12:56 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration