LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-06-2004, 10:44 AM   #1
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Rep: Reputation: 0
help debugging a socket leak


I'm trying to debug a program using lsof. I've been seeing a file-descriptor leak in /proc/$PID/fd from sockets that eventually keeps me from being able to create new network connections when I hit the resource limit.

lsof shows me that the socket handles that are being leaked are identified with "TYPE=sock 0,0 ...can't identify protocol" I've performed an strace and there are never any calls made (network or otherwise) that return the file-descriptors that match the bad ones in lsof.

My question is, where are these "sock" type sockets being created? Does anyone else know where they come from and how to get rid of them?

Thanks!
 
Old 05-06-2004, 01:32 PM   #2
infamous41md
Member
 
Registered: Mar 2003
Posts: 804

Rep: Reputation: 30
can u post the code?
 
Old 05-06-2004, 01:34 PM   #3
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
I couldn't post the code of the program in question, but I can see if I can create an example.
 
Old 05-06-2004, 01:44 PM   #4
infamous41md
Member
 
Registered: Mar 2003
Posts: 804

Rep: Reputation: 30
the obvious answer is that you're not closing your sockets when you're done with them. if u need them all to be open, check out setrlimit() and i think u can change max # open descriptors.
 
Old 05-06-2004, 01:49 PM   #5
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
I've been down that road. I've managed to account for the opening and closing of all of the socket calls my program makes. Strace helps there. These socket handles which are being allocated do not seem be be coming from my own code, but rather internally within the system. With Strace I can see every socket I allocate and my program never allocates these. Since my program is not allocating them, I don't know what they are to close them down.
 
Old 05-06-2004, 01:57 PM   #6
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
And I don't need more than a few dozen sockets open for any reason. These leaked "blank" sockets appear everytime I make a new network connection. Over time, I run out of resources so raising the limit would only delay the inevitable.
 
Old 05-06-2004, 02:01 PM   #7
infamous41md
Member
 
Registered: Mar 2003
Posts: 804

Rep: Reputation: 30
well, that's certainly odd. from stracing u have no idea what part of the code they are coming from? are you using lots of libraries? if you are doing some sort of nameresolution call before connecting(ie. gethostbyname, etc..), then of course there will be a few scokets opened for dns resolution, but i certainly hope that is not the problem or others would have encourntered already. w/o seeing code i dont know what else to tell you. how long does it take for u to max out? what kinda program is this?
 
Old 05-06-2004, 02:28 PM   #8
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
Here's the lsof output, if it helps:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
DoubleTak 2454 root cwd DIR 3,3 4096 1188252 /opt/NSI
DoubleTak 2454 root rtd DIR 3,3 4096 2 /
DoubleTak 2454 root txt REG 3,3 8059307 1188473 /opt/NSI/DoubleTake
DoubleTak 2454 root mem REG 3,3 103044 1300532 /lib/ld-2.3.2.so
DoubleTak 2454 root mem REG 3,3 79744 682809 /lib/tls/libpthread-0.29.so
DoubleTak 2454 root mem REG 3,3 91604 1300547 /lib/libnsl-2.3.2.so
DoubleTak 2454 root mem REG 3,3 23668 1300541 /lib/libcrypt-2.3.2.so
DoubleTak 2454 root mem REG 3,3 366424 1300587 /lib/libacl.so.1.1.0
DoubleTak 2454 root mem REG 3,3 15084 1300543 /lib/libdl-2.3.2.so
DoubleTak 2454 root mem REG 3,3 710608 374097 /usr/lib/libstdc++.so.5.0.3
DoubleTak 2454 root mem REG 3,3 211948 682807 /lib/tls/libm-2.3.2.so
DoubleTak 2454 root mem REG 3,3 30324 1300589 /lib/libgcc_s-3.2.2-20030225.so.1
DoubleTak 2454 root mem REG 3,3 49287 1300585 /lib/libattr.so.1.1.0
DoubleTak 2454 root mem REG 3,3 728579 1301648 /lib/libRSResource.so
DoubleTak 2454 root mem REG 3,3 1531064 682805 /lib/tls/libc-2.3.2.so
DoubleTak 2454 root 0u CHR 1,3 66759 /dev/null
DoubleTak 2454 root 1u CHR 5,1 65323 /dev/console
DoubleTak 2454 root 2u CHR 5,1 65323 /dev/console
DoubleTak 2454 root 3u CHR 5,1 65323 /dev/console
DoubleTak 2454 root 4uW REG 3,3 0 229150 /tmp/Double-Take
DoubleTak 2454 root 5u REG 3,3 5750 1188475 /opt/NSI/dtlog1.dtl
DoubleTak 2454 root 6u IPv4 3707 UDP *:1575
DoubleTak 2454 root 7u sock 0,0 3691 can't identify protocol
DoubleTak 2454 root 8u IPv4 3696 UDP *:1578
DoubleTak 2454 root 9u IPv4 3697 UDP 169.254.1.247:1575
DoubleTak 2454 root 10u IPv4 3698 UDP 169.254.1.247:32769
DoubleTak 2454 root 11u IPv4 3701 UDP *:1578
DoubleTak 2454 root 12u IPv4 3702 UDP 10.0.21.154:1575
DoubleTak 2454 root 13u IPv4 3703 UDP 10.0.21.154:32770
DoubleTak 2454 root 14u IPv4 3711 TCP *:1578 (LISTEN)
DoubleTak 2454 root 15u sock 0,0 3714 can't identify protocol
DoubleTak 2454 root 16u sock 0,0 3717 can't identify protocol
DoubleTak 2454 root 17u unix 0xd7443280 3719 socket
DoubleTak 2454 root 18u sock 0,0 3721 can't identify protocol
 
Old 05-06-2004, 02:32 PM   #9
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
This is a backup program which transmits changes to a backup server. The problem occurs with each connection I create. In the example I posted, handles 15, 16, & 18 display the problem. The speed with which I run out of resource depends on how many times I reconnect...in a test environment that can be many times an hour.
 
Old 05-06-2004, 02:44 PM   #10
infamous41md
Member
 
Registered: Mar 2003
Posts: 804

Rep: Reputation: 30
well, im baffled. all the output tells me, as u prolly know, is that the problem lies in the area of code right after creating the TCP *:1578 (LISTEN) socket, since descriptors are always assigned from next lowest open #.
 
Old 05-06-2004, 02:57 PM   #11
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
Yeah, and the code of this program is complex enough its like a needle in a haystack. Even strace would only point me at a specific system call, but I was hoping that I could identify the code based on the parameters being passed, etc...

I suspect, as you mentioned earlier, that there are some internal calls which use sockets that may not be getting cleaned up. Thanks for your reponse. Perhaps someone else will come along who has my answer. I'm going to keep trying to recreate this with simpler code or try to hunt it down in some library code, or something...
 
Old 08-10-2004, 11:58 PM   #12
Toonces7
LQ Newbie
 
Registered: Aug 2004
Location: austin
Posts: 1

Rep: Reputation: 0
Fixed this

Okay, I don't know if anyone's still reading this, but I had exactly the same problem rozeboom describes here and I was able to fix it.

My problem was an incorrect call to socket(). This code was written by someone else but I had to debug it. The problem was that this code failed to recognize that the function accept() opens and creates the socket without needing to call socket()

So this code was calling accept(), AND calling socket() when it should have not called socket() if it were an fd optained from accept(). FDs optained from accept() are already created and inherit their socket-parameters from the socket where accept() was called from.

In my particular case, the code's custom socket class was calling socket() from its constructor. It was doing this on sockets that'd had already been created via accept(), thus two FDs created, one of them never getting cleaned up. So my fix was to make a second version of the constructor for my socket class, one that takes an FD as a parameter. That constructor does NOT call socket(). I use this constructor to create socket class that originate from the accept() call.

It seems to work. Hope this is of use to someone out there

-Aaron
 
Old 08-11-2004, 09:19 AM   #13
rozeboom
LQ Newbie
 
Registered: Oct 2003
Distribution: CentOS5
Posts: 20

Original Poster
Rep: Reputation: 0
That sounds like it might help... the code I'm working with is also a C++ class for handling sockets and it might very well do exactly what you describe. Thanks for your response!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
cannot read data at server socket, though client socket sends it jacques83 Linux - Networking 0 11-15-2005 01:58 PM
Visual Debugging and Linux Kernel Debugging Igor007 Programming 0 09-30-2005 10:33 AM
Unable to connect to UNIX socket /tmp/.esd/socket error while using grip dr_zayus69 Linux - Software 4 08-23-2005 07:28 PM
socket leak in 2.4.21 ? scylla Linux - Networking 0 11-10-2004 04:30 PM
Memory Leak when using memory debugging C program on SuSE SLES8 babalina Linux - Distributions 0 10-06-2003 09:39 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration