ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi
I am running a large application where I used to run more than 2000 threads at user level.
When I am running this in RHEL4u3, I am getting segmentation fault in free() . It gives that memory is already freed.
Where as, if I run the same program on RHEL5 or FC5, I am not getting any segmentation fault. Even if I run it 4-5 times consecutively.
I have RHEL4u3 -- gcc version 4.1.0 20060304 (Red Hat 4.1.0-3) -- kernel 2.6.9.-34
and RHEL5 -- gcc version 4.1.1 20070105 (Red Hat 4.1.1-52) -- 2.6.18-8
FC5 -- gcc version 4.1.0 20060304 (Red Hat 4.1.0-3) -- 2.6.16-1.2080
Any help/pointers would be more than welcome
One more thing I am using 2 another .so files which are also created by me. Now, I want to make sure that those ".so" are not the culprit. So, I want to debug it through valgrind. But I dont know how to use valgrind with so file...
Could someone please guide?
Can't help with valgrind. But here are some "pointers" (funny you should talk about them, they probably are the culprit).
Compile your program for debugging (-g) and run it with gdb. Then with bt, you might get to the line where it crashes. Usually it's a good thing to test that a pointer is not null before attempting to free it. Also, I usually set a pointer to null after freeing it (don't know if it's a good thing, but it works for me).
Second, if you don't get to the root of the problem with gdb, or the line that gdb points to doesn't look descriptive, post the output here, and some parts of the code. Cheers.
Try electric fence also. It substitutes its own malloc() and free() for the standard ones, so your .so files will use it automatically.
I say "try", because, um, you're using POSIX threads? And you're running some 2000 of them concurrently?
One reason that I use threads as little as possible is that one thread can easily trash memory for the "benefit" of the thread that crashes. Makes debugging very difficult.
If threads don't communicate (or start and stop) very often, and by "often" I mean at least a few times a second, I usually try to stick with fork(). I say this with some hesitation, because I know that it would be quite time-consuming to re-factor your code.
If you do convert to separate processes, try to avoid System V style shared memory and semaphores. Instead, lock (portions of) a file for semaphores. For shared memory, your parent process should use mmap() on /dev/zero for as many bytes as it needs.
Avoiding System V style shared memory and semaphores ensures that when your whole program exits, you won't have those pesky shared memory blocks and semaphores leaking all over the place.
Good luck.
Last edited by wjevans_7d1@yahoo.co; 07-10-2007 at 07:28 AM.
Usually it's a good thing to test that a pointer is not null before attempting to free it. Also, I usually set a pointer to null after freeing it (don't know if it's a good thing, but it works for me).
Deleting a null pointer is a no op, setting a pointer to null is only valid when there is one copy of the pointer and it is not passed as a pointer to anything which deletes it ie.
Code:
void some_func(Foo* f)
{
...
free(f);
f =0;
}
foo* bar = malloc(sizeof(foo));
some_func(bar);
if(bar != 0)//this is true
Code:
void some_func(Foo** f)
{
...
free(*f);
*f =0;
}
foo* bar = malloc(sizeof(foo));
some_func(&bar);
if(bar != 0)//this is false
Quote:
Where as, if I run the same program on RHEL5 or FC5, I am not getting any segmentation fault.
Arr the joys of making a multithreaded application do what you think it's doing
Second, if you don't get to the root of the problem with gdb, or the line that gdb points to doesn't look descriptive, post the output here, and some parts of the code. Cheers.
Thanks for your input. I also practice the same thing. But as posted in other post and as i doubt that it is happening the same thing here... Assigning NULL to the pointer at one place will not save me at other place as if I am passing the pointer in a function.
However, through gdb only, i found that it is giving seg fault at free() function.
Try electric fence also. It substitutes its own malloc() and free() for the standard ones, so your .so files will use it automatically.
Good luck.
hey dude,
Can u give me URL where i can get efence.... i tried searching on Net... But i cudnt get it specifically for rhel4u3 64bit...
So, i ended up using DUMA....
Double free happens when the same object is deleted twice. ALWAYS set pointers to NULL right after deleting them, even if it's in a destructor. You also need mutexes when doing something significant like deleting.
I usually place my inter-thread dynamic objects in lists which are centrally-accessible. They are encapsulated so that a mutex must be set before adding, removing, or modifying elements.
When a function or object needs to access an object in the list, I pass a 'const void*' corresponding to that object which does not get cast. Instead, the function gains access to the master list in-turn (setting the mutex) and searches the list for a matching pointer. If the object is there, the function uses it safely, and since the mutex is set it cannot be deleted while it's being used. If the object isn't there, there isn't an undefined behavior problem; the function just logs an error and exits.
The encapsulation I use (written by me, admittedly) provides unlimited read-only access, so the mutex doesn't need to be clear just to read an object. If a read-only operation is lengthy I will access in write mode to keep another thread from changing the list.
You don't necessarily need to use the encapsulation I use (it does have its price,) but I highly recommend tabulating, encapsulating, centralizing, and restricting modification access to inter-thread dynamic objects.
ta0kira
PS This applies to C++, but can be isolated to allow use with C. The major project I'm working on now uses C++ internally to help with this but the API is all in C.
The encapsulation I use (written by me, admittedly) provides unlimited read-only access, so the mutex doesn't need to be clear just to read an object. If a read-only operation is lengthy I will access in write mode to keep another thread from changing the list.
If you're using threads, then I hope that by "lengthy" you mean "two bytes or longer". Nothing more frustrating to find than a race condition bug that happens extremely rarely.
"Lengthy" in this case means "must perform multiple operations on the retrieved element," such as read/process/read/etc., as opposed to reading a single int. I'll probably add individual element encapsulation in addition to list encapsulation so the entire list doesn't have to be locked out to delete an element. The list (also written by me) uses stable pointers (as a selectable option,) so even if an element is repositioned it's pointer, and hence using threads, still remain valid. The encapsulation I use does have the ability to indicate current read-only access status, and also the ability to "kick off" modules currently accessing them, so it wouldn't take a whole lot of work to add that protection, as well. That was part of the initial program design, but I hadn't gotten that far pending structural revisions and a working prototype. Element encapsulation is still a consideration subject to revision, but the fundamental concept of tabulating and encapsulating remains a valid strategy that's solved all of my thread-related problems up to this point.
ta0kira
Yes, that is possible, but the program doesn't deal with indefinite sizes or byte arrays in the context under speculation. Nothing retrieved is larger than a register, and I highly doubt any recent kernel will ever write a byte while the word itself is being read, especially since the word will always be dealt with as a word in the program.
You do bring up good points, but as I stated before it's already a consideration, and the only place I read where write access isn't locked out reads a register-sized int which is in no danger of being popped or deleted mid-read.
ta0kira
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.