Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I have been facing this problem from some time now and feel it's time that I float this around for some solution. It would be great if you could provide some definitive pointers in this regard.
I am running a networking application which runs over SCTP ( Stream control transport protocol) protocol stack ( implemented on the lines of RFC 2960 ),. The SCTP stack sits over IP layer and makes raw IP system calls for all the networking operations.
The setup configuration is as follows .
1. There are 2 linux machines which are connected back to back over 100 mbps ethernet interface each having an intel 1.6 GHz Pentium IV processor with 1 GB of RAM each . Each machine is has redhat 9.0 running .
2. Each machine has SCTP stack running over IP layer. On top of SCTP stack there is a load application which can pump data messages ( of 100 bytes each for a period of 5 minutes) at different rates (which could be configured at run time) .
3. The intent is to evaluate the performance of the of the SCTP stack with the above configuration scenario.
4. To start with a messages with a moderate rate ( 1000 MSG per/sec) are pumped from both ends and the number is gradually increased till the buffers at transport layer become insufficient to handle the data rate at which messages are being pumped by application application - which basically is a load generator) from both sides.
5. A time comes when (the message rate has reached 35000 MSG/sec) one of the two computers stops responding and remains hanged and I have to forcefully (hard) reboot the machine . Ctrl+C etc keys donot work and the traces on the console also stop coming, none of the keys except CAPS, SCROLL and NUM LOCK seem to work. When I telnet the hanged machine from peer the control stops at escape sequence but the login prompt does not appear. Although the hanged machine successfully responds to the ping requests.
6. The SCTP stack and the application are running as a binary in user mode, The binary runs under root privileges. ( in super user mode)
Could you let me know what is happening, how can a user mode program force a kernel to go in an infinite loop such that it stops responding. I have checked /var/log messages but didn't find anything fishy.
This is a pretty obscure question ... you might have more luck in the networking forum. But just to be clear, the application is entirely user-space? No corresponding module inserted into the kernel? I could think of a couple of things it could be, perhaps related to corruption of kernel memory. But a couple questions:
1) What kernel version exactly are you running (type uname -a)?
2) Is it always the same machine that hangs? If so, it may simply be a case of dodgy hardware buckling under load.
Thanks for your inputs.
Yes, the application is running entirely in user space, it's only that it makes raw ip system calls (for IP underlying layer) for all the networking operations.
No corresponding module inserted into the kernel
Secondly I have tried this thing on various machines (running on kernel 2.6 machines and kernel 2.4 machines ) with different harware configurations but the problem persists
I would like to add that the problem surfaces only when we are pumping messages at a very high rate and when CPU utilization approaches 100 %. The messages are pumped for around 5 minutes but if the machine hangs it does not recover even if left idle for next 10-12 hours.
Waiting for your suggestions.
Thanks and Best Regards