Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
A few things in there were larger than I expected, but nothing was extreme.
When was that run? You seem to have an idea of when the failure is approaching and looking at the slabinfo when the failure is about to happen or starting to happen would be more informative than when the system is healthy.
Quote:
How to see that what is total size of kernel memory availabe?
I'm not sure. I did a web search for info about the file /proc/kcore and everything I found says that file represents a binary image of physical memory. But when I look at that file on various systems, it seems to represent a binary image of kernel virtual memory, so its size is the limit of the size of kernel virtual memory.
So what is the output of ls -l /proc/kcore
Quote:
How to increase the kernel memory size?
It is a build time option when you recompile the kernel. Do you know how to recompile a kernel?
Quote:
Shall migrating to higher version of RHEL (5.3 etc) will help?
I know nothing about your system and the applications you run. I especially know nothing about the mercd driver that seems to be at the center of your problem.
Do you pay for support for this RHEL system? If you do, you should be asking Red Hat for some support. If you don't you ought to be using Centos instead of RHEL.
Maybe Switching to RHEL or Centos version 5 would help. Maybe what you're seeing is an old bug that was fixed long ago in RHEL itself or in the mercd driver. I don't know any of that stuff.
Is your hardware 64 bit capable? Do you have a good reason for running 32 bit RHEL rather than 64 bit? I think switching to 64 bit is more likely to fix the problem than switching just to version 5.
Click here to see the post LQ members have rated as the most helpful post in this thread.
I took the stats when the problem occured. Free suggests that physical memory is ok. But the telephony card driver is saying unable to allocate memory. I am not able to decide where the issue is? Please help.
When was that run? You seem to have an idea of when the failure is approaching and looking at the slabinfo when the failure is about to happen or starting to happen would be more informative than when the system is healthy.
It generally happens 3 days after system restart, when load is about at its peak (30% - normal load for IVR). If i only restart the card driver then it runs for 8-9 hrs then again the problem starts. I will provide slabinfo next time when the problem occurs.Its an voice IVR application.
It is a build time option when you recompile the kernel. Do you know how to recompile a kernel?
No. But i can do it. I have been working on linux for past 4 yrs. I am more of application designer/developer with telecom network understanding.
Quote:
Do you pay for support for this RHEL system? If you do, you should be asking Red Hat for some support. If you don't you ought to be using Centos instead of RHEL.
No. we dont. Yup i will consider it before putting such system in production. But the problem is that the current system is at remote loaction and any possibility of hardware/software change is very difficult.So my entire priority is in identify the root cause.
I have faced this issue earlier as well. But last time it was running RHEL 4.3 and my vender suggested to upgarde the O.S to RHEL 4.5. I did and we also replaced the chassis(server). The problem was resolved. But now we have RHEL 4.5 and we are again facing the issue. I am looking more for root cause to kill this issue once for all.
Quote:
Is your hardware 64 bit capable? Do you have a good reason for running 32 bit RHEL rather than 64 bit? I think switching to 64 bit is more likely to fix the problem than switching just to version 5.
I dont think so. Will consider 64 bit for now onwards.
I have also take then similar logs of system as posted earlier. But this time the system was working fine after the card driver restart. this is just for comparsion purpose. See if u can find anything:-
I was considering to vm.overcommit_memory = 2. I read in Redhat optimazation that it increases the ram availablity to the system. I dont know wheather it will help or not.
I have faced this issue earlier as well. But last time it was running RHEL 4.3 and my vender suggested to upgarde the O.S to RHEL 4.5. I did and we also replaced the chassis(server). The problem was resolved.
Also consider the possibility that you didn't so much resolve the problem as delay its occurrence.
Quote:
I have collected the vital stats when problem occured
I don't believe that you recorded this at exactly the time that the problem started; was it just before or just after the problem actually started?
Quote:
Code:
vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 16636 161352 3758148 0 0 2 28 23 11 1 1 97 1
I'm not sure that the output of vmstat is helping much but, if vmstat were to help, you'd have to do something other than this. The first line of vmstat probably only deceives about what is currently going on, so you need the multi-line output.
Do you know what that zombie process was? not the driver for your telephony card, by any chance?
At this point, I have a suspicion that there is simply a bug, or maybe an incompatibility in the card driver (did you install it from a repo, did you build it yourself from a tarball or something else?), but I have no idea how to proceed further without more information.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.