Red HatThis forum is for the discussion of Red Hat Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I currently have a server that has 26,000 users, and a lovely quota system. It appears there is a kernel bug out there related to quotas that causes the system to hit a race state every 24-36 hours. Over the past 6 months there are only two real solutions redhat has provided:
1. Turn off quotas (Not an option)
2. Reinstall to AS3.0 since they cannot replicate the problem on AS3.0.
Both of which I really don't like since I'm paying redhat for support on AS2.1.
Is anyone else out there experiencing this same problem? Do I really have to rebuild
Has anyone found a solution other than rebuilding the entire system and causing a major outage for my users?
Did you contact your sales representative at Red Hat about it?
Are there public bug reports about it which you can post here?
With regard to the age of RHEL 2.1, when did you install the server? Was the bug present since the beginning? Or was it introduced with a kernel erratum?
The suggestions you refer to come from Red Hat Support?
I agree with misc...Red Hat is not going to make you install RHEL 3 as an option or disable quotas...I highly doubt that's what they reommended...if so, did a senior engineer or technician tell you this?
What kernel version are you using on your RHEL 2.1 system? If you are using the base kernel (2.4.9-e3) then I highly recommend upgrading to the latest one if you haven't already.
Also, as part of the troubleshooing process, Red Hat is most likely going to check and see if you have the latest kernel (2.4.9-e49) installed to see if the problem can be replicated.
I was running the enterprise kernel to take advantage of the SMP and 8 gig of ram. We backup up to the single proc kernel to see if it would relieve the problem.
Currnet Kernel: Linux xx.xx.xx 2.4.9-e.49 #1 Fri Aug 6 11:56:52 EDT 2004 i686 unknown
My trouble ticket with redhat was just updated today. It appears they have a NON-production set of rpms they would like for me to test. Yes.. the first solution I was told my the tecnhician was to turn off quotas.
Brought you to directly from the trouble ticket log:
---------------------------------------------------------------------------
"The escalation team has also suggested the following:
1) Please turn disk quota off if you are still using it.
2) Please enable nmi_watchdog and obtain during the panic the output of sysrq-w
(a couple times) and a sysrq-m through the serial console. Detailed instructions
on how to do this are provided in the Kernel Profiling document attached to this
ticket, sections Category 1 and Category 2. "
The upgrade to AS3 was suggested due the inability to replicate the bug under AS3.
I appreciate the quick replies. If I knew how to crack the kernel open and try to fix it I would.
Last edited by draeician73; 10-23-2004 at 07:53 PM.
But that sounds like they only want to trouble-shoot it together with you in order to find out details. It doesn't sound like they want you to turn off quotas as a suggested fix.
There is a non-production kernel and quota package they want me to do some load testing on. This might be a fix.
I know they just wanna help. Just getting a bit frustrated with 6 months of having to reboot my mail server daily, and it still locking up within 12 hours at random. I'll keep in touch.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.