Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Edit: ** This might not be the right forum/subforum for this post, many appologies. If mods knows where it should belong, please move it to the approporiate place **
I sincerely doubt that anyone has enough time to even read thru this post, but I am going to take my chances as I have nowhere else to turn and just hope for the best.
My problem:
I have an LVM spanning over 6 disks consisting of two directory structures (dir1, dir2). When issuing an 'ls' or 'dir' in dir1, I get a filelist w/o problems. Doing the same in dir2, the system halts after reading from the disks, without any filelist being shown at all. The cursor freezes, and all I can do is cut the power to the machine and restart it. dir2 is the larger of the two structures.
When copying data from dir1, the system sometimes halts. Repeating the same procedure (copying the exact same files) sometimes works. The volumeset is formatted with XFS, and I have tried running xfs_check. This also leads to an immediate halt (not like when running dir, then the disks reads for a while (~8 seconds) before system freezes).
Steps I have taken in order to find the problem:
Checked RAM for errors, and replaced them
Updated Kernel to latest version
Changed the two controllercards used for the LVM
Installed a fresh system, and mounted the LVM
None of these steps has helped.
So, is it a faulty disk?
I would think so. But then, how come I can access data from dir1, and then the next time I try with the same files, the system halts? It makes no sense to me.
The system has worked flawless for over two years, and now, without making any actual alterations - it doesn't.
I am all out of ideas, and I'm turning to you guys here. If anyone has any ideas.. Please..
System Specs:
Intel P4 3GHz, 2x512 MB RAM
SuSE Linux 9.2
2 x Promise PATA 133 TX2 controllers
I will be delighted to fill in with more details of the system if required.
Since the system is halting on you, you probably can't get a good look at syslog.
Before doing the steps to trigger the problem, can you come in remotely via ssh from some other computer and then run a tail -f on /var/log/syslog? Recreate your halting problem on the local computer, and maybe you'll see something of interest on the remote computer's tail command.
Since the system is halting on you, you probably can't get a good look at syslog.
Before doing the steps to trigger the problem, can you come in remotely via ssh from some other computer and then run a tail -f on /var/log/syslog? Recreate your halting problem on the local computer, and maybe you'll see something of interest on the remote computer's tail command.
Thanks for trying to help Haertig.
I can access the machine via SSH, but I am not sure how to tail the syslog file. I'm running SuSE 9.2, and there is no such file as syslog. Should I configure syslog.conf to make syslogd to output everything to a textfile? Again, thx for helping.
I'm not familiar with SuSE, but it's GOT to have a syslog file. That's quite standard. I can't imagine not having one configured by default. Maybe it's configured to be in a different place than mine, which is in /var/log/syslog.
Check out your /etc/syslog.conf Here are the lines from my syslog.conf that show where standard logfiles are saved. Look for where SuSE stores the files on your system.
p.s. - If you make changes to your syslog.conf file, I believe you'll have to restart syslogd for them to take effect. I doubt you'll have to really change anything. Just snoop around syslog.conf to find out where they're currently being stored. I just can't fathom SuSE not having a syslog defined by default.
But no, the syslog doesn't reveal anything about what is happening or why the system halts.
Bummer. I was hoping you'd find some info there.
I guess we need to be specific about what you mean my "halt". Are you getting a message in your terminal window something like "Kernel panic, system halted"? Now THAT would be a halt. Or is it just that your local terminal window is freezing up (but you can still use the system via that ssh connection you setup from the other computer). Or, are you getting NO messages anywhere, and everything - including that remote ssh connection - is just frozen.
A frozen application vs. a frozen Xwindows vs. a halted system are all different things. I can't determine exactly which situation you are dealing with without more info. My hunch is that you're dealing with a hardware problem or some corruption in your filesystem or LVM. You've already taken some good troubleshooting steps.
I'm a bit concerned about your earlier statement:
Quote:
Checked RAM for errors, and replaced them
Does this mean you actually FOUND memory problems and replaced that bad ram? If so, the bad ram might have been the initial source of the problem, but it cascaded into filesystem or LVM corruption. i.e., you've now fixed the initial issue, but you still have to deal with the various corruption it might have caused.
In most of my system hang problems, I was able to ssh in and look around, but yours may be different.
A question: Are those LVM disks mounted as your "root" partition? If so, then you should be able to let it sit around a while after it freezes and then reboot. You should have something getting logged and if you have journaling, it will still show up next boot.
To be more precise, the system FREEZES, all activity stops. If I run ls from a prompt the cursor ceases to blink, can't trigger ctrl-z, nothing - except for turning the power off. If I go into dir2 (se first post) thru Xwindows, the mousecursor freezes and all activity stops.
Or, are you getting NO messages anywhere, and everything - including that remote ssh connection - is just frozen.
^- Here is exacltly where I am at.
Regarding the RAM, I changed them just to see if new ones would make any difference, but it didn't - same error - so I replaced them with the original ones. Please bear in mind, everything worked fine upto one point when things started to go wrong.
Are those LVM disks mounted as your "root" partition? If so, then you should be able to let it sit around a while after it freezes and then reboot. You should have something getting logged and if you have journaling, it will still show up next boot.
I knew I would reach the point where I couldn't answer the question. If you by root partition mean that the LVM is including the system drive, then the answer is no.
Being a Fedora user, I not sure if this applies to SuSE. But, for what it's worth, Fedora uses LVM2 and implements the device mapper, so the logical volumes are attached as children of /dev/mapper. If you're going to run fsck on a logical volume, you must do it on the appropriate /dev/mapper entry, not on /dev/hdx. For one thing, the logical volume partition type on /dev/hdx is not a valid Linux file system partition type.
I have tried using xfs_check and xfs_repair which are the tools used for xfs. This freezes the machine immediately. xfs_repair freezes at "Phaze 1: Find and verify Superblock..."
Today I changed motherboard, processor and memory just to be on the safe side. Same problem.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.