Root unable to delete file
Hello,
Strangest thing I have ever seen. Any idea why I can't delete a file if I'm root? Code:
[root@test directory]# rm -f .result.php.swp |
Hello,
Is this maybe a partition mounted as read only? Or is the directory set to immutable? Check with mount and lsattr. Kind regards, Eric |
Yes, the answer is right in the middle of all the text.
Quote:
You could try to remount it, but it would be a good idea to run "fsck -y" on it first. Or just reboot the system. If this continues to happen, it would be worth finding why it is switching to read-only. |
Thanks. Definitely something gone wrong. Rebooted the system and saw the following during boot up.
Code:
Memory for crash kernel (0x0 to 0x0) notwithin permissible range Code:
Setting hostname name.domain.com: [ OK ] Code:
2 logical volume(s) in volume group "VolGroup00" now active Quote:
I did power down the system by pressing the power button because I typed "power -h now" but didn't see the system completely powered down. Would the above have caused by certain information not synced into the hard disk? Or the hard disk itself has a physical issue? |
Quote:
|
Thanks. That message "ata2: softreset failed (device not ready)" had appeared for a while and I have had no clue what it was about. But what exactly does it mean? How can it be not ready and yet useable? And what is a soft reset?
|
You raise some good questions, that I can't fully answer. The error sounds like the kernel device driver received an unexpected response of some form, from the SATA controller. A soft reset suggests the issuance of a command to "reset" which could be issued in reaction to an error. Unfortunately, your guess is as good as mine.
One interesting thing I noticed is that if you Google the term "ata2: softreset failed" you get a lot of hits regarding known bugs, so apparently there was a kernel change at some point that impacted this. To be safe, you could run some disk check utilities periodically, but you should do them with the drive not mounted, such as from a live CD. I think this is one to keep your eye on for a while being on the lookout for more errors. You should also make sure you keep backups of important data, though this is always a good precaution. |
From past not-fun experiences RAID5 *sometimes* 1 drive will fail, no problem insert the new drive only to find out another drive has an error on the a drive and it cannot rebuild and it sits on a blinking cursor. (The other drive did NOT report any errors) always nice
I am no fan of RAID5, like the above post stated be sure the data is backed up, it sounds like it is getting ready to do something and it does not sound good. I converted all physical machines to Virtual Machines (VMware not easy) in a SAN HA environment and have another SAN doing snap-shots of the production SAN. In case the prod SAN goes belly up, I can put the other SAN in production. I got burned on RAID5 too many times, I never take anything for granted it will go belly up and put you in a not-so-nice position 'emergency mode'... |
Quote:
However, a softreset failure can be caused by a failure in the hard drive itself. Your symptoms indicate your hard drive is dead or dying. If your hard drive supports S.M.A.R.T -- and if it is of the rotational variety, it most likely does --, you can use smartctl from smartmontools to do an offline check to update the status information, check the attributes, especially the reallocated sector count, and then run a short or long self-test. A fast-increasing or maxed out reallocated sector count is the most reliable indicator of total failure in the near future; a self-test failure means the disk is dead; any further data you put on it is likely to be lost, and you have a very limited opportunity to save its contents. As to saving the contents of a dying disk, get another disk as large or larger, and create an image of (the partitions on) the old one using sudo dd conv=noerror if=/dev/sdX of=image-file bs=512 (unless it was purely a RAID5 member, in which case don't bother, just get a new disk and rebuild the array). When the dd is running, sudo killall -USR1 dd will make dd (all running dd's) to output progress information. |
Quote:
It's much better to invest in known reliable hard drives, and replace them if they get more than a dozen reallocated sectors. It's a pity Samsung sold their hard drive business to Seagate, as the larger Samsung hard drives were cream of the crop in my experience; they were a lot cheaper than the only other alternative for me, Western Digital. I wouldn't take Seagate disks even if I got them for free, I got so many problems with them. (The funniest one was a Maxtor disk more than a decade ago: it was unbalanced, and would not stay put on a table when turned on. Only vibrator I ever owned.) Well, I hear Hitachi enterprise grade drives are good, but I've got no experience with those. Oh, and if you monitor the drives, some drives nowadays also have a temperature sensor you can use to keep tabs on the server status. |
To add to this story, am working on a file with a command like "view text.log" and suddenly saw a message appeared from syslogd saying I/O can not commit. So I checked out my /var/log/messages and saw the following keep logging into it.
Most of it is alien language to me but one thing I do notice was back in 15 May, it logged the following line. Code:
May 15 04:14:14 test kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Code:
May 21 21:31:15 test kernel: ata2.00: status: { DRDY } Code:
SMART Self-test log structure revision number 1 Code:
[root@test log]# smartctl -t long /dev/sda1 |
Strangely enough the extended test was completed without errors.
Code:
SMART Error Log Version: 1 |
Quote:
Code:
root@CW8:/var/log# grep 'ata2: softreset failed' syslog |
Quote:
Quote:
You know, the log looks suspiciously like a cable problem. I'd reseat (detach and reattach) all SATA cables (including SATA power cables), to see if that helps. |
Quote:
|
All times are GMT -5. The time now is 01:07 AM. |