LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 08-04-2008, 04:37 PM   #1
Othyisar
LQ Newbie
 
Registered: Aug 2008
Posts: 4

Rep: Reputation: 0
ext3 fs goes ro after a day or three; nfs sharing issues


I have built three SAN partitions on my EMC DMX800 and attached them to a Linux server (Dell PowerEdge 2650, Linux 2.6.9-42.ELsmp #1 SMP) with the intent to share them out via NFS.

One of them is a home directory file system, auto-mounting to other unix systems. The other two are just NFS-shared file systems.

I am continually running into issues with these file systems where they go read-only or (with the home directory) corrupting files. I have fsck'ed these and got them back to usability, only to have them get corrupted or go read-only again.

I have disabled the home directory system so I can concentrate on one of the others which is a critical file system for our network.

Errors I keep seeing in /var/log/messages:

Jul 28 19:30:01 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_journal_start_sb: Detected aborted journal
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_free_blocks_sb: bit already cleared for block 15107529
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_free_blocks_sb: bit already cleared for block 15107530
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_free_blocks_sb: bit already cleared for block 15107531
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_free_blocks_sb: bit already cleared for block 15107533
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_journal_start_sb: Detected aborted journal
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_reserve_inode_write: Journal has aborted
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_reserve_inode_write: Journal has aborted
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_orphan_del: Journal has aborted
Jul 29 10:25:25 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_truncate: Journal has aborted
Jul 29 12:59:48 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_check_descriptors: Block bitmap for group 16 not in group (block 33554432)!
Jul 29 12:59:48 kcnfsp01 kernel: EXT3-fs: group descriptors corrupted !
Jul 29 13:00:04 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_check_descriptors: Block bitmap for group 16 not in group (block 33554432)!
Jul 29 13:00:04 kcnfsp01 kernel: EXT3-fs: group descriptors corrupted !
Jul 29 13:00:55 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_check_descriptors: Block bitmap for group 16 not in group (block 33554432)!
Jul 29 13:00:55 kcnfsp01 kernel: EXT3-fs: group descriptors corrupted !
Jul 29 13:01:01 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_check_descriptors: Block bitmap for group 16 not in group (block 33554432)!
Jul 29 13:01:01 kcnfsp01 kernel: EXT3-fs: group descriptors corrupted !
Jul 29 13:01:39 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_check_descriptors: Block bitmap for group 16 not in group (block 33554432)!
Jul 29 13:01:39 kcnfsp01 kernel: EXT3-fs: group descriptors corrupted !
Jul 29 13:03:08 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_check_descriptors: Block bitmap for group 16 not in group (block 33554432)!
Jul 29 13:03:08 kcnfsp01 kernel: EXT3-fs: group descriptors corrupted !
Jul 29 13:13:39 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:13:51 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:21:29 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:38:36 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:43:53 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:44:11 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:44:21 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:45:32 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 13:48:28 kcnfsp01 kernel: EXT3-fs error (device sdd1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 14:00:05 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jul 29 14:00:05 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jul 29 14:00:05 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jul 29 14:02:44 kcnfsp01 kernel: EXT3-fs: recovery complete.
Jul 29 14:02:44 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jul 29 14:08:31 kcnfsp01 kernel: EXT3-fs: journal inode is deleted.
Jul 29 14:29:53 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jul 29 22:09:54 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 29 22:09:54 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_journal_start_sb: Detected aborted journal
Jul 29 22:32:34 kcnfsp01 kernel: EXT3-fs: journal inode is deleted.
Jul 30 00:32:26 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 30 08:58:16 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 30 08:59:39 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_free_blocks_sb: bit already cleared for block 50985247
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_reserve_inode_write: Journal has aborted
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_truncate: Journal has aborted
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_reserve_inode_write: Journal has aborted
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_orphan_del: Journal has aborted
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_reserve_inode_write: Journal has aborted
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1) in ext3_delete_inode: Journal has aborted
Jul 30 09:00:58 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_journal_start_sb: Detected aborted journal
Jul 30 09:03:14 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 30 09:05:02 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 30 09:21:39 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Jul 30 09:23:45 kcnfsp01 kernel: EXT3-fs: recovery complete.
Jul 30 09:23:45 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jul 30 11:13:34 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_new_block: Allocating block in system zone - block = 16547840
Jul 30 11:13:34 kcnfsp01 kernel: EXT3-fs error (device sde1) in ext3_reserve_inode_write: Journal has aborted
Jul 30 11:13:34 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_journal_start_sb: Detected aborted journal
Jul 30 11:13:34 kcnfsp01 kernel: EXT3-fs error (device sde1) in ext3_ordered_commit_write: Journal has aborted
Jul 30 12:54:57 kcnfsp01 kernel: EXT3-fs warning (device sde1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure
Jul 30 12:54:57 kcnfsp01 kernel: EXT3-fs warning (device sde1): ext3_clear_journal_err: Marking fs in need of filesystem check.
Jul 30 12:54:57 kcnfsp01 kernel: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
Jul 30 12:54:57 kcnfsp01 kernel: EXT3-fs: recovery complete.
Jul 30 12:54:57 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Aug 1 12:55:01 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_new_block: Allocating block in system zone - block = 16547841
Aug 1 12:55:01 kcnfsp01 kernel: EXT3-fs error (device sde1) in ext3_reserve_inode_write: Journal has aborted
Aug 1 12:55:01 kcnfsp01 kernel: EXT3-fs error (device sde1): ext3_journal_start_sb: Detected aborted journal
Aug 1 12:55:01 kcnfsp01 kernel: EXT3-fs error (device sde1) in ext3_ordered_commit_write: Journal has aborted
Aug 2 21:00:08 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_readdir: bad entry in directory #25116673: rec_len % 4 != 0 - offset=0, inode=93754411, rec_len=21073, name_len=237
Aug 2 21:00:08 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_readdir: bad entry in directory #8142849: rec_len % 4 != 0 - offset=0, inode=3395865643, rec_len=15878, name_len=180
Aug 3 21:00:05 kcnfsp01 kernel: EXT3-fs error (device sdc1): ext3_readdir: bad entry in directory #2: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Aug 4 10:23:05 kcnfsp01 kernel: EXT3-fs warning (device sde1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure
Aug 4 10:23:05 kcnfsp01 kernel: EXT3-fs warning (device sde1): ext3_clear_journal_err: Marking fs in need of filesystem check.
Aug 4 10:23:05 kcnfsp01 kernel: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
Aug 4 10:23:05 kcnfsp01 kernel: EXT3-fs: recovery complete.
Aug 4 10:23:05 kcnfsp01 kernel: EXT3-fs: mounted filesystem with ordered data mode.

/dev/sdc is the home directory file system (big-time errors), sdd and sde are the others. /dev/sde is the one I am working on now, hoping it's resolution will be applicable to the other two as well.

When this happens, I unmount the file system from all hosts, remove it from /etc/exports, run 'exportfs -r' and then unmount it. When I re-mount it (no fsck) it's fine for another day or so, then goes ro again.

I ran fsck on the home directory file system the first time is reported journaling errors, and while it fixed about a zillion errors it also removed journaling, made the fs ext2, and then when I remounted it it was empty. I restored from tape but have since left it offline.

Also, I have also checked with EMC and there are no disk errors on any of the devices in this SAN system. It's used for a few dozen other servers, has been in place for years, and has no other issues.

As all three file systems I have put on this server are showing the same or similar problems, I assume the issue is with the server and not the SAN.

I do also see this issue on boot, which I am not sure is related:

kernel: nfs warning: mount version older than kernel
amd[2614]: mount_nfs_fh: NFS version 3

In addition, NFS services are failing to start on normal reboot, although I have placed the scripts after all other network service scripts in /etc/rc2.d. When I log in after a reboot and start NFS, it starts fine.

I'm about at wits' end. Any suggestions?
 
Old 08-06-2008, 03:13 PM   #2
trickykid
Guru
 
Registered: Jan 2001
Posts: 24,133

Rep: Reputation: 199Reputation: 199
From my experience, when a filesystem jumps to read-only, it was hardware related. Either the drive is going bad or the controller is going bad or needs firmware updates.
 
Old 08-08-2008, 04:02 PM   #3
Othyisar
LQ Newbie
 
Registered: Aug 2008
Posts: 4

Original Poster
Rep: Reputation: 0
Yeah, exploring that possibility now. it doesn't work on either fiber path, and the paths are connected to diff fiber switches, so looking internally. Nothing else seems to be a problem, though...

Thanks for your reply.
 
Old 08-08-2008, 11:04 PM   #4
jiml8
Senior Member
 
Registered: Sep 2003
Posts: 3,171

Rep: Reputation: 115Reputation: 115
You should run smartctl on those drives to see what the error rate is like. You also could have corruption on the system hard drive, such that the file system handlers and libraries are corrupted.

You have a hardware problem someplace.
 
  


Reply

Tags
ext3, mount, nfs, readonly, redhat


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
data sharing partition - NTFS or EXT3? cyberfishee Linux - Software 7 03-01-2008 02:29 AM
NFS Sharing and write issues thugic Linux - Networking 2 09-10-2007 05:34 AM
My article sharing Data between dual boot(NTFS+EXT3) kstan Linux - General 3 02-17-2007 07:03 AM
Day light savings time issues ddzc Linux - Software 2 02-14-2007 05:21 PM
LXer: Day 3 at OLS: NFS, USB, AppArmor, and the Linux Standard Base LXer Syndicated Linux News 0 07-22-2006 05:54 PM


All times are GMT -5. The time now is 01:25 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration