The following problem occurs on an FC3 filesystem being used for storage (i.e. it gets mounted with -o sync, data is saved, then it gets unmounted). Note: FC3 OS on the hard drive on which the FC3 filesystem exists is not booted up and running, but power to the computer system puts the hard drive into a ready state. The FC3 filesystem is an ext3 journaled filesystem.
Using dumpe2fs, I have noticed that the primary superblock is the only clean superblock - even after this AM booting up after 32 mounts (2 over the 30 maximum mount count limit) which causes fsck to run. This means that 12 alternative superblocks record a "not clean" status of the ext3 filesystem even after fsck has been run. This does not seem to be what should, but just is as it is.
This has me concerned, naturally, because none of the alternate superblocks are usable, and when the primary goes south they cannot be used to fix the problem.
The Problem
What happens occasionally is that the ext_attr Filesystem (FS) feature goes missing from the primary superblock - I'm not sure just why as yet. Obviously, this would be the perfect opportunity to take advantage of a "clean" alternative superblock that also contains the missing ext_attr Filesystem feature. When this happens, the journal is not applied on the mount command from another Linux system (i.e. that OS is booted up and running) in the computer system. The current way I deal with this situation is to either boot up the FC3 OS or another Linux system which does not appear to manifest the missing ext_attr FS feature. Note: When this occurence happens, the Filesystem state is still marked clean in the primary superblock, however, the missing ext_attr causes the journal to not be applied on a mount. Also, all of the "not clean" alternative superblocks do not contain the ext_attr Filesystem feature.
Potential fix for the problem
Since I know where the alternate superblocks are located (by running the dumpe2fs command and piping the output into a grep command to look for the string "superblock"), it seems that the dd command can be used (very carefully) to overwrite the "not clean" alternative superblocks (from the primary superblock) when in fact the primary superblock is clean and an fsck has just been run - and, most importantly, the ext_attr has not gone missing as far as the primary superblock is concerned. This allows the journal to be applied first to complete a successful mount by the other Linux system as opposed to when the ext_attr goes missing which results in the journal not being applied.
What command can be or is typically used to repair the alternative superblocks (e.g. debugfs) i.e. to make them consistent with a clean primary superblock - or, have I stumbled upon how (by use of the dd command) in the process of thinking about this problem and how to resolve it? One would think that when a primary superblock is fixed by fsck, that the alternative superblocks would be updated - but, this does not seem to be the case with FC3.
Here is an example of using the dd commands to accomplish the task of repairing the alternative superblocks (i.e. only the 1st alternative superblock):
For the purpose of this example, here is a truncated list of the primary and 1st alternative superblocks from the output of the dumpe2fs command:
Primary superblock at 0, Group descriptors at 1-5
Backup superblock at 32768, Group descriptors at 32769-32773
Given: FS blocksize=4096; primary superblock at=0; 1st alternative superblock at=32768 and size of superblock=1024 <=== Is this correct??? Hard drive is 80GB SATA
To copy the 1st backup superblock (assuming it is clean) to fix primary superblock:
# dd if=/dev/sdbn of=/dev/sdbn bs=1024 skip=32768 count=1
To copy the primary superblock (assuming it is clean) to fix the 1st backup superblock:
# dd if=/dev/sdbn of=/dev/sdbn bs=1024 seek=32768 count=1
Note: /dev/sdbn is replaced by the actual device name where 'n' is some number: 0, 1, 2, ...
A few last questions:
1) Will the dd commands work and not bork up the good primary superblock - have I got the right superblock size=1024?
2) Is there another way to do this, say with the debugfs command?
3) Is it not advisable to use the dd command (as above) to overwrite the alternative superblock(s) from the primary superblock?
4) Is this a bug in FC3 or any subsequent Fedora release?
5) If not, then why does the fsck command not repair or update the alternative superblocks when the primary superblock is fixed?
6) Is that an fsck.ext3 command bug/omission - i.e. that the alternative superblocks are not updated when the primary superblock is fixed by fsck.ext3?
-- Tom
P.S. I have recently been issuing three sync commands after all of the data transfers have been made to the successfully mounted FC3 file system (i.e. journal has been applied) before unmounting it. Hopefully, this may help to ameliorate the problem.
P.P.S. For more information about the symptoms of this problem and the
other Linux environment in which I operate look at
this thread.