Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you're a system admin responsible for Linux servers in racks, locked away with no keyboard or mouse directly connected, then I'm very interested in your replies to the following questions:
I need to repair a file system on a PC which I can see across the LAN, but of course, only while that PCs file system is running. Herein, lies the problem. Both PCs have Linux, but neither hardware has the serial console BOOT PROM (or serial console capability, to the best of my knowledge).
The file system checking utilities warn of the risk of EXTREME damage to mounted file systems. So, if the file system is not mounted, then the O/S is not running and no network devices are working, making the machine unreachable via Ethernet.
To my knowledge, that leaves 2 choices:
1. Boot from a CD/DVD locally plugged attached to the machine who's file system I want to fix, because a Live Linux can identify the attached drives and work on non-mounted file systems.
2. Connect via the serial console, if the hardware has the BOOT PROM support for this. While I do have serial console experience--from a Windows workstation connected to a Sun SPARC server--I do not know how to run or use fsck or e2fsck. Is this even an option between 2 Linux PCs?
So, how do all you sys admins repair file systems on servers you can't walk up to and use directly?
Thanks in advance for your replies, and thanks for reading my long question!!
Still like to know how to repair a file system remotely; via ssh would be great!
So you already know, that it depends on the partition schema of the system. If the file system in question does _not_ hold the root partition, then it may be easy to unmount, check, remount without having to switch to single-user-mode, which may break networking as well (actually it _will_ do so i.e. with Debian/GNU Linux, but your distribution may handle this differently).
However I suspect you request repair of the file system on the root partition. That is the harder part and commonly done by physically accessing the system and booting a rescue (live) system from CD/DVD or USB mass storage.
You'll see, there is no special hardware requirement, and I'd be surprised, if your systems lake serial console capability. Of course you'll need to have a serial connection installed between the system in question and another running system. But a serial port is (still) quit common, even if things like floppy drives and PS/2 ports are fading away. Apart from console related stuff you'll want to add another kernel option "root=/sbin/init" to stop the normal boot-up including mounting "/" read-write. USB-to-Serial converter will _not_ work, because USB may never be initialized, if not included in a special initial RAM-disc (often by making it kind of a mount dependency for root-fs or the like).
Well, while I do have these headless systems and used console on running systems, I must confess that I never tested the console for recovery by now. Hope, you still get some more points to start from.
What a great post! Thanks for taking the time to share all that useful info with me (and everyone else on LQ)!!
I ended up switching into single user mode and running fsck.ext2 on the 2 ext2 partitions. After repairing the ext2 partitions, the swap partition tested good and the machine seems to be working fine now.
As for the hardware part of serial consoles, I've always used Sun's BOOT PROM, so I really have no experience with serial console connections on machines without those BOOT PROMS. I guess I was assuming they are required, since they contain the code to operate the server without the OS running. From what you said, sounds like I need to learn a lot more about GRUB, and maybe LILO, too.
Thank you very much for the links and comments and a great reply!!!