LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Debian
User Name
Password
Debian This forum is for the discussion of Debian Linux.

Notices

Reply
 
Search this Thread
Old 10-10-2008, 09:27 PM   #1
websissy
Member
 
Registered: Jul 2008
Posts: 49

Rep: Reputation: 15
Why did linux boot with main file system read only after kernel install?


A few days ago, I tried migrating mailman 2.1.11 to my Debian etch4.0r3 server by backing up my old install of the mailman app and its data from my old Redhat server (where it had been running under python 2.4) and moving it to my new Debian server and extracting it from the archive in usr/local/mailman (where it would also be running under python 2.4!)

That approach was unsuccessful. I'm not sure why. It seemed to me that what I did should have worked fine. But for some reason it did not.

So today I started over. I began by renaming the /usr/local/mailman directory (to protect its archives (data) directory contents from accidental deletion).

Next, I checked to confirm the python version installed on my debian server was still okay using python -v from the shell prompt to verify that python ran and to confirm what version it was. Python came right up saying it was v2.4.

So far so good...

My third step was to download a fresh copy of mailman 2.1.1 from the gnu server and unzip and untar it into a fresh mailman directory where it would soon be installed. As soon as that was finished, I started gradually working through the setup process to install Mailman using the admin installation guide on gnu's mailman site. However, when I got to the step that told me to run ./configure, I did that and the configure immediately bitched that there was something wrong with the python installation and insisted python should be repaired before continuing. The interesting part is python 2.4 had been installed with aptitude but had not yet been used since then because I hadn't needed it yet. It was installed specifically for the needs of this site and for the mailman application. So, I'm not sure how python got "damaged".

Okay... now I was suddenly on a whole new trouble shooting path. What I did next was fire up aptitude and uninstall python 2.4 completely with the intent of reinstalling it immediately. I uninstalled python and of course the way aptitude works it automatically removed a list of other apps that were no longer needed at the same time. When the uninstall completed, I turned around and reinstalled python and its docs along with a python runtime speedup tool named psyco all at once.

Halfway through that install, aptitude informed me it was "now installing a new version of the linux kernel" (Oops! I hadn't ASKED for or authorized a kernel upgrade. So where the devil did it come from? I dunno!). In that informational notice, aptitude recommended that I should reboot the server immediately after the install was finished so the kernel upgrade and configuration process could be completed.

Naturally, I followed those instructions to the letter, but when I went back and logged on to the system after the reboot I discovered that:

a) The root file system is now "read only"
b) my "newly installed" mailman directory seems to have completely vanished. In its place is the old original mailman directory from two weeks ago that I had renamed to mailman.save earlier today.
c) not ONE of the websites on my server can be accessed now. I'm sure that's because the primary file system is "read only"

Is there anyone here who has a clue what I've done wrong and how to fix it? I'm completely bewildered and confused at this point.

Thanks!

Last edited by websissy; 10-10-2008 at 09:28 PM. Reason: debian linux reboot read-only file system mailman listserv python
 
Old 10-11-2008, 02:15 PM   #2
jailbait
Guru
 
Registered: Feb 2003
Location: Blue Ridge Mountain
Distribution: Debian Wheezy, Debian Jessie
Posts: 7,507

Rep: Reputation: 176Reputation: 176
Quote:
Originally Posted by websissy View Post

aptitude recommended that I should reboot the server immediately after the install was finished so the kernel upgrade and configuration process could be completed.

Naturally, I followed those instructions to the letter, but when I went back and logged on to the system after the reboot I discovered that:

a) The root file system is now "read only"
b) my "newly installed" mailman directory seems to have completely vanished. In its place is the old original mailman directory from two weeks ago that I had renamed to mailman.save earlier today.
c) not ONE of the websites on my server can be accessed now. I'm sure that's because the primary file system is "read only"
These problems sound like the problems that you could get if you reboot without going through a normal shutdown. Did you reboot immediately without issuing a shutdown command?

--------------------
Steve Stites
 
Old 10-12-2008, 09:43 AM   #3
websissy
Member
 
Registered: Jul 2008
Posts: 49

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by jailbait View Post
These problems sound like the problems that you could get if you reboot without going through a normal shutdown. Did you reboot immediately without issuing a shutdown command?

--------------------
Steve Stites
Thanks for the reply, Steve. It's much appreciated. I think you're right. I reached the same conclusion yesterday myself. I now believe what you described is almost exactly what happened. However, tehre is no "ctrl-alt-del button long enough to reach my server which is 1,500 miles away and the shutdown script we use does make it a point to do an orderly shutdown of everything on the server before it restarts the system. So, no, I don't THINK this is our fault.

However, from what I can tell, it appears the primary hd's main file system was glitched in an undocumented server "event" last weekend. My guess is the server center lost power sometime Sunday night and their Ops just restarted their servers without doing any sort of fsck recovery or any announcement to their client admins (moi) about what had happened. The server's response had been sluggish this week but I attributed it to heavy net loads. However, when I tried to install a new user app Friday, the proverbial defecation hit the perrenial ventilation and I was faced with a server with a glitched primary hd.

That's when I went looking for the manpages for fsck...

I wasn't too concerned about this at first because I knew we had a primary hd full-drive-clone backup on the secondary and several interim backups made this past week stored on the primary hd too.

However, once the journaling file system had "recovered" yesterday, we'd lost all our intermediate backups -- which had been stored as "tarballs" out in "no-man's-land" on the primary 500gb hd.

The "journaled" recovery basically rolled us clear back to last Sunday night shortly before the system crash occurred and about 24 hours after the drive backup to the secondary was made. (sigh...) Unfortunately, this is a brand new server. So, althogh I had backups working, I didn't yet have the normal overnight FTPs of intermediate backups to a remote B/U drive here in our data-center operating yet. All I can say in self defense is no one expects to have this sort of a hit on a server that's barely 60 days old.

Still, we "recovered" in a manner of speaking if one doesn't consider a week's work lost (and 1,500 new grey hairs for me) to be a big deal . But we're still getting intermittent segfaults from apache on that server which is in a datacenter 1500 miles away!

Thanks a lot for your insights and thoughts, Steve. It's helpful to have someone else independently confirm my own conclusions. This is one of "those situations" where there's no one else around here with the tech savvy to diagnose and repair a problem of this nature or for me to discuss this with except my wife and the cat. I love them both but I must say neither of them is terribly helpful in a situation like this.

Wish me luck. This battle isn't over yet...

Last edited by websissy; 10-12-2008 at 12:17 PM.
 
Old 10-12-2008, 05:19 PM   #4
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 48
You'll want to become friends with the uptime command. Just typing it into a terminal will tell you how long your machine has been on. You can then do a simple bit of addition/subtraction level math to determine how long it has been since it was turned on.

If you haven't rebooted since you did the kernel upgrade, "uptime" should be the amount of time that has passed since you rebooted, if it is less, then it is possible the data center lost power. I don't think that is very likely however. Any data center that loses power and then doesn't fess up would not keep my business. Many people have software that monitors servers, and when they start calling to discern why their software sent out alert emails during the power failure, they'd need some very crafty answers.

Peace,
JimBass
 
Old 10-12-2008, 07:02 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,222

Rep: Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019
Any f/s error could potentially cause a r/o remount - have a look at fstab.
 
Old 10-13-2008, 12:29 AM   #6
websissy
Member
 
Registered: Jul 2008
Posts: 49

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by JimBass View Post
You'll want to become friends with the uptime command. Just typing it into a terminal will tell you how long your machine has been on. You can then do a simple bit of addition/subtraction level math to determine how long it has been since it was turned on.

If you haven't rebooted since you did the kernel upgrade, "uptime" should be the amount of time that has passed since you rebooted, if it is less, then it is possible the data center lost power. I don't think that is very likely however. Any data center that loses power and then doesn't fess up would not keep my business. Many people have software that monitors servers, and when they start calling to discern why their software sent out alert emails during the power failure, they'd need some very crafty answers.

Peace,
JimBass
You know, Jim, I've been a hands-on tech pro in the IT biz for over 4 decades. I started back in the days of punched cards, ALC and COBOL. Over the years I've installed configured and/or served as lead admin on many systems. So, when I decided to lease a server and become its sole tech and admin I had some clue what I was in for. Still, it had been years since I last worked as a *nix admin and even then I had consultants and advisors to lean on. Needless to say much has changed since '94. The web as we know it was barely on the scopes then. Thus even after 40 years and at age 58, I'm still learning and I've come to appreciate guys like you and the others here even more now than I did way back when both I and the IT world were young and innocent.

Thanks for the tip. I'll remember uptime. I don't know what I'd do without my helpful friends on the net!

Best,
GregPlatt - a.k.a. WebSissy

Last edited by websissy; 10-13-2008 at 12:35 AM.
 
Old 10-13-2008, 12:41 AM   #7
websissy
Member
 
Registered: Jul 2008
Posts: 49

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by syg00 View Post
Any f/s error could potentially cause a r/o remount - have a look at fstab.
Yeah, backup... You know I've heard that word somewhere before. Must find time to look it up and figure out what they're talking about!

There are no errors in our fstab. And of course, noone at the server center would ever admit they restarted my system improperly. I'll probably never know exactly what caused this problem. I just know it happened. In fact, I'm still cleaning fresh manure out of my hair and off the walls, ceilings and floors!
 
Old 10-13-2008, 02:49 AM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,222

Rep: Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019Reputation: 1019
I wasn't trying to suggest you had a problem in fstab; do you have "errors=remount-ro" as an option ???. That might give you what you have seen.
 
  


Reply

Tags
file, linux, mailman, python, read, reboot, system


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why did linux boot with main file system read only after kernel install? websissy Debian 1 10-10-2008 11:32 PM
Upgraded Kernel, Kernel Panic, Can't read root file system. Romanus81 Slackware 25 05-04-2008 10:45 PM
I need to install a main stream linux system via floppies and/or internet. dallix Linux - General 2 08-09-2007 03:29 AM
FC3 boot hang: Read-only file system error? batard Fedora 2 04-05-2005 06:28 AM
Errors on Boot: Read-Only file system DrJones Linux - General 1 02-15-2004 04:03 AM


All times are GMT -5. The time now is 12:44 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration