LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 07-14-2008, 05:51 PM   #1
raypen
Member
 
Registered: Jun 2002
Location: Midwest
Distribution: Slackware
Posts: 365

Rep: Reputation: 30
Dual booting, serial console and odd behavior


I have an unusual setup on a SOHO Lan. Several machines boot only
a single operating system, but two are dual boot. One dual boots
Win2k and Slackware Linux 12.0 and the other dual boots Win98SE and
Slackware Linux 12.

I like to have the ability to boot one of the dual boot systems
from the other, so I have connected a serial line between the two
(a very long line since they are in different rooms) and configured
LILO to output a boot prompt to the serial line on both machines.
This allows me to select which system to boot on the remote machine.

I do not login from the serial console, so I have no agetty's running
on either machine. On either machine, I can boot into either Windows
or Linux from a regular console and then use either Hyperterminal
(Windows) or minicom (Linux) to boot the remote, selecting either
Windows or Linux. This seems to work well, and there is no clutter since
agetty's are not present at either end of the serial line.

However, a problem has arisen with the machine that boots Win98SE.
If I boot into Linux and then issue a Wake-On-Lan command to start
the other machine, I enter minicom, make the OS selection and watch
the boot messages (if I have booted Linux). I can then do whatever
networking tasks are necessary, including shutting down the remote
machine. When I try to reboot or shutdown this machine, I get the usual
shutdown message, but the machine simply stalls. I have to turn off
the power supply and on again to boot.

Upon reboot, the logs contain the following:
Code:
MESSAGES
kernel: EXT3-fs: INFO: recovery required on readonly filesystem.
kernel: EXT3-fs: write access will be enabled during recovery.
kernel: kjournald starting.  Commit interval 5 seconds
kernel: EXT3-fs: hda3: orphan cleanup on readonly fs
kernel: EXT3-fs: hda3: 6 orphan inodes deleted
kernel: EXT3-fs: recovery complete.
kernel: EXT3-fs: mounted filesystem with ordered data mode.

DEBUG
kernel: (fs/jbd/recovery.c, 255): journal_recover: JBD: recovery, exit status 0, recovered transactions 30651 to 30805
kernel: (fs/jbd/recovery.c, 257): journal_recover: JBD: Replayed 3883 and revoked 1/10 blocks
kernel: ext3_orphan_cleanup: deleting unreferenced inode 1701598
kernel: ext3_orphan_cleanup: deleting unreferenced inode 1403540
kernel: ext3_orphan_cleanup: deleting unreferenced inode 1403528
kernel: ext3_orphan_cleanup: deleting unreferenced inode 1403527
kernel: ext3_orphan_cleanup: deleting unreferenced inode 1403526
kernel: ext3_orphan_cleanup: deleting unreferenced inode 1403525

DMESG
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
(fs/jbd/recovery.c, 255): journal_recover: JBD: recovery, exit status 0, recovered transactions 30651 to 30805
(fs/jbd/recovery.c, 257): journal_recover: JBD: Replayed 3883 and revoked 1/10 blocks
kjournald starting.  Commit interval 5 seconds
EXT3-fs: hda3: orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 1701598
ext3_orphan_cleanup: deleting unreferenced inode 1403540
ext3_orphan_cleanup: deleting unreferenced inode 1403528
ext3_orphan_cleanup: deleting unreferenced inode 1403527
ext3_orphan_cleanup: deleting unreferenced inode 1403526
ext3_orphan_cleanup: deleting unreferenced inode 1403525
EXT3-fs: hda3: 6 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Sometimes when I have issued the WOL command from this machine, I have
gotten a kernel message on the console:

Code:
kernel: SysRq : HELP : loglevel0-8 reBoot tErm Full kIll saK showMem Nice powerOff 
showPc show-all-timers(Q) unRaw Sync showTasks Unmount shoW-blocked-tasks
I'm not sure how this relates to the problem, but add it since it may
be a factor.

Oddly, this is not a problem on the other machine (yet!).

I have tried to be as brief as possible about the problem and
have probably excluded information someone else thinks is necessary.
So with that in mind, does anyone have any idea as to how to
fix this?
 
Old 07-15-2008, 07:16 AM   #2
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
Does 'smartctl -a /dev/hda' offer any clues?
 
Old 07-15-2008, 02:16 PM   #3
raypen
Member
 
Registered: Jun 2002
Location: Midwest
Distribution: Slackware
Posts: 365

Original Poster
Rep: Reputation: 30
Other than S.M.A.R.T. being turned, no. However, the other machine
has SMART turned off. So just for yucks, I turned it off on the
computer in question and re-ran the remote boot-up into Linux
on the other machine. No change; same problem as before.

The output from 'smartctl -a /dev/hda' is reproduced below:
Code:
smartctl version 5.36 [i486-slackware-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar family
Device Model:     WDC WD400BB-75FRA0
Serial Number:    WD-WMAJE1031254
Firmware Version: 77.07W77
User Capacity:    40,000,000,000 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Jul 15 13:26:06 2008 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (1237) seconds.
Offline data collection
capabilities: 			 (0x79) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					No General Purpose Logging support.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  24) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   089   086   021    Pre-fail  Always       -       2075
  4 Start_Stop_Count        0x0032   098   098   040    Old_age   Always       -       2264
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   253   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       13304
 10 Spin_Retry_Count        0x0013   100   100   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0013   100   100   051    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2262
194 Temperature_Celsius     0x0022   117   253   000    Old_age   Always       -       26
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0012   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0009   200   085   051    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Maybe you notice something important.
 
Old 07-16-2008, 01:46 AM   #4
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
It looks as if your drive is fine. I was concerned that you may have had a failing hard drive as you seemed to imply that the problem had started to occur on a setup that had been working fine in the past but had started to give errors. Your mention of Win98SE suggested older hardware, and I have had to change drives on machines using Win98 due to deterioration with age and use. Also the journal replays suggested possible drive errors.
I do not use WOL so I cannot suggest anything there. The kernel message looks like it may be a menu arising from a request for help.
 
Old 07-17-2008, 06:02 PM   #5
raypen
Member
 
Registered: Jun 2002
Location: Midwest
Distribution: Slackware
Posts: 365

Original Poster
Rep: Reputation: 30
Solving this was a real nightmare. I at first thought that there
was a kernel component that was causing the problem, mostly the
EXT3 filesystem. I satisfied myself that this was not, and indeed
should not be, a problem.

I was also getting dmesg errors relating the real time clock (rtc)
and so I played around with that setup and eventually, HPET. Screwed
things up royally trying to decide what the right setup was and must
have run 10 recompiles, all to no avail.

It suddenly came to me that all the bootup messages were somehow
effecting the Slack 12 already running, so I thought I would try
removing these console messages. I edited /ect/lilo.conf (on both
machines) to remove the "ttyS0=9600n8" from the append statements
accompanying each image. This has the effect of removing almost all
of the bootup messages from the serial console while returning all
normal bootup messages to tty0 or tty1.

While you wouldn't expect these messages from another machine to
spill out of minicom into your running Linux, something does.

Needless to say, the ability to select an image to boot is not affected
by these changes; I can still hit tab in minicom and get a list of
bootable images. I then only see the
Code:
 BIOS check....................
message and then nothing appears. Because of this, my running Linux
is not affected, and shutdown and reboot now work as they should.

I don't think Linux was designed with this particular setup in mind.
It was somewhat fun figuring this out and I am elated that it works.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Dual booting w/GRUB - setting console resolution? Alstare Slackware 3 06-29-2008 11:47 PM
Odd TCP Behavior Rawjoe Linux - Networking 1 12-14-2006 10:53 AM
Odd Knoppix behavior Darkstar Linux - Distributions 1 06-25-2005 09:46 PM
Odd Re-Booting Behavior ....... ? justaguynsrq Slackware 2 04-16-2004 12:02 PM
RH 6.2 ... odd behavior jubal Linux - Networking 3 02-27-2001 09:04 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 07:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration