LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 04-03-2003, 05:28 AM   #1
joesbox
Member
 
Registered: Feb 2003
Location: hampton va
Distribution: ubuntu
Posts: 502

Rep: Reputation: 30
little information for the q


i am the admin for a linux at work and kinda was forced into something i was not trained for.
forced isn't really the word for it since i showed interest in it but what training i received was informal and brief.

anyway, on to my question. the box works great until the W2K it relies on goes down. when the win goes down for more than an hour and is brought back up then the lin is flooded and locks. the only way to fix this is to hardboot it. i think that doing this several times over the last few months has killed one of the main functions in the lin. all of the maintenance done on the lin is done remotely via Webmin but sometimes i just want to log in and work in a loving place. but when i log out the screen goes black and the lin locks up. the also occurs when trying to reboot from console. i have no idea where to look to find the problem nor do i have any idea what the problem could be. re-installing is a definate NO from the boss since we house over 1000 html pages and 100 scripts creating more pages and performing functions that are vital to our business. boss doesn't want to run the risk of loosing ANYTHING even though we have backups. boss is very anti-linux (very dissappointing for me) and is pussing me to work to porting everything to W2K.

i just caught myself rambling/ranting and i am sorry.
if anyone knows the answer to the reboot/logout locking problem please help. any ideas will help. thanks. (please no smart donkey responses)
thanks in advance.
joe
 
Old 04-03-2003, 08:37 AM   #2
sidey
Member
 
Registered: Mar 2003
Location: Essex UK
Distribution: rh 8.0 bsd 5.0 slack 9.0 rc2 crux
Posts: 147

Rep: Reputation: 15
I seem to have this problem too, especially when using a realtek chipset, the only way i have found so far to remedy this problem with realtek (driver 8139too.o) is to set the card to run at 10mbs instead of 100. by using mii-tool

eg

ifconfig eth0 down
mii-tool -A 10baseT-FD eth0
ifconfig eth0 up

This will solve the problem for that chipset. (note i experienced this problem on rh 8, slackware 9 rc2 and CRUX)

(but this could also be down to my setup as i'm running a very old laptop with a 16 bit pcmcia bus, but this sorted my problems out)

ps if this works can i have your job :P
 
Old 04-03-2003, 02:31 PM   #3
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Joesbox,

what exactly is the relationship between the
win-machine and Linux-box, how does the
Linux one depend on the presence of the
windows thing?

One idea (if this is a specific problem of
one service) to run a cron job, that checks
for the presence of the win-machine every
5 minutes (ping, connect to SQL database,
whatever you are doing on it) and if it's gone
switches the service on Linux off, and back
on in the next cycle in case the machine is
up again.

Cheers,
Tink
 
Old 04-03-2003, 08:14 PM   #4
joesbox
Member
 
Registered: Feb 2003
Location: hampton va
Distribution: ubuntu
Posts: 502

Original Poster
Rep: Reputation: 30
tink,
this is the set up. we have the two boxes. the win is the server that houses our public page which kinda mirrors the lin. the lin is used to download weather images from different agencies as well as text forecasts and observations. (if i get too weather technical please forgive) we download this every minute and store each for one month. we interface with the lin via http filling out forms and making weather related products that we monitor thru the lin. now this is where the win comes in. the win is mounted under / for an easy transfer of data. every time a forecast/observation is downloaded then a copy is created and saved to the win so that when someone from the outside (outside of our office) wishes to see a spacific forecast for a spacific airport or city then they have an ever-ready** source. now as far as when the win going down and the lin choking following the reboot, i am not sure if this is a coing-kydink or not but i am guessing that when the win comes back up the lin is flooded by what i am not sure. this is only theory and no evidence is there on the true problem. so this is only a problem that we get when the win goes down.
as far as checking for one service choking the lin i don't think so. the crontab has about 30 or so jobs to do varying from every minute to once an hour to once every couple of hours. not all of them run to the win. some run straight to the web and some via ftp and i have a couple mirroring different sites. i believe all but two are running perl scripts. with the win mounted i am not sure how to check to see if it is still mounted and if i do get that part i don't know how to shut down the spacific cron jobs.

** I work on an Air Force base and sometimes the main connection to the internet is disconnected. we download all of the information so that we can still perform our duty even without internet access.

sorry for the length of this post but you asked for the relationship of the two boxes and inorder to explain i had to give a slight background first.
 
Old 04-03-2003, 08:40 PM   #5
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
sorry for the length of this post but you asked for the relationship of the two boxes and inorder to explain i had to give a slight background first.
Not to worry, there's no such thing as too much
information :)

Quote:
now this is where the win comes in. the win is mounted under / for an easy transfer of data. every time a forecast/observation is downloaded then a copy is created and saved to the win
Hmm ... so the windows machine is not really actively
talking at the Linux box, is it?

Quote:
this is only theory and no evidence is there on the true problem. so this is only a problem that we get when the win goes down.
Did you look at /var/log/messages, debug,
syslog for any indication about what's going
on during the return of the WIn-Box? Maybe
even your cron-jobs are spitting out info
into some sort of log?

Quote:
with the win mounted i am not sure how to check to see if it is still mounted and if i do get that part i don't know how to shut down the spacific cron jobs.
The easiest thing would be to ping it (if it
completely dies, that is ... ) and then unmount
the mount-point if you don't get a response.
You'll have to find out which cron-jobs do
the copying, though, and build some sort
of check into them, as well, like "only copy
if mount lists the smb-share".

Cheers,
Tink
 
Old 04-03-2003, 11:30 PM   #6
joesbox
Member
 
Registered: Feb 2003
Location: hampton va
Distribution: ubuntu
Posts: 502

Original Poster
Rep: Reputation: 30
thanks i will check things out when it happens again. i just looked and it only goes back about 5 days.
thanks for the info.
later
 
Old 04-04-2003, 03:34 PM   #7
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
thanks i will check things out when it happens again. i just looked and it only goes back about 5 days.
thanks for the info.
Just keep us informed :}

I'm curious how this goes on!

Cheers,
Tink
 
Old 04-04-2003, 04:30 PM   #8
joesbox
Member
 
Registered: Feb 2003
Location: hampton va
Distribution: ubuntu
Posts: 502

Original Poster
Rep: Reputation: 30
a responce on what the log file says may take a while. these crashes aren't as regular as the normal win box. sometime it takes a whole month before it crashes..... lol
but yes i will try and remember to keep you posted.
 
Old 04-06-2003, 02:35 PM   #9
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
boss is very anti-linux (very dissappointing for me) and is pussing me to work to porting everything to W2K.
Quote:
these crashes aren't as regular as the normal win box. sometime it takes a whole month before it crashes..... lol
I forgot to respond to this in the first place ;)

Tell your boss to remove the cause of the
problem, not the victim ;)

Anyway, I hope you'll be able to work around
it with a little support from LQ and then point
out to your boss that a solution to the problem
on the other end of that line (the winDOHs box)
would have cost him megabuxx ;)

Cheers,
Tink
 
Old 04-24-2003, 05:30 PM   #10
joesbox
Member
 
Registered: Feb 2003
Location: hampton va
Distribution: ubuntu
Posts: 502

Original Poster
Rep: Reputation: 30
ok had to hard boot the linux for once in a long time. don't know what caused the problem but looks like, according to /var/log/messages there was a loss of memmory
here is an outtake of it that covers the time span that the box was down.

Apr 24 17:46:09 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1033 (mysqld).
Apr 24 17:46:14 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1850 (missionboard.cg).
Apr 24 17:46:19 lfi-ws-linuxwx sshd(pam_unix)[2009]: session closed for user joe
Apr 24 17:46:35 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1950 (missionboard.cg).
Apr 24 17:47:04 lfi-ws-linuxwx ftpd[2098]: connection from lfi-ws-aosf1.langley.af.mil [131.6.72.47]
Apr 24 17:47:10 lfi-ws-linuxwx ftpd[2098]: FTP session closed
Apr 24 17:47:22 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1953 (missionboard.cg).
Apr 24 17:47:30 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1980 (missionboard.cg).
Apr 24 17:47:51 lfi-ws-linuxwx kernel: Out of Memory: Killed process 30631 (httpd).
Apr 24 17:49:48 lfi-ws-linuxwx kernel: Out of Memory: Killed process 30631 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32059 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32422 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32424 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32538 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32669 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32670 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1708 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 31994 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 32423 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1103 (httpd).
Apr 24 17:50:01 lfi-ws-linuxwx kernel: Out of Memory: Killed process 902 (xfs).
Apr 24 17:50:03 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1868 (convert).
Apr 24 17:50:14 lfi-ws-linuxwx last message repeated 2 times
Apr 24 17:50:22 lfi-ws-linuxwx kernel: Out of Memory: Killed process 29694 (smbd).
Apr 24 17:50:24 lfi-ws-linuxwx kernel: Out of Memory: Killed process 1907 (convert).
Apr 24 17:50:53 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2120 (httpd).
Apr 24 17:51:00 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2127 (httpd).
Apr 24 17:51:33 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2132 (httpd).
Apr 24 17:51:37 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2140 (httpd).
Apr 24 17:51:40 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2141 (httpd).
Apr 24 17:51:50 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2142 (httpd).
Apr 24 17:52:23 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2216 (missionboard.cg).
Apr 24 17:53:06 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2147 (httpd).
Apr 24 17:53:18 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2229 (httpd).
Apr 24 17:53:44 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2243 (missionboard.cg).
Apr 24 17:53:51 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2143 (httpd).
Apr 24 17:53:59 lfi-ws-linuxwx kernel: Out of Memory: Killed process 2149 (httpd).
Apr 24 18:12:17 lfi-ws-linuxwx syslogd 1.4-0: restart.

apr 24 18:12 was when i got it back up.
do you know why this would happen (Apr 24 17:53:59 lfi-ws-linuxwx kernel: Out of Memory: Killed process *)?? i am not sure unless RH 7.1 had a memory leak. any suggestions????
 
Old 04-24-2003, 05:37 PM   #11
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 79
I think you may want to make the log go back a bit further.

It sounds like something may have happend before that that cause the memory problem.
 
Old 04-24-2003, 07:35 PM   #12
joesbox
Member
 
Registered: Feb 2003
Location: hampton va
Distribution: ubuntu
Posts: 502

Original Poster
Rep: Reputation: 30
Apr 23 10:59:42 lfi-ws-linuxwx last message repeated 2 times
Apr 23 10:59:42 lfi-ws-linuxwx ftpd[26534]: CWD /var/www/cgi-bin
Apr 23 10:59:42 lfi-ws-linuxwx ftpd[26534]: PWD
Apr 23 10:59:42 lfi-ws-linuxwx last message repeated 2 times
Apr 23 10:59:42 lfi-ws-linuxwx ftpd[26534]: TYPE ASCII
Apr 23 10:59:42 lfi-ws-linuxwx ftpd[26534]: PORT
Apr 23 10:59:42 lfi-ws-linuxwx ftpd[26534]: STOR newmetwatchtool.cgi
Apr 23 10:59:42 lfi-ws-linuxwx ftpd[26534]: xferlog (recv): 1 lfi-ws-aosf2.langley.af.mil 84345 /var/www/cgi-bin/newmetwatchtool.cgi a _ i r lisa ftp 0 * c
Apr 23 11:00:27 lfi-ws-linuxwx ftpd[26534]: QUIT
Apr 23 11:00:27 lfi-ws-linuxwx ftpd[26534]: FTP session closed
Apr 23 11:00:30 lfi-ws-linuxwx su(pam_unix)[18853]: session closed for user root
Apr 23 11:00:35 lfi-ws-linuxwx sshd(pam_unix)[18807]: session closed for user lisa
Apr 23 16:31:50 lfi-ws-linuxwx ucd-snmp[1077]: Received SNMP packet(s) from 131.6.8.214
Apr 23 16:31:51 lfi-ws-linuxwx kernel: smb_request: result -104, setting invalid
Apr 23 16:31:51 lfi-ws-linuxwx kernel: smb_retry: successful, new pid=939, generation=12
Apr 23 16:31:52 lfi-ws-linuxwx kernel: hdd: No disk in drive
Apr 23 16:31:53 lfi-ws-linuxwx kernel: end_request: I/O error, dev 02:00 (floppy), sector 0
Apr 24 04:02:17 lfi-ws-linuxwx kernel: smb_trans2_request: result=-104, setting invalid
Apr 24 04:02:18 lfi-ws-linuxwx kernel: smb_retry: successful, new pid=939, generation=13
Apr 24 04:23:30 lfi-ws-linuxwx su(pam_unix)[16683]: session opened for user news by (uid=0)
Apr 24 04:23:31 lfi-ws-linuxwx su(pam_unix)[16683]: session closed for user news
Apr 24 15:53:50 lfi-ws-linuxwx ftpd[31406]: connection from lfi-ws-aosf1.langley.af.mil [***.***.***.***]
Apr 24 15:53:51 lfi-ws-linuxwx ftpd[31406]: USER joe
Apr 24 15:53:51 lfi-ws-linuxwx ftpd[31406]: PASS password
Apr 24 15:53:53 lfi-ws-linuxwx ftpd[31406]: FTP LOGIN FROM lfi-ws-aosf1.langley.af.mil [***.***.***.***], joe
Apr 24 15:53:53 lfi-ws-linuxwx ftpd[31406]: SYST
Apr 24 15:53:53 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:53:53 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:53:53 lfi-ws-linuxwx ftpd[31406]: CWD /var/www
Apr 24 15:53:53 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:53:54 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:53:54 lfi-ws-linuxwx ftpd[31406]: TYPE ASCII
Apr 24 15:53:54 lfi-ws-linuxwx ftpd[31406]: PORT
Apr 24 15:53:54 lfi-ws-linuxwx ftpd[31406]: LIST
Apr 24 15:53:57 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:53:57 lfi-ws-linuxwx ftpd[31406]: CWD html
Apr 24 15:53:57 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:53:57 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:53:57 lfi-ws-linuxwx ftpd[31406]: TYPE ASCII
Apr 24 15:53:57 lfi-ws-linuxwx ftpd[31406]: PORT
Apr 24 15:53:57 lfi-ws-linuxwx ftpd[31406]: LIST
Apr 24 15:54:03 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:54:03 lfi-ws-linuxwx ftpd[31406]: CWD htdig
Apr 24 15:54:04 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:54:04 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:54:04 lfi-ws-linuxwx ftpd[31406]: CWD /var/www/html
Apr 24 15:54:04 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:54:16 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:54:16 lfi-ws-linuxwx ftpd[31406]: CWD /home/joe
Apr 24 15:54:16 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:54:16 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:54:16 lfi-ws-linuxwx ftpd[31406]: CWD /var/www/html
Apr 24 15:54:17 lfi-ws-linuxwx ftpd[31406]: PWD
Apr 24 15:54:17 lfi-ws-linuxwx last message repeated 2 times
Apr 24 15:54:17 lfi-ws-linuxwx ftpd[31406]: TYPE ASCII
Apr 24 15:54:17 lfi-ws-linuxwx ftpd[31406]: PORT
Apr 24 15:54:17 lfi-ws-linuxwx ftpd[31406]: RETR index.htm
Apr 24 15:54:17 lfi-ws-linuxwx ftpd[31406]: xferlog (send): 1 *****computer name******5395 /var/www/html/index.htm a _ o r joe ftp 0 * c
Apr 24 15:54:33 lfi-ws-linuxwx ftpd[31406]: QUIT
Apr 24 15:54:33 lfi-ws-linuxwx ftpd[31406]: FTP session closed
Apr 24 17:45:54 lfi-ws-linuxwx sshd(pam_unix)[2009]: session opened for user joe by (uid=0)

this is the messages file before the out of memory section. this goes back to the last day. btw ******computer name***** = deleted due to security reasons and the ip addresses have been deleted for same reasons
 
Old 04-24-2003, 08:01 PM   #13
Aussie
Senior Member
 
Registered: Sep 2001
Location: Brisvegas, Antipodes
Distribution: Slackware
Posts: 4,590

Rep: Reputation: 58
A few questions,
How much ram do you have?
How big is the swap?
What type of CPU and how many?
Is the machine running the distro in your profile (man9)?
Are you running X on this box and if so what version and what type of video hardware?
 
Old 04-25-2003, 12:06 PM   #14
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 79
What was joe doing? It looks like he logged in then the problems started.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Just for information jayakrishnan Linux - General 3 05-26-2007 08:06 PM
Help me, Please... i want information. mork_Thai Slackware 4 09-23-2005 12:10 PM
Need some information ahsan_94 Linux - Newbie 4 06-27-2004 02:34 AM
For your information....... blackmask *BSD 1 05-03-2002 07:48 AM
More Information gutter007 Linux - General 2 01-31-2001 07:25 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:43 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration