LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   System freeze-why? (https://www.linuxquestions.org/questions/linux-general-1/system-freeze-why-222577/)

svar 08-26-2004 06:01 AM

System freeze-why?
 
Lately I am having increasingly common occurences of freezing:
The main processes I have running are:
1) a shell script in a remote machine that checks a given directory for files
and , if found, gzips and ftps them to my machine
2) a perl script on my machine that looks in a specific directory and
for all files there gunzips them ,parses and stores relevant info
in memory as specific datastructures. After a certain time, these
datastructures are serialized and written on disk. Lately
I get sometimes garbage and even pieces of the file being read written
on the file with the datastructure dump, presumably this is the time
of the crash. The app is
running for some 3 years now, generally without problems. Every now and then, but now
this is more frequent I would have a freeze
Here are typical /var/log/messages relevant messages on freeze
Note that
a) sometimes I get a_alloc_pages message, lately not
b) sometimes I get some @@@ messages, sometimes not
my machine has a SCSI disk that is between 75% and 90% full(the /home partition)
Kernel is SuSe8(2.4.10, I think). I would just recompile the kernel with a more
recent one, but I am uncertain whether this would screw up anything else
and this is a production machine.
My guess is that the page_allocation error is an old kernel bug.
I'm not at all sure though all crashes are related to it.
I'm more worried that the freeze occurs when the app tries to write
to a bad disk sector. Fs is reiser.
Lately
Does anyone have any suggestion on what to check and do?
Thanks

Jul 5 11:45:59 quality5 in.ftpd[16516]: connect from 172.16.157.2 (172.16.157.2)
Jul 5 11:46:07 quality5 in.ftpd[16531]: connect from 172.16.157.2 (172.16.157.2)
Jul 5 11:50:32 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:51:10 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:51:59 quality5 last message repeated 3 times
Jul 5 11:51:59 quality5 kernel: VM: killing process httpd
[My QUESTION: WHAT DOES httpd have to do here???]
Jul 5 11:51:59 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:52:31 quality5 last message repeated 2 times
Jul 5 11:53:00 quality5 last message repeated 3 times
Jul 5 11:53:00 quality5 kernel: VM: killing process httpd
Jul 5 11:53:00 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:54:18 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:54:19 quality5 kernel: VM: killing process httpd
Jul 5 11:54:19 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:54:19 quality5 kernel: VM: killing process httpd
Jul 5 11:54:19 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:54:19 quality5 kernel: VM: killing process httpd
Jul 5 11:54:19 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:54:19 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d
2/0)
Jul 5 11:54:19 quality5 kernel: VM: killing process httpd
.....
Jul 13 07:28:31 quality5 syslogd 1.4.1: restart.
Jul 13 07:48:04 quality5 in.ftpd[5894]: connect from 172.16.157.2 (172.16.157.2)
Jul 13 07:59:00 quality5 /USR/SBIN/CRON[5905]: (root) CMD ( rm -f /var/spool/cron/lastrun/cron.hourly)
Jul 13 08:28:31 quality5 -- MARK --
Jul 13 08:30:46 quality5 kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jul 13 08:48:33 quality5 -- MARK --
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Jul 13 10:21:01 quality5 syslogd 1.4.1: restart.
....
Aug 4 07:23:11 quality5 in.ftpd[10047]: connect from 172.16.157.2 (172.16.157.2)
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
/usr/lib/qt-2.3.2/examples/xml/tagreader-with-features/tagreader.cpp
/usr/lib/qt-2.3.2/examples/xml/tagreader-with-features/tagreader.pro
/usr/lib/qt-2.3.2/examples/xml/tagreader/animals.xml
/usr/lib/qt-2.3.2/examples/xml/tagreader/Makefile
/usr/lib/qt-2.3.2/examples/xml/tagreader/structureparser.cpp
/usr/lib/qt-2.3.2/examples/xml/tagreader/structureparser.h
/usr/lib/qt-2.3.2/examples/xml/tagreader/tagreader.cpp
/usr/lib/qt-2.3.2/examples/xml/tagreader/tagreader.pro
/usr/lib/qt-2.3.2/examples/xmlquotes
/usr/lib/qt-2.3.2/examples/xmlquotes/main.cpp
/usr/lib/qt-2.3.2/examples/xmlquotes/Makefile
/usr/lib/qt-2.3.2/examples/xmlquotes/moc_richtext.cpp
/usr/lib/qt-2.3.2/examples/xmlquotes/quoteparser.cpp
/usr/lib/qt-2.3.2/examples/xmlquotes/quoteparser.h
/usr/lib/qt-2.3.2/examples/xmlquotes/quotes.xml
/usr/lib/qt-2.3.2/examples/xmlquotes/README
/usr/lib/qt-2.3.2/examples/xmlquotes/richtext.cpp
/usr/lib/qt-2.3.2/examples/xmlquotes/richtext.h
/usr/lib/qt-2.3.2/examples/xmlquotes/xmlquotes
/usr/lib/qt-2.3.2/examples/xmlquotes/xmlquotes.pro
/usr/lib/qt-2.3.2/extensions
/usr/lib/qt-2.3.2/extensions/nsplugin
/usr/lib/qt-2.3.2/extensions/nsplugin/doc
/usr/lib/qt-2.3.2/extensions/nsplugin/doc/examples.doc
/usr/lib/qt-2.3.2/extensions/nsplugin/doc/index.doc
/usr/lib/qt-2.3.2/extensions/nsplugin/examples
/usr/lib/qt-2.3.2/extensions/nsplugin/examples/grapher
/usr/lib/qt-2.3.2/extensions/nsplugin/examples/grapher/graph.cgi
/usr/lib/qt-2.3.2/extensions/nsplugin/examples/grapher/graph.g1n
Aug 4 16:05:07 quality5 syslogd 1.4.1: restart.
.....
Aug 13 00:07:06 quality5 in.ftpd[22292]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 00:07:08 quality5 in.ftpd[22296]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 00:07:09 quality5 in.ftpd[22301]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 00:07:11 quality5 in.ftpd[22303]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 00:07:13 quality5 in.ftpd[22312]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 07:03:17 quality5 syslogd 1.4.1: restart.
Aug 13 07:03:18 quality5 modprobe: modprobe: Can't locate module block-major-11
Aug 13 07:03:19 quality5 last message repeated 31 times
......
Aug 13 21:52:57 quality5 in.ftpd[8373]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 21:53:03 quality5 in.ftpd[8384]: connect from 172.16.157.2 (172.16.157.2)
Aug 13 21:59:00 quality5 /USR/SBIN/CRON[9042]: (root) CMD ( rm -f /var/spool/cron/l
astrun/cron.hourly)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
/usr/X11R6/share/xmclock/xmconvert.tcl
/var
/var/account
/var/account/pacct
/var/adm
/var/adm/autoinstall
/var/adm/autoinstall/files
/var/adm/autoinstall/logs
/var/adm/autoinstall/scripts
/var/adm/backup
/var/adm/current_package_descr
/var/adm/current_pa
Aug 14 08:50:16 quality5 syslogd 1.4.1: restart.
....
Aug 14 22:01:34 quality5 in.ftpd[29065]: connect from 172.16.157.2 (172.16.157.2)
Aug 14 22:01:38 quality5 in.ftpd[29075]: connect from 172.16.157.2 (172.16.157.2)
Aug 15 17:01:33 quality5 syslogd 1.4.1: restart.
Aug 15 17:01:34 quality5 modprobe: modprobe: Can't locate module block-major-11
Aug 15 17:01:35 quality5 last message repeated 31 times
.......
Aug 22 17:41:15 quality5 in.ftpd[19094]: connect from 172.16.157.2 (172.16.157.2)
Aug 22 17:41:17 quality5 in.ftpd[19096]: connect from 172.16.157.2 (172.16.157.2)
Aug 22 17:41:22 quality5 in.ftpd[19106]: connect from 172.16.157.2 (172.16.157.2)
Aug 23 07:08:43 quality5 syslogd 1.4.1: restart.
Aug 23 07:08:43 quality5 modprobe: modprobe: Can't locate module block-major-11
Aug 23 07:08:45 quality5 last message repeated 31 times
.....
Aug 23 21:31:34 quality5 in.ftpd[7138]: connect from 172.16.157.2 (172.16.157.2)
Aug 23 21:31:40 quality5 in.ftpd[7151]: connect from 172.16.157.2 (172.16.157.2)
Aug 24 15:30:49 quality5 syslogd 1.4.1: restart.
Aug 24 15:30:49 quality5 modprobe: modprobe: Can't locat
......
Aug 25 12:55:45 quality5 in.ftpd[23009]: connect from 172.16.157.2 (172.16.157.2)
Aug 25 12:55:50 quality5 in.ftpd[23022]: connect from 172.16.157.2 (172.16.157.2)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@Aug 25 13:52:14 quality5 syslogd 1.4.1: restart.
Aug 25 13:52:15 quality5 modprobe: modprobe: Can't locate module block-major-11
Aug 25 13:52:16 quality5 last message repeated 31 times
Aug 25 13:52:19 quality5 kernel: klogd 1.4.1, log source =

bluefire 08-27-2004 03:25 PM

I was having freezes for a while while running Debian Testing and Mandrake Cooker. Hardware is AMD-k6. I "solved" it by re-installing Mandrake 10 vanilla. My guess is that there's something in the Xorg libraries causing this (since that was the major change between my vanilla and testing setups) but I was unable to get to the bottom of the cause. Any of those circumstances apply to you?

Crashed_Again 08-28-2004 05:02 AM

What kernel sources are you using and what chipset does your motherboard have?

ambayah 08-28-2004 05:34 AM

KDE 3.3?

svar 08-28-2004 08:26 AM

Kernel is 2.4.18-4GB (Suse 8.0)
MB is ASUS CUv4x ASUS says chipsets are
VIA 694XDP and VIA 686B Chipsets

Hds are 3 Seagate Cheetah 36.7GB ULTRA (SCSI)(one of them is /home and seems to give the problem)
Controller SCSI AHA 29160

No, I do not have KDE3.3, this is an older distribution
Neither do I have AMD, and this is not Debian. It is very puzzling though. I'v had
these freezes occur infrequently at times(almost a month or more with no problems or even twice a day) Initially I though(because there is writing to disk) this must be som edisk error, but now I am not so sure.

J.W. 08-28-2004 10:34 PM

IMO, three very frequent causes of freezes are:

1. Overheating
2. Overclocking
3. Bad or incompatible memory

with #1 being the most likely culprit. What are your system temps, and do you use something like gkrellm to monitor them? -- J.W.

svar 08-29-2004 08:26 AM

Overheating-the ventillator seems to work
Overclocking-definitely not
bad/incompatible memory


Could it be bad disk sectors?

ambayah 08-31-2004 01:40 AM

When it freezes, does the red light / HDD activity light stays?

svar 08-31-2004 01:43 AM

The LEDs above Pause/Scroll Lock are on(green)

ambayah 08-31-2004 01:48 AM

Maybe its the Hard-disk.

Have you upgraded your system?

Try using swaret to upgrade and see if it still hangs.

svar 08-31-2004 01:53 AM

Meaning the kernel? No, I'm still with 2.4.22
Are you suggesting 2.6.7?

ambayah 08-31-2004 02:06 AM

Nope, I'm still 2.4.22, I've faced hangs before but havnt yet since upgraded a few stuff. I'm on Slacware and I upgrade using Swaret.

ambayah 08-31-2004 02:06 AM

And try reinstalling the drivers, or reconfigre alsoconfig.

svar 08-31-2004 02:10 AM

Thanks about swaret, I was not aware.
Not sure I follow though... What drivers are you talking about? Drivers for the SCSI disks?
You mean reconfigure alsaconfig? Please be more specific, if you can(or remember)
Thanks again

ambayah 08-31-2004 02:13 AM

:D

run alsoconfig on the terminal, and follow the menu.

I don't know how SuSeans upgrade, apt-get or something, but on Slackware we use Swaret to automagically upgrade.

Upgrade Disk, Sound, Display if possible.


All times are GMT -5. The time now is 10:41 AM.