LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   FC5 slows down after period of time (https://www.linuxquestions.org/questions/linux-server-73/fc5-slows-down-after-period-of-time-561159/)

asidarin 06-12-2007 08:13 AM

FC5 slows down after period of time
 
We had a disk crash last week and since we re-installed with Fedora Core 5, the system seems to work fine while it is busy, but when we come in, in the morning, it is very, very slow.

I had to turn the power off yesterday morning. After the system rebooted, the system seemed to be fine. I monitored it all day. When we came in this morning, same problem. It takes minutes to execute commands.

A friend suggested it might be a syslog problem. I restarted syslog, that didn't help. It took over 5 minutes for "service syslog restart" to complete. This box is only used for a Samba share. There are no direct users logged in on the system.

When I run TOP, it takes a long time to redisplay. It will alternate between 0's for each CPU fields or 100% for wait. The top app in the list of apps is init. After I reboot, the system seems to routinely stay at 95% idle and the top app is Xorg.

The system is an IBM ThinkCentre Model 8194-A4U. It has a 120GB drive. It is a Pentium 4, 2.4Ghz with 768MB RAM.

Amazingly, I set up a different ThinkCentre last night that is already doing the same thing. It is a Model 8198-A2U with a 160GB drive. I believe it has a Pentium 4, 3.0Ghz with 256MB RAM. I put FC6 on and then did a yum update. The update was still running when I left. This morning yum had prompted for Y/N question and after I answered it, it is proceeding with the update, but it is very, very slow. I can't even get TOP to load.

What I'm asking is if anyone has seen this in the past and what can I check to see what the problem is. I am guessing it might have to do with something in the powersave area where maybe the CPU or the disk is shutdown due to inactivity but now it isn't re-awakening.

I've just noticed something else, on the primary system, before I reboot, I noticed that it only had about 9MB of memory left. After I reboot, it has about 331MB free. I'm not sure what would be taking up the memory.

Thanks

macemoneta 06-12-2007 12:04 PM

Do you see any errors in /var/log/messages and/or dmesg? The high I/O wait makes me suspect one or more of your drives is experiencing a problem.

asidarin 06-12-2007 01:11 PM

Quote:

Originally Posted by macemoneta
Do you see any errors in /var/log/messages and/or dmesg? The high I/O wait makes me suspect one or more of your drives is experiencing a problem.

I get these in /var/log/messages frequently:

Jun 12 12:51:49 acsfs smbd[2264]: [2007/06/12 12:51:49, 0] smbd/service.c:make_connection_snum(592)
Jun 12 12:51:49 acsfs smbd[2264]: Can't become connected user!
Jun 12 12:54:18 acsfs smbd[2269]: [2007/06/12 12:54:18, 0] smbd/service.c:make_connection_snum(592)

I don't see anything that jumps out on dmesg.

I have been watching TOP. Shortly after I booted, I had approx 331MB of memory free. I did a ps aux to a file to record the processes. It quickly went down to around 188MB has the first 4 shares were loaded by users. It has slowly went down now to 45MB of free memory as reported by TOP. When I do a ps aux to a separate file and compare the two, they are almost identical except the latest one has 7 shares and to ssh connections and the first one only had 4 shares and 1 ssh connection. The sizes for the shares are slightly larger. In example the largest of the 4 smbd's from the first ps aux was 12220 and now the largest of the smbd's from the second ps aux is 13672. These are the VSZ column numbers, not the RSS size.

Top shows uptime at 5:11 and 4 users.

This seems like it is a huge memory leak. I guess I don't understand why ps aux doesn't show some program's size increasing dramatically. Is there a way to better track memory size of apps to see where it is all going? At this pace, I'm not sure I'm going to make it to 5:00 this afternoon.

I also looked through /var/log/cron and I can see where cron.hourly ran without any problems through 4:00 am. But when cron.daily started a minute later, it doesn't seem to have finished. That was one of the things I noticed this morning was that the time in TOP showed around 4:00 am. I thought it was just a problem with the actual date of the system. But I wonder if there is something in the cron.daily that is killing the system.

I had left TOP up and running from the night before. It is refreshing every 3 seconds. When I checked it this morning, it would show an updated time of every 3 or 4 seconds, but the refreshes were taking more than 30 seconds. It was like they were queued up.

Thanks

macemoneta 06-12-2007 01:50 PM

There's no indication of a memory leak. Instead of using top, use free:

Code:

# free
            total      used      free    shared    buffers    cached
Mem:      1032788    966756      66032          0      37544    611692
-/+ buffers/cache:    317520    715268
Swap:      265064        176    264888

The value you're interested in is highlighted. While top would show "66032"free in this example, it doesn't take into consideration the amount of memory being used for cache buffers which prevents disk I/O, and can be reclaimed if memory is actually needed.

asidarin 06-12-2007 02:50 PM

Quote:

Originally Posted by macemoneta
There's no indication of a memory leak. Instead of using top, use free:

Code:

# free
            total      used      free    shared    buffers    cached
Mem:      1032788    966756      66032          0      37544    611692
-/+ buffers/cache:    317520    715268
Swap:      265064        176    264888

The value you're interested in is highlighted. While top would show "66032"free in this example, it doesn't take into consideration the amount of memory being used for cache buffers which prevents disk I/O, and can be reclaimed if memory is actually needed.

[root@acsfs ~]# free
total used free shared buffers cached
Mem: 767208 758140 9068 0 73820 480320
-/+ buffers/cache: 204000 563208
Swap: 1540088 0 1540088

I had just recently logged out of the console, then relogged in and then started a terminal session and started TOP and it fell from around 38MB free to about 9MB free. So, I'm seeing about the same thing as free is reporting.

Is there a way to flush the cache?

Thanks

asidarin 06-12-2007 02:59 PM

Sorry, didn't format the way it should have

Code:

                total        used        free  shared  buffers  cached
Mem:            767208      758140      9068  0        73820    480320
-/+ buffers/cache:          204000      563208
Swap:          1540088      0          1540088


macemoneta 06-12-2007 03:00 PM

Two thirds of your RAM is free, and you are not using any swap. There is no memory bottleneck on your system. If you were to "flush the cache", your system would grind to a halt (you think it's bad now), as every file I/O would require a real disk I/O.

macemoneta 06-12-2007 03:05 PM

I suggest you run "smartctl -A" on each of your drives. If they are not reporting errors, they may be experiencing high recoverable counts.

asidarin 06-12-2007 03:48 PM

Code:

[root@acsfs ~]# smartctl -A /dev/hda
smartctl version 5.33 [i386-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027  201  201  063    Pre-fail  Always      -      14851
  4 Start_Stop_Count        0x0032  242  242  000    Old_age  Always      -      23369
  5 Reallocated_Sector_Ct  0x0033  253  253  063    Pre-fail  Always      -      0
  6 Read_Channel_Margin    0x0001  253  253  100    Pre-fail  Offline      -      0
  7 Seek_Error_Rate        0x000a  253  252  000    Old_age  Always      -      0
  8 Seek_Time_Performance  0x0027  252  244  187    Pre-fail  Always      -      48484
  9 Power_On_Minutes        0x0032  212  212  000    Old_age  Always      -      26h+50m
 10 Spin_Retry_Count        0x002b  253  252  157    Pre-fail  Always      -      0
 11 Calibration_Retry_Count 0x002b  253  252  223    Pre-fail  Always      -      0
 12 Power_Cycle_Count      0x0032  253  253  000    Old_age  Always      -      267
192 Power-Off_Retract_Count 0x0032  253  253  000    Old_age  Always      -      0
193 Load_Cycle_Count        0x0032  253  253  000    Old_age  Always      -      0
194 Temperature_Celsius    0x0032  253  253  000    Old_age  Always      -      47
195 Hardware_ECC_Recovered  0x000a  253  252  000    Old_age  Always      -      2502
196 Reallocated_Event_Count 0x0008  253  253  000    Old_age  Offline      -      0
197 Current_Pending_Sector  0x0008  253  253  000    Old_age  Offline      -      0
198 Offline_Uncorrectable  0x0008  253  253  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x0008  199  199  000    Old_age  Offline      -      0
200 Multi_Zone_Error_Rate  0x000a  253  252  000    Old_age  Always      -      0
201 Soft_Read_Error_Rate    0x000a  253  242  000    Old_age  Always      -      320
202 TA_Increase_Count      0x000a  253  252  000    Old_age  Always      -      0
203 Run_Out_Cancel          0x000b  253  252  180    Pre-fail  Always      -      3
204 Shock_Count_Write_Opern 0x000a  253  252  000    Old_age  Always      -      0
205 Shock_Rate_Write_Opern  0x000a  253  252  000    Old_age  Always      -      0
207 Spin_High_Current      0x002a  253  252  000    Old_age  Always      -      0
208 Spin_Buzz              0x002a  253  252  000    Old_age  Always      -      0
209 Offline_Seek_Performnce 0x0024  191  189  000    Old_age  Offline      -      0
 99 Unknown_Attribute      0x0004  253  253  000    Old_age  Offline      -      0
100 Unknown_Attribute      0x0004  253  253  000    Old_age  Offline      -      0
101 Unknown_Attribute      0x0004  253  253  000    Old_age  Offline      -      0

[root@acsfs ~]#


macemoneta 06-12-2007 07:39 PM

That looks good. How about:

cat /proc/interrupts

asidarin 06-12-2007 08:19 PM

Code:

root@acsfs ~]# cat /proc/interrupts
          CPU0
  0:  11076110    IO-APIC-edge  timer
  1:        244    IO-APIC-edge  i8042
  7:    2119934    IO-APIC-edge  parport0
  8:          1    IO-APIC-edge  rtc
  9:          1  IO-APIC-level  acpi
 12:      9609    IO-APIC-edge  i8042
 14:    153636    IO-APIC-edge  ide0
 15:    382769    IO-APIC-edge  ide1
 16:          0  IO-APIC-level  uhci_hcd:usb3
 17:    3748946  IO-APIC-level  uhci_hcd:usb1, uhci_hcd:usb4, i915@pci:0000:00:02.0
 18:          0  IO-APIC-level  uhci_hcd:usb2
 19:          0  IO-APIC-level  ehci_hcd:usb5
 20:        23  IO-APIC-level  Intel ICH5
 21:  10548523  IO-APIC-level  eth0
NMI:          0
LOC:  11076366
ERR:          0
MIS:          0

Also, well after working hours, I see TOP is active as it has been all day. The CPU us fld is bouncing around 25-40% and the idle of course rarely stays up around 90%+ like it had been all day. At peak, there were 9 smbd processes, now there is two. Most users turn their PC's off at night. Again, their are no direct logins.

The apps that are staying at the very top is Xorg which now has CPU time of over 135:00:00 and floaters, which I think is a screen saver.

It looks busier now that it has most of the day.

We didn't have a console on the system until yesterday but it may be turned off. Any chance that could be a problem?

Thanks

macemoneta 06-12-2007 08:47 PM

Generally servers don't run X, and they certainly don't run screensavers - they just burn CPU for no good reason. However, while that may provide crappy response to your users, it won't cause your problem.

I'm not seeing any reason for the bad response time. You wouldn't happen to have an email server running on this system with an open relay?

asidarin 06-12-2007 09:22 PM

Not unless it comes that way from the install. I basically installed FC5, copied over by smb.conf file, turned on Samba and let it rip. It is a very vanilla install.

Thanks,

macemoneta 06-12-2007 09:48 PM

Assuming you have all the maintenance applied, I see no reason for the performance problem you're having. My last suggestion would be to check for a compromised machine (very unlikely):

yum -y install chkrootkit

Then run:

chkrootkit -q -n

asidarin 06-12-2007 09:56 PM

Code:

root@acsfs ~]# chkrootkit -q -n

/usr/lib/perl5/5.8.8/i386-linux-thread-multi/.packlist

 The tty of the following user process(es) were not found
 in /var/run/utmp !
! RUID          PID TTY    CMD
! root        1977 tty1  /sbin/mingetty tty1
! root        1980 tty2  /sbin/mingetty tty2
! root        1983 tty3  /sbin/mingetty tty3
! root        1986 tty4  /sbin/mingetty tty4
! root        1990 tty6  /sbin/mingetty tty6
! root        5289 tty7  /usr/bin/Xorg :0 -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7

I don't have any ttys installed on this system other than the regular serial port(s) on the back of the PC.

Thanks,

macemoneta 06-12-2007 10:16 PM

That response looks normal (the tty1-6 are the console sessions on ctrl-alt-f1 though f6). There doesn't appear to be anything wrong. The I/O activity of your samba users may simply be keeping the hard drive occupied.

asidarin 06-12-2007 11:23 PM

For the most part, the problem isn't during the day, it seems to work fine during the day and as I watch it tonight it is fine. There is no one during the night. But when we come in, in the morning, around 6:00, that is when we find it where it seems to be all jammed up to where it almost seems unresponsive.

Thanks,

asidarin 06-14-2007 10:23 AM

I'm back, another problem. I made it a little over two days, but it is a problem again.

If I do a ps -ef, it reponds instantly. If I do a TOP, it takes a very long time to diplay the window.

This started at 9:21. I asked the main user at 9:45 how it was going and she said it just seemed to slow down again. I checked my TOP window and it still had 9:21 as the time. I checked the date command and it also said that the system date was 9:22 but the current wall clock time was after 9:45. I had checked the system around 8:30-8:45 and both times were in sync.

While watching TOP, it was doing the same thing, about every 20-30 seconds, it will redisplay even though the refresh rate is set to 3 second. But the system time displayed in TOP only goes up about 3-4 seconds on each refresh.

I am tarring the data up to save it off, it scrolled the first 50 or so files then it paused for a long time and is repeating. The total size of the data is only around 471MB, so it should only take a minute or two to tar it up, but it has already taken over 5 minutes and still isn't finished.

Here is a copy of my smartctl. At first I thought it was a problem with the drive, but not I'm not so sure. Below this is a copy of the ps -ef and top output.

Please help, the Microsofties here want to just switch over to Win2003 and be done with it. I don't want to give up.

Thanks,

Code:

[root@acsfs log]# smartctl -A /dev/hda
smartctl version 5.33 [i386-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027  201  201  063    Pre-fail  Always      -      14851
  4 Start_Stop_Count        0x0032  242  242  000    Old_age  Always      -      23369
  5 Reallocated_Sector_Ct  0x0033  253  253  063    Pre-fail  Always      -      0
  6 Read_Channel_Margin    0x0001  253  253  100    Pre-fail  Offline      -      0
  7 Seek_Error_Rate        0x000a  253  252  000    Old_age  Always      -      0
  8 Seek_Time_Performance  0x0027  253  244  187    Pre-fail  Always      -      55335
  9 Power_On_Minutes        0x0032  212  212  000    Old_age  Always      -      68h+55m
 10 Spin_Retry_Count        0x002b  253  252  157    Pre-fail  Always      -      0
 11 Calibration_Retry_Count 0x002b  253  252  223    Pre-fail  Always      -      0
 12 Power_Cycle_Count      0x0032  253  253  000    Old_age  Always      -      267
192 Power-Off_Retract_Count 0x0032  253  253  000    Old_age  Always      -      0
193 Load_Cycle_Count        0x0032  253  253  000    Old_age  Always      -      0
194 Temperature_Celsius    0x0032  253  253  000    Old_age  Always      -      45
195 Hardware_ECC_Recovered  0x000a  253  252  000    Old_age  Always      -      1763
196 Reallocated_Event_Count 0x0008  253  253  000    Old_age  Offline      -      0
197 Current_Pending_Sector  0x0008  253  253  000    Old_age  Offline      -      0
198 Offline_Uncorrectable  0x0008  253  253  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x0008  199  199  000    Old_age  Offline      -      0
200 Multi_Zone_Error_Rate  0x000a  253  252  000    Old_age  Always      -      0
201 Soft_Read_Error_Rate    0x000a  253  242  000    Old_age  Always      -      245
202 TA_Increase_Count      0x000a  253  252  000    Old_age  Always      -      0
203 Run_Out_Cancel          0x000b  253  252  180    Pre-fail  Always      -      1
204 Shock_Count_Write_Opern 0x000a  253  252  000    Old_age  Always      -      0
205 Shock_Rate_Write_Opern  0x000a  253  252  000    Old_age  Always      -      0
207 Spin_High_Current      0x002a  253  252  000    Old_age  Always      -      0
208 Spin_Buzz              0x002a  253  252  000    Old_age  Always      -      0
209 Offline_Seek_Performnce 0x0024  191  189  000    Old_age  Offline      -      0
 99 Unknown_Attribute      0x0004  253  253  000    Old_age  Offline      -      0
100 Unknown_Attribute      0x0004  253  253  000    Old_age  Offline      -      0
101 Unknown_Attribute      0x0004  253  253  000    Old_age  Offline      -      0

[root@acsfs log]#

Code:

[root@acsfs log]# top
top - 09:21:08 up 2 days,  1:27,  3 users,  load average: 0.91, 1.19, 1.21
Tasks: 107 total,  2 running, 105 sleeping,  0 stopped,  0 zombie
Cpu(s):  0.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:    767208k total,  756408k used,    10800k free,    93720k buffers
Swap:  1540088k total,      124k used,  1539964k free,  494840k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    1 root      16  0  1992  680  572 S  0.0  0.1  0:01.37 init
    2 root      34  19    0    0    0 S  0.0  0.0  0:00.23 ksoftirqd/0
    3 root      RT  0    0    0    0 S  0.0  0.0  0:00.00 watchdog/0
    4 root      10  -5    0    0    0 S  0.0  0.0  0:02.00 events/0
    5 root      11  -5    0    0    0 S  0.0  0.0  0:00.00 khelper
    6 root      10  -5    0    0    0 S  0.0  0.0  0:00.00 kthread
    8 root      10  -5    0    0    0 S  0.0  0.0  0:00.23 kblockd/0
    9 root      20  -5    0    0    0 S  0.0  0.0  0:00.00 kacpid
  98 root      10  -5    0    0    0 S  0.0  0.0  0:00.00 khubd
  156 root      11  -5    0    0    0 S  0.0  0.0  0:00.00 aio/0
  155 root      15  0    0    0    0 S  0.0  0.0  0:03.36 kswapd0
  243 root      10  -5    0    0    0 S  0.0  0.0  0:00.00 kseriod
  315 root      11  -5    0    0    0 S  0.0  0.0  0:00.00 kpsmoused
  336 root      11  -5    0    0    0 S  0.0  0.0  0:00.00 kmirrord
  345 root      15  0    0    0    0 D  0.0  0.0  0:03.10 kjournald
  384 root      11  -5    0    0    0 S  0.0  0.0  0:00.00 kauditd
  408 root      13  -4  2204  692  384 S  0.0  0.1  0:00.24 udevd
 1145 root      25  0    0    0    0 S  0.0  0.0  0:00.00 kjournald
 1441 rpc      16  0  1732  564  464 S  0.0  0.1  0:00.00 portmap
 1460 rpcuser  20  0  1740  716  620 S  0.0  0.1  0:00.00 rpc.statd
 1488 root      16  0  4728  588  308 S  0.0  0.1  0:00.25 rpc.idmapd
 1502 dbus      16  0  3064 1000  760 S  0.0  0.1  0:00.53 dbus-daemon
 1511 root      16  0  2288  884  784 S  0.0  0.1  0:00.00 hcid
 1514 root      15  0  1664  488  420 S  0.0  0.1  0:00.00 sdpd
 1526 root      10 -10    0    0    0 S  0.0  0.0  0:00.00 krfcommd
 1560 root      16  0  1820  476  396 S  0.0  0.1  0:00.86 hidd
 1643 root      16  0  1872  724  600 S  0.0  0.1  0:00.60 automount
 1657 root      16  0  1900  516  292 S  0.0  0.1  0:00.02 smartd
 1666 root      16  0  1596  476  404 S  0.0  0.1  0:00.00 acpid
 1675 root      15  0 25456  576  396 S  0.0  0.1  0:00.00 hpiod
 1680 root      16  0 11844 5072 1168 S  0.0  0.7  0:02.12 python
 1692 root      16  0  9688 3380 1808 S  0.0  0.4  0:02.16 cupsd
 1721 root      16  0  4972 1116  788 S  0.0  0.1  0:00.00 sshd
 1733 ntp      16  0  4248 4248 3240 S  0.0  0.6  0:01.58 ntpd
 1751 root      16  0  8276 2076 1032 S  0.0  0.3  0:01.14 sendmail
 1759 smmsp    16  0  7328 1684  852 S  0.0  0.2  0:00.00 sendmail
 1768 root      16  0  1820  468  396 S  0.0  0.1  0:01.80 gpm
 1809 xfs      16  0  3644 1656  812 S  0.0  0.2  0:00.01 xfs

Code:

[root@acsfs log]# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root        1    0  0 Jun12 ?        00:00:01 init [5]
root        2    1  0 Jun12 ?        00:00:00 [ksoftirqd/0]
root        3    1  0 Jun12 ?        00:00:00 [watchdog/0]
root        4    1  0 Jun12 ?        00:00:02 [events/0]
root        5    1  0 Jun12 ?        00:00:00 [khelper]
root        6    1  0 Jun12 ?        00:00:00 [kthread]
root        8    6  0 Jun12 ?        00:00:00 [kblockd/0]
root        9    6  0 Jun12 ?        00:00:00 [kacpid]
root        98    6  0 Jun12 ?        00:00:00 [khubd]
root      156    6  0 Jun12 ?        00:00:00 [aio/0]
root      155    1  0 Jun12 ?        00:00:03 [kswapd0]
root      243    6  0 Jun12 ?        00:00:00 [kseriod]
root      315    6  0 Jun12 ?        00:00:00 [kpsmoused]
root      336    6  0 Jun12 ?        00:00:00 [kmirrord]
root      345    1  0 Jun12 ?        00:00:03 [kjournald]
root      384    6  0 Jun12 ?        00:00:00 [kauditd]
root      408    1  0 Jun12 ?        00:00:00 /sbin/udevd -d
root      1145    1  0 Jun12 ?        00:00:00 [kjournald]
rpc      1441    1  0 Jun12 ?        00:00:00 portmap
rpcuser  1460    1  0 Jun12 ?        00:00:00 rpc.statd
root      1488    1  0 Jun12 ?        00:00:00 rpc.idmapd
dbus      1502    1  0 Jun12 ?        00:00:00 dbus-daemon --system
root      1511    1  0 Jun12 ?        00:00:00 hcid: processing events
root      1514    1  0 Jun12 ?        00:00:00 sdpd
root      1526    1  0 Jun12 ?        00:00:00 [krfcommd]
root      1560    1  0 Jun12 ?        00:00:00 /usr/bin/hidd --server
root      1643    1  0 Jun12 ?        00:00:00 /usr/sbin/automount --timeout=60
root      1657    1  0 Jun12 ?        00:00:00 /usr/sbin/smartd
root      1666    1  0 Jun12 ?        00:00:00 /usr/sbin/acpid
root      1675    1  0 Jun12 ?        00:00:00 ./hpiod
root      1680    1  0 Jun12 ?        00:00:02 python ./hpssd.py
root      1692    1  0 Jun12 ?        00:00:02 cupsd
root      1721    1  0 Jun12 ?        00:00:00 /usr/sbin/sshd
ntp      1733    1  0 Jun12 ?        00:00:01 ntpd -u ntp:ntp -p /var/run/ntpd
root      1751    1  0 Jun12 ?        00:00:01 sendmail: accepting connections
smmsp    1759    1  0 Jun12 ?        00:00:00 sendmail: Queue runner@01:00:00
root      1768    1  0 Jun12 ?        00:00:01 gpm -m /dev/input/mice -t exps2
xfs      1809    1  0 Jun12 ?        00:00:00 xfs -droppriv -daemon
root      1840    1  0 Jun12 ?        00:00:00 /usr/sbin/atd
avahi    1904    1  0 Jun12 ?        00:00:00 avahi-daemon: running [acsfs.loc
avahi    1905  1904  0 Jun12 ?        00:00:00 avahi-daemon: chroot helper proc
root      1914    1  0 Jun12 ?        00:00:00 cups-config-daemon
68        1924    1  0 Jun12 ?        00:00:01 hald
root      1925  1924  0 Jun12 ?        00:00:00 hald-runner
68        1931  1925  0 Jun12 ?        00:00:00 /usr/libexec/hald-addon-acpi
68        1939  1925  0 Jun12 ?        00:00:00 /usr/libexec/hald-addon-keyboard
root      1965  1925  0 Jun12 ?        00:00:33 /usr/libexec/hald-addon-storage
root      1977    1  0 Jun12 tty1    00:00:00 /sbin/mingetty tty1
root      1980    1  0 Jun12 tty2    00:00:00 /sbin/mingetty tty2
root      1983    1  0 Jun12 tty3    00:00:00 /sbin/mingetty tty3
root      1986    1  0 Jun12 tty4    00:00:00 /sbin/mingetty tty4
root      1989    1  0 Jun12 tty5    00:00:00 /sbin/mingetty tty5
root      1990    1  0 Jun12 tty6    00:00:00 /sbin/mingetty tty6
root      1993    1  0 Jun12 ?        00:00:00 /bin/sh /etc/X11/prefdm -nodaemo
root      1998  1993  0 Jun12 ?        00:00:00 /usr/sbin/gdm-binary -nodaemon
root      2073  1998  0 Jun12 ?        00:00:00 /usr/sbin/gdm-binary -nodaemon
root      2230    1  0 Jun12 ?        00:00:30 /usr/libexec/gam_server
root      5273    1  0 Jun12 ?        00:00:00 /usr/bin/esd -terminate -nobeeps
root      5289  2073 46 Jun12 tty7    19:59:09 /usr/bin/Xorg :0 -audit 0 -auth
root      5310  2073  0 Jun12 ?        00:00:00 /usr/bin/gnome-session
root      5353  5310  0 Jun12 ?        00:00:00 /usr/bin/ssh-agent /usr/bin/dbus
root      5356    1  0 Jun12 ?        00:00:00 /usr/bin/dbus-launch --exit-with
root      5357    1  0 Jun12 ?        00:00:00 dbus-daemon --fork --print-pid 8
root      5363    1  0 Jun12 ?        00:00:00 /usr/libexec/gconfd-2 5
root      5366    1  0 Jun12 ?        00:00:00 /usr/bin/gnome-keyring-daemon
root      5368    1  0 Jun12 ?        00:00:00 /usr/libexec/bonobo-activation-s
root      5370    1  0 Jun12 ?        00:00:10 /usr/libexec/gnome-settings-daem
root      5375    1  0 Jun12 ?        00:00:00 /usr/bin/metacity --sm-client-id
root      5381    1  0 Jun12 ?        00:00:10 gnome-panel --sm-client-id defau
root      5383    1  0 Jun12 ?        00:00:09 nautilus --no-default-window --s
root      5387    1  0 Jun12 ?        00:00:00 /usr/libexec/wnck-applet --oaf-a
root      5393    1  0 Jun12 ?        00:00:07 /usr/libexec/gnome-vfs-daemon --
root      5395    1  0 Jun12 ?        00:00:07 /usr/libexec/trashapplet --oaf-a
root      5397    1  0 Jun12 ?        00:00:00 eggcups --sm-client-id default4
root      5399    1  0 Jun12 ?        00:00:00 bluez-pin --dbus
root      5414    1  0 Jun12 ?        00:00:07 nm-applet --sm-disable
root      5418    1  0 Jun12 ?        00:00:00 /usr/libexec/mapping-daemon
root      5422    1  0 Jun12 ?        00:00:00 pam-panel-icon --sm-client-id de
root      5427    1  0 Jun12 ?        00:00:05 /usr/libexec/mixer_applet2 --oaf
root      5430    1  0 Jun12 ?        00:00:04 /usr/libexec/clock-applet --oaf-
root      5432    1  0 Jun12 ?        00:00:00 /usr/libexec/notification-area-a
root      5433    1  0 Jun12 ?        00:00:07 gnome-power-manager
root      5435  5422  0 Jun12 ?        00:00:03 /sbin/pam_timestamp_check -d roo
root      5439    1  0 Jun12 ?        00:00:13 gnome-terminal
root      5442    1  0 Jun12 ?        00:00:01 gnome-screensaver
root      5443  5439  0 Jun12 ?        00:00:00 gnome-pty-helper
root      5444  5439  0 Jun12 pts/1    00:00:00 bash
root      5476  5444  0 Jun12 pts/1    00:03:57 top
root      5871    1  0 Jun12 ?        00:00:00 crond
root      5927  5442  2 Jun12 ?        00:51:26 /usr/libexec/gnome-screensaver/f
root      7503    6  0 Jun12 ?        00:00:00 [pdflush]
root      7505    6  0 Jun12 ?        00:00:00 [pdflush]
root    11929  1721  0 Jun13 ?        00:00:05 sshd: root@pts/2
root    11932 11929  0 Jun13 pts/2    00:00:00 -bash
root    14035    1  0 Jun13 ?        00:00:00 syslogd -m 0
root    14038    1  0 Jun13 ?        00:00:00 klogd -x
root    14053    1  0 Jun13 ?        00:00:00 smbd -D
root    14056    1  0 Jun13 ?        00:00:01 nmbd -D
root    14058 14053  0 Jun13 ?        00:00:00 smbd -D
root    16597 14053  0 05:56 ?        00:00:00 smbd -D
root    17034 14053  0 07:38 ?        00:00:00 smbd -D
root    17095 14053  0 07:52 ?        00:00:09 smbd -D
laurat  17121 14053  0 07:57 ?        00:00:00 smbd -D
root    17146 14053  0 08:03 ?        00:00:06 smbd -D
root    17149 14053  0 08:03 ?        00:00:00 smbd -D
root    17472  1692  0 09:20 ?        00:00:00 parallel:/dev/lp0 298 JULIET smb
root    17478 11932  0 09:21 pts/2    00:00:00 ps -ef
root    17479  5442  0 09:21 ?        00:00:00 gnome-screensaver


macemoneta 06-14-2007 02:07 PM

Unfortunately, it sounds like the system is "broken" - non-compliant in some way that is causing problems for the kernel.

You can try booting with the kernel options noacpi, noapic, mem=1000M to see if it helps. Edit your /etc/grub.conf and add the options to the kernel line. For example:

kernel /vmlinuz-2.6.xxx ro root=/dev/hda quiet noacpi noapic mem=1000M

Then reboot. You can also try updating your BIOS, and swapping the slots PCI cards are in (clear the BIOS PCI data if you do this).

asidarin 06-25-2007 10:46 AM

Just as a followup, I had a brand new 80GB drive that was two years old that I have never used. I put that into the original sys replacing the used 120GB drive and it has been up for 6+ days now. The other sys that has the 160GB drive still needs to be rebooted once or twice a night to make it through the day.

So, the only thing I can think of is that the 120 & 160GB drives have a small incompatibility with the hardware and RH FC5/6. Both system exhibited the same symptoms, they would run for up to 2 days without a problem then "slow" down. I tried both FC5 & 6, and openSuse 10.2 and got the same symptoms.

All three drives are the cheap Maxtor drives from BBuy.

I guess I forgot that part also, on the 80GB drive is FC2. That was where we started from when the original drive died.

Thanks to all who helped, especially macemoneta.


All times are GMT -5. The time now is 02:35 PM.