LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Using swap causes system to stop responding (https://www.linuxquestions.org/questions/linux-general-1/using-swap-causes-system-to-stop-responding-371804/)

djortz 10-11-2005 02:58 AM

Using swap causes system to stop responding
 
Used hardware
Maxtor Sata hdd
Kingston 2x512MB
AMD 64 Athlon 3000+

Used software
Suse 9.3
kernel 2.6.11.4-21.9-default (original)

Situation
New hardware and fresh installation of Suse9.3 on a 64bit AMD processor. Swap space on the sata drive is 1GB

Problem
When the system needs the swap space the proces kswapd0 takes up about 30% of the CPU (according to top) and causes the load to go up to at least 8. After a few secconds it is impossible to halt the system correctly or to even switch to console (ctrl+alt+f2). The problem occours at the very moment the swapdaemon is activated to use swap. Besides rebooting I haven't found a way to prevent the system to stop respondig.

Attempted solutions
* I found a post in a RH buglist that advised the topicstarter to upgrade the kernel (to 2.4.9!) and do "echo 2 10 25 > /proc/sys/vm/buffermem" "to force the your system to give up buffermem more easily when more than 25% of lowmem is consumed by buffermem."
The kernel upgrade did not sound useful in my situation and the given command resulted in "no such file or directory"
* Rebooting :S

I'm not a newbe, though I'm not an expert either. I've been using Suse for about 2 years now as a desktop system, so there should be some knowledge. Is there someone out there wanting to help me fixing this problem?

Update
It seems that I had been rebooting too soon. The system came back up and is running fine now:
Code:

            total      used      free    shared    buffers    cached
Mem:          1000        499        501          0        28        242
-/+ buffers/cache:        228        772
Swap:        1004        146        857

Apparently starting the swapdaemon is very CPU consuming. How to solve that problem?

bigrigdriver 10-11-2005 08:44 PM

Check /etc/fstab entry for the swap partition. Change the pri=xx number to a lower number to give swap a lower priority (it should use less cpu, but may be slower)

See 'man swapon', for more details, especially the -p priority option.

foo_bar_foo 10-11-2005 09:37 PM

i might be wrong but i think the priority setting is for priority order for multiple swap partitions.

as far as the other issue i mean when the system is using swap it's because something is using up / allocating the memory to itself. Often that "something" is the problem.
So the system seems to not respond for a second it's because the data needed for things to function has been pushed out and has to be reread. It's most likely not a problem with kswapd being to hoggish. kswapd is just trying to clean up the mess.

syg00 10-11-2005 10:18 PM

Mmmmm - bit unlikely I would think. Presuming that free listing is just after the system "came back" (without a boot), there is no apparent stress on the memory system at that point.
The culprit would have to start, eat the memory system up, and terminate - even then you'd expect the free lists to show some impact, as I don't believe the reclaim is done immediately at memory free.

Odd - definitely odd.

djortz, are you getting anything in logs ???. What about kernel oops ???.

foo_bar_foo 10-12-2005 01:52 AM

i ws just trying to explain the mechanism

djortz 10-12-2005 04:58 AM

Thanks for your replies

"i might be wrong but i think the priority setting is for priority order for multiple swap partitions."
I read that in the man page too

That "something" foo_bar_foo is talking about could be true. I discovered this problem while update via YOU. After updating, YOU executes some config scripts and the system doesn't like that. While executing the last config (tetex) or shortly after that, the system eats up all the swap (1GB) and all the memory (1GB), causing it to stop responding.
You'd think that that's the problem... BUT
*This doesn't hapen when updating in text-mode
*This does happen when I begin to use swap in general, though I haven't been able to reproduce that situation.

"Presuming that free listing is just after the system "came back" (without a boot),"
I did restart X (ctrl+alt+backspace), but I think that has only quicken the "came back"

I've checked /var/log/messages but don't see anything unussual. /var/log/warn states something with my fdd, but that's fine with me.

Could it be possible that my hdd is too slow for some reason and swap can't write fast enough to disk?
I'm not a debugging expert, so tell me wath logs or programs you need and I'll search for it.

syg00 10-12-2005 05:53 AM

Quote:

Originally posted by djortz
I discovered this problem while update via YOU. After updating, YOU executes some config scripts and the system doesn't like that. While executing the last config (tetex) or shortly after that, the system eats up all the swap (1GB) and all the memory (1GB), causing it to stop responding.
What do you base this assertion on - do you have evidence/numbers ???.
May we see them please.
Quote:

Could it be possible that my hdd is too slow for some reason and swap can't write fast enough to disk?
Unfortunately most desktop users tend to allocate swap on the same disk as all their data. If you need to read data, and this requires a swapout of a page (or more likely pages), to make room for the (new) data, performance is awful.
No matter how fast the disk is.

Linux doesn't seem to have the metrics to measure and/or tune this.
If you were to see (meaningful) messages, I'd expect them in syslog - other than an oops, which should be self-evident.

djortz 10-12-2005 06:05 AM

Quote:

Originally posted by syg00
What do you base this assertion on - do you have evidence/numbers ???.
May we see them please.
I monitored free -m every 2 secs with watch and at the end it said something like:
Code:

            total      used      free    shared    buffers    cached
Mem:          1000        992        8          ?        ?        ?
-/+ buffers/cache:        ?        ?
Swap:        1004        1004        0

If the question marks have valuble information, let me now, then I'll catch them too

Quote:

Unfortunately most desktop users tend to allocate swap on the same disk as all their data. If you need to read data, and this requires a swapout of a page (or more likely pages), to make room for the (new) data, performance is awful.
No matter how fast the disk is.
So it would be better to use another hdd for swap? Though I don't think it will be the solution to my problem, it is a good thing to keep in mind.

syg00 10-12-2005 06:23 AM

Quote:

Originally posted by djortz
I monitored free -m every 2 secs with watch and at the end it said something like:
Code:

            total      used      free    shared    buffers    cached
Mem:          1000        992        8          ?        ?        ?
-/+ buffers/cache:        ?        ?
Swap:        1004        1004        0

If the question marks have valuble information, let me now, then I'll catch them too
Nope, that seems pretty conclusive - that looks like a well and truly screwed up system .... :eek:
Do you have "top" batch numbers for similar times so you can see the memory consumption increase ???.
Quote:

So it would be better to use another hdd for swap? Though I don't think it will be the solution to my problem, it is a good thing to keep in mind.
Agreed - your options seem to be non-existent.
There is a project under development to provide some controls in this arena, but the minimum kernel level is currently 2.6.13 - soon to be 2.6.14 I suspect. And it is a significant kernel update - so significant the kernel devs have sent it back to be re-written (predominantly) in userspace rather than kernel space.

In your case, I'd suggest opening a bug with Suse, and seeing what they have to say.

djortz 10-12-2005 06:47 AM

I'll reproduce the problem tonight (in about 5h).
I'll log:
-top
-free -m
-anything else?

I'll post this as a bug after I get the information from top and free.

Thanks for your help

syg00 10-12-2005 07:09 AM

No, that should be a good start - if they want anything else, they'll ask for it.
Especially if it's easily reproducable.

djortz 10-12-2005 02:34 PM

How to save the output of top?
top >> file.log does not work

syg00 10-12-2005 06:45 PM

"Man top" indicated the following might work
Code:

top -b -d 5 -n 3 >> file.log
It did.
Duration and iteration count are up to you. Send it to the background with "&" easily enough.

foo_bar_foo 10-12-2005 10:56 PM

another small piece of info
i think restart of x could be the reason for the strange free output in the beginning
x would recreate the mmap that maps the video card memory and most likely show that fee but i'm not sure.
also about swap -- swapd is also responsible for just booting pages that are already on disk that map to executable code.
this would be x and everything else that is running so there is not necesarily a problem with disk writes cause that stuff (the real crucial stuff for system response) doesn't get written anyway.

what you have there i think is just a utility that gets stuck in a memory allocation loop.
that's a very easy thing to write. I did it once for fun and got exactly the same results you describe.
no problem just don't use the darn thing anymore after you figure out what it is.

syg00 10-13-2005 02:06 AM

Quote:

Originally posted by foo_bar_foo
... just don't use the darn thing anymore after you figure out what it is.
Reasonable sentiment - 'cept I presume YOU is Yast Online Updater.
If so, they probably need to at least be told they may have a problem.


All times are GMT -5. The time now is 04:07 PM.