LinuxQuestions.org - how to limit swapping - prevent processes allocating huge amounts of memory

- Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)

- - how to limit swapping - prevent processes allocating huge amounts of memory (https://www.linuxquestions.org/questions/linux-general-1/how-to-limit-swapping-prevent-processes-allocating-huge-amounts-of-memory-511997/)

how to limit swapping - prevent processes allocating huge amounts of memory

Hi!

I do have the following problem: A (Suse) Linux machine, which is being used as login machine by several users, freezes frequently due to processes allocating too much memory. Freezing in this case means that the machines becomes unresponsive for more than 30 minutes or so, which is intolerable since people wish to work on it. Freezing means that it is unresponsive and doing extensive hard disk activity, swapping, I suppose.
A simple program allocating just memory and writing it mimicks the effect.

I have found several threads advising the use of ulimit/limit and /etc/security/limits.conf. The former helps if vmemoryusage is limited but the latter does not seem to allow this sort of manipulation, as there is no such parameter in the explanation in the file.

Additionally the 'login' sessions even seem not to be affected by the other settings from /etc/security/limits.conf (like rss), only those created through ssh.

So, what to do? How to limit globally, for every user process the amount of 'swap' memory used to something like 500 megs?

Please keep in mind that I'm not very knowledgeable about Linux ... but the one (currently) causing most of those freezes :(

Thanks!

David

Moved: This thread is more suitable in the Linux General forum (my taxonomy saying "performance" is not necessary security-related) and has been moved accordingly to help your thread/question get the exposure it deserves.

As for your title "how to limit swapping", swapping is the last way out for the VM, so I don't think you want to restrict that. I would suggest posting what your specific HW specs and machine requirements are since you talk about memory but you show no specs or requirements. For all I know this may be a machine with limited amounts of RAM or one forced to sustain humongous herds of users or one using warped sysctl's. I also suggest running some form of SAR to get a better grip on what resources are used for real. Chances are if you add things up you just need to add RAM (which is always a Good Thing) or the bottleneck isn't RAM but say IO.

Just my thoughts so MMMV(VM).

As unSpawn sys, we need more info.
Swapping and high disk activity are generally symptoms rather than the problem itself.

As an aside, various projects are contributing to developing a common patchset to achieve this (and much more) using userspace limit definitions. Much argument has (maybe) finally lead to a common approach - still being developed though.

Thank you! The system has 1 GB of Ram which isn't very much for multi-user, I admit. But the same happens with my home PC (no Network, no multi-user) with 512 MB, although it freezes only for a few minutes (which would, again, be intolerable with the multi-user machine).
I think, memory upgrades are not an option with the multi-user machine, due to budget reasons and no free slots. There are ~3-6 (different) users logged in to the machine, and it is always short of ram, but freezes do not happen with common applications.

SAR seems to be some surveillance tool, as I gather from google, but would not prevent freezes. Is that correct?

One of the applications leading to freezes is GNUPLOT, when processing data files with some error which causes the pm3d routines to go mad and allocate huge amounts of memory. If one has top running while GNUPLOT works, and one is lucky (that is, the top refresh happens before the freeze) one can see the amount of memory allocated to GNUPLOT rising to more than 70% or so, until the moment the system becomes unresponsive. I think matlab was another software which would cause freezes, but I'm not sure and it was moved to another machine anyway.

So, is there a possibility to set vmemoryuse to say 500 megs per process (not per user!)?

Thanks!

David

Quote:

Originally Posted by david@linuxquestions

So, is there a possibility to set vmemoryuse to say 500 megs per process (not per user!)?

Thanks!

David

That wouldn't remedy your problem, though. If 6
users decided to have a process with 495MB each
you'd have the very same situation; if a RAM upgrade
is not an option, how about 8GB of swap on solid-state? ;}

Cheers,
Tink

Quote:

Originally Posted by david@linuxquestions

SAR seems to be some surveillance tool, as I gather from google, but would not prevent freezes. Is that correct?

Nope - what is being referred to would be this. Have a look around the site, very good info all round.
As for your problem, I'm surprised the OOM killer isn't being invoked. Have a look at what the sysstat package shows. Short term, I'd be looking to isolate your database and running system libraries from the swap space. Having all that I/O hitting the same drive(s) would be a disaster. Allocate several swap spaces on individual (and separate) drives - make them all the same priority to get some "striping" effect.
I'm guessing Tinks suggestion is a no-go if you can't afford more RAM.

Quote:

Originally Posted by syg00

I'm guessing Tinks suggestion is a no-go if you can't afford more RAM.

It was a joke. Limiting usage of RAM per process to
any value that even in a (1 per user * number of users)
exceeds physical RAM significantly is always going to
lead to excessive swapping.

The only solution to lack of RAM is RAM. Or remove the
tools that can cause those problems. If you could (would)
set hard limits the folks would be just as frustrated (if not
more) if 20 minutes into processing their graph the tool
borked out because it hit the limit.

Cheers,
Tink

Hmmm, thank you!

1) I suppose at least at home I'm patient enough that the Oom killer is invoked (hope Google told me right this time). On the MU machine we never waited long enough. Half an hour was not, at least ... It's got just one physical disk, too, so multiple disks aren't an option either (and buying one neither).
While searching about the Oom killer - there seems to be some commitlimit, which, by default, seems to be ram+swap, which is reasonable. Would lowering this solve the problem? How is it done?

2) limit vmemoryuse 500000

gives roughly the desired behaviour - except that every user would have to enter the command prior to launching their applications, and apps started directly from KDE seem to be unaffected, which, again does not help.

3)

Quote:

Originally Posted by Tinkster

...
The only solution to lack of RAM is RAM. Or remove the
tools that can cause those problems. If you could (would)
set hard limits the folks would be just as frustrated (if not
more) if 20 minutes into processing their graph the tool
borked out because it hit the limit.

Since Gnuplot never finishs the graph anyway (the user working locally on the machine usually hits the power button after a few minutes, and I can't blame him), or would propably end up with the Oom killer after a night of swapping or so (never tried that, was just introduced to the guy ;) ), no one benefits from such large processes. On other (ancient) machines of the network (running some Unix) it is done like that anyway, but no one seems to know how ...

If it were me I might be looking to "fiddle" the vm sysctls. This will no doubt invoke the OOM killer though, so you'd better be prepared for your highest page user to be sacrificed.
And you get no say in the decision making. Have a look at the documentation in the kernel source tree.
Maybe try the following as root - all care, no responsibility; you know the deal.

Code:

echo 2 > /proc/sys/vm/overcommit_memory

Presumes no (big) memory consumer that you want to keep running at all costs. If it all goes to hell, just set it back to zero, or reboot ...

Maybe it's just me, but I can't help but think maybe it's a symptom of a completely different issue entirely. 1 gig of ram covers things pretty well for most users in most cases, but of course it's going to depend on how much bloat is going on. Like if there is a bunch of extraneous daemons and services being run, are the programs being run, high in bloat from everything to the WM/DE to the applications? But what I can't help but wonder is if there is something else going on because the system sitting there seeming like it's thrashing was what used to happen with my system, but was actually something else entirely. In my case it was getting 'dma_timer_expiry' errors whenever there was a decent amount of activity on a particular HD of mine. I never bothered to narrow down whether it was the HD itself or something about the system not liking that particular model of HD, coupled with a particular model of CD-ROM, sharing an IDE channel with my particular northbridge chipset that I have. There are many occurances of systems out there that have completely different hardware than mine that have run into this mysterious problem. I will probably narrow it down to the definite culprit one day, but right now I don't care to bother since I don't even use that third HD anymore.

Anyway, I would suggest checking your system logs to see if everything is cool on that front. Worth a look before you start making tweaks to address things from the wrong angle and possibly create a different set of issues or create what ends up working against performance if the issue wasn't what was assumed. Also, when you're getting this thrashing, did you try looking at what is going on with memory by running 'free' and also looking at /proc/swaps, etc.? If you can't due to the system being unresponsive during this 'thrashing' like behavior, this seems rather odd with Linux considering that it would mean there must be an insane amount of load on the system. It generally takes alot to do this. For instance, I run gnome, and lots of notorious bloat-heavy apps like firefox and OOO all at the same time, on my system with only 128MB (sometimes in the background while my girlfriends end up logging into a new session of gnome on top of all of that) of ram and that never happens (will sometimes even has something compiling through all of this), unless some process has gone super hay-wire. For instance, I learned one day that it was very bad idea to use aoss with the version of gaim I'm using since the result ended up being the it would spawn a extra and new process of gaim every time a sound played. This caused the memory and CPU resources to all be eaten up in a short amount of time causing the system act just like what you're describing about yours (thrashing like mad and being extremely sluggish in response. Ended up catching it so late that killing off the processes led to disk thrashing hell where I ended up getting fed up and just brought the system down hard and crossed my fingers since it seemed like it wasn't going to pull out of it, until maybe the next day.).

Anyway, what I'm trying to get at, is that I would suggest some more analysis to see if things are going on how you think they are. Then after you conclude what is the cause, without a doubt, then proceed to making the necessary changes.

Just my two cents on this. The bottom line is that it's your system, so it's all your call. I would just suggest to not rule other causes that don't seem like they could be in play at first thought. Symptoms sometimes have a way of manifesting themselves in weird ways do to other variables coming into play and influencing things.