maximum committed memory?
Hello,
I wrote a gcc program and I am running it on a unix server that has 8 cpus. The program is a simulation and I have to run it for different sets of input parameters. I run the program 6 times simultaneously with 6 different sets of input parameters. When I do this the programs crash. However when I run only 3 of them simultaneously, and then run the other 3, they all finish successfully. Below is the output of the top command when only one of my programs is running. top - 17:58:23 up 8 days, 2:45, 1 user, load average: 1.00, 1.00, 1.00 Tasks: 120 total, 2 running, 118 sleeping, 0 stopped, 0 zombie Cpu(s): 36.6%us, 0.1%sy, 0.0%ni, 63.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 10267424k total, 2039184k used, 8228240k free, 153588k buffers Swap: 2963952k total, 35404k used, 2928548k free, 104768k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 23308 sidera 20 0 8809m 1.5g 720 R 101 15.8 592:06.37 fir2p 1 root 20 0 10312 200 168 S 0 0.0 0:05.50 init 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0 0.0 0:20.02 migration/0 4 root 15 -5 0 0 0 S 0 0.0 0:00.20 ksoftirqd/0 5 root RT -5 0 0 0 S 0 0.0 0:08.72 watchdog/0 6 root RT -5 0 0 0 S 0 0.0 0:08.64 migration/1 7 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1 8 root RT -5 0 0 0 S 0 0.0 0:00.72 watchdog/1 9 root RT -5 0 0 0 S 0 0.0 0:09.16 migration/2 10 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2 11 root RT -5 0 0 0 S 0 0.0 0:00.76 watchdog/2 12 root RT -5 0 0 0 S 0 0.0 0:03.08 migration/3 13 root 15 -5 0 0 0 S 0 0.0 0:00.08 ksoftirqd/3 14 root RT -5 0 0 0 S 0 0.0 0:00.80 watchdog/3 15 root RT -5 0 0 0 S 0 0.0 0:00.06 migration/4 16 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/4 I was told that the committed memory is the memory my program has asked for. The committed memory may be used or not. At this time my program has approximately 8809MB committed memory and approximately 1.5GB used committed memory. My programs will crash if they try to use more memory than the RAM+swap the system has. They will also crash if they try to commit a very large amount of memory. Can you tell me how much memory my programs are allowed to commit? Thanks, Anna |
Anyone else joining this discussion may want to know that it moved here from a GCC mailing list (which was very much the wrong place for it). For background info, here is a link to an interesting point in the middle of that previous discussion:
http://gcc.gnu.org/ml/gcc-help/2010-03/msg00069.html To get some basic info on the current state, post the output from the following commands Code:
grep Commit /proc/meminfo Code:
CommitLimit: 32213000 kB The Committed_AS value is the (small at the moment) total amount committed for all the processes on this system. The vm.overcommit_memory is the mode Linux is in for deciding how much to over commit memory (allow the Committed_AS value to exceed the CommitLimit). 0 is the default mode and the most complicated. I did a few searches on it just now for you and I don't quite understand its rules nor what else affects them. One thing you could do is switch to root and give the command Code:
/sbin/sysctl vm.overcommit_memory=1 In the previous discussion, you never answered my question about the top output showing one of your processes with about 1.5GB actually used. Do you have good reason to believe that top was run when that process was near its maximum memory use? Based on the info you have provided, I think it is very likely your attempt to run six of those crashed due to the over commit and not due to actual memory use. As I described above, you could completely stop the over commit crash from happening. But that does not guarantee your six processes will run correctly to completion. Maybe they use more than 1.5GB each sometime later in their processing. If the six processes use more than your 10GB of physical ram, they might slow down due to swapping, but probably not. Most likely some significant fraction of their memory use at any moment is stale and won't noticeably affect performance if it is in the swap partition. But if the six processes use more than the 10GB of ram plus the 3 GB of swap, the out of memory killer will kill something. As I said in the other discussion, 3GB of swap is not enough margin for error. I suggest significantly increasing the swap size. |
I stopped the program I was running when I wrote the previous messages. However I am now running very similar programs.
Now I have the following outputs: sidera@olympus:~$ top top - 18:04:07 up 10 days, 2:51, 1 user, load average: 4.00, 4.00, 4.00 Tasks: 122 total, 5 running, 117 sleeping, 0 stopped, 0 zombie Cpu(s): 39.1%us, 0.0%sy, 0.0%ni, 60.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 10267424k total, 7700440k used, 2566984k free, 154936k buffers Swap: 2963952k total, 34776k used, 2929176k free, 105076k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 23531 sidera 20 0 8809m 1.7g 716 R 103 17.2 2835:24 saw 23529 sidera 20 0 8809m 1.6g 716 R 99 16.3 2835:45 saw 23532 sidera 20 0 8809m 1.8g 712 R 99 18.1 2833:51 saw 23533 sidera 20 0 8809m 1.9g 712 R 99 18.9 2834:49 saw 1 root 20 0 10312 200 168 S 0 0.0 0:12.82 init 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0 0.0 0:21.26 migration/0 4 root 15 -5 0 0 0 S 0 0.0 0:00.44 ksoftirqd/0 5 root RT -5 0 0 0 S 0 0.0 0:10.48 watchdog/0 6 root RT -5 0 0 0 S 0 0.0 0:11.66 migration/1 7 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1 8 root RT -5 0 0 0 S 0 0.0 0:00.92 watchdog/1 9 root RT -5 0 0 0 S 0 0.0 0:11.08 migration/2 10 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2 11 root RT -5 0 0 0 S 0 0.0 0:01.00 watchdog/2 12 root RT -5 0 0 0 S 0 0.0 0:03.12 migration/3 13 root 15 -5 0 0 0 S 0 0.0 0:00.12 ksoftirqd/3 14 root RT -5 0 0 0 S 0 0.0 0:01.08 watchdog/3 15 root RT -5 0 0 0 S 0 0.0 0:00.06 migration/4 16 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/4 17 root RT -5 0 0 0 S 0 0.0 0:01.06 watchdog/4 18 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/5 19 root 15 -5 0 0 0 S 0 0.0 0:00.26 ksoftirqd/5 20 root RT -5 0 0 0 S 0 0.0 0:01.00 watchdog/5 21 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/6 22 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/6 sidera@olympus:~$ grep Commit /proc/meminfo CommitLimit: 8097664 kB Committed_AS: 36143244 kB Can you explain the following command? What does the 'vm' do? /sbin/sysctl vm | grep commit I believe that the memory the simulations are using will not change. After running for some time they reach a steady state. However I have to run them for different input parameters. If I change the input parameters the used memory will change. Thanks, Anna |
Quote:
I'm far from sure I understand that correctly, but your large value for Committed_AS supports that view. In your situation, if you don't modify vm.overcommit_memory you would need the excess (unused) swap space to be at least as large as the largest single task's excess (committed but unused) memory. The processes you are now describing seem to be actually using 1.9GB and committing nearly 8.9GB. If you had six of those, you might expect them to use about 9.5GB of ram plus 1.9GB of swap. Then you would need a minimum of 7GB of extra swap space to satisfy the commit limit heuristic rules (as I think I might understand them). So increasing swap space from the current 2.9GB up to 9GB or more ought to let you run those six copies of the current task, and probably they would swap little enough that running six at a time would be faster than running fewer at a time more times. As I said before, I think changing vm.overcommit_memory to 1 would also let you run six at once without needing you to increase swap space. But that plan has far less margin for error if the processes are a little bigger than you think. Quote:
On the system where I tried it before suggesting it, it listed the values of every parameter (there were only two) whose names started with vm. and included commit But on the system where I am now, it just didn't work. I was hoping to see any vm.*commit* parameters your system might have without needing to learn sysctl well enough to ask for a full list. The one that really matters is vm.overcommit_memory You can get its value with /sbin/sysctl vm.overcommit_memory and after switching to be root you can change its value with /sbin/sysctl vm.overcommit_memory=1 Seeing if there were other vm.*commit* parameters was just a matter of curiosity. I don't know how much that might vary by kernel or distribution, but I have no specific reason to expect any relevant info there. |
All times are GMT -5. The time now is 07:04 PM. |