LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices

Reply
 
Search this Thread
Old 03-20-2009, 09:17 AM   #1
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Rep: Reputation: 0
Help me understand why oom-killer kicks in


Hi,

I have some kind of memory related problem that invokes the
oom-killer. I hope that some kernel or memory management guru have
some ideas of what could be the cause and what I can do to solve the
problem!

The box is running Debian/Lenny 32 bit, which now also is
Debian/Stable. The CPU is "CPU0: Intel(R) Core(TM)2 Quad CPU Q9550 @
2.83GHz stepping 07" and the box has 8 GB of memory. The kernel is a
Debian standard "Linux big 2.6.26-1-686-bigmem #1 SMP Sat Jan 10
19:13:22 UTC 2009 i686 GNU/Linux". All software is up to date.

I have more or less daily problems with memory management, resulting
in that the oom-killer kicks in and kill some processes.

The box is very light loaded and the only memory deserving application
is VirtualBox. As I understand it there is always huge amounts of free
memory and I can't understand why the kernel doesn't agree with me...

When the oom-killer does it's job I get really extensive information
in /var/log/messages. I hope that someone more capable than me can
help me in analyzing the log and maybe help me understand what's going
on and what I can do to resolve the problem.

So far I have tried some memory management tuning by adding the
following rows to /etc/sysctl.conf

vm.min_free_kbytes = 16384
vm.lowmem_reserve_ratio = "128 32 32"

I can't really say if that have made any improvement and since I'm
really not knowing what I'm doing, I'm probably on the wrong track...

Attaching the relevant part of /var/log/messages from one time when
the oom-killer was invoked.

Please let me know if you need any more information and I'm very
grateful to anyone that could help me resolve this problem!

The complete log is here http://pastebin.com/m6fab85c6 and the most important (me thinks) stuff below:

Code:
Mar 20 00:18:56 big -- MARK --
Mar 20 00:38:56 big -- MARK --
Mar 20 00:58:56 big -- MARK --
Mar 20 01:17:34 big kernel: [1591579.037846] gkrellm invoked oom-killer: gfp_mask=0x800d0, order=0, oomkilladj=0
Mar 20 01:17:34 big kernel: [1591579.037846] gkrellm invoked oom-killer: gfp_mask=0x800d0, order=0, oomkilladj=0
Mar 20 01:17:34 big kernel: [1591579.037852] Pid: 4514, comm: gkrellm Tainted: P          2.6.26-1-686-bigmem #1
Mar 20 01:17:34 big kernel: [1591579.037852] Pid: 4514, comm: gkrellm Tainted: P          2.6.26-1-686-bigmem #1
Mar 20 01:17:34 big kernel: [1591579.037871]  [<c015fe1e>] oom_kill_process+0x4f/0x195
Mar 20 01:17:34 big kernel: [1591579.037871]  [<c015fe1e>] oom_kill_process+0x4f/0x195
Mar 20 01:17:34 big kernel: [1591579.037887]  [<c0160248>] out_of_memory+0x14e/0x17f
Mar 20 01:17:34 big kernel: [1591579.037887]  [<c0160248>] out_of_memory+0x14e/0x17f
Mar 20 01:17:34 big kernel: [1591579.037900]  [<c01621aa>] __alloc_pages_internal+0x2b8/0x34e
Mar 20 01:17:34 big kernel: [1591579.037900]  [<c01621aa>] __alloc_pages_internal+0x2b8/0x34e
Mar 20 01:17:34 big kernel: [1591579.037911]  [<c01aeb39>] proc_file_read+0x0/0x1ff
Mar 20 01:17:34 big kernel: [1591579.037911]  [<c01aeb39>] proc_file_read+0x0/0x1ff
Mar 20 01:17:34 big kernel: [1591579.037916]  [<c016224c>] __alloc_pages+0x7/0x9
Mar 20 01:17:34 big kernel: [1591579.037916]  [<c016224c>] __alloc_pages+0x7/0x9
Mar 20 01:17:34 big kernel: [1591579.037921]  [<c016225d>] __get_free_pages+0xf/0x1b
Mar 20 01:17:34 big kernel: [1591579.037921]  [<c016225d>] __get_free_pages+0xf/0x1b
Mar 20 01:17:34 big kernel: [1591579.037925]  [<c01aebad>] proc_file_read+0x74/0x1ff
Mar 20 01:17:34 big kernel: [1591579.037925]  [<c01aebad>] proc_file_read+0x74/0x1ff
Mar 20 01:17:34 big kernel: [1591579.037935]  [<c01aeb39>] proc_file_read+0x0/0x1ff
Mar 20 01:17:34 big kernel: [1591579.037935]  [<c01aeb39>] proc_file_read+0x0/0x1ff
Mar 20 01:17:34 big kernel: [1591579.037940]  [<c01ab436>] proc_reg_read+0x58/0x6b
Mar 20 01:17:34 big kernel: [1591579.037940]  [<c01ab436>] proc_reg_read+0x58/0x6b
Mar 20 01:17:34 big kernel: [1591579.037948]  [<c01ab3de>] proc_reg_read+0x0/0x6b
Mar 20 01:17:34 big kernel: [1591579.037948]  [<c01ab3de>] proc_reg_read+0x0/0x6b
Mar 20 01:17:35 big kernel: [1591579.037952]  [<c017e88e>] vfs_read+0x81/0x11e
Mar 20 01:17:35 big kernel: [1591579.037952]  [<c017e88e>] vfs_read+0x81/0x11e
Mar 20 01:17:35 big kernel: [1591579.037960]  [<c017ecdf>] sys_read+0x3c/0x63
Mar 20 01:17:35 big kernel: [1591579.037960]  [<c017ecdf>] sys_read+0x3c/0x63
Mar 20 01:17:35 big kernel: [1591579.037968]  [<c0108853>] sysenter_past_esp+0x78/0xb1
Mar 20 01:17:35 big kernel: [1591579.037968]  [<c0108853>] sysenter_past_esp+0x78/0xb1
Mar 20 01:17:35 big kernel: [1591579.037984]  =======================
Mar 20 01:17:35 big kernel: [1591579.037984]  =======================
Mar 20 01:17:35 big kernel: [1591579.037986] Mem-info:
Mar 20 01:17:35 big kernel: [1591579.037986] Mem-info:
Mar 20 01:17:35 big kernel: [1591579.037987] DMA per-cpu:
Mar 20 01:17:35 big kernel: [1591579.037987] DMA per-cpu:
Mar 20 01:17:35 big kernel: [1591579.037989] CPU    0: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037989] CPU    0: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037991] CPU    1: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037991] CPU    1: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037993] CPU    2: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037993] CPU    2: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037994] CPU    3: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037994] CPU    3: hi:    0, btch:   1 usd:   0
Mar 20 01:17:35 big kernel: [1591579.037996] Normal per-cpu:
Mar 20 01:17:35 big kernel: [1591579.037996] Normal per-cpu:
Mar 20 01:17:35 big kernel: [1591579.037998] CPU    0: hi:  186, btch:  31 usd: 151
Mar 20 01:17:35 big kernel: [1591579.037998] CPU    0: hi:  186, btch:  31 usd: 151
Mar 20 01:17:35 big kernel: [1591579.038000] CPU    1: hi:  186, btch:  31 usd: 117
Mar 20 01:17:35 big kernel: [1591579.038000] CPU    1: hi:  186, btch:  31 usd: 117
Mar 20 01:17:35 big kernel: [1591579.038002] CPU    2: hi:  186, btch:  31 usd: 166
Mar 20 01:17:35 big kernel: [1591579.038002] CPU    2: hi:  186, btch:  31 usd: 166
Mar 20 01:17:35 big kernel: [1591579.038003] CPU    3: hi:  186, btch:  31 usd: 172
Mar 20 01:17:35 big kernel: [1591579.038003] CPU    3: hi:  186, btch:  31 usd: 172
Mar 20 01:17:35 big kernel: [1591579.038005] HighMem per-cpu:
Mar 20 01:17:35 big kernel: [1591579.038005] HighMem per-cpu:
Mar 20 01:17:35 big kernel: [1591579.038007] CPU    0: hi:  186, btch:  31 usd:  49
Mar 20 01:17:35 big kernel: [1591579.038007] CPU    0: hi:  186, btch:  31 usd:  49
Mar 20 01:17:35 big kernel: [1591579.038009] CPU    1: hi:  186, btch:  31 usd: 140
Mar 20 01:17:35 big kernel: [1591579.038009] CPU    1: hi:  186, btch:  31 usd: 140
Mar 20 01:17:35 big kernel: [1591579.038010] CPU    2: hi:  186, btch:  31 usd:  26
Mar 20 01:17:35 big kernel: [1591579.038010] CPU    2: hi:  186, btch:  31 usd:  26
Mar 20 01:17:35 big kernel: [1591579.038012] CPU    3: hi:  186, btch:  31 usd: 163
Mar 20 01:17:35 big kernel: [1591579.038012] CPU    3: hi:  186, btch:  31 usd: 163
Mar 20 01:17:35 big kernel: [1591579.038015] Active:585080 inactive:487333 dirty:17 writeback:0 unstable:0
Mar 20 01:17:35 big kernel: [1591579.038015] Active:585080 inactive:487333 dirty:17 writeback:0 unstable:0
Mar 20 01:17:35 big kernel: [1591579.038017]  free:794249 slab:186448 mapped:32269 pagetables:2309 bounce:0
Mar 20 01:17:35 big kernel: [1591579.038017]  free:794249 slab:186448 mapped:32269 pagetables:2309 bounce:0
Mar 20 01:17:35 big kernel: [1591579.038020] DMA free:7592kB min:584kB low:728kB high:876kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? no
Mar 20 01:17:35 big kernel: [1591579.038020] DMA free:7592kB min:584kB low:728kB high:876kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? no
Mar 20 01:17:35 big kernel: [1591579.038023] lowmem_reserve[]: 0 1746 17748 17748
Mar 20 01:17:35 big kernel: [1591579.038023] lowmem_reserve[]: 0 1746 17748 17748
Mar 20 01:17:35 big kernel: [1591579.038028] Normal free:28260kB min:32180kB low:40224kB high:48268kB active:14124kB inactive:14044kB present:894080kB pages_scanned:53611 all_unreclaimable? no
Mar 20 01:17:35 big kernel: [1591579.038028] Normal free:28260kB min:32180kB low:40224kB high:48268kB active:14124kB inactive:14044kB present:894080kB pages_scanned:53611 all_unreclaimable? no
Mar 20 01:17:35 big kernel: [1591579.038031] lowmem_reserve[]: 0 0 64008 64008
Mar 20 01:17:35 big kernel: [1591579.038031] lowmem_reserve[]: 0 0 64008 64008
Mar 20 01:17:35 big kernel: [1591579.038035] HighMem free:3141144kB min:512kB low:74240kB high:147968kB active:2326196kB inactive:1935288kB present:8193024kB pages_scanned:0 all_unreclaimable? no
Mar 20 01:17:35 big kernel: [1591579.038035] HighMem free:3141144kB min:512kB low:74240kB high:147968kB active:2326196kB inactive:1935288kB present:8193024kB pages_scanned:0 all_unreclaimable? no
Mar 20 01:17:35 big kernel: [1591579.038038] lowmem_reserve[]: 0 0 0 0
Mar 20 01:17:35 big kernel: [1591579.038038] lowmem_reserve[]: 0 0 0 0
Mar 20 01:17:35 big kernel: [1591579.038042] DMA: 84*4kB 52*8kB 31*16kB 18*32kB 16*64kB 9*128kB 2*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 7584kB
Mar 20 01:17:35 big kernel: [1591579.038042] DMA: 84*4kB 52*8kB 31*16kB 18*32kB 16*64kB 9*128kB 2*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 7584kB
Mar 20 01:17:35 big kernel: [1591579.038050] Normal: 1695*4kB 3*8kB 2*16kB 0*32kB 1*64kB 36*128kB 17*256kB 6*512kB 3*1024kB 1*2048kB 1*4096kB = 28148kB
Mar 20 01:17:35 big kernel: [1591579.038050] Normal: 1695*4kB 3*8kB 2*16kB 0*32kB 1*64kB 36*128kB 17*256kB 6*512kB 3*1024kB 1*2048kB 1*4096kB = 28148kB
Mar 20 01:17:35 big kernel: [1591579.038058] HighMem: 41687*4kB 10581*8kB 881*16kB 17144*32kB 13508*64kB 6787*128kB 1470*256kB 197*512kB 12*1024kB 9*2048kB 21*4096kB = 3141268kB
Mar 20 01:17:35 big kernel: [1591579.038058] HighMem: 41687*4kB 10581*8kB 881*16kB 17144*32kB 13508*64kB 6787*128kB 1470*256kB 197*512kB 12*1024kB 9*2048kB 21*4096kB = 3141268kB
Mar 20 01:17:35 big kernel: [1591579.038067] 633828 total pagecache pages
Mar 20 01:17:35 big kernel: [1591579.038067] 633828 total pagecache pages
Mar 20 01:17:35 big kernel: [1591579.038069] Swap cache: add 37, delete 35, find 0/1
Mar 20 01:17:35 big kernel: [1591579.038069] Swap cache: add 37, delete 35, find 0/1
Mar 20 01:17:35 big kernel: [1591579.038071] Free swap  = 8008256kB
Mar 20 01:17:35 big kernel: [1591579.038071] Free swap  = 8008256kB
Mar 20 01:17:35 big kernel: [1591579.038073] Total swap = 8008392kB
Mar 20 01:17:35 big kernel: [1591579.038073] Total swap = 8008392kB
Mar 20 01:17:35 big kernel: [1591579.081640] 2293760 pages of RAM
Mar 20 01:17:35 big kernel: [1591579.081640] 2293760 pages of RAM
Mar 20 01:17:35 big kernel: [1591579.081640] 2064384 pages of HIGHMEM
Mar 20 01:17:35 big kernel: [1591579.081640] 2064384 pages of HIGHMEM
Mar 20 01:17:35 big kernel: [1591579.081640] 228611 reserved pages
Mar 20 01:17:35 big kernel: [1591579.081640] 228611 reserved pages
Mar 20 01:17:35 big kernel: [1591579.081640] 509557 pages shared
Mar 20 01:17:35 big kernel: [1591579.081640] 509557 pages shared
Mar 20 01:17:35 big kernel: [1591579.081640] 2 pages swap cached
Mar 20 01:17:35 big kernel: [1591579.081640] 2 pages swap cached
Mar 20 01:17:35 big kernel: [1591579.081640] 17 pages dirty
Mar 20 01:17:35 big kernel: [1591579.081640] 17 pages dirty
Mar 20 01:17:35 big kernel: [1591579.081640] 0 pages writeback
Mar 20 01:17:35 big kernel: [1591579.081640] 0 pages writeback
Mar 20 01:17:35 big kernel: [1591579.081640] 32269 pages mapped
Mar 20 01:17:35 big kernel: [1591579.081640] 32269 pages mapped
Mar 20 01:17:35 big kernel: [1591579.081640] 186473 pages slab
Mar 20 01:17:35 big kernel: [1591579.081640] 186473 pages slab
Mar 20 01:17:35 big kernel: [1591579.081640] 2309 pages pagetables
Mar 20 01:17:35 big kernel: [1591579.081640] 2309 pages pagetables

Last edited by Ralfredo; 03-20-2009 at 09:29 AM. Reason: Added pastebin URL
 
Old 03-21-2009, 07:30 PM   #2
TBC Cosmo
Member
 
Registered: Feb 2004
Location: NY
Distribution: Fedora 10, CentOS 5.4, Debian 5 Sparc64
Posts: 355

Rep: Reputation: 43
How much memory have you allocated to guests in virtual box (during these OOM kills)? What are you looking at that makes you think there are huge amounts of free mem?

Last edited by TBC Cosmo; 03-21-2009 at 07:32 PM.
 
Old 03-22-2009, 09:37 AM   #3
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by TBC Cosmo View Post
How much memory have you allocated to guests in virtual box (during these OOM kills)? What are you looking at that makes you think there are huge amounts of free mem?
Hi, thanks for your interest in trying to help me!

I have 1 GB allocated to the VM. The box has 8 GB of memory and around 1GB is used by other processes. I should have around 6 GB to play with. I also normally have no problem to start a couple of other VMs. But sometimes the oom-killer kicks in.

The info below is taken when one VM with 1GB is running. No memory tuning are added to /etc/sysctl.conf. Only default values are applied.

Code:
$ free -lm
             total       used       free     shared    buffers     cached
Mem:          8116       4939       3177          0        275       2381
Low:           821        812          8
High:         7295       4126       3169
-/+ buffers/cache:       2282       5834
Swap:         7820          0       7820
Code:
$ sar -r 1 4
Linux 2.6.26-1-686-bigmem (big)         03/22/2009      _i686_

03:31:50 PM kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad
03:31:51 PM   3251900   5059868     60.88    281872   2438784   8008264       128      0.00         0
03:31:52 PM   3251900   5059868     60.88    281872   2438784   8008264       128      0.00         0
03:31:53 PM   3250536   5061232     60.89    281884   2438776   8008264       128      0.00         0
03:31:54 PM   3250536   5061232     60.89    281884   2438784   8008264       128      0.00         0
Average:      3251218   5060550     60.88    281878   2438782   8008264       128      0.00         0
Code:
top - 15:32:16 up 1 day,  3:00, 10 users,  load average: 0.13, 0.19, 0.19
Tasks: 215 total,   1 running, 214 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.5%us,  0.3%sy,  0.0%ni, 98.1%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8311768k total,  5062396k used,  3249372k free,   282036k buffers
Swap:  8008392k total,      128k used,  8008264k free,  2438916k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
13341 gb        20   0 1141m 1.0g  19m S    1 13.1  17:57.31 VirtualBox         
 3439 root      20   0  635m 349m  21m S    3  4.3  41:59.78 Xorg               
13412 gb        20   0  199m  85m  21m S    1  1.0   3:40.71 firefox-bin        
24455 gb        20   0  177m  57m  28m S    1  0.7  10:34.35 amarokapp          
 3998 gb        20   0 53148  26m  13m S    0  0.3   0:09.49 mono               
 4839 gb        20   0 56668  22m  14m S    0  0.3   0:11.94 VirtualBox         
 4054 gb        20   0 28720  21m 5264 S    0  0.3   1:57.83 emacs22            
 3981 gb        20   0 38924  21m  12m S    1  0.3   2:47.30 gnome-panel
Code:
~$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 2  0    128 3254676 281188 2438948    0    0    46    62   30  105  3  3 94  1
 0  0    128 3254644 281188 2438948    0    0     0     0  163 1318  0  0 100  0
 0  0    128 3253440 281188 2438948    0    0     0     0  156 1299  1  0 99  0
 0  0    128 3253536 281196 2438944    0    0     0    24  164 1339  0  0 100  0
 0  0    128 3253452 281200 2438948    0    0     0    64  158 1661  0  0 99  0
 
Old 03-22-2009, 10:56 AM   #4
TBC Cosmo
Member
 
Registered: Feb 2004
Location: NY
Distribution: Fedora 10, CentOS 5.4, Debian 5 Sparc64
Posts: 355

Rep: Reputation: 43
I don't immediately have an answer. From some quick poking around yesterday, I saw that this has happened because of a bug in a driver for one user's hardware. So when I have a few, I want to keep looking and hopefully have something helpful to add. There are some other threads about this here, too (in "similar threads").
 
Old 03-22-2009, 09:38 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
Looks like a classic "lowmem" exhaustion - simplest solution is to run a 64-bit system. Linux (32-bit) wasn't designed to handle large memory systems - the 64-bit linear addressing works much better.
On 32-bit, having lots of free memory above 1 Gig is no help if the applications insist on allocating below. You are attaching the symptom, not the problem.
 
Old 03-23-2009, 06:13 AM   #6
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by syg00 View Post
Looks like a classic "lowmem" exhaustion - simplest solution is to run a 64-bit system. Linux (32-bit) wasn't designed to handle large memory systems - the 64-bit linear addressing works much better.
Seems like 64-bit is the right solution, but for different reasons I rather avoid that at the moment.

Do anyone have any idea if "CONFIG_HIGHPTE=y" (kernel config parameter) maybe should help. I found some documentation saying "The VM uses one page of memory for each page table. For systems with a lot of RAM, this can be wasteful of precious low memory. Setting this option will put user-space page tables in high memory."
 
Old 03-23-2009, 06:55 AM   #7
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
For that discussion 8 Gig is not a lot of memory - have a look at /proc/meminfo for how much your page tables are occupying.
 
Old 03-23-2009, 07:24 AM   #8
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by syg00 View Post
For that discussion 8 Gig is not a lot of memory
I don't really follow you here.

Quote:
Originally Posted by syg00 View Post
have a look at /proc/meminfo for how much your page tables are occupying.
The content's of /proc/meminfo is below. I'm really not capable to interpret those figures into something usable.

Code:
$ cat /proc/meminfo
MemTotal:      8311768 kB
MemFree:       5831424 kB
Buffers:        468196 kB
Cached:         802088 kB
SwapCached:      18020 kB
Active:        1837964 kB
Inactive:       265056 kB
HighTotal:     7470528 kB
HighFree:      5819352 kB
LowTotal:       841240 kB
LowFree:         12072 kB
SwapTotal:     8008392 kB
SwapFree:      7984620 kB
Dirty:            7208 kB
Writeback:           0 kB
AnonPages:      829996 kB
Mapped:         118324 kB
Slab:           332648 kB
SReclaimable:   315240 kB
SUnreclaim:      17408 kB
PageTables:       6872 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
WritebackTmp:        0 kB
CommitLimit:  12164276 kB
Committed_AS:  1815112 kB
VmallocTotal:   116728 kB
VmallocUsed:     48320 kB
VmallocChunk:    60404 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
HugePages_Surp:      0
Hugepagesize:     2048 kB
 
Old 03-23-2009, 07:36 AM   #9
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
This is the entry of interest
Code:
PageTables:       6872 kB
Not worth wasting your time trying to reclaim less than 7 Meg.
 
Old 03-23-2009, 08:30 AM   #10
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by syg00 View Post
This is the entry of interest
Code:
PageTables:       6872 kB
Not worth wasting your time trying to reclaim less than 7 Meg.
Now I understand what you are meaning, and no 7 MB isn't much to save. Only reason I mentioned CONFIG_HIGHPTE is that I found that recommendation in a thread/post there someone had problems similar to mine.

Anyway, I think my problems are in someway related to VirtualBox since the oom-killer situation started after I upgraded VirtualBox from 2.0.6 to 2.1.4. I have started (before I started this one) a thread at the VirtualBox forum,
http://forums.virtualbox.org/viewtop...fce8320cc8e1e5, but so far no response. When I ran version 2.0.6 or lower I never had any oom-killer problems.

Even tough I understand that the preferred solution is to go to a 64-bit OS it feels strange if that is the only solution. I mean, people do run 32 bit Linux with rater much memory with success or don't they?
 
Old 03-23-2009, 04:46 PM   #11
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
Given that you are a Debian user, does that imply you are using OSE ?.
 
Old 03-24-2009, 12:22 AM   #12
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
I should have had a look at that full log earlier - have you always run that many instances ?.
I found an old 32-bit server I had laying around unused in the office - 4 Gig RAM. Installed VBox 2.1.4 on Centos 5.2. and fired up a couple of guests.

Each instance contributes better than 100 Meg dirty private (non shared) storage below the 1 Gig boundary. One (non-X) image that was actually running chewed up 182 Meg (the other was stopped partway through an install). Multiply that by a few times, and it's easy to see storage becoming short.
 
Old 03-24-2009, 03:45 AM   #13
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Original Poster
Rep: Reputation: 0
First of all, I really appreciate that you are trying to help me. Really!

Quote:
Originally Posted by syg00 View Post
Given that you are a Debian user, does that imply you are using OSE ?.
No its the one under the "VirtualBox Personal Use and Evaluation License (PUEL)" Installed by aptitude after adding "deb http://download.virtualbox.org/virtualbox/debian lenny non-free" to /etc/apt/sources.list.

Quote:
Originally Posted by syg00 View Post
I should have had a look at that full log earlier - have you always run that many instances ?.
I found an old 32-bit server I had laying around unused in the office - 4 Gig RAM. Installed VBox 2.1.4 on Centos 5.2. and fired up a couple of guests.

Each instance contributes better than 100 Meg dirty private (non shared) storage below the 1 Gig boundary. One (non-X) image that was actually running chewed up 182 Meg (the other was stopped partway through an install). Multiply that by a few times, and it's easy to see storage becoming short.
Normally, when I have my oom-killer problems, only one instance is running. I have been running, if I remember correctly, as many as four instances, each with around 1 to 1.5 GB allocated to the guest. That has worked without any oom-killings, go figure...

Usually the oom-killer kicks in in the middle of the night with only one VM running. I upgraded VirtualBox to 2.1.4 on March 7. Before that I newer had any problems. After:

Code:
$ sudo zgrep 'Killed process' /var/log/messages*
/var/log/messages:Mar 23 01:19:32 big kernel: [215148.388898] Killed process 13341 (VirtualBox)
/var/log/messages:Mar 23 01:19:32 big kernel: [215148.388898] Killed process 13341 (VirtualBox)
/var/log/messages:Mar 23 01:19:33 big kernel: [215148.416920] Killed process 13412 (firefox-bin)
/var/log/messages:Mar 23 01:19:33 big kernel: [215148.416920] Killed process 13412 (firefox-bin)
/var/log/messages:Mar 23 01:19:35 big kernel: [215148.441370] Killed process 8874 (apache2)
/var/log/messages:Mar 23 01:19:35 big kernel: [215148.441370] Killed process 8874 (apache2)
/var/log/messages.1.gz:Mar 19 01:18:03 big kernel: [1447935.391583] Killed process 9992 (VirtualBox)
/var/log/messages.1.gz:Mar 19 01:18:03 big kernel: [1447935.391583] Killed process 9992 (VirtualBox)
/var/log/messages.1.gz:Mar 19 01:18:04 big kernel: [1447935.431522] Killed process 13205 (apache2)
/var/log/messages.1.gz:Mar 19 01:18:04 big kernel: [1447935.431522] Killed process 13205 (apache2)
/var/log/messages.1.gz:Mar 19 22:16:09 big kernel: [1574246.270236] Killed process 15779 (firefox-bin)
/var/log/messages.1.gz:Mar 19 22:16:09 big kernel: [1574246.270236] Killed process 15779 (firefox-bin)
/var/log/messages.1.gz:Mar 19 22:21:13 big kernel: [1574697.342575] Killed process 15665 (VirtualBox)
/var/log/messages.1.gz:Mar 19 22:21:13 big kernel: [1574697.342575] Killed process 15665 (VirtualBox)
/var/log/messages.1.gz:Mar 20 01:17:40 big kernel: [1591579.081641] Killed process 16939 (VirtualBox)
/var/log/messages.1.gz:Mar 20 01:17:40 big kernel: [1591579.081641] Killed process 16939 (VirtualBox)
/var/log/messages.1.gz:Mar 20 01:17:45 big kernel: [1591579.122579] Killed process 15169 (firefox-bin)
/var/log/messages.1.gz:Mar 20 01:17:45 big kernel: [1591579.122579] Killed process 15169 (firefox-bin)
/var/log/messages.1.gz:Mar 20 13:26:30 big kernel: [1667397.282821] Killed process 29030 (VirtualBox)
/var/log/messages.1.gz:Mar 20 13:26:30 big kernel: [1667397.282821] Killed process 29030 (VirtualBox)
/var/log/messages.1.gz:Mar 21 01:17:31 big kernel: [1734166.511986] Killed process 14507 (VirtualBox)
/var/log/messages.1.gz:Mar 21 01:17:31 big kernel: [1734166.511986] Killed process 14507 (VirtualBox)
/var/log/messages.2.gz:Mar  9 01:17:27 big kernel: [193049.583556] Killed process 16917 (VirtualBox)
/var/log/messages.2.gz:Mar  9 01:17:27 big kernel: [193049.583556] Killed process 16917 (VirtualBox)
/var/log/messages.2.gz:Mar  9 01:17:29 big kernel: [193049.641312] Killed process 4253 (firefox-bin)
/var/log/messages.2.gz:Mar  9 01:17:29 big kernel: [193049.641312] Killed process 4253 (firefox-bin)
/var/log/messages.3.gz:Mar  7 17:39:51 big kernel: [ 8357.910234] Killed process 5374 (VirtualBox)
/var/log/messages.3.gz:Mar  7 17:39:51 big kernel: [ 8357.910234] Killed process 5374 (VirtualBox)
/var/log/messages.3.gz:Mar  7 17:39:52 big kernel: [ 8357.932204] Killed process 6691 (winedevice.exe)
/var/log/messages.3.gz:Mar  7 17:39:52 big kernel: [ 8357.932204] Killed process 6691 (winedevice.exe)
As you can see, around 01:17 in the night is a popular time (I guess many serial killers like that time of the night .) This might be related to a cron-job that starts at 01:15 that basically does "/bin/ls -alR /". This machine has just over ten local file systems and six remote mounted over NFS (version 3). One theory is that this "big ls" which takes a couple of minutes, demands memory in a way that doesn't mix well with VirtualBox 2.1.4
 
Old 03-24-2009, 07:56 AM   #14
TBC Cosmo
Member
 
Registered: Feb 2004
Location: NY
Distribution: Fedora 10, CentOS 5.4, Debian 5 Sparc64
Posts: 355

Rep: Reputation: 43
Quote:
Originally Posted by Ralfredo View Post
One theory is that this "big ls" which takes a couple of minutes, demands memory in a way that doesn't mix well with VirtualBox 2.1.4
It will certainly fill up the filesystem cache, but that would be freed up immediately as needed by other new memory demands. Not to go off on a complete tangent, but what else is in that cron job? Is there any description of what it's purpose in life is?
 
Old 03-24-2009, 11:25 AM   #15
Ralfredo
LQ Newbie
 
Registered: Mar 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by TBC Cosmo View Post
It will certainly fill up the filesystem cache, but that would be freed up immediately as needed by other new memory demands. Not to go off on a complete tangent, but what else is in that cron job? Is there any description of what it's purpose in life is?
It just creates a kind of very simple "system history file".

Code:
$ sudo crontab -l | grep file
15 01   * * * /usr/bin/nice /usr/local/sbin/make_file_list.sh


$ cat /usr/local/sbin/make_file_list.sh
#!/bin/sh

umask 277

/bin/ls -alR / 2>/dev/null | gzip -9 > /var/file_list/file_list.gz

cp /var/file_list/file_list.gz /var/file_list/file_list.gz_year_`date +%Y`
cp /var/file_list/file_list.gz /var/file_list/file_list.gz_month_`date +%m`
mv /var/file_list/file_list.gz /var/file_list/file_list.gz_day_`date +%d`
It helps me answer questions like when did I install XXX or when was i stupid enough to remove YYY. The precision gets worse as time goes on, but often I find myself helped by having the history. Nothing I can't live without but helpful from time to time. Besides of burning quite a few CPU-cycles and massaging the disks a little, the script ought to be harmless.

BTW, I know a little smarter programming could avoid a couple of "cp", but hey the machine need to have the feeling of being useful even in the middle of the night

The files created looks like below, if that's not obvious:

Code:
$ ls -lrt /var/file_list/
total 394192
-r-------- 1 root root 11834720 2008-12-31 01:21 file_list.gz_year_2008
-r-------- 1 root root 11834720 2008-12-31 01:21 file_list.gz_month_12
-r-------- 1 root root 10707905 2009-01-29 01:20 file_list.gz_day_29
-r-------- 1 root root 10533698 2009-01-30 12:37 file_list.gz_day_30
-r-------- 1 root root 10599206 2009-01-31 01:19 file_list.gz_day_31
-r-------- 1 root root 10599206 2009-01-31 01:19 file_list.gz_month_01
-r-------- 1 root root 10488819 2009-02-25 01:19 file_list.gz_day_25
-r-------- 1 root root 10552811 2009-02-26 01:19 file_list.gz_day_26
-r-------- 1 root root 10529033 2009-02-27 01:19 file_list.gz_day_27
-r-------- 1 root root 10584826 2009-02-28 01:20 file_list.gz_day_28
-r-------- 1 root root 10584826 2009-02-28 01:20 file_list.gz_month_02
-r-------- 1 root root 10570093 2009-03-01 01:20 file_list.gz_day_01
-r-------- 1 root root 10552937 2009-03-02 01:20 file_list.gz_day_02
-r-------- 1 root root 10567805 2009-03-03 01:20 file_list.gz_day_03
-r-------- 1 root root 10615756 2009-03-04 01:20 file_list.gz_day_04
-r-------- 1 root root 10681525 2009-03-05 01:20 file_list.gz_day_05
-r-------- 1 root root 10618591 2009-03-06 01:19 file_list.gz_day_06
-r-------- 1 root root 10653995 2009-03-07 01:20 file_list.gz_day_07
-r-------- 1 root root 10599288 2009-03-08 01:19 file_list.gz_day_08
-r-------- 1 root root 10622728 2009-03-09 01:20 file_list.gz_day_09
-r-------- 1 root root 10623800 2009-03-10 01:20 file_list.gz_day_10
-r-------- 1 root root 10671923 2009-03-11 01:20 file_list.gz_day_11
-r-------- 1 root root 10936737 2009-03-12 01:20 file_list.gz_day_12
-r-------- 1 root root 11331504 2009-03-13 01:20 file_list.gz_day_13
-r-------- 1 root root 11334635 2009-03-14 01:20 file_list.gz_day_14
-r-------- 1 root root 11316461 2009-03-15 01:20 file_list.gz_day_15
-r-------- 1 root root 11308168 2009-03-16 01:20 file_list.gz_day_16
-r-------- 1 root root 11335552 2009-03-17 01:20 file_list.gz_day_17
-r-------- 1 root root 11360195 2009-03-18 01:20 file_list.gz_day_18
-r-------- 1 root root 11407681 2009-03-19 01:20 file_list.gz_day_19
-r-------- 1 root root 11156277 2009-03-20 01:20 file_list.gz_day_20
-r-------- 1 root root 10708876 2009-03-21 01:20 file_list.gz_day_21
-r-------- 1 root root 10682126 2009-03-22 01:20 file_list.gz_day_22
-r-------- 1 root root 10886243 2009-03-23 01:22 file_list.gz_day_23
-r-------- 1 root root 11190555 2009-03-24 01:20 file_list.gz_year_2009
-r-------- 1 root root 11190555 2009-03-24 01:20 file_list.gz_day_24
-r-------- 1 root root 11190555 2009-03-24 01:20 file_list.gz_month_03
While talking about cron jobs. The only other cron job that's not Debian standard running around 01:17 is as below. It only take's a few sconds (if even that) and I can't see that it should have anything to do with my problem, but who knows. Stranger things have happened...

Code:
$ sudo crontab -l|grep disk
14 01   * * * /usr/local/sbin/disk_lvm_info.sh


$ cat /usr/local/sbin/disk_lvm_info.sh
#!/bin/sh

OUTFILE=/usr/local/etc/disk_lvm_info
exec > $OUTFILE 2>&1

set -x

/sbin/sfdisk -l /dev/sda
/sbin/pvdisplay 
/sbin/vgdisplay
/sbin/lvdisplay
cat /etc/fstab
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Polyakov's OOM Killer Tamer LXer Syndicated Linux News 0 01-19-2009 03:10 PM
oom-killer on RHEL5.2 jaiarunk_s Linux - Server 3 12-12-2008 07:54 PM
Out of memory (oom) killer causes system crash? BusyBeeBop Linux - Software 6 06-02-2008 01:42 AM
OOM-Killer woes Slim Backwater Slackware 2 07-25-2006 03:00 AM
oom-killer is killing my programs on FC3 abarclay Fedora 1 03-08-2005 09:14 AM


All times are GMT -5. The time now is 04:48 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration