LinuxQuestions.org - Query regarding High Inactive(file) usage

Hi kernel experts

I'm looking for help related to what seems to be a problem in the MM or FS (EXT4/JBD2) subsystem. Though it could very well be something in the userland that i'm unable to catch.

I'm debugging an issue where my Virtual machine running 3.10.19 (host has the same kernel version) on QEMU (1.5.3) gets low on memory. The VM is launched with 2.4G of RAM and the usual process usage is around 1.4G.

When the low memory is detected, i notice that the Inactive(file) usage reported is very high, while Cached usage is fairly low in /proc/meminfo. For e.g.:

MemTotal: 2459784 kB
MemFree: 89544 kB
Buffers: 3316 kB
Cached: 108872 kB
SwapCached: 0 kB
Active: 1119204 kB
Inactive: 926884 kB
Active(anon): 1104440 kB
Inactive(anon): 1896 kB
Active(file): 14764 kB
Inactive(file): 924988 kB

At this point, the process resident usage checked via /proc/pid/smaps hasn't gone up.

Dropping caches doesn't help at all. And i don't see anything unusual in the kernel logs.

I've looked at /dev/shm and other tmpfs usage. They all look normal. Disk usage for other partitions looks normal too. By normal i mean it is pretty much what i see even without the problem.

So i did the next logical thing and took a VM core and decoded it in crash.

A few things i noticed:

1. Going through the inode list via the Superblocks, the inodes don't have that many pages mapped. So that's why dropping caches doesn't help.
2. From the kmem output, I see huge number of pages in the Inactive LRU that don't have any mappings associated with them. This confirms the first point.
3. But most of these pages have an associated private entry. These private/FS related entries are buffer_head buffers.

crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 614948 2.3 GB ----
FREE 44698 174.6 MB 7% of TOTAL MEM
USED 570250 2.2 GB 92% of TOTAL MEM
SHARED 24930 97.4 MB 4% of TOTAL MEM
BUFFERS 415 1.6 MB 0% of TOTAL MEM
CACHED 24965 97.5 MB 4% of TOTAL MEM
SLAB 59052 230.7 MB 9% of TOTAL MEM

TOTAL HUGE 0 0 ----
HUGE FREE 0 0 0% of TOTAL HUGE

TOTAL SWAP 0 0 ----
SWAP USED 0 0 0% of TOTAL SWAP
SWAP FREE 0 0 0% of TOTAL SWAP

COMMIT LIMIT 307474 1.2 GB ----
COMMITTED 1389447 5.3 GB 451% of TOTAL LIMIT
crash>
crash> kmem -V
VM_STAT:
NR_FREE_PAGES: 44698
NR_INACTIVE_ANON: 3554
NR_ACTIVE_ANON: 294558
NR_INACTIVE_FILE: 186610
NR_ACTIVE_FILE: 5002

For e.g., here is one of the pages I looked at.

crash> kmem ffffea000004cf20
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea000004cf20 15fc000 0 698 1 1ffc000000082c referenced,uptodate,lru,private
crash> struct page.lru -x ffffea000004cf20
lru = {
next = 0xffffea0000322dc8,
prev = 0xffffea000004cf78
}
crash> list 0xffffea0000322dc8 | wc -l
186611

crash> struct page.private -x ffffea000004cf20
private = 0xffff8800282b2540
crash> kmem 0xffff8800282b2540
CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME
ffff880095c16200 104 170225 172383 4659 4k buffer_head
SLAB MEMORY TOTAL ALLOCATED FREE
ffff8800282b2000 ffff8800282b20c8 37 37 0
FREE / [ALLOCATED]
[ffff8800282b2540]

PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea00008c96f0 282b2000 0 0 1 1ffc0000000080 slab
crash> buffer_head -x ffff8800282b2540
struct buffer_head {
b_state = 0x100001,
b_this_page = 0xffff8800282b2540,
b_page = 0xffffea000004cf20,
b_blocknr = 0x11698,
b_size = 0x1000,
b_data = 0xffff8800015fc000 "",
b_bdev = 0x0,
b_end_io = 0x0,
b_private = 0x0,
b_assoc_buffers = {
next = 0xffff8800282b2588,
prev = 0xffff8800282b2588
},
b_assoc_map = 0x0,
b_count = {
counter = 0x0
}
}

The number of buffer_head cache objects in-use seems close to the total number of in-use pages (170K buffer_head vs 186K pages).

Pages have a non-zero refcount, while the buffer head itself has a zero ref-count.

These buffer-heads are used by EXT4 and JBD2 (of the ones that we use in our VM) kernel modules, neither of which i have too much of an idea of. From the looks of it, the user of the buffer_head seems to have cleared the bh fields and done a put, but hasn't freed up the cache object.

I've been combing through the upstream commits to see if anything remotely resembles this symptom but i haven't found anything yet.

I can't recreate this issue, so debugging has gotten harder. Last i saw this was a month back and it hasn't happened ever since.

Has anyone seen or debugged an issue like this? Or what would be a good approach to nail this down?

Is it possible that this is somehow tied up to a funky user space trickery?

TIA

PS: Everything in the VM runs as root. i know its not safe, but it wasn't my decision.