What metrics is oom-killer using to determine memory usage in Cgroup
Linux - KernelThis forum is for all discussion relating to the Linux kernel.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
What metrics is oom-killer using to determine memory usage in Cgroup
I am trying to find a metrics that represents memory usage logged in syslog when container reaches the threshold and gets killed.
This is the message I refer to:
Nov 6 10:16:24 pool-a53hsbota-7h3co kernel: [2111341.288726] memory: usage 524288kB, limit 524288kB, failcnt 118
Nov 6 10:16:24 pool-a53hsbota-7h3co kernel: [2111341.289672] memory+swap: usage 524288kB, limit 9007199254740988kB, failcnt 0
Nov 6 10:16:24 pool-a53hsbota-7h3co kernel: [2111341.298582] kmem: usage 5800kB, limit 9007199254740988kB, failcnt 0
I tried to collect different matrices using Prometheus and compare their values to the value in the log, but I couldn't find a metrics or combination of matrices which represents the same value as the one logged at that point of time.
I tried:
- /sys/fs/cgroup/memory/kubepods/burstable/<pod>/<container>/memory.stat
- ps command
All what I am trying to do is to show in Grafana using proper metrics that memory usage for the container grew and when it reached the limit, the container was killed.
If using croup2, add this to your reading list. Yes I know it says facebook, but those folks did all the work for PSI then released it for public consumption.
Go get a beverage of choice before starting.
Thank you for sharing the article. My problem is not understanding of OOM, but showing that the application misbehave. I know for the fact that application is being killed by OOM when it reaches its limit of 512MB. I am looking for metrics which clearly shows that consumed memory reached that limit value at that moment. Right now when I check cgroup momory used at the time of OOM invocation, it shows only 100MB, which is very far from the actual limit.
Basically I would like to shows on some chart, that used memory was rising and just before the invocation of OOM the consumed memory was close to the limit. I cannot find a single metrics which would show it.
If using croup2, add this to your reading list. Yes I know it says facebook, but those folks did all the work for PSI then released it for public consumption.
Go get a beverage of choice before starting.
I briefly scanned through the page and haven't found what I am looking for, but I will read it in more details to see if it gives me the information I am looking for.
Thank you for sharing
Thank you for sharing the article. My problem is not understanding of OOM, but showing that the application misbehave. I know for the fact that application is being killed by OOM when it reaches its limit of 512MB. I am looking for metrics which clearly shows that consumed memory reached that limit value at that moment. Right now when I check cgroup momory used at the time of OOM invocation, it shows only 100MB, which is very far from the actual limit.
Basically I would like to shows on some chart, that used memory was rising and just before the invocation of OOM the consumed memory was close to the limit. I cannot find a single metrics which would show it.
Well, that article does mention the exact data sources used for OOM killer, which is why I put it there in the first place.
After answering last time, I remember reading up on some specific cgroup "issues" with OOM killer, but most of it was how to solve it by splitting cgroup memory so it doesn't kill a full "container" when it reaches the limit, but rather killing the high memory consumer inside the container before OOM killer kills the container.
Well, that article does mention the exact data sources used for OOM killer, which is why I put it there in the first place.
After answering last time, I remember reading up on some specific cgroup "issues" with OOM killer, but most of it was how to solve it by splitting cgroup memory so it doesn't kill a full "container" when it reaches the limit, but rather killing the high memory consumer inside the container before OOM killer kills the container.
The article you are referencing sounds very interesting and could be very helpful in solving the problem I work on. Any chance that you would find a link to that article?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.