LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Blogs > Musings on technology, philosophy, and life in the corporate world
User Name
Password

Notices



Hi. I'm a Unix Administrator, mathematics enthusiast, and amateur philosopher. This is where I rant about that which upsets me, laugh about that which amuses me, and jabber about that which holds my interest most: Unix.
Rate this Entry

OOM Killer and acquisitions...a match made in heaven!

Posted 11-25-2009 at 10:17 AM by rocket357
Updated 11-25-2009 at 10:28 AM by rocket357

Yesterday afternoon one of my coworkers enlisted the help of employees company wide to load test a project we'd worked on together about a year ago. He'd made significant changes to the application server and wanted to see how much the server could handle in raw traffic/requests. So at 2:30 PM, half of the company I work for started hitting the application server running bogus reports...

I watched the PostgreSQL database backend while all of this was going on. By 2:34 PM, the system load was at 20+. Seconds later, the machine went from 8 GB used RAM + 3 GB used swap to 150 MB RAM used. Damn. PostgreSQL restarted.

3 things got overlooked:
1) The number of people that participated...it was much higher than he expected
2) The punctual nature of the people that participated...he anticipated a trickle-in of people, not a blast right at 2:30.
3) The test db was 10 GB in size, which is more than the 8 GB of RAM the db backend had...meaning we're really testing the db backend's disk speed.

Looking through the logs, it would appear that the oom killer hacked off one of PostgreSQL's child processes, which in turn corrupted shared memory, causing PostgreSQL to restart. Since this is a dedicated machine (albeit not an exceptionally powerful one...it has a single quad core, 8 GB of RAM, and about 10 15k SAS drives), the oom killer hitting PostgreSQL is sorta like putting up an array of flat panel screens just so you can see anything that might be blocked by that massive array of flat panel screens...if I don't have PostgreSQL, I don't care if the machine is running or not. (Solaris is starting to look good as a db server OS...I'll have to look into that).

The load test was successful, if you can say "the application server can outlast the database server" 3 times really fast...which, ironically, lead to management approving the purchase of two new PostgreSQL database machines that are **much** more powerful than the one used in the test. Go team.

I may have to employ the oom killer again when I need a replacement machine =)
Views 1003 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 03:50 PM.

Main Menu
Advertisement

Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration