LinuxQuestions.org - [SOLVED] MySQL server gets restarted continually (CentOS 6.5 VPS 1 GB RAM)

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - MySQL server gets restarted continually (CentOS 6.5 VPS 1 GB RAM) (https://www.linuxquestions.org/questions/linux-newbie-8/mysql-server-gets-restarted-continually-centos-6-5-vps-1-gb-ram-4175486713/)

MySQL server gets restarted continually (CentOS 6.5 VPS 1 GB RAM)

Hi everyone,
I have a CentOS 6.5 VPS with 1 GB RAM. Over the last 10 days or so, I noticed that the mysql server had gone down because I got the following error message while trying to log in:

Quote:

ERROR! mysql manager or server PID could not be found

I checked the logs and found the following (this is just an extract):

Code:

131203  7:07:50 [Note] /usr/sbin/mysqld: Normal shutdown



131203  7:07:50 [Note] Event Scheduler: Purging the queue. 0 events

131203  7:07:52  InnoDB: Starting shutdown...

131203  7:07:52  InnoDB: Shutdown completed; log sequence number 0 192491

131203  7:07:52 [Note] /usr/sbin/mysqld: Shutdown complete



131203 07:07:52 mysqld_safe mysqld from pid file /var/lib/mysql/vps-1146563-18940.manage.myhosting.com.pid ended

131203 07:08:09 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

131203  7:08:09 [Note] Plugin 'FEDERATED' is disabled.

131203  7:08:09  InnoDB: Initializing buffer pool, size = 8.0M

131203  7:08:09  InnoDB: Completed initialization of buffer pool

131203  7:08:09  InnoDB: Started; log sequence number 0 192491

131203  7:08:09 [Note] Event Scheduler: Loaded 0 events

131203  7:08:09 [Note] /usr/sbin/mysqld: ready for connections.

Version: '5.1.70-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

131203  7:10:00 [Note] /usr/sbin/mysqld: Normal shutdown



131203  7:10:00 [Note] Event Scheduler: Purging the queue. 0 events

131203  7:10:02  InnoDB: Starting shutdown...

131203  7:10:05  InnoDB: Shutdown completed; log sequence number 0 192491

131203  7:10:05 [Note] /usr/sbin/mysqld: Shutdown complete

I restarted the server

Code:

service mysql restart

and got the same error message as show above plus:

Code:

Starting MySQL. SUCCESS!

And then it started working like a charm.
I talked to a support representative and the only thing he was able to come up with was that the database server seemed to be getting restarted because of insufficient memory:

Code:

root@centosvps [~]# free -m

            total      used      free    shared    buffers    cached

Mem:          1024        997        26          0          0        782

-/+ buffers/cache:        215        808

Swap:            0          0

And then of course, he offered me more RAM. It is not that expensive, but I want to make sure that there is something in this issue that would justify buying more resources, especially when mysql only uses around 4% RAM as shown by the top command.
However, I noticed that when running a simple query (note that this database has over 6 million records) the CPU usage of mysql went up to 99.7%! but the memory usage did not increase significantly.
I appreciate any comments. They will be more than welcome.

Quote:

Originally Posted by gacanepa (Post 5074701)

Hi everyone,

I talked to a support representative and the only thing he was able to come up with was that the database server seemed to be getting restarted because of insufficient memory:

Code:

root@centosvps [~]# free -m

            total      used      free    shared    buffers    cached

Mem:          1024        997        26          0          0        782

-/+ buffers/cache:        215        808

Swap:            0          0

This is a completely false statement and a pretty common misconception with Linux. If you look at the above output you see 26 MB of free memory and you're like WTF is going on with my system. But without going into the gory details Linux kernel manages memory differently than Windows and will only "clean" the memory when it is required for use with another application. Until then memory that had been used by the system at one point in time will go into a "dirty inactive" status.

Using free -m always provides confusing output.

Run a cat /proc/meminfo to see what is actually going on with your memory.

Here is one of my prod servers running with a bunch of memory. Running a free -m shows that there is little to no free memory:

Code:

 free -m

            total      used      free    shared    buffers    cached

Mem:        19939      19834        105          0        666      15043

-/+ buffers/cache:      4125      15814

Swap:        9983          0      9983

So I want to see whats really going on I dig into my /proc directory:

Code:

cat /proc/meminfo

MemTotal:    20418256 kB

MemFree:        107732 kB

Buffers:        679172 kB

Cached:      15432136 kB

SwapCached:          0 kB

Active:        6396636 kB

Inactive:    12651936 kB

HighTotal:          0 kB

HighFree:            0 kB

LowTotal:    20418256 kB

LowFree:        107732 kB

SwapTotal:    10223608 kB

SwapFree:    10223036 kB

Dirty:          33524 kB

Writeback:          4 kB

AnonPages:    2937276 kB

Mapped:          82772 kB

Slab:          1134120 kB

PageTables:      56380 kB

NFS_Unstable:        0 kB

Bounce:              0 kB

CommitLimit:  20432736 kB

Committed_AS: 11044808 kB

VmallocTotal: 34359738367 kB

VmallocUsed:    269708 kB

VmallocChunk: 34359468443 kB

HugePages_Total:    0

HugePages_Free:      0

HugePages_Rsvd:      0

Hugepagesize:    2048 kB

Reviewing the above output I can see the following:

Code:

MemTotal:    20418256 kB

MemFree:        107732 kB

Inactive:    12651936 kB

This tells me I actually have 12759668 KB free of my 201418256 KB.

Another note. You're cached memory which is the second line in the free -m output can give you an idea of how much you actually have free but its not 100% accurate.

As you can see in my meminfo file, 15432136 KB of memory are cached but 6396636 KB are still active so only 12651936KB is available to be cleaned for use.

Sounds like the support rep honestly didnt know what the problem was and I feel for those folks as I've done that job before working for a hosting company. And at the end of the day most of their scope's of support dont provide troubleshooting services. They will say, oh your MySQL is dying? Well reprovision it and it wont happen again. Basically its the data and config you are doing to the system that will cause it to crash. If you take their out of the box image I'm sure MySQL runs fine and unless you pay them or they provide actual server administration services you won't get very far with support.

What does your /var/log/messages file report around the time the service dies?

What exists in /var/lib/mysql when the service dies before you restart it?

You can find plenty of these links on Google but here's one with some more info on the memory management:

http://knowledgelayer.softlayer.com/...ory-management

@Kustom42:
Thank you SO VERY MUCH for that detailed, in-depth explanation.
As to your questions,
What does your /var/log/messages file report around the time the service dies? I checked the /var/log/messages file around the time MySQL died and there was nothing about MySQL.
What exists in /var/lib/mysql when the service dies before you restart it? I had to recreate the issue by killing mysql and then restarting it again.
AFTER KILLING MYSQL:

Code:

root@vps-centos [/var/lib/mysql]# ls -l

total 20560

drwxr-x--x 10 mysql mysql    4096 Dec  3 15:39 ./

drwxr-xr-x 20 root  root      4096 Dec  3 03:41 ../

drwx------  2 mysql mysql    4096 Dec 21  2012 cphulkd/

drwx------  2 mysql mysql    4096 Nov 29 05:30 eximstats/

drwx------  2 mysql mysql    4096 Nov 27 02:29 horde/

-rw-rw----  1 mysql mysql 10485760 Dec  3 15:39 ibdata1

-rw-rw----  1 mysql mysql  5242880 Dec  3 15:39 ib_logfile0

-rw-rw----  1 mysql mysql  5242880 Sep 20  2012 ib_logfile1

drwx------  2 mysql mysql    4096 Dec 21  2012 leechprotect/

drwx------  2 mysql mysql    4096 Dec 21  2012 logaholicDB_vps_1111842_12395/

drwx--x--x  2 mysql mysql    4096 Oct 21 02:35 mysql/

-rw-r--r--  1 mysql mysql        6 Oct 21 02:35 mysql_upgrade_info

drwx------  2 mysql mysql    4096 Oct 21 02:36 roundcube/

drwx------  2 mysql mysql    4096 Nov  1 15:38 voterdatanow/

-rw-rw----  1 mysql mysql    7593 Dec 21  2012 vps-1111842-12395.manage.myhosting.com.err

-rw-rw----  1 mysql mysql    12964 Dec  3 15:39 vps-1146563-18940.manage.myhosting.com.err

AFTER RESTARTING IT:

Code:

root@vps-centos [/var/lib/mysql]# ls -l

total 20564

drwxr-x--x 10 mysql mysql    4096 Dec  3 15:40 ./

drwxr-xr-x 20 root  root      4096 Dec  3 03:41 ../

drwx------  2 mysql mysql    4096 Dec 21  2012 cphulkd/

drwx------  2 mysql mysql    4096 Nov 29 05:30 eximstats/

drwx------  2 mysql mysql    4096 Nov 27 02:29 horde/

-rw-rw----  1 mysql mysql 10485760 Dec  3 15:39 ibdata1

-rw-rw----  1 mysql mysql  5242880 Dec  3 15:40 ib_logfile0

-rw-rw----  1 mysql mysql  5242880 Sep 20  2012 ib_logfile1

drwx------  2 mysql mysql    4096 Dec 21  2012 leechprotect/

drwx------  2 mysql mysql    4096 Dec 21  2012 logaholicDB_vps_1111842_12395/

drwx--x--x  2 mysql mysql    4096 Oct 21 02:35 mysql/

srwxrwxrwx  1 mysql mysql        0 Dec  3 15:40 mysql.sock=

-rw-r--r--  1 mysql mysql        6 Oct 21 02:35 mysql_upgrade_info

drwx------  2 mysql mysql    4096 Oct 21 02:36 roundcube/

drwx------  2 mysql mysql    4096 Nov  1 15:38 voterdatanow/

-rw-rw----  1 mysql mysql    7593 Dec 21  2012 vps-centos.err

-rw-rw----  1 mysql mysql    13517 Dec  3 15:40 vps-centos.err

-rw-rw----  1 mysql mysql        6 Dec  3 15:40 vps-centos.pid

At first glance, there are 2 lines in the last listing that are not present in the first one:

Code:

srwxrwxrwx 1 mysql mysql 0 Dec 3 15:40 mysql.sock=

and

Code:

-rw-rw---- 1 mysql mysql 6 Dec 3 15:40 vps-centos.pid

Quote:

Originally Posted by gacanepa (Post 5074837)

At first glance, there are 2 lines in the last listing that are not present in the first one:

Code:

srwxrwxrwx 1 mysql mysql 0 Dec 3 15:40 mysql.sock=

and

Code:

-rw-rw---- 1 mysql mysql 6 Dec 3 15:40 vps-centos.pid

Now you're starting to do some troubleshooting. So in the Linux world everything exists as a flat file, including processes. The issue here is that the .pid and .sock files are being removed somehow and that causes MySQL to think it has died and stop functioning. Now all you have to do is figure out what is actually causing it to be deleted.

The .pid file is what is used by MySQL to establish that a mysql process is running. It uses this as a check to prevent MySQL from starting multiple daemons as well as to store some info about the current running process.

The .sock file is for establishing the socket connection to bind an open port.(This is actually done via one of the /proc/pid/fd/* files as each process will have a file in its "fd" directory for each socket used but MySQL uses this additional file as well)

I see a .err file which usually indicates there is something being logged. Is there a separate mysql.err log file? What are the contents of your /var/log/mysql/ directory?

Also, this being a VPS it is quite possible that you are in fact over utilizing your VPS and the process is being automatically killed by the hypervisor host. 9/10 support folks will have no idea what this means but if you push your way up the food chain you can have them check for an OOM or Out of Memory kill error logged on the VPS hypervsior. A VPS hypervisor with most cloud hosting solutions is a little finicky and 99% have no hard utilization limits set. What that means is that you are sitting on a box with 64GB of memory that you share with a bunch of other folks. You pay for 1GB so they setup alerts and auto OOM(out of memory) kills to kill processes that consume more resources than they are allocated via the soft limits. They can also put hard limits in place but the downside to this is it can render an entire VM unusable until it is rebooted if its the hard limit in some circumstances.

Now here are a couple of next troubleshooting steps before getting back on the phone with support for another headache:

Install sar if it is not already installed. Run "sar -q" as root to see if is installed.

Code:



yum install sar -y

Setup a file watch for your .pid file. BE CAREFUL with this one, if you forget about it it will fill up your filesystem which should take basically forever but I have no idea how much space you have available.

Code:



while true; do fuser -v /var/lib/mysql/vps-centos.pid >> /tmp/temporary_proc_watch.log; done

done

The above command will check for all processes who are using the .pid file every couple seconds and log it to a file. If some other process is using the file you will see it logged there.

Again monitor your filesystem closely and kill this if space is an issue although one line of text every few seconds isn't much space.

Thanks again!

Quote:

What are the contents of your /var/log/mysql/ directory?

There is no mysql directory inside /var/log:

Code:

root@vps-centos [/var/log]# ls

./            dcpumon/        lastlog  stunnel-4.15-build.log

../          dmesg          maillog  tallylog

bandwidth/    dmesg.old      messages  wtmp

boot.log      exim_mainlog    sa/      xferlog.offset

btmp          exim_paniclog  samba/    xferlog.offsetftpsep

chkservd.log  exim_rejectlog  secure    yum.log

cron          httpd/          spooler

root@vps-centos [/var/log]#

sar is installed on the VPS. I will take a closer look at the man page for additional details.
And thanks again for the tip on the watchdog for the .pid file.
I will mark this thread as solved as I received more help than I even expected, and will add to your reputation.
Have a nice day!