LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   MySQL server gets restarted continually (CentOS 6.5 VPS 1 GB RAM) (https://www.linuxquestions.org/questions/linux-newbie-8/mysql-server-gets-restarted-continually-centos-6-5-vps-1-gb-ram-4175486713/)

gacanepa 12-03-2013 11:16 AM

MySQL server gets restarted continually (CentOS 6.5 VPS 1 GB RAM)
 
Hi everyone,
I have a CentOS 6.5 VPS with 1 GB RAM. Over the last 10 days or so, I noticed that the mysql server had gone down because I got the following error message while trying to log in:
Quote:

ERROR! mysql manager or server PID could not be found
I checked the logs and found the following (this is just an extract):
Code:

131203  7:07:50 [Note] /usr/sbin/mysqld: Normal shutdown

131203  7:07:50 [Note] Event Scheduler: Purging the queue. 0 events
131203  7:07:52  InnoDB: Starting shutdown...
131203  7:07:52  InnoDB: Shutdown completed; log sequence number 0 192491
131203  7:07:52 [Note] /usr/sbin/mysqld: Shutdown complete

131203 07:07:52 mysqld_safe mysqld from pid file /var/lib/mysql/vps-1146563-18940.manage.myhosting.com.pid ended
131203 07:08:09 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
131203  7:08:09 [Note] Plugin 'FEDERATED' is disabled.
131203  7:08:09  InnoDB: Initializing buffer pool, size = 8.0M
131203  7:08:09  InnoDB: Completed initialization of buffer pool
131203  7:08:09  InnoDB: Started; log sequence number 0 192491
131203  7:08:09 [Note] Event Scheduler: Loaded 0 events
131203  7:08:09 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.70-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
131203  7:10:00 [Note] /usr/sbin/mysqld: Normal shutdown

131203  7:10:00 [Note] Event Scheduler: Purging the queue. 0 events
131203  7:10:02  InnoDB: Starting shutdown...
131203  7:10:05  InnoDB: Shutdown completed; log sequence number 0 192491
131203  7:10:05 [Note] /usr/sbin/mysqld: Shutdown complete

I restarted the server
Code:

service mysql restart
and got the same error message as show above plus:
Code:

Starting MySQL. SUCCESS!
And then it started working like a charm.
I talked to a support representative and the only thing he was able to come up with was that the database server seemed to be getting restarted because of insufficient memory:
Code:

root@centosvps [~]# free -m
            total      used      free    shared    buffers    cached
Mem:          1024        997        26          0          0        782
-/+ buffers/cache:        215        808
Swap:            0          0

And then of course, he offered me more RAM. It is not that expensive, but I want to make sure that there is something in this issue that would justify buying more resources, especially when mysql only uses around 4% RAM as shown by the top command.
However, I noticed that when running a simple query (note that this database has over 6 million records) the CPU usage of mysql went up to 99.7%! but the memory usage did not increase significantly.
I appreciate any comments. They will be more than welcome.

Kustom42 12-03-2013 12:16 PM

Quote:

Originally Posted by gacanepa (Post 5074701)
Hi everyone,

I talked to a support representative and the only thing he was able to come up with was that the database server seemed to be getting restarted because of insufficient memory:
Code:

root@centosvps [~]# free -m
            total      used      free    shared    buffers    cached
Mem:          1024        997        26          0          0        782
-/+ buffers/cache:        215        808
Swap:            0          0

And then of course, he offered me more RAM. It is not that expensive, but I want to make sure that there is something in this issue that would justify buying more resources, especially when mysql only uses around 4% RAM as shown by the top command.
However, I noticed that when running a simple query (note that this database has over 6 million records) the CPU usage of mysql went up to 99.7%! but the memory usage did not increase significantly.
I appreciate any comments. They will be more than welcome.


This is a completely false statement and a pretty common misconception with Linux. If you look at the above output you see 26 MB of free memory and you're like WTF is going on with my system. But without going into the gory details Linux kernel manages memory differently than Windows and will only "clean" the memory when it is required for use with another application. Until then memory that had been used by the system at one point in time will go into a "dirty inactive" status.

Using free -m always provides confusing output.


Run a cat /proc/meminfo to see what is actually going on with your memory.

Here is one of my prod servers running with a bunch of memory. Running a free -m shows that there is little to no free memory:

Code:

free -m
            total      used      free    shared    buffers    cached
Mem:        19939      19834        105          0        666      15043
-/+ buffers/cache:      4125      15814
Swap:        9983          0      9983

So I want to see whats really going on I dig into my /proc directory:

Code:

cat /proc/meminfo
MemTotal:    20418256 kB
MemFree:        107732 kB
Buffers:        679172 kB
Cached:      15432136 kB
SwapCached:          0 kB
Active:        6396636 kB
Inactive:    12651936 kB
HighTotal:          0 kB
HighFree:            0 kB
LowTotal:    20418256 kB
LowFree:        107732 kB
SwapTotal:    10223608 kB
SwapFree:    10223036 kB
Dirty:          33524 kB
Writeback:          4 kB
AnonPages:    2937276 kB
Mapped:          82772 kB
Slab:          1134120 kB
PageTables:      56380 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  20432736 kB
Committed_AS: 11044808 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    269708 kB
VmallocChunk: 34359468443 kB
HugePages_Total:    0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:    2048 kB


Reviewing the above output I can see the following:

Code:

MemTotal:    20418256 kB
MemFree:        107732 kB
Inactive:    12651936 kB


This tells me I actually have 12759668 KB free of my 201418256 KB.


Another note. You're cached memory which is the second line in the free -m output can give you an idea of how much you actually have free but its not 100% accurate.

As you can see in my meminfo file, 15432136 KB of memory are cached but 6396636 KB are still active so only 12651936KB is available to be cleaned for use.




Sounds like the support rep honestly didnt know what the problem was and I feel for those folks as I've done that job before working for a hosting company. And at the end of the day most of their scope's of support dont provide troubleshooting services. They will say, oh your MySQL is dying? Well reprovision it and it wont happen again. Basically its the data and config you are doing to the system that will cause it to crash. If you take their out of the box image I'm sure MySQL runs fine and unless you pay them or they provide actual server administration services you won't get very far with support.



What does your /var/log/messages file report around the time the service dies?

What exists in /var/lib/mysql when the service dies before you restart it?

Kustom42 12-03-2013 12:18 PM

You can find plenty of these links on Google but here's one with some more info on the memory management:

http://knowledgelayer.softlayer.com/...ory-management

gacanepa 12-03-2013 02:48 PM

@Kustom42:
Thank you SO VERY MUCH for that detailed, in-depth explanation.
As to your questions,
What does your /var/log/messages file report around the time the service dies? I checked the /var/log/messages file around the time MySQL died and there was nothing about MySQL.
What exists in /var/lib/mysql when the service dies before you restart it? I had to recreate the issue by killing mysql and then restarting it again.
AFTER KILLING MYSQL:
Code:

root@vps-centos [/var/lib/mysql]# ls -l
total 20560
drwxr-x--x 10 mysql mysql    4096 Dec  3 15:39 ./
drwxr-xr-x 20 root  root      4096 Dec  3 03:41 ../
drwx------  2 mysql mysql    4096 Dec 21  2012 cphulkd/
drwx------  2 mysql mysql    4096 Nov 29 05:30 eximstats/
drwx------  2 mysql mysql    4096 Nov 27 02:29 horde/
-rw-rw----  1 mysql mysql 10485760 Dec  3 15:39 ibdata1
-rw-rw----  1 mysql mysql  5242880 Dec  3 15:39 ib_logfile0
-rw-rw----  1 mysql mysql  5242880 Sep 20  2012 ib_logfile1
drwx------  2 mysql mysql    4096 Dec 21  2012 leechprotect/
drwx------  2 mysql mysql    4096 Dec 21  2012 logaholicDB_vps_1111842_12395/
drwx--x--x  2 mysql mysql    4096 Oct 21 02:35 mysql/
-rw-r--r--  1 mysql mysql        6 Oct 21 02:35 mysql_upgrade_info
drwx------  2 mysql mysql    4096 Oct 21 02:36 roundcube/
drwx------  2 mysql mysql    4096 Nov  1 15:38 voterdatanow/
-rw-rw----  1 mysql mysql    7593 Dec 21  2012 vps-1111842-12395.manage.myhosting.com.err
-rw-rw----  1 mysql mysql    12964 Dec  3 15:39 vps-1146563-18940.manage.myhosting.com.err

AFTER RESTARTING IT:
Code:

root@vps-centos [/var/lib/mysql]# ls -l
total 20564
drwxr-x--x 10 mysql mysql    4096 Dec  3 15:40 ./
drwxr-xr-x 20 root  root      4096 Dec  3 03:41 ../
drwx------  2 mysql mysql    4096 Dec 21  2012 cphulkd/
drwx------  2 mysql mysql    4096 Nov 29 05:30 eximstats/
drwx------  2 mysql mysql    4096 Nov 27 02:29 horde/
-rw-rw----  1 mysql mysql 10485760 Dec  3 15:39 ibdata1
-rw-rw----  1 mysql mysql  5242880 Dec  3 15:40 ib_logfile0
-rw-rw----  1 mysql mysql  5242880 Sep 20  2012 ib_logfile1
drwx------  2 mysql mysql    4096 Dec 21  2012 leechprotect/
drwx------  2 mysql mysql    4096 Dec 21  2012 logaholicDB_vps_1111842_12395/
drwx--x--x  2 mysql mysql    4096 Oct 21 02:35 mysql/
srwxrwxrwx  1 mysql mysql        0 Dec  3 15:40 mysql.sock=
-rw-r--r--  1 mysql mysql        6 Oct 21 02:35 mysql_upgrade_info
drwx------  2 mysql mysql    4096 Oct 21 02:36 roundcube/
drwx------  2 mysql mysql    4096 Nov  1 15:38 voterdatanow/
-rw-rw----  1 mysql mysql    7593 Dec 21  2012 vps-centos.err
-rw-rw----  1 mysql mysql    13517 Dec  3 15:40 vps-centos.err
-rw-rw----  1 mysql mysql        6 Dec  3 15:40 vps-centos.pid

At first glance, there are 2 lines in the last listing that are not present in the first one:
Code:

srwxrwxrwx  1 mysql mysql        0 Dec  3 15:40 mysql.sock=
and
Code:

-rw-rw----  1 mysql mysql        6 Dec  3 15:40 vps-centos.pid

Kustom42 12-03-2013 05:14 PM

Quote:

Originally Posted by gacanepa (Post 5074837)
At first glance, there are 2 lines in the last listing that are not present in the first one:
Code:

srwxrwxrwx  1 mysql mysql        0 Dec  3 15:40 mysql.sock=
and
Code:

-rw-rw----  1 mysql mysql        6 Dec  3 15:40 vps-centos.pid



Now you're starting to do some troubleshooting. So in the Linux world everything exists as a flat file, including processes. The issue here is that the .pid and .sock files are being removed somehow and that causes MySQL to think it has died and stop functioning. Now all you have to do is figure out what is actually causing it to be deleted.

The .pid file is what is used by MySQL to establish that a mysql process is running. It uses this as a check to prevent MySQL from starting multiple daemons as well as to store some info about the current running process.

The .sock file is for establishing the socket connection to bind an open port.(This is actually done via one of the /proc/pid/fd/* files as each process will have a file in its "fd" directory for each socket used but MySQL uses this additional file as well)

I see a .err file which usually indicates there is something being logged. Is there a separate mysql.err log file? What are the contents of your /var/log/mysql/ directory?


Also, this being a VPS it is quite possible that you are in fact over utilizing your VPS and the process is being automatically killed by the hypervisor host. 9/10 support folks will have no idea what this means but if you push your way up the food chain you can have them check for an OOM or Out of Memory kill error logged on the VPS hypervsior. A VPS hypervisor with most cloud hosting solutions is a little finicky and 99% have no hard utilization limits set. What that means is that you are sitting on a box with 64GB of memory that you share with a bunch of other folks. You pay for 1GB so they setup alerts and auto OOM(out of memory) kills to kill processes that consume more resources than they are allocated via the soft limits. They can also put hard limits in place but the downside to this is it can render an entire VM unusable until it is rebooted if its the hard limit in some circumstances.


Now here are a couple of next troubleshooting steps before getting back on the phone with support for another headache:


Install sar if it is not already installed. Run "sar -q" as root to see if is installed.

Code:


yum install sar -y


Setup a file watch for your .pid file. BE CAREFUL with this one, if you forget about it it will fill up your filesystem which should take basically forever but I have no idea how much space you have available.


Code:


while true; do fuser -v /var/lib/mysql/vps-centos.pid >> /tmp/temporary_proc_watch.log; done
done

The above command will check for all processes who are using the .pid file every couple seconds and log it to a file. If some other process is using the file you will see it logged there.

Again monitor your filesystem closely and kill this if space is an issue although one line of text every few seconds isn't much space.

gacanepa 12-05-2013 08:46 AM

Thanks again!
Quote:

What are the contents of your /var/log/mysql/ directory?
There is no mysql directory inside /var/log:
Code:

root@vps-centos [/var/log]# ls
./            dcpumon/        lastlog  stunnel-4.15-build.log
../          dmesg          maillog  tallylog
bandwidth/    dmesg.old      messages  wtmp
boot.log      exim_mainlog    sa/      xferlog.offset
btmp          exim_paniclog  samba/    xferlog.offsetftpsep
chkservd.log  exim_rejectlog  secure    yum.log
cron          httpd/          spooler
root@vps-centos [/var/log]#

sar is installed on the VPS. I will take a closer look at the man page for additional details.
And thanks again for the tip on the watchdog for the .pid file.
I will mark this thread as solved as I received more help than I even expected, and will add to your reputation.
Have a nice day!

alicemaris558 07-24-2014 03:45 AM

Nice post.


All times are GMT -5. The time now is 08:30 AM.