LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Server Reboot During Daily Backup (https://www.linuxquestions.org/questions/linux-server-73/server-reboot-during-daily-backup-4175526992/)

Rahul77 12-01-2014 01:13 PM

Server Reboot During Daily Backup
 
Hi, we have an unusual problem. We are running a database server on which an incremental file system backup runs every day during the morning hours.
In the sequence, the / fs completes and then the backup agent moves on to /boot. This is where things go wrong. First, there is high i/o wait, spiking cpu utilisation, and then the NMI watchdog setup causes the kernel to restart the system.

This happens every day, unless we exclude the /boot file system from the backup. The backup agent we are running in Idera 5.8. Our kernel version is 2.6.32.431.29.2. It is Red Hat 6.5 that we are running on.

What might be the problem which causes the high i/o wait when the backup starts working on the /boot fs?

Will greatly appreciate any help.
Many Thanks
R

JeremyBoden 12-02-2014 08:59 AM

Does /boot require a fresh backup every single day?

Rahul77 12-02-2014 09:19 AM

Hey Jeremy, Many Thanks for your response!

That is another question we need confirmed, how useful is the backup of the boot FS everyday, for a bare metal restore?
The daily backup is basically an incremental one? What challenge will it pose to the restore in case we exclude /boot from the daily incremental backup?

Thanks in advace
R

JeremyBoden 12-02-2014 10:52 AM

/boot should only have things like grub and kernel(s) in it.
It should change only when you modify either of these.
You can confirm this by checking file dates with
Code:

sudo ls -lR --full-time /boot
On my PC, I find no problem with backing up anything (including /boot) to an alternative disk (or other location) with rsync.
However, I find it necessary to boot from a "live" DVD or USB to do this.
This guarantees that the main system is not in use at the time of the backup.

Probably something you don't want to mess around doing for a server...

padeen 12-02-2014 11:18 PM

Nothing should be using /boot. I don't even have its partition mounted.

Is /boot on a separate partition? Is the filesystem corrupt on that partition, causing the high i/o wait states?

Rahul77 12-03-2014 01:46 PM

First off, JeremyBoden, thanks for your response. I have checked it and we don't see any changes to it since we made changes to the Kernel.

Padeen, it is a separate partition and the FS appears clean. We have run fsck on it a few times. Apart from that, when we stop the databases running on the server and run the backup with /boot enabled, it completes successfully. And the I/O then, although high, it maxes to about 30%, does not hit the 50%, which is what happens when the backup includes /boot with the databases running as well.

Hope this gives a clear picture. And thanks for responding.

We have decided to run a full backup every time we perform a kernel updates. Doing so shall allow us to keep the Kernel update current, at the very least.

However, the real problem is with the high spike in i/o wait when the backup runs with the boot fs included.

Will be really helpful in case someone with any insights offers any advise.

Thanks
R

rayfward 12-05-2014 12:39 AM

Hi R
Backups are very intense CPU users I'm assuming your using a DLT drive? It does mention in the man page that a CPU crisis will cause a reboot. Top only reports the average. You can disable/suspend the watchdog and let it finish.
See man page watchdog "The watchdog daemon can be stopped without causing a reboot if the device /dev/watchdog is closed correctly, unless your kernel is compiled with the CONFIG_WATCHDOG_NOWAYOUT option enabled. "

It is unlikely to be a file system corruption.
I would consider that :-
The tapes are damaged.
The tapes are out of space.
The tapes are being used at double density (Double density doesn’t !)
All need CPU time.
In the end watchdog needs CPU and causing a reboot means the backup may not be complete. This would renderer the backup useless.
Incremental Backups.
Fine if you have lots of single files bad for databases. Again what ever databases you are backing up, Incremental will back only the ones that have been changed but not the whole package. First time you implement your DR plan you will find all the flaws leading to some very late nights.

Regards
Ray


All times are GMT -5. The time now is 05:09 AM.