LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 07-22-2016, 08:03 AM   #1
PastulioLive
Member
 
Registered: Nov 2014
Posts: 39

Rep: Reputation: Disabled
How do I test if mysql dump succeeds while piping it to gzip


Hey everybody,

Small question that may have a very simple answer, but google has failed me (I'm probably not using the correct search terms).

I am writing a script that dumps a mysql database and pipes it into gzip.
The problem is that I want to test if the dump was successfull, but the pipe gzip command always returns succesfull.

Here is the test command I'm using:
Code:
if [ `ssh root@${SERVER} "mysqldump --all-databases" 2>> ${LOG_FILE} | gzip > ${DESTINATION_PATH}/mysql/mysql_$(date +%F).gz` ]
then
  echo "ERROR: Database dump failed!" | tee -a ${LOG_FILE}
  ((ERRORS++))
fi
I've switched the "${SERVER}" variable to a server that does not exist.
The log file does not produce any error and the gzip file contains the letter "W".

Does anybody have any advise on how to best solve this.
(I also could try checking the filesize after the dump, but then I still don't know if the database dump is complete)

Thanks!
 
Old 07-22-2016, 09:47 AM   #2
gda
Member
 
Registered: Oct 2015
Posts: 130

Rep: Reputation: 27
You could use the internal bash array variable $PIPESTATUS which contains the exit status(es) of last executed foreground pipe:

Code:
ssh root@${SERVER} "mysqldump --all-databases" 2>> ${LOG_FILE} | gzip > ${DESTINATION_PATH}/mysql/mysql_$(date +%F).gz
if [ ${PIPESTATUS[0]} -ne 0 ]; then
  echo "ERROR: Database dump failed!" | tee -a ${LOG_FILE}
  ((ERRORS++))
fi
Or even better:
Code:
ssh root@${SERVER} "mysqldump --all-databases" 2>> ${LOG_FILE} | gzip > ${DESTINATION_PATH}/mysql/mysql_$(date +%F).gz 2>> ${LOG_FILE}
buffer=( ${PIPESTATUS[*]} )
if [ ${buffer[0]} -ne 0 ] || [ ${buffer[1]} -ne 0 ]; then
  echo "ERROR: Database dump failed!" | tee -a ${LOG_FILE}
  ((ERRORS++))
fi

Last edited by gda; 07-22-2016 at 10:03 AM.
 
1 members found this post helpful.
Old 07-22-2016, 12:41 PM   #3
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
You can also use the pipefail option ("set -o pipefail") to get a failure return from a pipe if any command in the pipe fails.
 
Old 07-22-2016, 01:10 PM   #4
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,635

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Personally...I trust none of those methods, but I'm old-school. The only way I'm EVER happy with a backup is if I have tested it, and successfully restored it. I've seen too many gremlins over the years, where you get ZERO error messages, but can't seem to get things working again because <reasons>.

I'd not trust it unless I was able to import that dump file somewhere else, pull up the database, and verify it worked.
 
Old 07-22-2016, 01:43 PM   #5
gda
Member
 
Registered: Oct 2015
Posts: 130

Rep: Reputation: 27
You are completely right: zero returned value does not mean that everything went fine necessarily (even if according to my experience this is not usual). At the same time it is true that there are no many other ways to check for errors any script programmatically. The checks you mentioned are typically done manually...

The use of pipefail is also valid but has the disadvantage to not provide you information about which command in the pipe failed.
 
Old 07-22-2016, 01:52 PM   #6
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,635

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by gda View Post
You are completely right: zero returned value does not mean that everything went fine necessarily (even if according to my experience this is not usual). At the same time it is true that there are no many other ways to check for errors any script programmatically. The checks you mentioned are typically done manually...

The use of pipefail is also valid but has the disadvantage to not provide you information about which command in the pipe failed.
Absolutely...you SHOULD get your script to check things, and take every advantage you can get. But there is NEVER a substitute for doing a real test.
 
Old 07-25-2016, 02:07 AM   #7
PastulioLive
Member
 
Registered: Nov 2014
Posts: 39

Original Poster
Rep: Reputation: Disabled
Thanks for the replies guys!

I am also very skeptical when it comes to backups, and I realise that the only way to truely test it, is to restore the backup.
We unfortunately don't have the time or resources to try a restore on a regular bases (even though we really should).
So I am just looking to catch quick and obvious errors like unable to connect to host, disk full, etc... that will notify me if the backup fails.

I'm hoping that in the future, we will have more time to get it done properly, but I imagine that is only done by automating a backup/restore operation in a test-lab?

I think the pipefail should do for now (since I will want to check the backup if any of the commands in the pipe fail)

Thanks!
 
Old 07-25-2016, 08:59 AM   #8
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,635

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by PastulioLive View Post
Thanks for the replies guys!

I am also very skeptical when it comes to backups, and I realise that the only way to truely test it, is to restore the backup. We unfortunately don't have the time or resources to try a restore on a regular bases (even though we really should). So I am just looking to catch quick and obvious errors like unable to connect to host, disk full, etc... that will notify me if the backup fails.

I'm hoping that in the future, we will have more time to get it done properly, but I imagine that is only done by automating a backup/restore operation in a test-lab? I think the pipefail should do for now (since I will want to check the backup if any of the commands in the pipe fail)
No, the pipefail is a great thing to do, and you don't have to test EVERY backup...but if you take a little time and do one every month or so, you'll at least feel a bit better. I wouldn't automate the restore, but that's just me. In the event of a failure, you're going to probably do the restore manually. If it was ME, I would:
  • Document the ENTIRE PROCESS, with pictures and screen-shots. Put this into PDF form, in a knowledgebase online (if you have one), and in a hard-copy binder somewhere. You want anyone with at least one good eye and the ability to read, to be able to restore system functionality. Single points of failure aren't ever good...even if they're a PERSON. If you're the only one who knows how to do a restore, and it dies when you're on vacation...guess what?
  • Run some basic queries on the database..pull out records from random tables, and run a stored-procedure or two (if you have them), and make sure you get sane data.
  • Ask the end-user to verify this data.
I'd only do this periodically...chances are, if you have one solid test, you can do spot-checks from then on.
 
Old 07-26-2016, 05:16 AM   #9
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
At some point with every new system or new backup scheme I do a "bare metal restore" test. I take the disk drive out of the machine, replace it with an empty drive of similar size, and see what it takes to get the system up and running by booting from a live cd and restoring from my backups. There are typically some "interesting" moments along the way. (Just how was that disk partitioned? What was the LVM structure? What options were used in making those filesystems? How can I get the same UUIDs back so I don't have to edit configuration files? How many of those files that I don't back up are more annoying to lose than I expected? ...)
 
Old 07-26-2016, 07:26 AM   #10
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,635

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by rknichols View Post
At some point with every new system or new backup scheme I do a "bare metal restore" test. I take the disk drive out of the machine, replace it with an empty drive of similar size, and see what it takes to get the system up and running by booting from a live cd and restoring from my backups. There are typically some "interesting" moments along the way. (Just how was that disk partitioned? What was the LVM structure? What options were used in making those filesystems? How can I get the same UUIDs back so I don't have to edit configuration files? How many of those files that I don't back up are more annoying to lose than I expected? ...)
Yes...all of this, with copious documentation. That's the ONLY way you're going to know what's up, and what it takes to get things back, truly.

I worked somewhere years ago, where their 'disaster recovery' tests consisted of:
  • Taking the senior team members who installed everything to start with off site
  • Having THEM rebuild everything
  • Pat themselves on the back because stuff worked.
We pointed out each time, that in the event of a DISASTER (the "D" in "DR Test"), the senior people probably would NOT be available because...you know....disaster? They'd never let us build a documentation set or test anything in a lab at the office either, because we 'didn't have time'. Amazingly, when folks began quitting, those 'tests' began failing.
 
Old 07-27-2016, 08:00 AM   #11
PastulioLive
Member
 
Registered: Nov 2014
Posts: 39

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by TB0ne View Post
No, the pipefail is a great thing to do, and you don't have to test EVERY backup...but if you take a little time and do one every month or so, you'll at least feel a bit better. I wouldn't automate the restore, but that's just me. In the event of a failure, you're going to probably do the restore manually. If it was ME, I would:
  • Document the ENTIRE PROCESS, with pictures and screen-shots. Put this into PDF form, in a knowledgebase online (if you have one), and in a hard-copy binder somewhere. You want anyone with at least one good eye and the ability to read, to be able to restore system functionality. Single points of failure aren't ever good...even if they're a PERSON. If you're the only one who knows how to do a restore, and it dies when you're on vacation...guess what?
  • Run some basic queries on the database..pull out records from random tables, and run a stored-procedure or two (if you have them), and make sure you get sane data.
  • Ask the end-user to verify this data.
I'd only do this periodically...chances are, if you have one solid test, you can do spot-checks from then on.
Documentation:
I have a Wiki (FOSWIKI) where I document the process of setting up the service (the ones I did not inherit at least).
It includes everything to setup the service from scratch (Linux, packages, dependencies, selinux, firewall, resources (nfs), etc...)
It also includes people who are responsible for the service (and what parts) and a list of common tasks/issues.
It also has common tasks like forgotten passwords or locking yourself out of the application.
The restore procedure will also go here.
I have not gone so far as to put it in a binder (preferably off-site), but I understand that I should.

Basic Queries:
I find this to be a bit tricky. As a sysadmin, you usually don't really use the service, your end users do.
The application I'm trying to back up here is Phabricator for instance.
So I guess I could try to find out how a certain SVN commit is stored in the MySQL database and retrieve it.
Doing this for every service we run seems like an awefull lot of work though.
Also, the procedure would be entirely different for all of the different services we offer.
That sounds like an awefull lot of work, but I guess that's the only way to be sure.
How do you guys feel pretty confident that you know if a restore is properly working?

Aks users to verify the data
+1 to this :-)

Quote:
Originally Posted by TB0ne View Post
Yes...all of this, with copious documentation. That's the ONLY way you're going to know what's up, and what it takes to get things back, truly.

I worked somewhere years ago, where their 'disaster recovery' tests consisted of:
  • Taking the senior team members who installed everything to start with off site
  • Having THEM rebuild everything
  • Pat themselves on the back because stuff worked.
We pointed out each time, that in the event of a DISASTER (the "D" in "DR Test"), the senior people probably would NOT be available because...you know....disaster? They'd never let us build a documentation set or test anything in a lab at the office either, because we 'didn't have time'. Amazingly, when folks began quitting, those 'tests' began failing.
Time is a big issue for me too! I work in an SMO and we have about 100 people working here, but we only have 2 Sysadmins.
I am the "Linux-Guy / IT support guy" (helping our 100 emplyees with random Windows/Application/Network/etc... troubleshooting)
I also have to setup new PC's and take care of the networking backend (cisco switches, firewall, dmz, vpn, patching, port forwards, etc...)
My collegue is the Senior Sysadmin, but he basically only has time to take care of his windows environment, and a lot of administrative stuff (licensing and all kinds of stuff).
(All of this doesn't even mention the time we should be taking read up on security, documentation, etc...)

I just can't imagine being able to do all of this in a 40 hour work-week. How do you guys do it!

Also, our company uses jenkins for continuous integration, so downtime is also not something we can really afford, but I don't have the time to look into/setup high availability.
 
Old 07-27-2016, 08:13 AM   #12
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,635

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by PastulioLive View Post
Documentation:
I have a Wiki (FOSWIKI) where I document the process of setting up the service (the ones I did not inherit at least).
It includes everything to setup the service from scratch (Linux, packages, dependencies, selinux, firewall, resources (nfs), etc...)
It also includes people who are responsible for the service (and what parts) and a list of common tasks/issues.
It also has common tasks like forgotten passwords or locking yourself out of the application.
The restore procedure will also go here. I have not gone so far as to put it in a binder (preferably off-site), but I understand that I should.
I use a binder somewhere, because if things *REALLY* go wrong, you at least have that. Web server problems? Network problems? Can't access that site?? The wiki is useless at that point, and you have no idea what's in it. Hard copies NEVER go down, unless you toss them in the shredder.
Quote:
Basic Queries:
I find this to be a bit tricky. As a sysadmin, you usually don't really use the service, your end users do. The application I'm trying to back up here is Phabricator for instance.
So I guess I could try to find out how a certain SVN commit is stored in the MySQL database and retrieve it. Doing this for every service we run seems like an awefull lot of work though. Also, the procedure would be entirely different for all of the different services we offer. That sounds like an awefull lot of work, but I guess that's the only way to be sure. How do you guys feel pretty confident that you know if a restore is properly working?
By doing:
Quote:
Aks users to verify the data
+1 to this :-)
...this. Don't overcomplicate it; if you know you're using SVN/MySQL, then focus on getting those two SERVICES back up. From there, grab an end-user and ask them for five minutes of their time, and get THEM to pull up a record. They know how to use their application probably better than you do.
Quote:
Time is a big issue for me too! I work in an SMO and we have about 100 people working here, but we only have 2 Sysadmins. I am the "Linux-Guy / IT support guy" (helping our 100 emplyees with random Windows/Application/Network/etc... troubleshooting) I also have to setup new PC's and take care of the networking backend (cisco switches, firewall, dmz, vpn, patching, port forwards, etc...) My collegue is the Senior Sysadmin, but he basically only has time to take care of his windows environment, and a lot of administrative stuff (licensing and all kinds of stuff). (All of this doesn't even mention the time we should be taking read up on security, documentation, etc...)
Again, don't overcomplicate it. This is a marathon, not a sprint. Set up a machine somewhere; shove a DVD in, and let it install while you're doing something else. Now you've got a test-bed. Copy your mysqldump to it at some point...take two minutes and SSH into it, and start a copy running if it's huge, and the next day/week/whenever, restore that dump file, and bring up the DB. Look at it. Repeat with any other services you want to test. Spending 15-30 minutes a day doing small things adds up BIG on your testing/documentation procedures.
Quote:
I just can't imagine being able to do all of this in a 40 hour work-week. How do you guys do it!
Simple...most of us don't work just 40 hours a week, and automate the crap out of things. If I had SVN/SQL tests, I'd sure cron something to pull a backup file onto my test system, AND cron a restore of it too, so my test rig would 'just be there' whenever I wanted it. Let the machines work for you, don't work for them.
Quote:
Also, our company uses jenkins for continuous integration, so downtime is also not something we can really afford, but I don't have the time to look into/setup high availability.
This is something to beat your boss over the head with, as this makes no sense. Either you CAN afford downtime (because if you can't test/setup/look into things you are GOING TO HAVE IT), or you cannot (in which case, they need to pony up some $$$ for temps/another employee to take the load off you two). There's no other way.
 
Old 07-27-2016, 08:34 AM   #13
PastulioLive
Member
 
Registered: Nov 2014
Posts: 39

Original Poster
Rep: Reputation: Disabled
Thanks TB0ne, you are giving me some great advice!

I've been working at this company for 5 years now, but the consultant who ran the Linux and Network left a year or two back and I'm just trying to catch up to everything that this company is running, we provide a ton of services (like 11 bugzilla instances with slightly different configurations for all the subcontractors/clients we are working with).
Besides that, we have a lot of in-house applications that need to be maintained for our development/production/testing process to operate.
Theres also a lot of different things like, svn servers, git servers, jenkins, phabricator, storage/file/ftp servers and shares, webservers, wikis...
The list goes on, as I'm sure it does in a lot of other companies.

I didn't used to work 40 hour weeks either, but since I'm underpaid, I try to keep my hours after work to a minimum (Which, at the end of the day, is just shooting myself in the foot).
I have a new negotiation soon about my pay, so I hope my motivation will return.
(I love the work I do, but if you have to work another job to have a decent standard of living, and have about a 2-3 hour total commute, you are not going be working on backups/scripts after hours)

I just feel like I need an extra me to get things done and was wondering if the problem is me or the workload.
The 15 min/day working on tests may indeed get me a long way there, so I'll definately try it instead of looking at it like a "Project" that I need to take the day for.


I'm sorry for going so far off topic, but It's interesting to get your point of view. These things may be better suited somewhere else.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] piping gzip into split soupmagnet Linux - Newbie 5 12-13-2012 10:45 PM
[SOLVED] How to migrate linux mysql dump backup to windows mysql server sunrised24 Linux - Server 2 02-14-2012 01:58 PM
[SOLVED] How to find out how long a tar gzip would take, test mode? anon091 Linux - Newbie 4 07-01-2010 05:59 PM
Piping dd through gzip to image Win2k Partition sr_25 Linux - Newbie 2 07-08-2007 02:43 PM
backup: dump vs tar-bzip/gzip kpachopoulos Linux - General 2 12-21-2006 06:45 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 12:31 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration