Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I have a 500 GB system hard disk that I will mount to / and use as my main partition. It also has a smaller partition for swap. Then I have 4 1 TB hard drives. I have a website that is linked to a database via php. So far I have about 400 GB of content on the site which grows daily. However, it's growing slowly. Growth will probably slowdown in a year or so, so I don't feel a need to have more than 2 TB in the long run. Also, I can only fit 5 hard drives in my computer so I do not want to add any more. Okay so far I can think of two options:
Setup 2 1 TB hard drives on a LVM. Then I can use one or both of tje other 1 TB hard drives for backup.
I have question here. If I do this, I will probably setup a cron job to backup weekly, maybe monthly. It's not an active site. The thing is from testing, a backup tgz file only compresses about 15% with the type of content I use, which is mostly binary. Also, is there a difference at all between a tar.gz file and a tgz file, in terms of compression? So if I made a backup right now the file would be about 370 GB ish. Anyhow, this setup would definitely require space. However, I see a flaw. Because the backup would be so big, each backup would just overwrite the new backup. However, what if my production data gets blown and the backup.tgz file is corrupt? I understand the bigger the tgz file is, the higher the chance that it will be corrupt. I won't be transferring the tgz file, I will just backup to one of those 1 TB hard drives. However, I've seen times when just the process of creating a tgz file sometimes ends up in the dreaded 'unexpected EOF reached.' I've tried using 'tar -tvv filename.tgz' which is supposed to test the integrity of the file, but whenever I run this, the process always seems to freeze. Also, I've heard about bzip2recover. Anybody have any experience with this?
So yes, this is one viable option. I don't lose any space but I risk being left with a backup tgz file that could be corrupt.
Setup 2 1 TB hard drives on a mirrored raid. Then setup the other 2 1 TB hard drives also on a mirrored raid. Then merge the two mirrored raids into a striped raid. I think they call this a Raid 10 because it's a combination of the two. I like this but at the same time I lose 2 TB of space. Also, anybody know the procedure if I lose one hard disk? What programs do I use to insert a brand new 1 TB hard drive into this Raid 10 array?
Another big flaw is if two hard drives fail both in the same mirrored raid, then I lose everything. So I'm not liking this idea.
What do you guys think?
I'm thinking Option 1 and doing backup on 1 TB hard drive and then another on the other 1 TB hard drive. That way I increase my chances. However, once my storage goes above 1 TB, which it will, I can no longer do double backups. I would merge the other two 1 TB hard drives into another LVM and just make 1 backup that hopefully won't corrupt.
Anyhow, I'm excited to hear what you guys think. The data is very valuable to me so I want to do as much as I can to minimize the risk of loss.
No advice regarding your options as I don't know what I would use in your scenario.
If data is very valuable, I would not use onsite backup only. Use a couple of external HDs and store offsite as well (bank, at work if the location differs from where the machine is); cycle them.
Further consider incremental backups instead of full backups all the time; this can save you considerable space, especially if the content does not change significantly. E.g. once a month a full backup and the other weeks of the month a backup of the new/changed files.
The path you follow here depends on what your end goal is and whether you have a bit of money to spend in addition to what you already have.
To me what the previous two posters have mentioned is paramount - you need to ship your backups off site somewhere - either by purchasing a secondary storage device for yourself; or buying some hosting space somewhere to store your backups.
In terms of the Option 1 shortcoming; of a backup being corrupted; if you're using an Oracle database; this is easy to counter since Oracle uses redo logs and RMAN which can allow you to recover to a specific point in time - I'm not sure if MySQL/other relational databases have a similar feature/s?
In terms of Option 2; I'd say instead of doing a RAID 1/0 with your 4 disks; what about a RAID 5? In this way you only lose 1 disk worth of space and thus have more data space available. The difference would be that you cannot lose more than one disk without losing all your data.
In my opinion you should go for a combination of Option 1 and Option 2. Use all 4 drives for your database in a RAID 5 configuration. Ship your backups off site and store say up to 7 days worth to allow recovery in a wider window.
If you have cash to spend you may want to consider a small storage solution (HP MSA2000 or EMC Clariion AX or similar) which will cost a bit but be much more robust to preserve your data.
Thinking about going with LVM that way I can always add another hard dis. Then I will make incremental backups on modified or new files and a full monthly backup to both one of those internal 1 TB hard disks, which are separate from the main LVM and then another to a hosted backup. Thanks for all the info guys.