[SOLVED] Software raid between local drive and network drive
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
What is “network drive” in your definition? With iSCSI devices it should work. Do you want to mirror data for safety reasons or what’s the goal behind it?
Yes this is just for safety reasons. Network drive meaning any drive accessed through tcp/ip. Even dropbox cloud storage.
Disk performance does not matter too much because this is for a pc devoted to email reading with thunderbird. But the local disk is old and made a mess of the ntfs filesystem the other day, resulting in loss of some email directories, so some redundancy would be nice.
If this RAID is possible, striping with it might even beat the local drive performance-wise, considering the local drive is an ancient ata33 ide drive.
RAID is not the right solution when one of the drives is not local (removable USB, network, etc). That's not what RAID is built for, and you will have a nightmare of a time trying to set it up and maintain it without the RAID controller constantly freaking out and behaving unpredictably.
RAID 1 between two partitions on the same drive also isn't a very good idea. If you're losing sectors, that drive is going to be trash very soon (the entire drive, not just one partition).
Bottom line is RAID (any form of RAID) is not the answer when you already have a failing drive. What you want is a strong backup system, and a replacement drive ready to go when your current one fails, which it will, soon. Ideally you would ghost your current drive onto a new one and swap it in immediately.
Last edited by suicidaleggroll; 01-30-2012 at 09:55 AM.
Then why is one of Sun's filesystems designed to be fault tolerant with plenty of redundancy even on single drive? That's driven by the same principle as raid.
Things are a little more complicated here, haven't said what the exact situation is, but redundancy across network storage can't possibly be a new idea.
No such thing as automatic duplication of data across servers in cloud computing? Hard to believe.
No such thing because of "nightmare of a time trying to set it up and maintain it without the RAID controller constantly freaking out and behaving unpredictably"? Who talked about a raid controller anyway, this is software raid, with as much fault tolerance as is required.
You can't be fault tolerant on a single drive when RAIDed with itself...you can correct for small errors here and there, but as soon as a drive starts to lose sectors, it spirals out of control quickly. Once the drive goes kaput, it's gone, and you'll lose both halves of the drive simultaneously, which means your RAID is useless.
I maintain 14 different RAID arrays on a daily basis (RAID 1, 5, 6, and 10, both software and hardware), and have for the last 6 years. I can count on zero hands the number of times a RAID 1 array using multiple partitions on a single drive would have been useful on any of my systems.
There are many ways of keeping data duplicated across multiple filesystems, including over the network, without using RAID. RAID is designed to be used with fixed storage, fixed device names, where no member will ever be removed except in the event of a fault, for replacement.
"RAID controller" doesn't necessarily mean a hardware card, it could just as easily mean the software application doing the controlling.
Last edited by suicidaleggroll; 01-30-2012 at 12:29 PM.
Can I suggest that you read what is posted and ponder the implications before replying. The mentioned filesystem is called ZFS I recalled now and a google search reveals that it can be mounted with copies=2 (or copies=3) whereby it maintains 2 (or 3) copies of everything on a single disk. The purpose of this is not to keep disks forever of course but to prevent loss of data until a disk with bad sectors is replaced as you said. This is equivalent to a RAID 1 array using multiple partitions on a single drive that would have been useful zero times on any of your systems but plenty of times on Sun's systems and millions of their customers.
If RAID is useful in the twilight of a disk before it is replaced, a network version of it makes sense and there's nothing to stop its software developer from calling it network raid (redundant array of independent disks).
Crucially, it is not the name that matters but what is the software that does it?
I don't know about enterprisey stuff, but you could use Linux's NBD to gain access to a remote hard drive, then setup software RAID the usual way (mdadm?), e.g. by RAID1 on top of /dev/sdb and /dev/nbd0.
Can I suggest that you read what is posted and ponder the implications before replying. The mentioned filesystem is called ZFS I recalled now and a google search reveals that it can be mounted with copies=2 (or copies=3) whereby it maintains 2 (or 3) copies of everything on a single disk. The purpose of this is not to keep disks forever of course but to prevent loss of data until a disk with bad sectors is replaced as you said. This is equivalent to a RAID 1 array using multiple partitions on a single drive that would have been useful zero times on any of your systems but plenty of times on Sun's systems and millions of their customers.
If RAID is useful in the twilight of a disk before it is replaced, a network version of it makes sense and there's nothing to stop its software developer from calling it network raid (redundant array of independent disks).
Crucially, it is not the name that matters but what is the software that does it?
I'm simply trying to explain that you're searching for the wrong terms. You want redundancy, you don't want RAID. RAID is a very specific redundancy implementation, one that is not well suited for networked or removable drives. Like I said before, "There are many ways of keeping data duplicated across multiple filesystems, including over the network, without using RAID. RAID is designed to be used with fixed storage, fixed device names, where no member will ever be removed except in the event of a fault, for replacement." You're getting too caught up in what the acronym "RAID" stands for, and losing sight of the fact that RAID is a specific version of redundancy, one that does not fit with your setup.
It would be like if somebody wanted to share a networked drive with a windows box. NFS stands for "Network File System", sounds perfect, but NFS is a specific implementation of a network file system, one that does not work with Windows. Even though the acronym "NFS" stands for exactly what they want, NFS won't actually work for them, they would have to use something like Samba instead.
I'm just trying to get you off of the term "RAID", since searching for different versions of RAID, ways of implementing RAID, ways of using RAID, etc., will just end up being a huge waste of time. It won't work for you, at least not in the way you want. You should be looking for ways of duplicating a filesystem onto a remote drive, such as regular ghosting/backups, etc. You could even do something as simple as creating an image of your drive, then doing nightly rsyncs to keep track of changes in the filesystem. If you start to lose files you copy them back over from the daily rsync, and if you lose the entire drive you restore the image you made onto a new drive, then update the files from your latest rsync.
Last edited by suicidaleggroll; 01-31-2012 at 10:29 AM.
Other people might be searching for raid-inspired setups over networks right now, and coming to this thread. Already on google. If someone is stuck on terminology and consequently failing to answer the question, guess who that is.
I get the impression that a local copy on one and the same disk should mainly handle a corruption in the data path while storing/retrieving the information.
And as they say: “...while the device is still largely operational.” To me it’s just personal taste, whether you judge it sufficient or not for your specific requirement.
I could imagine, that once you have some abrasion flying around in a disk it will die soon and damage one block after the other, while for a flash device, one broken block doesn’t affect others.
BTW: I’m not used to ZFS: is there any output then: “Attention, could read only by accessing a copy.”?
ZFS redundancy is obsolete if raid is used. Probably added because their drives were too expensive to throw away at the first sign of aging. Maybe this is still true today for high-end server drives.
In the blog it’s described different: you use a RAID but during transfer of the data to the RAID controller it got corrupted. Hence the RAID will only store the wrong information. Later while reading ZFS can detect this due to the checksum. During the transfer of the copy all went well and you can access the data.
Check out Mirrorfolder. http://www.techsoftpl.com It does software RAID-1 to a network drive. Here is their description of how it works:
Real-time mirroring is implemented in a file system filter driver that performs RAID-1 type of mirroring in software on per file basis. Like a software-only RAID-1 system, it duplicates individual file I/O requests in memory and sends them to both source and mirror devices. That means the same data is written to both source and mirror files at the same time whenever there is any change in the source file. Other file operations like move, rename, delete, attribute change, etc., are also performed in the mirror folder simultaneously with the source folder. So, at any point of time the content of the source and mirror folders remains identical.
For example, if you have a large database file in your source folder and modify only one record in that file using the application interface of that database, the driver included with MirrorFolder will modify only that same record in the mirror database file simultaneously, and will NOT copy the entire source database file to its mirror.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.