LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Union filesystem with read-only branch on top? (https://www.linuxquestions.org/questions/linux-software-2/union-filesystem-with-read-only-branch-on-top-4175430183/)

guanx 10-02-2012 08:34 PM

Union filesystem with read-only branch on top?
 
Hello,

I tried to find a way to unify two filesystems, one read-only and one writable. The writable filesystem is complete but very slow, while the read-only filesystem is much faster but some files may be missing or not up to date.

I have read something about union filesystems. All that I found are stacking an RW filesystem on an RO fs. Therefore when reading a file that exists on both filesystems then the RW fs, which is slow, will be used.

Instead of this, I would like to stack an RO fs on an RW fs so that
  • when reading a file which exists on both the RO and the RW branch, the newer file will be read;
  • when writing, data goes into the RW fs.

How to achieve this goal?
Thanks!

dru8274 10-03-2012 07:18 AM

The main union filesystems for linux are unionfs and aufs - Another Union File System. But I have only used and can comment about the latter...

The common scenario is for an aufs filesystem comprised of two branches, with RW as the topmost branch, and RO as the bottom. And you desire a read behavior where...
Quote:

Originally Posted by guanx (Post 4795543)
  • when reading a file which exists on both the RO and the RW branch, the newer file will be read;

But having looked at the aufs manpage, it won't do that. Indeed, the ages of files on different branches are never compared for a read operation. It just isn't that intelligent. Read priority always goes to the file that exists on the topmost branch. So clearly, reading on aufs will sometimes give you the oldest file :-(

You have mentioned perhaps mounting RO as the top branch, with RW as the bottom. But even if you do that, the aufs will always read a file from the RO branch, even if a newer version exists on the RW branch. So once again, an older file is read - not the behavior that you want.

Returning back to the scenario where the top branch is RW... You might write a script to manually prune your RW filesystem of those extra older files. Because once they are removed, read operations on the aufs will always return the newest file, and any new files written will always write to the top (RW) branch. Which is exactly the read/write behaviour that you want. And as for those older files on RW, you can archive them as a dated tar or squashfs file, or store them somewhere entirely separate to the aufs. Or perhaps since those older files are now obselete, you may not need them anymore?

Because an aufs can work perfectly for you, but only if those older files on the RW filesystem are somehow set aside and out of sight. So good luck with that.

Happy with ur solution... then tick "yes" and mark as Solved!

guanx 10-03-2012 08:18 AM

Hi dru8274, thanks for your reply!

Probably I did not make everything clear in my previous post.

My thought is to maintain a complete, up-to-date RW filesystem, to which I can fallback without the help of the RO filesystem. The RO filesystem is only there to accelerate file reads. Therefore some files existing on both will have an older version on the RO branch, particularly when they are written to.

dru8274 10-03-2012 09:27 AM

Quote:

Originally Posted by guanx (Post 4795990)
My thought is to maintain a complete, up-to-date RW filesystem, to which I can fallback without the help of the RO filesystem. The RO filesystem is only there to accelerate file reads. Therefore some files existing on both will have an older version on the RO branch, particularly when they are written to.

Okay, I understand. To speedup the overall system by reading from RO whenever possible. But if you use a simple two branched aufs scheme, then you simply cannot avoid sometimes reading an older file from RO.

I'm considering a possible solution, but again it depends on your specifics. So let me make some educated guesses here. I do not know when or how your RO is created. But if initially, the files in RO are all newer than the files in RW. And then, some new files are written into RW. And thusly, some files in RO that were the newest now become oldest. If you situation works like that then ...

An aufs that always reads the newest file would require 3 branches, one RO branch and two RW branches. Where the two RW branches might be on two separate slow partitions, or just two separate folders within one slow partition. Such that the aufs branches, top to bottom would be...

RW1 -- contains only newly written files, but on a slow filesystem. (Top)
RO -- contains some files, on a fast filesystem.
RW2 -- contains the bulk of files, but on a slow filesystem. (Bottom)

Then using this arrangement, aufs will always read the newest file. RW1 is initially empty, and only newly written files will be written there. No files are ever written to RW2, as that would sometimes lead to oldest files being read. And that overall, the use of RO for reading files is maximized, which makes for faster reads on average.

But the downside is that you need two RW branches. And that periodically, you may want to merge the contents of RW1 into RO and/or RW2. Otherwise, its size will just grow and grow...

That said, I know it's not the simple solution you hoped for. Anyway, I'm now headed off to bed :-)

Happy with ur solution... then tick "yes" and mark as Solved!


All times are GMT -5. The time now is 01:21 AM.