LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   kernel patch to enable filesystem compression (https://www.linuxquestions.org/questions/linux-general-1/kernel-patch-to-enable-filesystem-compression-705190/)

msound 02-16-2009 06:25 PM

kernel patch to enable filesystem compression
 
Hey hey! I have a 1.5TB drive that stores tens of thousands of csv and text files. I'd like to use chattr -R +c to enable compression at the filesystem level. This would be a much more elegant solution than gzipping every individual file that comes in.

While reading about chattr -R +c, I keep reading that I need to patch the kernel before filesystem compression is fully supported by my system.

Does anyone have any links or advice on how to patch the kernel to fully support chattr +c?

My system is running CentOS 4.7 (Final) with the following kernel: 2.6.9-78.0.5.ELsmp

Thanks!

unSpawn 02-16-2009 07:38 PM

There's http://sourceforge.net/projects/e3compr/ but you really should read the FAQ before applying it. Other stuff to try if you can live with FUSE: FuseCompress (http://miio.net/fusecompress/, 2009), FUSE-zip (http://code.google.com/p/fuse-zip/, 2008), avf (http://sourceforge.net/projects/avf, 2007), compFUSEd (http://freshmeat.net/projects/compf/, 2007), LZOlayer (http://north.one.pl/~kazik/pub/LZOlayer/, 2006). You'll find code maintenance and maturity varies wildly and I haven't seen any benchmarking apart from http://code.google.com/p/fuse-zip/wiki/PerformancePage. YMMV(VM).

ErV 02-16-2009 08:10 PM

Quote:

Originally Posted by msound (Post 3446142)
tens of thousands of csv and text files.

That's not much (my system has something between 70000..130000 files, and it isn't on 1 terabyte drive). I'd start to worry if there were few dozens of millions of them.

i92guboj 02-17-2009 06:18 AM

You might want to check btrfs, which supports it natively since 0.17. However it's still a young fs, and there's no guarantee that it's stable or that the on disk format won't change in the future. It supports transparent compression via zlib, though.

I would forget about that kernel patches unless you don't mind living with old kernels (you don't need support for newer hardware, are not affected by old bugs or and don't care about security), I doubt that stuff is actively supported and developed (otherwise it would have probably been included in the kernel at one point). Looking at sourceforge, the last e3comp update has almost 1 year ago.

msound 02-17-2009 10:12 AM

Hey guys thanks for the replies.

Quote:

Looking at sourceforge, the last e3comp update has almost 1 year ago.
That was definitely one trend I noticed while searching around for a solution. I think one of them only mentioned the ext2 filesystem and I'm running ext3.

Quote:

That's not much (my system has something between 70000..130000 files, and it isn't on 1 terabyte drive). I'd start to worry if there were few dozens of millions of them.
Well, as of 7:59am PST, the drive is currently storing 394,860 files. That slightly exceeds your 130,000 :P

We're currently adding between 1,000 - 4,000 new files per day, and that number is expected to go up.

The other catch is that we make these files downloadable via a web interface. Another drawback to gzipping each of the files individually is that we'd then have to gunzip each file before we pass them to the user for download (end users seem to have enough trouble handling plain text files, asking them to unzip a file first would be unreasonable at this scope).

The other issue is that this application, the scripts that retrieve the data, the web site and the database are all being run from a single machine. So server load is definitely a concern.

I've setup a few test CentOS 4.7 VM's on my vm server at home. I think the next logical step would be to setup a few configurations in a test environment. The only thing I wouldn't be able to mirror at home is the fact that the production drive is actually a partition on a san, and not a local hard disk array.

I'll post an update later this week after I've done a few tests.

Thanks!


All times are GMT -5. The time now is 11:32 AM.