LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 09-13-2017, 04:21 PM   #1
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,278
Blog Entries: 8

Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
Is it safe to dedupe "/" with jdupes/btrfs?


I'm a btrfs newbie and I'm trying it out on my PXE nfs file server because of the dedupe potential. The file server serves up multiple PXE NFS root drives - all Debian amd64. And also, the file server itself is a Debian amd64 install.

I have installed onto a single btrfs partition. Here's a description of some of the directories:
Code:
/ = normal local Debian 64 install
/srv/nfs = various file shares
/srv/nfs/snow = root for PXE client "snow"
/srv/nfs/anna = root for PXE client "anna"
/srv/nfs/elsa = root for PXE client "elsa"
Now, there are a LOT of duplicate files here! The majority of the files will actually be extracted files from deb packages - same size, date, everything. So there's a lot of potential for greatly reducing the disk space used by deduping.

So. I'm pretty confident I figured out the syntax for the deduping I want to do:

Code:
jdupes --dedupe -R /srv/nfs
or
Code:
jdupes --dedupe -R /
My question is - is it safe to run this on a "live" running OS?

If it's not safe to do this, then I'm okay with only occasionally running "jdupes -B -R /srv/nfs" after shutting down all of the clients. I'm not actually hurting for space on the server's 30GB SSD yet.

(I have previously only installed just the software I needed on specific clients, to ensure everything fits comfortably. With deduping, I can afford to simply give all clients a full suite of all software I ever use. If it's installed on one, it might as well be installed on all of them!)

Thanks!
 
Old 09-13-2017, 04:25 PM   #2
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 8,607
Blog Entries: 4

Rep: Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998Reputation: 2998
Not entirely familiar with what you are referring to here ... but ... "regularly scheduled maintenance downtime" is a classically good idea. It's probably not a good idea to do something that might abruptly change the environment that is seen by software that might be negatively impacted by such a change.

Also: "in general, don't do something that doesn't, for some good reason, need to be done."
 
Old 09-13-2017, 04:38 PM   #3
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,278
Blog Entries: 8

Original Poster
Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
In my case, I feel that learning linux system administration with trial-by-fire is a good enough reason.

Still, I don't like doing things for that reason alone. For example, I developed my RAMBOOT hack because it could give my computers a good performance boost, and reduce heat/noise/power. I'm able to stretch out what I can do with my computers without spending any money on SSD upgrades.

In this case, I have found the 30GB size of this SSD to be limiting. But deduping might let me pretty much contain the OS partitions of ALL of my computers on that SSD! I still maintain a backup on another (slower) drive via rsync. But this could let me reclaim a decent chunk of disk space on a number of computers - as well as letting me spin down their hard drives.

Currently, that backup is on an ext4 partition, so I don't take advantage of deduping there. Not yet. I'll convert the backup server to btrfs soon enough if this all works out.
 
Old 09-15-2017, 12:22 PM   #4
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,278
Blog Entries: 8

Original Poster
Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
I decided that performance would be better by splitting up the dedupe job into smaller sections. Something like:

Code:
jdupes -B -R /bin         /srv/nfs/*/bin
jdupes -B -R /boot        /srv/nfs/*/boot
jdupes -B -R /etc         /srv/nfs/*/etc
jdupes -B -R /lib         /srv/nfs/*/lib
jdupes -B -R /lib64       /srv/nfs/*/lib64
jdupes -B -R /opt         /srv/nfs/*/opt
jdupes -B -R /sbin        /srv/nfs/*/sbin

find /usr/ -type d -exec jdupes -B {} /srv/nfs/*{} \;
That last "find" command gets a bit messy because not all directories exist in all installs, but it seems to work okay.
 
Old 09-18-2017, 12:53 PM   #5
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,278
Blog Entries: 8

Original Poster
Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
I'm going to mark this as "solved" even though I still really don't know whether or not it is safe to dedupe on "/".

What I have decided to do works for me. I had to split things up into very small chunks because this computer doesn't really have much RAM or CPU power, relatively speaking (only 3GB of RAM, slower old Core 2 Duo T5600@1.83Ghz).
 
Old 09-18-2017, 07:50 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 15,948

Rep: Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210Reputation: 2210
Dedupe is all well and good until you have a fault - how many files/systems are impacted if a (very) common block becomes inaccessible ?.

I use rsnapshot on my "normal" systems to a common backup system. But I back that up on a regular basis to a non hard-linked copy for this very reason. Everything has a cost.
 
1 members found this post helpful.
Old 09-18-2017, 08:11 PM   #7
IsaacKuo
Senior Member
 
Registered: Apr 2004
Location: Baton Rouge, Louisiana, USA
Distribution: Debian 9 Stretch
Posts: 2,278
Blog Entries: 8

Original Poster
Rep: Reputation: 362Reputation: 362Reputation: 362Reputation: 362
Thanks for the tip!

I currently only use btrfs with deduping on the "live" system. All of the backups are currently on ext4 (with no hard link deduping). But I will have one backup on btrfs with deduping after I set it up, to get used to it.
 
  


Reply

Tags
btrfs, deduping, jdupes


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
SLES 12 Unable to IPL the system after resizing "/" btrfs filesystem mikenash Linux - Enterprise 1 01-30-2017 09:47 AM
[SOLVED] Is LVM/btrfs/ZFS "pooling" worth the trouble? sans Linux - Newbie 8 03-28-2015 11:46 PM
"last Pass" or "Password Safe" evaluation drmjh Linux - Software 8 10-29-2014 04:34 AM
[SOLVED] Strange "no space available" message using btrfs on Fedora 19 PTrenholme Fedora 2 06-23-2013 10:48 PM
LXer: Btrfs Brings "Pretty Beefy" Changes In Linux 3.2 LXer Syndicated Linux News 0 11-07-2011 08:30 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 09:09 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration