LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 05-18-2007, 01:53 PM   #1
Cerox
LQ Newbie
 
Registered: Dec 2006
Location: Cologne, Germany
Distribution: SuSe 10.2
Posts: 9

Rep: Reputation: 0
[BackupPC] Exclusion of Directories doesn't work


Hello altogether,

I am using BackupPC for a while to backup my Windows-clients; these clients provide their data through smb-shares

Now I want to backup a Linux-machine; I have correctly configured openSSH and my Linux-Server (where BackupPC is running) has a private key, to get root access on the client machine.

Due to the fact, that rsync was running very slow (the bandwidth reached max. 200 KB/s -> the server is in my LAN (FastEthernet-connection)), I have set the $Conf{XferMethod} to "tar".

Now I want to exclude particular directories from the client-backup, because these directories are also present on a Samba-server.

I have adapted the following directive:

Code:
$Conf{BackupFilesExclude} = ['/home/sebastian/Archiv', '/home/sebastian/Star Wars', '/home/sebastian/Videos', '/home/sebastian/vmware', '/h
ome/sebastian/wallpaper', '/media', '/mnt', '/proc', '/tmp'];
Apparently I haven't understand, how to use this directive correctly, because BackupPC excludes the /home-directory completely from backup. A closing slash is not existent like it's explained in the comment of the config.

I must admin, that I use this directive the first time; can anyone help me?
 
Old 05-18-2007, 06:40 PM   #2
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 48
I use backuppc across my networks, and every time I thought it's comments or config.pl file was wrong, it has turned out that it was me that was wrong!

That being said, try and simplify things to work it out. Rather than trying to tar up everything, set tar to just do /home/sebastian, then set the exclude to be /home/sebastian/Archiv. If that fails to exclude /home/sebastian/Archiv, then try to make the exclude /home/sebastian/Archiv/. Then add in more piece by piece.

Also, I wouldn't give up on rsync or rsyncd. I use that in 100% of my hosts, including the windows hosts. It does take a while to get the first backup done, but then it flies going forward.

If you still encounter problems, post the relevant parts of the config.pl here, and I'll try it on my machines and see what happens. Also, are you running 2.11, or 3.0?

Peace,
JimBass
 
Old 05-19-2007, 09:02 AM   #3
Cerox
LQ Newbie
 
Registered: Dec 2006
Location: Cologne, Germany
Distribution: SuSe 10.2
Posts: 9

Original Poster
Rep: Reputation: 0
Hi and thanks for your answer.

Now I had a look in the Xfer-Error-log and there are many errors when BackupPC tries to backup the /sys-directory. Then I decided to exclude this virtual file system from the client-backup.

I don't understand why but know all works correctly; the server has successfully backuped 30 GB from the client-machine. Have you an idea why the server excluded the other directories like /bin and /home?

Regarding rsync, I have to point out, that the bandwidth during backup reached max. 200 KB/s. With tar the server transfers ~6 MB/s. With a bandwidth of 200 KB/s, the backup-process would need ~44 hours to backup 30 GB. That's too long^^

I have another question. During the last successful backup with tar, the server-load increases to ~6. If the server does backups from windows-machines, the load reaches max. ~4. Do you know why the load is higher, if BackupPC backups linux-machines? Maybe it's because of the ssh-encryption; the server is an old AMD thunderbird 800 MHz...

I'm using BackupPC 2.1.2pl1.
 
Old 05-19-2007, 09:58 AM   #4
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 48
I don't know without seeing your config.pl what was happening, but it always does a backup on what you tell it to backup, and excludes what you don't include (or specifically exclude). Sorry to be vague with that, it just does what you tell it. If you installed it from a Suse repository, they may include some base setup, and that can influence what is happening. Also be aware that there are often multiple files that govern the behavior. I either install it from source or from a Debian repository. Debian Puts the main config file in /etc/backuppc/config.pl, but then each host being backed up can have its own config, which exists in /etc/backuppc/pc/$HOSTNAME.pl, or in some other close directory. You may be encountering something like that. In general it looks to the hosts' own config file, and if things aren't explicitly set there, it defaults to the base config.pl.

You may want to try getting version 3.0.0 on your machine. The integrated a set up feature into the cgi script, so you can add new hosts through the web interface rather than a straight text file.

I don't think you understand the way rsync works. It is a much more "intelligent" set up than tar. I'm not referring to the intelligence of the user who selects rsync, just the process itself. Tar has virtually no intelligence. Every time you do a tar backup, it wraps the whole of the data into a tarball and sends it across the network. What was taking rsync so long is it computes checksums for every file, (and parts of large files), compares checksums from the host to the backuppc, then transfers the whole (or part of a large file) when a difference is found. What makes that intelligent is all future backups through rsync are extremely fast. Simply because it doesn't need to transfer all of the data. Imagine you have a 200 Mb spreadsheet. The first time it is transferred, it is a nightmare, because you need to get all 200 Mb across the wire, and have checksums for each amount of data, say for example that each page gets its own checksum. The next day if you've made no changes to the file, it doesn't transfer the data at all, it just copies it from the previous day. If on the 3rd day you change one entry on the 137th page, rsync says, "ok, page 137 has a different checksum, send just that", whereas tar would just say, "ok, send all 200 Mb over again."

As an example, here's a bit of info on an rsyncd backup one of my hosts made last night:
Code:
Backup#  	Type  	 #Files  	 Size/MB  	 MB/sec  	 #Files  	 Size/MB  	 #Files  	 Size/MB
93  	        full  	 176051  	 43741.8  	 5.75  	         175783  	 43601.0  	 453  	         141.0
So I backed up 43 Gb of data in 126.7 minutes through rsync. I used to do cygwin+tar+ssh to the client, which is a windows 2000 server, and it would take 2-3 hours to do an incremental backup, 6 to do a full. I believe this is the same situation you have with your machines, they are on the same LAN, so have a cap of whatever the network will allow, which is 100 Mb/s for the 2 hosts I'm using as an example. What takes time for rsync is the checksum calculations (which I believe are md5sums). Running a ton of md5sums on an 800 Mhz processor will take time. It isn't impossible by any means, but it will take a while. After the first backup completes, all the future ones go much faster, often improving by a factor of 10-20.

As for the load, yes, backuppc is a fair amount of heavy lifting for the processors involved. I don't know why you'd see different load values for what is being backed up. If one backup is on the localhost then that makes sense, because then it is both performing the tar and running checksums, whereas when you ssh to a remote hosts, that host does the tarring and your backuppc server does the checksumming. That's just a guess, I really have no true idea.

Again, I encourage you to install version 3.0.0 over 2.1.2, the improvements are excellent, and since you can edit all the config files through the web, it is easy for even end users to change what is being backed up if you want them to, or it can just make it simpler on yourself!

Peace,
JimBass

Last edited by JimBass; 05-19-2007 at 10:01 AM.
 
Old 05-19-2007, 10:50 AM   #5
Cerox
LQ Newbie
 
Registered: Dec 2006
Location: Cologne, Germany
Distribution: SuSe 10.2
Posts: 9

Original Poster
Rep: Reputation: 0
Hi and thanks for your detailed explanations regarding rsync.

First I want to explain something. My most important data is backuped daily with a shell-script to a samba-share on my server. These important files have a total size of 700MB; so it needs a few minutes every day to backup these files.

BackupPC is only responsible for 30GB on this client-machine and for ~5GB on a windows-machine. A weekly backup of this 30GB is sufficient, because this files don't change very often.

For me it's not profitable to do a 48hours-backup with rsync whereby I would be able to use rsync for this machine. A weekly backup, which needs ~4hours with tar, is ok. Moreover, due to the fact that rsync calculates many checksumes, I think 800 MHz are too slow for a satisfying usage with rsync.

I am running SuSe 10.2 on my client-machine and Debian Sarge on my server. I have installed backuppc with apt. I did the changes for tar (and $Conf{BackupFilesExclude}) in the config.pl. For the windows-host I use a $hostname.pl, which overwrites the settings for this host to backup it via smb.
 
Old 05-19-2007, 11:19 AM   #6
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 48
It is only the first rsync that is time consuming, the second and on will be faster than tar, but that is your choice.

Since you have Debian on the server, I would suggest doing this:

Code:
apt-get -t testing install backuppc
That will get you a 3.0.0 version of backuppc. It should just upgrade your current setup, though you may want to make backups of the configuration before doing that upgrade. There is nothing in backuppc that will screw up the rest of your system, all the tools like tar, df, rsync will stay with Debian Stable packages. Just the backuppc package will be moved "up" to testing.

I have found the daily backups to be beneficial for my clients. Even things that don't change often do change, and in a once a week backup, what ends up being needed can appear Tuesday and be gone by Friday when the backup is done. The space optimization of backuppc is brilliant. The machine I showed the example from earlier has this info:

Code:
Pool is 40.72GB comprising 261835 files and 4369 directories (as of 5/18 18:05)
That is with 3 full backups of 3 hosts, all of them in the 20-40 Gb neighborhood. The math is unbelievable. I have roughly 170-180 Gb of data compressed to 40 on this machine. For the tiny increase in space, I strongly suggest doing daily backups.

Peace,
JimBass
 
Old 05-19-2007, 11:37 AM   #7
Cerox
LQ Newbie
 
Registered: Dec 2006
Location: Cologne, Germany
Distribution: SuSe 10.2
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
For the tiny increase in space, I strongly suggest doing daily backups.
There is no need. As I already said, my important data like documents, websites etc. are backuped daily with a shellscript. I also have a few generations of this important files.

I tried to install BackupPC3 via apt, but it said BackupPC was also the newest version.

I have Debian Sarge; that's old stable because Etch is the stable-distribution.

Here's my /etc/apt/sources.list; but that sould be unimportant because I have used "-t testing" in the apt-get-command.

Code:
deb http://ftp.de.debian.org/debian/ stable main
deb-src http://ftp.de.debian.org/debian/ stable main
deb http://security.debian.org/debian-security stable/updates main contrib non-free
Another question:

I want to run BackupPC_nightly manually...

Quote:
server:/# su backuppc -c "/usr/share/backuppc/bin/BackupPC_nightly"
usage: /usr/share/backuppc/bin/BackupPC_nightly [-m] poolRangeStart poolRangeEnd
Can you say me, what I have to enter for the arguments "poolRangeStart" and "poolRangeEnd"?
 
Old 05-19-2007, 12:59 PM   #8
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 48
I haven't ever used the backuppc commands at the terminal with the exception of __INSTALLDIR__/bin/BackupPC_dump -v -f hostName, for troubleshooting. Once I get that running, I haven't had to play with the individual commands open to backuppc. I did find this page with a description of what options you have for that command:

http://svn.rot13.org/index.cgi/Backu..._nightly?rev=1

I don't know why "apt-get -t testing install backuppc" would tell you that you have the current version, because the version in testing is 3.0.0. Did you maybe not do apt-get update first? Maybe that will bork things for you, unless you change your sources.list to old-stable. If you leave it at stable and update, you're going to get bumped to etch/stable when you upgrade.

Peace,
JimBass
 
Old 05-19-2007, 01:44 PM   #9
Cerox
LQ Newbie
 
Registered: Dec 2006
Location: Cologne, Germany
Distribution: SuSe 10.2
Posts: 9

Original Poster
Rep: Reputation: 0
Hi,

I didn't change the sources.list as Debian Etch was released. apt upgraded many packages after the Etch-release with "apt-get upgrade". So I have Etch, but my Kernel is still the 2.6.8 (this is the latest kernel which I can get via apt). I dont't want to compile the kernel myself on this machine but that's not the topic...

I will update this thread if I find a solution to install the latest BackupPC-beta from testing-repository.

Thank you for your help.

Regards
Sebastian
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
sed inclusion/exclusion jinksys Programming 2 05-01-2007 09:42 PM
pthreads and I/O mutual exclusion oulevon Programming 6 04-29-2006 09:33 PM
backuppc tkt Linux - Newbie 0 06-10-2005 12:44 AM
BASH; exclusion list and cp TheLinuxDuck Programming 3 03-10-2005 01:59 PM
interrupts and mutual exclusion jwstric2 Programming 1 12-09-2004 03:36 PM


All times are GMT -5. The time now is 05:05 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration