LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 12-23-2023, 04:16 AM   #16
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453

Quote:
Originally Posted by IsaacKuo View Post
Yes, there are parts that a normal user will not have read access to. What's more important, though, is that a normal user will not be able to replicate the user:group ownership nor replicate the proper permissions on the copied files. Even if the user could read all of the files, the user couldn't create a usable backup because the backup won't have the proper ownership and permissions.
Right! So the way to get stuck in is to get the data backup set up first and run it for a couple of weeks. Then, when I feel more confident, find a way to add in backups of the two system partitions later.
 
Old 12-23-2023, 07:52 AM   #17
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453
Well, it looks as if my initial rsync command for dumping data should be something like:
Code:
rsync -axPvn /home/data littleboy:/home/hazel/dumps
for the first time (testing) and then
Code:
rsync -axP /home/data littleboy:/home/hazel/dumps
to do it in anger. That should create a tree under /home/hazel/dumps/data. Except that from what I've read, the -x option will not copy the contents of data because it's actually a mount point! Maybe not use -x for the data dump; I'll certainly need it for system dumps because I'll want to start at / but exclude the data partition and the dynamic filesystems. And I don't know if I need -H. What's a sparse file anyway?

Once I have got it to work, I'd like to progress to turbocapitalist's 7-day system with hard links, which looks really cool and ensures that I will have father and grandfather copies if needed.

For the two system partitions, dumping after the first boot following an update should be good enough, but I'll need to find out how to do it as root.

PS: Just booted littleboy with ethernet connection. Both eth0 and wlan0 came up with different local ip addresses. So I assume I can choose my connection by setting the appropriate address for littleboy in Slackware's /etc/hosts file.

PPS: Bah! I didn't actually have rsync on my very stripped-down Slackware. Just installed it, now I have to chase up its dependencies. Well, I'm not doing that by day; too expensive! I'll do it tomorrow morning in the early hours when I get cheap rates, then try again and report back.

Last edited by hazel; 12-23-2023 at 08:52 AM.
 
Old 12-23-2023, 08:48 AM   #18
Petri Kaukasoina
Senior Member
 
Registered: Mar 2007
Posts: 1,791

Rep: Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469
Quote:
Originally Posted by hazel View Post
What's a sparse file anyway?
For example, /var/log/lastlog is a sparse file. It contains mostly null bytes and the file system does not need to store the nulls. rsync without -S makes a copy containing all those nulls verbatim with 1596 blocks of 1k. But rsync with -S needs only 8 blocks as the original.

Code:
# rsync -aS /var/log/lastlog /tmp 
# ls -ls /var/log/lastlog /tmp/lastlog 
8 -rw-r--r-- 1 root root 1632864 2023-11-29 17:47 /tmp/lastlog
8 -rw-r--r-- 1 root root 1632864 2023-11-29 17:47 /var/log/lastlog
# rm /tmp/lastlog
# rsync -a /var/log/lastlog /tmp 
# ls -ls /var/log/lastlog /tmp/lastlog 
1596 -rw-r--r-- 1 root root 1632864 2023-11-29 17:47 /tmp/lastlog
   8 -rw-r--r-- 1 root root 1632864 2023-11-29 17:47 /var/log/lastlog
 
1 members found this post helpful.
Old 12-23-2023, 08:56 AM   #19
Petri Kaukasoina
Senior Member
 
Registered: Mar 2007
Posts: 1,791

Rep: Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469Reputation: 1469
Quote:
Originally Posted by hazel View Post
And I don't know if I need -H.
For example, in /usr/lib64/dri there are many hard linked files. Without -H, rsync creates separate files, and with -H it preserves the hard links, using the disk space only once.
Code:
# rsync -a /usr/lib64/dri /tmp
# du -sh /usr/lib64/dri /tmp/dri
61M     /usr/lib64/dri
411M    /tmp/dri
# rm -rf /tmp/dri 
# rsync -aH /usr/lib64/dri /tmp
# du -sh /usr/lib64/dri /tmp/dri
61M     /usr/lib64/dri
61M     /tmp/dri
 
1 members found this post helpful.
Old 12-23-2023, 09:03 AM   #20
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453
So I will need both -H and -S for my system dumps. Thanks a lot; that's good to know. But I doubt if I need either for the data dump. All the files on that partition were created by me and they are all standard data formats. Certainly no hard links, although there might be one or two soft ones.

I feel I have an itinerary now, thanks to all you lovely people. Last night when I started this thread, it just felt like ERR!?

Hey! When I've finished this project, I'll write a blog on it.

Last edited by hazel; 12-23-2023 at 09:08 AM.
 
Old 12-23-2023, 09:24 AM   #21
jailbait
LQ Guru
 
Registered: Feb 2003
Location: Virginia, USA
Distribution: Debian 12
Posts: 8,337

Rep: Reputation: 548Reputation: 548Reputation: 548Reputation: 548Reputation: 548Reputation: 548
Quote:
Originally Posted by hazel View Post

Doesn't rsync keep successive copies automatically unless you explicitly delete them?
When a file changes then rsync overwrites the backup copy with the latest copy.

If the live file is deleted then what rsync does depends on whether or not you have specified --delete in the rsync command. If you haven't specified --delete then rsync does nothing to the backup file and over time the backup partition will fill up with garbage. If you specify --delete then rsync will delete any backup files that do not have existing live files. That is why you need multiple generations of backup, to give you a grace period before a file is deleted forever or overwritten before you discover that you need to fall back to a previous version of the file.

Last edited by jailbait; 12-23-2023 at 09:28 AM.
 
1 members found this post helpful.
Old 12-23-2023, 10:10 AM   #22
rclark
Member
 
Registered: Jul 2008
Location: Montana USA
Distribution: KUbuntu, Fedora (KDE), PI OS
Posts: 482

Rep: Reputation: 179Reputation: 179
As root user, I backup multiple directories in a bash script. I find the -av --delete is all I need to keep it 'simple'.
FYI the man page show -a is same as -rlptgoD . I don't backup the OS just the /home and any data directories out side of /home that I might have created.

Sample:

...
rsync -av --delete /homedata/virtualBoxVMs /mnt/usbdrive/
rsync -av --delete /homedata2/development /mnt/usbdrive/
...
 
Old 12-25-2023, 09:39 AM   #23
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453
Yippee! I just rsynced the data directory on bigboy over to littleboy. It took 49 minutes by wireless. I was trying to get it to use the ethernet connection, which would have been faster, but it didn't do that for some reason (advice needed in due course). I am planning to do differential dumps weekly rather than daily as there isn't a big data churn on my machine compared to Turbocapitalist's.

Now if I do that for a month, using the --link-dest option to hard link to files that haven't changed, and from then on delete the oldest one every week, will that work? I gather from what I know about hard links that deleting the directories that contain them doesn't delete the corresponding files as long as there is at least one other hard link to each one. So deleting today's dump directory (data-25-12-2023) shouldn't affect the files I have just copied, once there are later dump directories with hard links to them. But files that have been deleted on bigboy will eventually disappear on littleboy too when the last directory dump to contain a link to them is erased a month later. Is that correct? If so, there is no need to explicitly delete old files.

I have found that I need to specify hazel@littleboy in my target for a data dump. If I just put littleboy, it asks for a root password but doesn't say which machine! I assumed it meant littleboy's root but when I used that password, it was rejected. When I get to dumping system partitions, I will need to be root, maybe on both machines (?)

Last edited by hazel; 12-25-2023 at 09:46 AM.
 
Old 12-25-2023, 10:14 AM   #24
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,311
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Yes, the --link-dest links unchanged files in the two directories to avoid making a whole new copy, so deleting he oldest directory will leave the other directories unaffected. Files stick around until the last reference is deleted:

Code:
cd /tmp/
touch x
ln x y
stat -c '%i\n' x y
rm x
stat -c '%i\n' x y
If you have the date from GNU coreutils, then you can do relative dates easily:

Code:
#!/bin/sh

d=$(date +'%F')

thisweek=$(date -d $d +'%V')
linkdest=$(date -d "$d - 1 week" +'%V')
deleteweek=$(date -d "$d - 5 weeks" +'%V')

echo $thisweek
echo $linkdest
echo $deleteweek

exit 0
 
1 members found this post helpful.
Old 12-25-2023, 10:39 AM   #25
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453
Good! I want eventually to have a script that does this and run it as a cron job (probably via anacron as I keep irregular hours). Using your functions will allow me to automatically name the destination directories. Of course I will then have to put a slash after "data" to ensure that only the contents get transferred. But for a month or so, I want to do it by hand just to check that everything goes smoothly.

Now why did rsync use wifi rather than ethernet? I put lines into /etc/hosts for littleboy with both ip addresses and commented out the one corresponding to wlan0, so why did the router still use that one?

Last edited by hazel; 12-25-2023 at 10:49 AM.
 
Old 12-25-2023, 10:50 AM   #26
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,311
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Quote:
Originally Posted by hazel View Post
Now why did rsync use wifi rather than ethernet? I put lines into /etc/hosts for littleboy with both ip addresses and commented out the one corresponding to wlan0, so why did the router still use that one?
It could be that your main machine is using the router's DNS service/proxy as its first choice that would pick up the WLAN address for littleboy through that. Many wi-fi routers are set up so that your self-reported host names are in the router's DNS service/proxy and so other machines on the WLAN can look them up using that self-reported name.

However, someone with networking knowledge would have to say whether there is another reason and maybe the following work-around would be unnecessary then:

One possible work-around would be to use the -B option with the SSH client call using Rsync's -e option.

Code:
rsync -e 'ssh -B eth0 -l hazel' ...
That can be put into the SSH client's configuration file using the BindInterface option. That file is good for making a lot of short cuts with multiple options for specific connections. The configuration file for the SSH client is one of the more seriously underappreciated capabilities among the common, every day tools.
 
Old 12-25-2023, 11:04 AM   #27
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453
Actually it's only a problem for the very first dump of a partition. All the later ones will be much smaller. So probably not worth chasing up.

I'm appending a first draft of a script that incorporates your very useful dating suggestions.

Oops! The line for deleting the oldest dump needs rm -fR, not just rm.
Attached Files
File Type: txt rsync_script.txt (590 Bytes, 6 views)

Last edited by hazel; 12-25-2023 at 11:06 AM.
 
Old 12-25-2023, 04:56 PM   #28
jailbait
LQ Guru
 
Registered: Feb 2003
Location: Virginia, USA
Distribution: Debian 12
Posts: 8,337

Rep: Reputation: 548Reputation: 548Reputation: 548Reputation: 548Reputation: 548Reputation: 548
Quote:
Originally Posted by hazel View Post

I'm appending a first draft of a script that incorporates your very useful dating suggestions.
Your logic is complicated enough that I would have to set up a test and run it to make sure exactly what the script does over a five week interval. Scanning the overall logic I think that you may often have the situation where the only backup copy of a file is in the oldest backup and all the newer backups have a hard link to that file. When you delete the oldest backup what will the hard links point to?
 
Old 12-25-2023, 09:19 PM   #29
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,311
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Quote:
Originally Posted by jailbait View Post
When you delete the oldest backup what will the hard links point to?
The hard links would still point to the same inode as before. See the exercise in #24 above. Now symbolic links would be another matter, but Rsync produces hard links and not symbolic links so that potential concern would be moot.
 
Old 12-25-2023, 11:36 PM   #30
hazel
LQ Guru
 
Registered: Mar 2016
Location: Harrow, UK
Distribution: LFS, AntiX, Slackware
Posts: 7,579

Original Poster
Blog Entries: 19

Rep: Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453Reputation: 4453
Quote:
Originally Posted by jailbait View Post
Scanning the overall logic I think that you may often have the situation where the only backup copy of a file is in the oldest backup and all the newer backups have a hard link to that file. When you delete the oldest backup what will the hard links point to?
That's precisely what I initially asked myself. It's because we are encouraged to think of files as being in directories. The visual metaphor of the folder encourages this. And then the first dump becomes mystically special because it has the actual files in it while the later ones only have links. But it just ain't so. The files are just somewhere on the partition, we don't need to know where. Only their locations are stored inside the parent directory as hard links.

Normally when a directory is deleted, the contents are deleted too because the only hard links to them are stored in the directory file. Every file has a link count field and when the value of that drops to zero, the filesystem driver knows it can recycle those blocks. But if you have another set of hard links to the same files in another directory, they won't be deleted.

Last edited by hazel; 12-26-2023 at 12:03 AM.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] best way to periodically rsync to clients from rsync server j1renicus Linux - General 2 09-19-2012 03:38 AM
[SOLVED] Advice needed setting up a MySQL server yariboy Linux - Newbie 7 04-07-2012 02:01 PM
Could I run rsync to download files from a server without rsync daemon? Richard.Yang Linux - Software 1 09-18-2009 04:08 AM
Rsync server vs rsync over ssh humbletech99 Linux - Networking 1 10-18-2006 12:10 PM
Advice needed on setting up security on Fedora installation gevers1 Linux - Security 1 01-21-2004 09:31 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:01 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration