LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 02-15-2011, 02:32 PM   #1
forbin
Member
 
Registered: Apr 2010
Posts: 43

Rep: Reputation: 0
Understanding LVM Snapshots


There's a lot of noise in GoogleSpace on the subject of LVM snapshots. I'm starting to grasp it, but one thing still really puzzles me. Where does the data come from when I run my backups?

Based on what I've been reading, a snapshot volume does not have to be as large as the source volume. The snapshot volume only needs to be large enough to accommodate the maximum number of changes that might occur to the source volume during the life of the snapshot.

Suppose I have a 600GB LV...

/dev/vg01/data

...but I only expect a maximum of 5GB worth of changes on any given day. So just to be safe, I create a 20GB snapshot volume...

/dev/vg01/snapshot.

At the end of the day, I shut down my database service, take the snapshot, and start the database service.

Supposedly I can now safely take a backup of /dev/vg01/snapshot, which gives me a full backup of /dev/vg01/data at the time the snapshot was taken.

Umm... how? Any way you look at it, /dev/vg01/snapshot is only 20GB in size, so how can it give me a full backup of /dev/vg01/data?
 
Old 02-15-2011, 02:42 PM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
the snapshot is really just a copy of the inode tree at the point in time you created it. It is not 20GB in your example, 20GB is the space which is reserved for changes once a snapshot is taken. you can have looooooads of snapshots if you want, as long as the largest delta between any version of the filesystem doesn't exceed that snapshot reserve size.

Walking through a snapshot we have...

- 20gb LV partition exists, which is configured to includes a 5gb snapshot allocation
- files keep changing in the remaining 15gb ext3 partition
- snapshot is taken, inode tree is copied and used for references to the newly created snapshot block device when it is mounted.
- files keep changing on the live filesystem, but any changes to the data are written in the 5gb snapshot space
- once 5gb of changes are made to the filesystem, further changes must be written to other parts of the LV, thus invalidating the snapshot.

It's pretty clever when you get it, hopefully you do now. Or I made things a lot worse, and TBH I assumed the thing about the inode tree... Think of it a bit like a Venn diagram. Two overlapping circles, one for the snapshot, one for the main filesystem. They are both the same size, but share the majority of the disk space where they overlap. There's no data copies etc, it's *exactly* the same disk sectors.

Last edited by acid_kewpie; 02-15-2011 at 02:45 PM.
 
1 members found this post helpful.
Old 02-15-2011, 03:01 PM   #3
z1p
Member
 
Registered: Jan 2011
Location: the right coast of the US
Distribution: Ubuntu 10.04
Posts: 80

Rep: Reputation: 23
LVM snapshots use a copy on write (CoW) technology. That is data is only written to the snapshot area when a block is changed on the original. So when it is time to read from the snapshot (for making a backup or whatever), any changed blocks are read from the snapshot area. But any blocks that haven't changed since the snapshot are read from the original.
 
1 members found this post helpful.
Old 02-15-2011, 03:11 PM   #4
forbin
Member
 
Registered: Apr 2010
Posts: 43

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by acid_kewpie View Post
the snapshot is really just a copy of the inode tree at the point in time you created it. It is not 20GB in your example, 20GB is the space which is reserved for changes once a snapshot is taken. you can have looooooads of snapshots if you want, as long as the largest delta between any version of the filesystem doesn't exceed that snapshot reserve size.

Walking through a snapshot we have...

- 20gb LV partition exists, which is configured to includes a 5gb snapshot allocation
- files keep changing in the remaining 15gb ext3 partition
- snapshot is taken, inode tree is copied and used for references to the newly created snapshot block device when it is mounted.
- files keep changing on the live filesystem, but any changes to the data are written in the 5gb snapshot space
- once 5gb of changes are made to the filesystem, further changes must be written to other parts of the LV, thus invalidating the snapshot.

It's pretty clever when you get it, hopefully you do now. Or I made things a lot worse, and TBH I assumed the thing about the inode tree... Think of it a bit like a Venn diagram. Two overlapping circles, one for the snapshot, one for the main filesystem. They are both the same size, but share the majority of the disk space where they overlap. There's no data copies etc, it's *exactly* the same disk sectors.
This is helpful information and mostly confirms my suspicions. Hopefully I can ask a couple of follow-up questions:

1. If it is exactly the same physical disk sectors, then backing up the system during production hours will cause a performance hit, correct? If that is true, then the only benefit I see from using snapshots is that it minimizes system downtime.

2. It can't be 100% EXACTLY the same disk sectors. Changed sectors must be kept in two places, right? The version that was current at the time of the snapshot plus the new version after changes. Those would be different sectors.

3. Keeping snapshots running (i.e., tracking changes) during production hours would cause a small performance hit, I assume.
 
Old 02-15-2011, 03:17 PM   #5
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
1. Well it's always going to be accessing the same disk isn't it? it's not going to matter if it's the same sectors or different sectors, as chances are you're not going to be actively reading the same sectors are you? So I think that's a non-issue.

2. well it's 100% until you make a change to the filesystem, then it's 99%..... 95.... 80... until your percentage is lower than the percentage of snapshot volume, at which point the snapshot is invalid.

3. no, there are no changes being tracked. LVM simply marks different parts of the disk as eligible to be written to. It's a one time thing and just leaves a copy of the old data once it's masked off where all new changes need to go. The only difference is that a change to a file will not be written back to where it read it from, but this new snapshot reserve area. So the live indoe tree points the the sectors for the new version of the file, the snapshot tree still points to the old location, and that's at a partial level, not the entire file. if 95% of the file is untouched on disk (rounded to inode size), then both filesystems still point to 95% of the inodes for that file, and only 5% differ. It's all implict other than knowing when you need to invalidate the snapshot, but that's fairly straight forward. a snapshot doesn't "run", it's totally passive.

Last edited by acid_kewpie; 02-15-2011 at 03:19 PM.
 
Old 02-15-2011, 03:21 PM   #6
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
I have a badly formed analogy of stencils and spray paints in my head, but i'll spare you that one.
 
Old 02-15-2011, 03:27 PM   #7
forbin
Member
 
Registered: Apr 2010
Posts: 43

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by acid_kewpie View Post
1. Well it's always going to be accessing the same disk isn't it? it's not going to matter if it's the same sectors or different sectors, as chances are you're not going to be actively reading the same sectors are you? So I think that's a non-issue.
Running a backup causes disk iops, which subtracts from the total iops available for production use, so I don't see how running a backup could fail to cause a performance hit.

Quote:
Originally Posted by acid_kewpie View Post
3. no, there are no changes being tracked.
I think I'm referring to the Copy-on-Write operations that are going on while the snapshot exists, which z1p mentioned above. Or are you guys saying different things?
 
Old 02-15-2011, 03:36 PM   #8
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
Quote:
Originally Posted by forbin View Post
Running a backup causes disk iops, which subtracts from the total iops available for production use, so I don't see how running a backup could fail to cause a performance hit.
Well I meant the hit is the same all the time the data is on the same disk. It's just not "interesting" here I don't think.[/quote]
Quote:
I think I'm referring to the Copy-on-Write operations that are going on while the snapshot exists, which z1p mentioned above. Or are you guys saying different things?
No, I think I'm just simplifying things a little too much TBH, so yes, there's the hit because that block is copied, but if that is able to have a significant impact, then something isn't right surely
 
Old 02-15-2011, 03:45 PM   #9
forbin
Member
 
Registered: Apr 2010
Posts: 43

Original Poster
Rep: Reputation: 0
Thanks so much for your comments. I will now go to my quiet place and try to make sure I understand them better before I ask anything else. :-)
 
Old 02-15-2011, 08:36 PM   #10
z1p
Member
 
Registered: Jan 2011
Location: the right coast of the US
Distribution: Ubuntu 10.04
Posts: 80

Rep: Reputation: 23
Just a few comments related to the discussion going on.

Whether or not accessing the snapshot will hit the same disk depends on the how the volume group is laid out. Also, the impact can vary depending on the the volume group setup and use. [google "LVM performance" and you can find see what I mean.]


Quote:
Originally Posted by acid_kewpie
3. no, there are no changes being tracked. LVM simply marks different parts of the disk as eligible to be written to. It's a one time thing and just leaves a copy of the old data once it's masked off where all new changes need to go. The only difference is that a change to a file will not be written back to where it read it from, but this new snapshot reserve area. So the live indoe tree points the the sectors for the new version of the file, the snapshot tree still points to the old location, and that's at a partial level, not the entire file. if 95% of the file is untouched on disk (rounded to inode size), then both filesystems still point to 95% of the inodes for that file, and only 5% differ. It's all implict other than knowing when you need to invalidate the snapshot, but that's fairly straight forward. a snapshot doesn't "run", it's totally passive.
My understanding is that it does track changes. When the original is modified, First a check is made to see made to determine if the block has changed, if it hasn't it is added to the 'changed blocks' on the snapshot and the original is updated. If that same block is updated again, then a write isn't needed to the snapshot.

I personally don't think that anything is done at the inode level, it is all done at the disk block level which doesn't know the difference between a block used for data or a block used for filesystem metadata (inodes).

I think the easiest to see how that might work is to think of the case where a file is grown. In that case a number of blocks will be written to with data, these blocks will have there original data copied to the snapshot. Also, as a result of the file growing, the inodes structure, [which reside in disk blocks, possibly scatter throughout the disk], will be modified. This means that the blocks on disk that contain the changed inodes info will be updated. When this happens the original data in these blocks will be copied to the snapshot, preserving the filesystem structure [inodes] as it was at the time of the snapshot.

If the modified file is later read through the actual volume, then the new inode structure is read and the new data is retrieved. If the file is erad through the snapshot, then the original inode structure as capture by the snapshot is read, which will point to the blocks used by the file at the time of the snapshot. So some blocks may come off the real volume and some may come off the snapshot. This is all taken care under the covers. Remeber the snapshot is of a volume not a filesystem. In theory you could snapshot a raw volume.
 
2 members found this post helpful.
Old 02-15-2011, 11:06 PM   #11
forbin
Member
 
Registered: Apr 2010
Posts: 43

Original Poster
Rep: Reputation: 0
z1p, your explanation seems very natural and makes a huge amount of sense to me. However, whether you or acid_kewpie are more correct, there is enough agreement between you that I now realize that LVM snapshots will not accomplish what I want. A snapshot is NOT a backup in itself. It just makes backups possible. However, I am quite happy with solutions such as rsync and rdiff-backup, which do make backups of your data while also maximizing system availability.
 
Old 02-16-2011, 12:10 AM   #12
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,251

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
Not true - snapshot offers the benefit of a "point-in-time" consistent image of the data. The others don't.
Whether you consider this important is your decision.

However if you take a snap, and then back that up, *all* the data is known to be (time) consistent - regardless of when the actual backup runs. I don't use LVM, but I do similar on btrfs and consider time-consistent backups absolutely essential.
 
Old 02-16-2011, 12:44 AM   #13
forbin
Member
 
Registered: Apr 2010
Posts: 43

Original Poster
Rep: Reputation: 0
syg00, I admit that I don't know much about LVM snapshots, but everything I've read here and elsewhere confirms that snapshots are NOT backups. They are simply a MEANS to a backup. A backup is a full copy of the data. When you are done with a backup, you end up with TWO copies of your data, like you would if you made a backup copy of a DVD. You can completely destroy the original and still have a good full copy. Taking a snapshot does not give you two copies of your data. It gives you a point-in-time view of a filesystem plus a delta of changes. Snapshots make it POSSIBLE to make consistent copies of the data, but a snapshot is not, in itself, a copy of the data. To prove my point, consider the case of a disk sector that goes bad. Unless that particular sector changed, your snapshot probably does NOT contain a copy of the data in that sector. The snapshot just points to the original sector. Hence the snapshot itself cannot be used to restore the data.

When I use rsync or rdiff-backup, I get a complete second copy of the data.
 
Old 02-16-2011, 01:37 AM   #14
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,251

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
No-one has suggested a snap is a "secure" backup - me included
Quote:
Originally Posted by me
However if you take a snap, and then back that up...
You're not listening to what has been said.
 
Old 02-16-2011, 02:15 AM   #15
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
Quote:
Originally Posted by forbin View Post
syg00, I admit that I don't know much about LVM snapshots, but everything I've read here and elsewhere confirms that snapshots are NOT backups. They are simply a MEANS to a backup. A backup is a full copy of the data. When you are done with a backup, you end up with TWO copies of your data, like you would if you made a backup copy of a DVD. You can completely destroy the original and still have a good full copy. Taking a snapshot does not give you two copies of your data. It gives you a point-in-time view of a filesystem plus a delta of changes. Snapshots make it POSSIBLE to make consistent copies of the data, but a snapshot is not, in itself, a copy of the data. To prove my point, consider the case of a disk sector that goes bad. Unless that particular sector changed, your snapshot probably does NOT contain a copy of the data in that sector. The snapshot just points to the original sector. Hence the snapshot itself cannot be used to restore the data.

When I use rsync or rdiff-backup, I get a complete second copy of the data.
Well I'd say that it is a genuinely valid form of backup. If you have a system where you can run a process, perform significant changes to a set of files, totally wreck the joint, and then perform another process to get back to your starting point, that surely is a backup by any other name? It just depends what you are trying to protect from. You'll not be protected from physical disk failure with this as your "backup" method, but then neither would you be if you, for example, dd'd a partition into a file on a different partition on the same drive, but that is a full data copy....
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Periodic LVM snapshots hydraMax Linux - Server 4 11-27-2010 04:14 PM
LVM Snapshots, can they be disabled? Stiken Linux - Newbie 2 01-29-2010 02:57 PM
LVM snapshots: How to use? gargamel Linux - Server 12 02-24-2008 02:57 PM
LXer: Back Up (And Restore) LVM Partitions With LVM Snapshots LXer Syndicated Linux News 0 04-17-2007 11:16 AM
LVM snapshots that run out of space, results? haertig Linux - Software 2 03-02-2006 02:26 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 02:45 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration