LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Understanding LVM Snapshots (https://www.linuxquestions.org/questions/linux-server-73/understanding-lvm-snapshots-862896/)

forbin 02-15-2011 02:32 PM

Understanding LVM Snapshots
 
There's a lot of noise in GoogleSpace on the subject of LVM snapshots. I'm starting to grasp it, but one thing still really puzzles me. Where does the data come from when I run my backups?

Based on what I've been reading, a snapshot volume does not have to be as large as the source volume. The snapshot volume only needs to be large enough to accommodate the maximum number of changes that might occur to the source volume during the life of the snapshot.

Suppose I have a 600GB LV...

/dev/vg01/data

...but I only expect a maximum of 5GB worth of changes on any given day. So just to be safe, I create a 20GB snapshot volume...

/dev/vg01/snapshot.

At the end of the day, I shut down my database service, take the snapshot, and start the database service.

Supposedly I can now safely take a backup of /dev/vg01/snapshot, which gives me a full backup of /dev/vg01/data at the time the snapshot was taken.

Umm... how? Any way you look at it, /dev/vg01/snapshot is only 20GB in size, so how can it give me a full backup of /dev/vg01/data?

acid_kewpie 02-15-2011 02:42 PM

the snapshot is really just a copy of the inode tree at the point in time you created it. It is not 20GB in your example, 20GB is the space which is reserved for changes once a snapshot is taken. you can have looooooads of snapshots if you want, as long as the largest delta between any version of the filesystem doesn't exceed that snapshot reserve size.

Walking through a snapshot we have...

- 20gb LV partition exists, which is configured to includes a 5gb snapshot allocation
- files keep changing in the remaining 15gb ext3 partition
- snapshot is taken, inode tree is copied and used for references to the newly created snapshot block device when it is mounted.
- files keep changing on the live filesystem, but any changes to the data are written in the 5gb snapshot space
- once 5gb of changes are made to the filesystem, further changes must be written to other parts of the LV, thus invalidating the snapshot.

It's pretty clever when you get it, hopefully you do now. Or I made things a lot worse, and TBH I assumed the thing about the inode tree... Think of it a bit like a Venn diagram. Two overlapping circles, one for the snapshot, one for the main filesystem. They are both the same size, but share the majority of the disk space where they overlap. There's no data copies etc, it's *exactly* the same disk sectors.

z1p 02-15-2011 03:01 PM

LVM snapshots use a copy on write (CoW) technology. That is data is only written to the snapshot area when a block is changed on the original. So when it is time to read from the snapshot (for making a backup or whatever), any changed blocks are read from the snapshot area. But any blocks that haven't changed since the snapshot are read from the original.

forbin 02-15-2011 03:11 PM

Quote:

Originally Posted by acid_kewpie (Post 4259466)
the snapshot is really just a copy of the inode tree at the point in time you created it. It is not 20GB in your example, 20GB is the space which is reserved for changes once a snapshot is taken. you can have looooooads of snapshots if you want, as long as the largest delta between any version of the filesystem doesn't exceed that snapshot reserve size.

Walking through a snapshot we have...

- 20gb LV partition exists, which is configured to includes a 5gb snapshot allocation
- files keep changing in the remaining 15gb ext3 partition
- snapshot is taken, inode tree is copied and used for references to the newly created snapshot block device when it is mounted.
- files keep changing on the live filesystem, but any changes to the data are written in the 5gb snapshot space
- once 5gb of changes are made to the filesystem, further changes must be written to other parts of the LV, thus invalidating the snapshot.

It's pretty clever when you get it, hopefully you do now. Or I made things a lot worse, and TBH I assumed the thing about the inode tree... Think of it a bit like a Venn diagram. Two overlapping circles, one for the snapshot, one for the main filesystem. They are both the same size, but share the majority of the disk space where they overlap. There's no data copies etc, it's *exactly* the same disk sectors.

This is helpful information and mostly confirms my suspicions. Hopefully I can ask a couple of follow-up questions:

1. If it is exactly the same physical disk sectors, then backing up the system during production hours will cause a performance hit, correct? If that is true, then the only benefit I see from using snapshots is that it minimizes system downtime.

2. It can't be 100% EXACTLY the same disk sectors. Changed sectors must be kept in two places, right? The version that was current at the time of the snapshot plus the new version after changes. Those would be different sectors.

3. Keeping snapshots running (i.e., tracking changes) during production hours would cause a small performance hit, I assume.

acid_kewpie 02-15-2011 03:17 PM

1. Well it's always going to be accessing the same disk isn't it? it's not going to matter if it's the same sectors or different sectors, as chances are you're not going to be actively reading the same sectors are you? So I think that's a non-issue.

2. well it's 100% until you make a change to the filesystem, then it's 99%..... 95.... 80... until your percentage is lower than the percentage of snapshot volume, at which point the snapshot is invalid.

3. no, there are no changes being tracked. LVM simply marks different parts of the disk as eligible to be written to. It's a one time thing and just leaves a copy of the old data once it's masked off where all new changes need to go. The only difference is that a change to a file will not be written back to where it read it from, but this new snapshot reserve area. So the live indoe tree points the the sectors for the new version of the file, the snapshot tree still points to the old location, and that's at a partial level, not the entire file. if 95% of the file is untouched on disk (rounded to inode size), then both filesystems still point to 95% of the inodes for that file, and only 5% differ. It's all implict other than knowing when you need to invalidate the snapshot, but that's fairly straight forward. a snapshot doesn't "run", it's totally passive.

acid_kewpie 02-15-2011 03:21 PM

I have a badly formed analogy of stencils and spray paints in my head, but i'll spare you that one.

forbin 02-15-2011 03:27 PM

Quote:

Originally Posted by acid_kewpie (Post 4259497)
1. Well it's always going to be accessing the same disk isn't it? it's not going to matter if it's the same sectors or different sectors, as chances are you're not going to be actively reading the same sectors are you? So I think that's a non-issue.

Running a backup causes disk iops, which subtracts from the total iops available for production use, so I don't see how running a backup could fail to cause a performance hit.

Quote:

Originally Posted by acid_kewpie (Post 4259497)
3. no, there are no changes being tracked.

I think I'm referring to the Copy-on-Write operations that are going on while the snapshot exists, which z1p mentioned above. Or are you guys saying different things?

acid_kewpie 02-15-2011 03:36 PM

Quote:

Originally Posted by forbin (Post 4259508)
Running a backup causes disk iops, which subtracts from the total iops available for production use, so I don't see how running a backup could fail to cause a performance hit.

Well I meant the hit is the same all the time the data is on the same disk. It's just not "interesting" here I don't think.[/quote]
Quote:

I think I'm referring to the Copy-on-Write operations that are going on while the snapshot exists, which z1p mentioned above. Or are you guys saying different things?
No, I think I'm just simplifying things a little too much TBH, so yes, there's the hit because that block is copied, but if that is able to have a significant impact, then something isn't right surely

forbin 02-15-2011 03:45 PM

Thanks so much for your comments. I will now go to my quiet place and try to make sure I understand them better before I ask anything else. :-)

z1p 02-15-2011 08:36 PM

Just a few comments related to the discussion going on.

Whether or not accessing the snapshot will hit the same disk depends on the how the volume group is laid out. Also, the impact can vary depending on the the volume group setup and use. [google "LVM performance" and you can find see what I mean.]


Quote:

Originally Posted by acid_kewpie
3. no, there are no changes being tracked. LVM simply marks different parts of the disk as eligible to be written to. It's a one time thing and just leaves a copy of the old data once it's masked off where all new changes need to go. The only difference is that a change to a file will not be written back to where it read it from, but this new snapshot reserve area. So the live indoe tree points the the sectors for the new version of the file, the snapshot tree still points to the old location, and that's at a partial level, not the entire file. if 95% of the file is untouched on disk (rounded to inode size), then both filesystems still point to 95% of the inodes for that file, and only 5% differ. It's all implict other than knowing when you need to invalidate the snapshot, but that's fairly straight forward. a snapshot doesn't "run", it's totally passive.

My understanding is that it does track changes. When the original is modified, First a check is made to see made to determine if the block has changed, if it hasn't it is added to the 'changed blocks' on the snapshot and the original is updated. If that same block is updated again, then a write isn't needed to the snapshot.

I personally don't think that anything is done at the inode level, it is all done at the disk block level which doesn't know the difference between a block used for data or a block used for filesystem metadata (inodes).

I think the easiest to see how that might work is to think of the case where a file is grown. In that case a number of blocks will be written to with data, these blocks will have there original data copied to the snapshot. Also, as a result of the file growing, the inodes structure, [which reside in disk blocks, possibly scatter throughout the disk], will be modified. This means that the blocks on disk that contain the changed inodes info will be updated. When this happens the original data in these blocks will be copied to the snapshot, preserving the filesystem structure [inodes] as it was at the time of the snapshot.

If the modified file is later read through the actual volume, then the new inode structure is read and the new data is retrieved. If the file is erad through the snapshot, then the original inode structure as capture by the snapshot is read, which will point to the blocks used by the file at the time of the snapshot. So some blocks may come off the real volume and some may come off the snapshot. This is all taken care under the covers. Remeber the snapshot is of a volume not a filesystem. In theory you could snapshot a raw volume.

forbin 02-15-2011 11:06 PM

z1p, your explanation seems very natural and makes a huge amount of sense to me. However, whether you or acid_kewpie are more correct, there is enough agreement between you that I now realize that LVM snapshots will not accomplish what I want. A snapshot is NOT a backup in itself. It just makes backups possible. However, I am quite happy with solutions such as rsync and rdiff-backup, which do make backups of your data while also maximizing system availability.

syg00 02-16-2011 12:10 AM

Not true - snapshot offers the benefit of a "point-in-time" consistent image of the data. The others don't.
Whether you consider this important is your decision.

However if you take a snap, and then back that up, *all* the data is known to be (time) consistent - regardless of when the actual backup runs. I don't use LVM, but I do similar on btrfs and consider time-consistent backups absolutely essential.

forbin 02-16-2011 12:44 AM

syg00, I admit that I don't know much about LVM snapshots, but everything I've read here and elsewhere confirms that snapshots are NOT backups. They are simply a MEANS to a backup. A backup is a full copy of the data. When you are done with a backup, you end up with TWO copies of your data, like you would if you made a backup copy of a DVD. You can completely destroy the original and still have a good full copy. Taking a snapshot does not give you two copies of your data. It gives you a point-in-time view of a filesystem plus a delta of changes. Snapshots make it POSSIBLE to make consistent copies of the data, but a snapshot is not, in itself, a copy of the data. To prove my point, consider the case of a disk sector that goes bad. Unless that particular sector changed, your snapshot probably does NOT contain a copy of the data in that sector. The snapshot just points to the original sector. Hence the snapshot itself cannot be used to restore the data.

When I use rsync or rdiff-backup, I get a complete second copy of the data.

syg00 02-16-2011 01:37 AM

No-one has suggested a snap is a "secure" backup - me included
Quote:

Originally Posted by me
However if you take a snap, and then back that up...

You're not listening to what has been said.

acid_kewpie 02-16-2011 02:15 AM

Quote:

Originally Posted by forbin (Post 4259914)
syg00, I admit that I don't know much about LVM snapshots, but everything I've read here and elsewhere confirms that snapshots are NOT backups. They are simply a MEANS to a backup. A backup is a full copy of the data. When you are done with a backup, you end up with TWO copies of your data, like you would if you made a backup copy of a DVD. You can completely destroy the original and still have a good full copy. Taking a snapshot does not give you two copies of your data. It gives you a point-in-time view of a filesystem plus a delta of changes. Snapshots make it POSSIBLE to make consistent copies of the data, but a snapshot is not, in itself, a copy of the data. To prove my point, consider the case of a disk sector that goes bad. Unless that particular sector changed, your snapshot probably does NOT contain a copy of the data in that sector. The snapshot just points to the original sector. Hence the snapshot itself cannot be used to restore the data.

When I use rsync or rdiff-backup, I get a complete second copy of the data.

Well I'd say that it is a genuinely valid form of backup. If you have a system where you can run a process, perform significant changes to a set of files, totally wreck the joint, and then perform another process to get back to your starting point, that surely is a backup by any other name? It just depends what you are trying to protect from. You'll not be protected from physical disk failure with this as your "backup" method, but then neither would you be if you, for example, dd'd a partition into a file on a different partition on the same drive, but that is a full data copy....


All times are GMT -5. The time now is 08:18 AM.