LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   File is larger when being copied, than what it actually is (https://www.linuxquestions.org/questions/linux-server-73/file-is-larger-when-being-copied-than-what-it-actually-is-4175706908/)

Red Squirrel 01-24-2022 07:35 PM

File is larger when being copied, than what it actually is
 
EDIT: changed title so it's more relavant to what the issue actually is as at first I did not realize what was going on.

So I have this vm file I want to rsync to another server to back it up, because I need to format and rebuild the host and I want to save from having to recreate the VM and rebuild that too.

According to "du" the file is 17GB, which makes sense for a fairly standard Linux distro with only a bit of data on it.

But according to everything else, it's essentially as big as the entire file system. For example if I try to copy it, it just keeps copying even when it gets to 17GB. In stat as well as dir, it also shows it's much bigger.

What is going on?

Quote:

root@server04:100# du -h vm-100-disk-0.qcow2
17G vm-100-disk-0.qcow2


root@server04:100# stat vm-100-disk-0.qcow2
File: vm-100-disk-0.qcow2
Size: 3221717254144 Blocks: 33723240 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 179961860 Links: 1
Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-01-25 13:07:43.003980187 +0000
Modify: 2022-01-24 02:40:04.772365413 +0000
Change: 2022-01-24 02:40:04.772365413 +0000
Birth: -



root@server04:100# dir -h
total 17G <--- here it's normal
0 -rw-r--r-- 1 root root 0 Jan 24 21:45 test
17G -rw-r----- 1 root root 3.0T Jan 24 02:40 vm-100-disk-0.qcow2 <--- but here it's not?


In my 20+ years of using Linux I have never seen this before and it's messing with me. This is a server at OVH running Proxmox VE, and the partitioning scheme is really weird, which caused me to run out of disk space at some point and had a weird issue where the disk space was not being released properly after deleting the file, this caused some weird corruption all over as well and nothing really works properly. So I will be rebuilding the server, and this time choosing custom partitioning so I can do something more sane than their default. Their default only allocates 20GB to the / partition and allocates most of it to some oddball folder deep within /var . That's where the file is now as I had to move it when I realized that, but damage was done at that point since / ran out of space then I started getting corruption.

I can't copy this file for the life of me though because instead of only creating a 17GB file at the destination it just keeps growing until the destination runs out of space.




Old post:


I never figured moving a 32GB file would be this hard...

Trying to get a VM off a server so I can format that server and then move the VM back once I reinstall everything. The file system is all messed up since I did not pay attention when configuring it and I ended up with only 20GB on the / file system. When it got full, I ended up with lot of corruption.

I started rsyncing the file to another server that has over 1TB of free space but it erroed out after 10 hours with this:

Quote:

rsync: [receiver] write failed on "/tmp/vm-100-disk-0.qcow2": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(378) [receiver=3.2.3]

rsync: [sender] write error: Broken pipe (32)
What would cause this and how can I ensure it won't happen again? It's a long wait to copy... only for it to fail at the half way mark.

I don't have physical access to this server, it's a dedicated server at OVH, so using USB or other storage device is not an option.


EDIT:

From further research, apparently rsync copies the file to a temp location before it copies to the actual location, so I'm not sure where that location is and which side it's on (client or server) but maybe that was the problem. I used --inplace which supposedly stops that. Will let it go overnight and see what happens and report back.

I'm only getting like 28MB/sec over what is suppose to be a gig network so really not sure what's happening.

jefro 01-24-2022 08:46 PM

The host may have enough space to add a drive to the existing VM then clone??

I assume the network is having issues.

rknichols 01-24-2022 09:43 PM

Quote:

Originally Posted by Red Squirrel (Post 6321737)
From further research, apparently rsync copies the file to a temp location before it copies to the actual location, so I'm not sure where that location is and which side it's on (client or server) but maybe that was the problem.

That temporary location is in the same destination directory where the file will reside, so that the final action is a simple rename within the directory.

If you are copying large files and deleting destination large files (recursing into directories and using one of the "--delete" options), there might not be enough space for all versions. You might want to use the "--delete-before" option to avoid that problem. See the manpage for other considerations about "--delete-before".

Another thing to check is whether you are running out of inodes at the destination ("df -i").

Turbocapitalist 01-24-2022 10:19 PM

There is also the --inplace option to consider.

pan64 01-25-2022 12:37 AM

Quote:

Originally Posted by Red Squirrel (Post 6321737)

EDIT:

From further research, apparently rsync copies the file to a temp location before it copies to the actual location, so I'm not sure where that location is and which side it's on (client or server) but maybe that was the problem.

That is written: rsync has a sender and a receiver side, this message was reported by the receiver. Also write failed on /tmp/<filename>, so the location is /tmp. Sometimes /tmp is configured as a ramdisk and sometimes its size is limited. I don't really know your configuration, but that can be a reason.

shruggy 01-25-2022 05:16 AM

With -T/--temp-dir, you can specify where rsync should create its temporary files.

Red Squirrel 01-25-2022 07:20 AM

Quote:

Originally Posted by jefro (Post 6321750)
The host may have enough space to add a drive to the existing VM then clone??

I assume the network is having issues.

The server I'm dealing with is the actual host . I need to completely format the server because OVH's default partition settings only alowcate 20GB to / which screwed me over and I ran out of space, which caused a bunch of stuff to corrupt. So need to move the VM to another OVH server I happen to still have so I can format the host through the control panel. But yeah, not sure why it's transferring so slow. Both servers are in the same DC and have a gig connection so in theory I should be getting gig speeds.

I may ask tech support if they can stick a USB stick in the server or something but not sure if that's something they would do.


Oh and /tmp is where I was actually writing the file. There is over 1TB of space on / and tmp is part of that partition. But I wonder if that folder is treated differently so I made another folder that is separate from tmp and trying again, and also used --inplace, so we'll see.

It failed at around 50% last time, I'm around 1-2 hours away from hitting 50% so I'll see if it keeps going after that.

Red Squirrel 01-25-2022 12:48 PM

FFS. Happened again. But here is the weird part, the file is only 17GB at the source, but at the destination it's 1.7TB so it really is filling up the drive. Why is it doing that? It's making the file bigger at the other end somehow. It's like if it's just copying non stop even when it's done. I have never seen that ever.

Code:

root@server04:100# rsync --inplace --verbose --progress vm-100-disk-0.qcow2 debian@192.99.10.155:/localdata/
vm-100-disk-0.qcow2
1,816,239,013,888  56%  28.39MB/s  13:25:44  rsync: [receiver] write failed on "/localdata/vm-100-disk-0.qcow2": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(378) [receiver=3.2.3]

rsync: [sender] write error: Broken pipe (32)




Code:

total 16861620
      0 -rw-r--r-- 1 root root            0 Jan 24 21:45 test
16861620 -rw-r----- 1 root root 3221717254144 Jan 24 02:40 vm-100-disk-0.qcow2
root@server04:100#
root@server04:100#
root@server04:100# du -h
17G        .
root@server04:100#



At this point I think I'm just going to cut my losses and format the server without saving the VM and just rebuild the VM from scratch. I was really trying to save myself the trouble because I had setup SSL certs, apache config etc and a lot of tedious crap and now I'm going to have to do it over again. But I wasted more time trying to save this file at this point than what it would have taken me to just rebuild it.

But I'm just stumped at what is going on. I have never seen this before.

boughtonp 01-25-2022 01:11 PM

Quote:

Originally Posted by Red Squirrel (Post 6321984)
the file is only 17GB at the source

It might only be using 17GB of disk space, but that listing clearly shows the file is ~3TB.

This is a sign of an optimization called "thin provisioning", and possibly means you want the -sparse option, or possibly to convert/compress the file before transfer (it seems the sparse option might only apply after transfer, so it still sends the full 3TB of data).


Red Squirrel 01-25-2022 02:43 PM

That is odd. Is there a way to disable this thin provisioning? Never seen that before, at least not at a normal file level. It is a VM and it is thin provisioned but I thought that was handled by the VM software? The raw file itself should just be a normal file as far as rsync is concerned.

shruggy 01-25-2022 02:58 PM

I don't think this is enabled by default unless you specify -S/--sparse.

Red Squirrel 01-25-2022 03:11 PM

Trying sparse with and without --inplace but that does not seem to be working either. It just keeps copying even when it's past 17GB. This is the weirdest thing.

"stat" also seems to say the file is much larger than it actually is. I had not noticed that at first in the dir command. Only du seems to be reporting the proper size.

Red Squirrel 01-25-2022 03:27 PM

Tried to see if I can use dd. But even dd goes over the file size! This is the weirdest thing ever. I'm ready to just cut my losses and not bother saving the VM but I need to figure out what is going on in case I run into this again. I've never seen anything like this before.

boughtonp 01-25-2022 06:04 PM


 
If dd is acting on the larger file, then it means your filesystem is aware of sparse files.

Presumably there's a way to get dd to behave on the actual data - it might be conv=sparse or it might be another option.

Something else that might work is converting the file with qemu-img.

If the file is mostly empty, it should compress well - you could try rsync's --compress option and see if that helps, though the target machine might still run out of space; if so manually compressing and transferring that file might be needed.

(Also, du has --apparent-size to have it report larger sizes of sparse files instead of actual usage.)


Red Squirrel 01-25-2022 06:37 PM

So what is going on exactly though that it's doing this? I've never run into this at all before.

I've also tried to grab the file from the other end, but it still tries to pull down more data. --sparse on rsync did not seem to help either way. conv=sparse on dd either. Though with dd I was testing it locally just copying it right next to the same file. Is there a way to do dd over a network? I can try that.

Before I format the server I will try to go in rescue mode and do it from there. Maybe this is something the OS is doing and it won't happen in rescue mode.


Googled what sparse means as it's first I hear about it. So guess it's a special kind of file that uses the file system in a more raw way and is preallocated to specific blocks? I wonder how it got created that way. I will play around with qemu-img to see if I can convert it to a normal file.

This is what I'm trying and it's having trouble though, am I doing something wrong?

Quote:

root@server04:100# qemu-img convert -f qcow2 -O raw vm-100-disk-0.qcow2 vm-100-disk-0.img -S 18432M
qemu-img: Invalid buffer size for sparse output specified. Valid sizes are multiples of 512 up to 16777216. Select 0 to disable sparse detection (fully allocates output).
Also tried dd options such as conv=sparse and conv=notrunc, no go.


All times are GMT -5. The time now is 11:23 AM.