How to virtually join files (i.e. without directing cat to a new file)?
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Location: At the 100th Meridian where the great plains begin
Distribution: Debian Testing on T60 laptop
Posts: 105
Rep:
How to virtually join files (i.e. without directing cat to a new file)?
I have a few large files that together form an image of a disk partition. In other words, if combined they would be the image of a partition, but it has been split so that each individual file is smaller.
I want to join them together in order to mount and view the contents of the partition, but I don't have enough space on my drive to cat them together and write them as a new file on the drive. Another benefit could be that if the end result is very large then you wouldn't have to wait forever for it all to be catted together.
It seems like such a simple thing to just virtually join them together without having to actually go through the process of using cat, but if I search for how to do this I don't find anything about it. Is there a simple way to do this?
If the files are big enough to hold in memory, then the solution is easy: just create a big tmpfs partition and cat the files together onto that. But I guess for something as big as a filesystem, that's not going to be the case.
It should be possible to do this using a simple program written on top of fuse. I started trying to rig up a simple program using perl's Fuse.pm module, but it turned out to be a bit more complicated than I thought. It seems to work for me, and it's largely based on the Fuse example class - but be warned: I haven't extensively tested it. I doubt that it will damage any data so long as you treat the virtual files as read-only, but it's not pretty, I make no promises about it being fast (it could be very slow for large files - or not) and there may be bugs in it that could result in bad data in the virtual file. Oh, and don't try and write to the file (i.e. no fsck).
To run this, you will need a directory named "temp" in your home directory (this is where the virtual file will go), and some software that you should be able to install through your package manager: the fuser kernel module (which you probably have already), the libfuser libraries, Perl, and the Fuse.pm module for Perl.
To install Fuse.pm, open a root shell and type "perl -MCPAN -e shell", then "install Fuse". You may be prompted for defaults, and you can exit when finished. On Ubuntu, that didn't work for me but there's a simple package you can install instead, use "sudo apt-get install libfuse-perl".
The array at the top of the script (just under where it says "my @files = ") contains the names of the files that make up the virtual file, in order. You will probably want to change this.
To mount the filesystem, just run the script - but be warned that this will lock your terminal, so have another one ready to access the files in. To unmount and free up the first terminal, use "fusermount -u ~/temp" (you should also run this if the script crashes for any reason to remove the mount point so you can start again).
Finally, here's the script:
Code:
#!/usr/bin/perl -w
use warnings;
use strict;
use Fuse qw(:all);
my @files = (
"/tmp/one",
"/tmp/two",
"/tmp/three",
);
# get size of files
my %filesize = map { $_ => sizeof($_) } @files;
my $totalfilesize = 0;
map { $totalfilesize += $filesize{$_} } @files;
warn "Starting with total file size $totalfilesize; do not modify files while filesystem mounted!";
sub sizeof {
my $file = shift;
my ($size) = (`wc -c $file` =~ /(\d+)/);
return $size;
}
my (%files) = (
'.' => {
type => 0040,
mode => 0755,
ctime => time()-1000
},
catenated => {
cont => "This is file 'b'.\n",
type => 0100,
mode => 0644,
ctime => time()-1000
},
);
sub filename_fixup {
my ($file) = shift;
$file =~ s,^/,,;
$file = '.' unless length($file);
return $file;
}
sub e_getattr {
my ($file) = filename_fixup(shift);
$file =~ s,^/,,;
$file = '.' unless length($file);
return -ENOENT() unless exists($files{$file});
my ($size) = exists($files{$file}{cont}) ? length($files{$file}{cont}) : 0;
$size = $totalfilesize;
my ($modes) = ($files{$file}{type}<<9) + $files{$file}{mode};
my ($dev, $ino, $rdev, $blocks, $gid, $uid, $nlink, $blksize) = (0,0,0,1,0,0,1,1024);
my ($atime, $ctime, $mtime);
$atime = $ctime = $mtime = $files{$file}{ctime};
# 2 possible types of return values:
#return -ENOENT(); # or any other error you care to
#print(join(",",($dev,$ino,$modes,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks)),"\n");
return ($dev,$ino,$modes,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks);
}
sub e_getdir {
# return as many text filenames as you like, followed by the retval.
print((scalar keys %files)."\n");
return (keys %files),0;
}
sub e_open {
# VFS sanity check; it keeps all the necessary state, not much to do here.
my ($file) = filename_fixup(shift);
print("open called\n");
return -ENOENT() unless exists($files{$file});
return -EISDIR() if $files{$file}{type} & 0040;
print("open ok\n");
return 0;
}
sub e_read {
# return an error numeric, or binary/text string. (note: 0 means EOF, "0" will
# give a byte (ascii "0") to the reading program)
my ($file) = filename_fixup(shift);
my ($buflen,$off) = @_;
return -ENOENT() unless exists($files{$file});
if(!exists($files{$file}{cont})) {
return -EINVAL() if $off > 0;
my $context = fuse_get_context();
return sprintf("pid=0x%08x uid=0x%08x gid=0x%08x\n",@$context{'pid','uid','gid'});
}
return -EINVAL() if $off > $totalfilesize; #length($files{$file}{cont});
return 0 if $off == $totalfilesize; #length($files{$file}{cont});
my ($o, $i);
$o = $off; $i = 0;
while ($o > $filesize{$files[$i]}) {
$o -= $filesize{$files[$i]};
$i++;
}
my $read = 0;
my $offset = $off;
my $rtn;
my $ret = "";
# Read up to min($buflen,$totalfilesize-$off) bytes
while ($read < $buflen && $read+$off < $totalfilesize && defined $files[$i]) {
open IN, "<$files[$i]" or return -EINVAL;
seek IN, $offset, 0;
my $r = read IN, $rtn, $buflen - $read;
$read += $r;
close IN;
$o -= $filesize{$files[$i]};
$i++;
$offset = 0; # one iteration per file, so next iteration reads from the start
$ret = $ret . $rtn;
}
return "$ret";
}
sub e_statfs { return 255, 1, 1, 1, 1, 2 }
Fuse::main(
"mountpoint" => "$ENV{HOME}/temp",
"getattr"=>"main::e_getattr",
"getdir" =>"main::e_getdir",
"open" =>"main::e_open",
"statfs" =>"main::e_statfs",
"read" =>"main::e_read",
);
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.