LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 08-21-2009, 05:43 PM   #1
chadwick
Member
 
Registered: Apr 2005
Location: At the 100th Meridian where the great plains begin
Distribution: Debian Testing on T60 laptop
Posts: 105

Rep: Reputation: 17
How to virtually join files (i.e. without directing cat to a new file)?


I have a few large files that together form an image of a disk partition. In other words, if combined they would be the image of a partition, but it has been split so that each individual file is smaller.

I want to join them together in order to mount and view the contents of the partition, but I don't have enough space on my drive to cat them together and write them as a new file on the drive. Another benefit could be that if the end result is very large then you wouldn't have to wait forever for it all to be catted together.

It seems like such a simple thing to just virtually join them together without having to actually go through the process of using cat, but if I search for how to do this I don't find anything about it. Is there a simple way to do this?

Last edited by chadwick; 08-21-2009 at 07:58 PM.
 
Old 08-21-2009, 09:54 PM   #2
rjlee
Senior Member
 
Registered: Jul 2004
Distribution: Ubuntu 7.04
Posts: 1,994

Rep: Reputation: 76
There seems to have been some discussion on this on the kernel mailing lists (http://lkml.indiana.edu/hypermail/li...02.3/0464.html) but it didn't amount to much.

If the files are big enough to hold in memory, then the solution is easy: just create a big tmpfs partition and cat the files together onto that. But I guess for something as big as a filesystem, that's not going to be the case.

It should be possible to do this using a simple program written on top of fuse. I started trying to rig up a simple program using perl's Fuse.pm module, but it turned out to be a bit more complicated than I thought. It seems to work for me, and it's largely based on the Fuse example class - but be warned: I haven't extensively tested it. I doubt that it will damage any data so long as you treat the virtual files as read-only, but it's not pretty, I make no promises about it being fast (it could be very slow for large files - or not) and there may be bugs in it that could result in bad data in the virtual file. Oh, and don't try and write to the file (i.e. no fsck).

To run this, you will need a directory named "temp" in your home directory (this is where the virtual file will go), and some software that you should be able to install through your package manager: the fuser kernel module (which you probably have already), the libfuser libraries, Perl, and the Fuse.pm module for Perl.

To install Fuse.pm, open a root shell and type "perl -MCPAN -e shell", then "install Fuse". You may be prompted for defaults, and you can exit when finished. On Ubuntu, that didn't work for me but there's a simple package you can install instead, use "sudo apt-get install libfuse-perl".

The array at the top of the script (just under where it says "my @files = ") contains the names of the files that make up the virtual file, in order. You will probably want to change this.

To mount the filesystem, just run the script - but be warned that this will lock your terminal, so have another one ready to access the files in. To unmount and free up the first terminal, use "fusermount -u ~/temp" (you should also run this if the script crashes for any reason to remove the mount point so you can start again).

Finally, here's the script:
Code:
#!/usr/bin/perl -w

use warnings;
use strict;
use Fuse qw(:all);

my @files = (
    "/tmp/one",
    "/tmp/two",
    "/tmp/three",
    );

# get size of files
my %filesize = map { $_ => sizeof($_) } @files;

my $totalfilesize = 0;
map { $totalfilesize += $filesize{$_} } @files;

warn "Starting with total file size $totalfilesize; do not modify files while filesystem mounted!";

sub sizeof {
    my $file = shift;
    my ($size) = (`wc -c $file` =~ /(\d+)/);
    return $size;
}
my (%files) = (
        '.' => {
                type => 0040,
                mode => 0755,
                ctime => time()-1000
        },
        catenated => {
                cont => "This is file 'b'.\n",
                type => 0100,
                mode => 0644,
                ctime => time()-1000
        },
);

sub filename_fixup {
        my ($file) = shift;
        $file =~ s,^/,,;
        $file = '.' unless length($file);
        return $file;
}

sub e_getattr {
        my ($file) = filename_fixup(shift);
        $file =~ s,^/,,;
        $file = '.' unless length($file);
        return -ENOENT() unless exists($files{$file});
        my ($size) = exists($files{$file}{cont}) ? length($files{$file}{cont}) : 0;
	$size = $totalfilesize;
        my ($modes) = ($files{$file}{type}<<9) + $files{$file}{mode};
        my ($dev, $ino, $rdev, $blocks, $gid, $uid, $nlink, $blksize) = (0,0,0,1,0,0,1,1024);
        my ($atime, $ctime, $mtime);
        $atime = $ctime = $mtime = $files{$file}{ctime};
        # 2 possible types of return values:
        #return -ENOENT(); # or any other error you care to
        #print(join(",",($dev,$ino,$modes,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks)),"\n");
        return ($dev,$ino,$modes,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks);
}
sub e_getdir {
        # return as many text filenames as you like, followed by the retval.
        print((scalar keys %files)."\n");
        return (keys %files),0;
}

sub e_open {
        # VFS sanity check; it keeps all the necessary state, not much to do here.
        my ($file) = filename_fixup(shift);
        print("open called\n");
        return -ENOENT() unless exists($files{$file});
        return -EISDIR() if $files{$file}{type} & 0040;
        print("open ok\n");
        return 0;
}

sub e_read {
        # return an error numeric, or binary/text string.  (note: 0 means EOF, "0" will
        # give a byte (ascii "0") to the reading program)
        my ($file) = filename_fixup(shift);
        my ($buflen,$off) = @_;
        return -ENOENT() unless exists($files{$file});
        if(!exists($files{$file}{cont})) {
                return -EINVAL() if $off > 0;
                my $context = fuse_get_context();
                return sprintf("pid=0x%08x uid=0x%08x gid=0x%08x\n",@$context{'pid','uid','gid'});
        }
        return -EINVAL() if $off > $totalfilesize; #length($files{$file}{cont});
        return 0 if $off == $totalfilesize; #length($files{$file}{cont});
	my ($o, $i);
	$o = $off; $i = 0;
	while ($o > $filesize{$files[$i]}) {
	    $o -= $filesize{$files[$i]};
	    $i++;
	}
	my $read = 0;
	my $offset = $off;
	my $rtn;
	my $ret = "";
	# Read up to min($buflen,$totalfilesize-$off) bytes
	while ($read < $buflen && $read+$off < $totalfilesize && defined $files[$i]) {
	    open IN, "<$files[$i]" or return -EINVAL;
	    seek IN, $offset, 0;
	    my $r = read IN, $rtn, $buflen - $read;
	    $read += $r;
	    close IN;
	    $o -= $filesize{$files[$i]};
	    $i++;
	    $offset = 0; # one iteration per file, so next iteration reads from the start
	    $ret = $ret . $rtn;
	}
	return "$ret";
}

sub e_statfs { return 255, 1, 1, 1, 1, 2 }

Fuse::main(
    "mountpoint" => "$ENV{HOME}/temp",
    "getattr"=>"main::e_getattr",
    "getdir" =>"main::e_getdir",
    "open"   =>"main::e_open",
    "statfs" =>"main::e_statfs",
    "read"   =>"main::e_read",
    );
 
Old 08-23-2009, 03:12 PM   #3
TimothyEBaldwin
Member
 
Registered: Mar 2009
Posts: 249

Rep: Reputation: 27
If the joins are reasonably aligned (512 bytes?) use losetup to map them to loop devices than use dmsetup to join the block devices.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
cat FILE|sort +1 -n Returns ... +1: No such file or directory musician Linux - Newbie 1 07-23-2009 01:28 PM
cat file over ssh and put it into log file sinister1 Programming 2 04-24-2009 12:54 AM
Combines 16000 files into 1 single file > error tb: /bin/cat: Argument list too long guanyu Linux - General 4 02-09-2007 12:33 AM
filesize limit on 'cat $file | sed > $file' drkstr Linux - Software 2 07-10-2006 02:47 AM
trying to redirect text to a file to cat at later point. says file doesn't exist. dr_zayus69 Programming 1 10-02-2005 08:10 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 08:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration