LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-07-2010, 02:07 PM   #1
prochobo
LQ Newbie
 
Registered: May 2010
Posts: 2

Rep: Reputation: 0
Indexing multiple drives with Perl


I have 4 separate drives, all un-RAIDed. Each drive has almost the same root directories, but different contents in each folder.

For example:
/mnt/sda/TV Shows/Lost/
/mnt/sdb/TV Shows/Your_Favorite_Show/

The goal is to have a single folder that has symlinks to all the files in each of the drives. Pretty much a poor man's JBOD.

Previously, I had problems with conditions like 2 drives having the same sub folder contents, but I ended up solving that with the current script I'm using now.

What I'm looking for now is speed. I'm very new to Perl and the script takes about 12 minutes to complete with the current drives.

Basically, the script makes a list of all directories and files in each drive. First, it makes the directories. I didn't use any validation because if a directory already exists, it simply won't make one. However, with the files, I used a hash to only keep the unique files. Then I use the key/value pairs with ln to create every link to the files only, not directories.

Here's my very simple code to get that done. Any ideas on how to speed it up or perhaps any insight to a different approach?

Code:
#!/usr/bin/perl

use warnings;

my @drives_to_sync = qw ( /mnt/sda/ /mnt/sdb/ /mnt/sdc/ /mnt/sdd/);
my @folders_to_sync;
my %create_folders;
my %file_links;

foreach (@drives_to_sync) {
	my @folder_to_create = `find $_ -type d`;
	my @files_to_create = `find $_ -type f`;
	
	foreach (@folder_to_create) {
		chomp ($_);
		my $dest = $_;
		$create_folders{ $_ } = &link_location($dest);
		mkdir $_;
	}
	
	foreach (@files_to_create) {
		chomp ($_);
		my $dest = $_;
		$file_links{ $_ } = &link_location($dest);
	}
}

sub link_location {
	s/\/mnt\/sd./\/mnt\/test/;
	$_[0];
}

foreach $key (keys %file_links) {
	$value = $file_links{$key};
	#print "$key => $value\n";
	`ln -s "$value" "$key"`;
}
 
Old 05-07-2010, 02:29 PM   #2
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by prochobo View Post
I have 4 separate drives, all un-RAIDed. Each drive has almost the same root directories, but different contents in each folder.

For example:
/mnt/sda/TV Shows/Lost/
/mnt/sdb/TV Shows/Your_Favorite_Show/

The goal is to have a single folder that has symlinks to all the files in each of the drives. Pretty much a poor man's JBOD.

Previously, I had problems with conditions like 2 drives having the same sub folder contents, but I ended up solving that with the current script I'm using now.

What I'm looking for now is speed. I'm very new to Perl and the script takes about 12 minutes to complete with the current drives.

Basically, the script makes a list of all directories and files in each drive. First, it makes the directories. I didn't use any validation because if a directory already exists, it simply won't make one. However, with the files, I used a hash to only keep the unique files. Then I use the key/value pairs with ln to create every link to the files only, not directories.

Here's my very simple code to get that done. Any ideas on how to speed it up or perhaps any insight to a different approach?

Code:
#!/usr/bin/perl

use warnings;

my @drives_to_sync = qw ( /mnt/sda/ /mnt/sdb/ /mnt/sdc/ /mnt/sdd/);
my @folders_to_sync;
my %create_folders;
my %file_links;

foreach (@drives_to_sync) {
	my @folder_to_create = `find $_ -type d`;
	my @files_to_create = `find $_ -type f`;
	
	foreach (@folder_to_create) {
		chomp ($_);
		my $dest = $_;
		$create_folders{ $_ } = &link_location($dest);
		mkdir $_;
	}
	
	foreach (@files_to_create) {
		chomp ($_);
		my $dest = $_;
		$file_links{ $_ } = &link_location($dest);
	}
}

sub link_location {
	s/\/mnt\/sd./\/mnt\/test/;
	$_[0];
}

foreach $key (keys %file_links) {
	$value = $file_links{$key};
	#print "$key => $value\n";
	`ln -s "$value" "$key"`;
}
The item in red is bad. For no good reason you connect to the child's process STDOUT which you discard. And starting child process is time consuming.

In order to replace item in red read

perldoc -f symlink
.
 
Old 05-07-2010, 03:07 PM   #3
prochobo
LQ Newbie
 
Registered: May 2010
Posts: 2

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Sergei Steshenko View Post
The item in red is bad. For no good reason you connect to the child's process STDOUT which you discard. And starting child process is time consuming.

In order to replace item in red read

perldoc -f symlink
.
Thanks! I didn't know there was a symlink function. It now takes just over a minute.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Multiple Directory Indexing and Security?? your_shadow03 Linux - Newbie 3 10-23-2009 03:51 PM
recoll indexing error while indexing Mail/ [SOLVED] ajnabi Linux - Software 1 09-14-2009 12:44 AM
Multiple Blade servers booting from multiple disk drives simultaneously NGC_cheryl Linux - Enterprise 0 11-26-2007 08:38 AM
multiple drives cs-cam General 2 02-16-2005 05:46 AM
Multiple Hard Drives TuxFreak Linux - General 3 12-22-2004 12:11 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:01 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration