LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Need help with a short Linux command line script - working with split files (http://www.linuxquestions.org/questions/linux-newbie-8/need-help-with-a-short-linux-command-line-script-working-with-split-files-488868/)

bpmee 10-02-2006 08:42 PM

Need help with a short Linux command line script - working with split files
 
Hi All,

First off, thank:) s in advance to anyone that can help me on this one..

The following is what I want to script. I know bits and pieces of it already :scratch: , but I am missing the more complicated sections that work with variables, etc.. here goes:

Code:

# /bin/sh
## First, I want to split the main file into smaller files of 850 lines each

cd /splitfiles

split -l 850 /sourcefolder/mainfile.txt part

## Result: partxaa.txt , partxab.text etc.

## Next, I want to count how many pieces were made

ls *.txt | wc -w

## Result: x number of text files
## ie. two text files would be output: 2

** Now I need help!!!

## Next, I want to create x number of folders in a directory in the amount
## of the number given from amount of text files (following our example from above
## this would be 2). I have a list of folder names in file dirnames.txt, each on its
## own line. From our example I want to create 2 folders with 2 random name selections from
## the folder name list, for example names: foo foo2

cat dirnames.txt ...some command - pull 2 random names from the list

... some command sequence as described above then:

cd /home/var/www/mysite
mkdir foo *randomfoldername1* foo2 *randomfoldername2*

## once the folders have been created, I want to copy files to each folder
## The files for each folder are the same, EXCEPT for our split text files from the first step

[root@ mysite]#
cp /path/to/filesandfolderstocopy/*.* mysite/foo *randomfoldername1* then
cp /path/to/filesandfolderstocopy/*.* mysite/foo2 *randomfoldername2*

## If, possible, could the two lines above be condensed to something with a variable? $1?

## Now grab first split file from our first step, and put it into the first folder created:

... some command to identify the first file in the split files folder

mv /splitfiles/partxaa.txt /home/var/www/mysite/foo/partxaa.txt

## Now continue in the same fashion for the next file or files in the "splitfiles" folder:

mv /splitfilespartx..etc /home/var/www/mysite/foo2/partxab.txt ..etc

## continue until all split files are moved to their folders. There will be enough folders present
## because we counted the number of splits in the second step.

## Finally, chmod all folders 755 - using a helpful code snippet I found:

find /home/var/www/mysite -maxdepth 1 -type d ...collect the directory names ie. foo , foo2
chmod 755 foo , foo2 , others

## DONE!


bulliver 10-02-2006 11:05 PM

This seems a rather arbitrary thing to do. Can you explain what you are trying to accomplish?

Anyway, I don't think there is a way to pick a random number in shell directly, so try something like:
Code:

RAND_UPPER=$(cat dirnames.txt | wc -l)
RAND_NUMBER=$(ruby -e "puts rand(1-${RAND_UPPER}+1)")
RAND_filename=$(sed -n ${RAND_NUMBER}p dirnames.txt)

Here, we set a lower bounds of 1, and an upper bounds of the number of filenames in your file.
We get a random number from Ruby, and grab a line from your file using said random number.

You will of course need to loop over this to grab as many names from your file as you have your 'chunks'.
Also, this makes no provisions to make sure it doesn't grab the same name twice. You will have to keep track of your first and subsequent 'random' line numbers and try for another if you pull a duplicate.

As for your 'find' line, have a look at the '-exec' command in the find man page. It will run a command for each file it finds.

This would really be _so_ much easier if you were using a 'real' language like Ruby, Python, or Perl...

Anyway, hopefully I have given you a few ideas. Work some more on it, and post back if you get stuck...

bpmee 10-03-2006 11:54 PM

Thanks Bulliver, I came across this script in Perl which may help
 
Hi Bulliver, others,

Thanks for your tip about my script. I think Perl might be the way to go on this one, since I believe it can be executed from the command line or as a CRON job...Admittedly, I really have no clue how to code Perl, just a general understanding of it through countless trial and error efforts.

I've found the a script for my first step, named splitfile.pl, which is invoked by the following:

Step 1:
[root@]# perl splitfile.pl -l 500 wordlist.txt file

Where -l = lines for each file, then source file, then output file prefix

I'd list the URL for the source, but I don't have sufficient privileges yet.

In my next step, I originally wanted to call a command line function that counts the number of files created, then uses that count to pull random new folder names from a text list of possible names.

Bulliver pointed out the error with this step: there is a chance that the same new folder name could be pulled from the list, thus screwing up the whole process.

So, now I will simplify it, but I don't know the perl for doing the rest:

Step 2: Collect all file names created by the split process, for example: fooaaa, fooaab, fooaac but *without* the .txt extension for now - basename I think...

Step 3: Go to a selected web directory and create new folders named according to the filenames collected in step 2; for example /var/www/user1/web/fooaaa , then make /var/www/user1/web/fooaab , /var/www/user1/web/fooaac etc. Make sure to create a folder for each name...

Step 4: After all folders have been created, go to the first folder, fooaaa, and copy files from a predefined source on the server: example copy /home/sourcefilesandfolders/*.* to /var/www/user1/web/foooaaa/ .

Step 5: Among the folders and files copied to these new directories will be a sub-folder named: abc , such that the complete directory path to folder abc would be: /var/www/user1/web/fooaaa/abc . Now return to the directory containing the split files and move file fooaaa to /var/www/user1/web/fooaaa/abc/mylist.txt - making sure to rename it to mylist.txt .

Repeat in the same fashion for each remaining split file, moving it to sub-directory abc with name mylist.txt of the folder that was created with its original "split name"...

In the end, we would have:
/var/www/user1/web/fooaaa/abc/mylist.txt
/var/www/user1/web/fooaab/abc/mylist.txt
etc.

Step 6: Chmod all /var/www/user1/web/foo* folders to 755

Step 7: Open crontab for user1 and begin with the following commands:

Code:

0 0 * * * /usr/bin/php /var/www/user1/web/fooaaa/abc/myphpfunction.php
Step 8: Starting at Cron time 0 0 * * * . Write command for myphpfunction.php for each foo* directory created, but for each successive command, increment the Cron Exec time by 5 minutes. When 12 sets have been written to the crontab, increment hour value by 1 (since 12 x 5 minutes = one hour has past) and continue until commands for *every* folder created have been written...

It is not likely that 24 hours worth of new folder commands will be needed, since this would be 12/hour x 24 hours equaling 288 unique commands, which is essentially 288 new foo* created folders.

Thanks again for the help!

bpmee 10-04-2006 03:08 AM

OK...I switched to Perl, and this is what I have so far, but I get an error
 
Hi,

Ok I switched over to Perl and was able to scrounge up some snippets for the following process, which effectively takes care of steps 1-3 in my amended plan, * SEE POST IMMEDIATELY ABOVE*.

Now, however, I want to copy the contents of the following variable, assigned by my file path: $sourcefiles = "/home/soure"; . Inside folder "source" is all the files I wish to copy over to my new folders...

I tried to use "foreach" twice, being naive and not aware of any other way of copying (not moving) files..

I tried copy::file, but there is no wildcard for copying all files as in copying files from the Command Line...

Code:

#!/usr/bin/perl

use File::Find;
use File::Copy;

$userdir = "/var/www/web12/web/";
$sourcefiles = "/home/source";
$sourcedir = "play";
$ftd1 = "bm";
opendir $sourcedir, ".";
@contents = grep /$ftd1/, readdir $sourcedir;
closedir $sourcedir;
foreach $listitem ( @contents )
{
  chdir $userdir;
  mkdir $listitem, 0755;
         
### WORKS UP TO HERE, BELOW I WHERE I GOOF -
### PERHAPS FIND:FILE and COPY COMBO?

opendir $sourcefiles, ".";
        @contents2 = grep !/^\.\.?$/, readdir $sourcefiles;
        closedir $sourcefiles;
        foreach $listitem2 ( @contents2 )
        {
                copy($listitem2, $userdir.$listitem."/")
        }
}

Thanks again to all for any help!


All times are GMT -5. The time now is 04:04 AM.