AIXThis forum is for the discussion of IBM AIX.
eserver and other IBM related questions are also on topic.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I would be very thankful if you can help me out.
Here is a scenario. I have an ftp script which runs every day and brings sysdate-1 files from the ftp server.we run this script on a daily basis and archive them after the files are processed into the data warehouse for each day.
Now I need a script which has to look at the archive folder and find the latest processed file date ( say it found 11/27/2007 in the archive folder), then it has to pull the files after that date till yesterday from the ftp server.
Here is my requirement..do you guys have any sample script which does this check?
Could you please send me if u have??
I would suggest using perl for this, and in particular the nice Date::Calc library. Here is a simple script, not as compact as a typical perl hacker might like but spelled out so that you can see what's going on. The Date_to_Days function in Date::Calc returns the number of days since Jan 1, 1 A.D. and therefore makes the decision about which files to fetch easy. You would need the ncftp package for this, it contains the useful commands ncftpls and ncftpget.
Code:
#!/usr/bin/perl -w
#
# Usage: getRecent username password localDir
use Date::Calc qw( Decode_Date_EU Today This_Year Date_to_Days Decode_Month);
my ( $user, $pass, $localDir, $remoteFTPsite );
my ( $latestLocalDate, $latestDateAsNum, $todayAsNum );
my ( $day, $month, $year, $fileName, $fileDateAsNum, $wantedFiles, $nWanted );
my @field;
# grab some vital parameters from the command line:
$user = shift || die "must specify user password and local dir\n";
$pass = shift || die "must specify password and local dir\n";
$localDir = shift || die "must specify local dir\n";
# where the remote stuff can be found:
$remoteFTPsite = "ftp://remote-data-store.com/reports/";
# get the date of the most recent file in $localDir. The date will
# be expressed as DD/MM/YYYY, see 'man ls' and 'man date'. We grab
# the second line (after 'total' stuff) and its sixth field:
$latestLocalDate = `/bin/ls -lt --time-style=+'%d/%m/%Y' $localDir | gawk 'NR==2{print \$6}'`;
chomp $latestLocalDate;
$latestDateAsNum = Date_to_Days(Decode_Date_EU($latestLocalDate));
# get today as a number, and this year since ls info does not include
# the year, we assume (check this for your application):
$todayAsNum = Date_to_Days(Today());
$year = This_Year();
# fetch a listing of the remote files. Lines usually look like this:
# -rw-r--r-- 1 fred users 9327 Oct 17 14:45 CustomerReport0977.txt
# field: 0 1 2 3 4 5 6 7 8
# Note that's a minus-el not a minus-one in the command below:
open(REMOTE_LS, "/usr/bin/ncftpls -u $user -p $pass -E -l $remoteFTPsite |");
$wantedFiles = "";
$nWanted = 0;
while(<REMOTE_LS>) {
chomp; # clean up the line
@field = split; # split line into fields
$month = Decode_Month($field[5]);
$day = $field[6];
$fileName = $field[8];
$fileDateAsNum = Date_to_Days($year, $month, $day);
print "$day/$month/$year $fileName $fileDateAsNum\n";
if($fileDateAsNum > $latestDateAsNum && $fileDateAsNum < $todayAsNum) {
# ok, we want this file:
$wantedFiles = "$wantedFiles $fileName";
$nWanted++;
printf("... %02d/%02d/%d %s\n", $day, $month, $year, $fileName);
}
}
close(REMOTE_LS);
# Once you are happy, change printf to system in the line below:
if($nWanted > 0) {
printf("ncftpget -u $user -p $pass -E $remoteFTPsite $localDir $wantedFiles");
} else {
print "no files to fetch\n";
}
Also keep in mind that you don't need to stay all perl... I often use bash scripts as wrappers. Let me share some of what I do as an example case:
Code:
14:59 ~/Catalog$ cat catalog.sh
#!/bin/bash
umask 007
#
# catalog.sh -- a script to call legacy catalog compile scripts.
#
echo "$0: catalog compile and update script"
echo ""
echo "Running compile_catalog.pl to build html files"
./compile_catalog.pl
if [ ! $? ]; then
echo "catalog compile failed; edit source file or catalog.pl"
exit $EX_IOERR
fi
echo " ... done."
echo ""
echo "Running lftp to update website catalog using commands from ftp_script.scp"
lftp -f ftp_script.scp
if [ ! $? ] ; then
echo "lftp failed"
exit $EX_IOERR
else
echo " ... done."
echo ""
echo "website catalog update is complete"
fi
14:59 ~/Catalog$ cat ftp_script.scp
debug 0
set cmd:parallel 20
set dns:cache-enable
set net:connection-limit 20
open <host>
user <username pass>
CD catalog
LCD /home/www/catalog
MPUT *.html
CD some_dir
LCD /home/www/catalog/some_dir
MPUT *.html
CD ../some_dir2
CD some_dir2
LCD /home/www/catalog/some_dir2
MPUT *.html
CLOSE
exit
lftp is standard install part of ubuntu and allows parallel connections, which really speeds bulk uploads.
Additionally, you can combine the find command with the above .sh and .pl scripts to really get it going.
Here's an example of where I use find in an .sh script to delete intermediately compiled html as well as sourcefile backup files. In this example, the backs were the file_name.<numeric_date_code>
Obviously, you can test the find commands with cp rather than rm... I always make about 3 tests and a backup before rm! See how I set the NUM variable? That's a real good test. Find has real useful time testing, absolute and relative -- and can be compared to a given file's date. Find allows for some amazing single line entries in crontabs. I did a cron's find once that must have been 200 characters long. :P
BTW, I used a perl script to read source_file.txt to generate HTML pages which the top scripts copy to local "current" folder and then ftp up to site. All of this is currently being replaced with php + mySql.
Anyhow, the point of my post is that I tend to combine methods to form the shortest solution to code. Sometimes a shell command is the best (.sh, ftp, and find) or sometimes perl (to handle more complex text parsing).
Also, it looks like u r finding the file date based on the time stamp...right? should not it be from the file name if it has the datein the name it self like abc_09282007.txt ??
Actually I have a solaris OS ..does this work or do you know where I need to modify.
Thanks a lot for ur help.
Really appreciate it.
Last edited by suneeladdala; 12-03-2007 at 11:00 PM.
Reason: Updated version
what are -E and -l after $pass in the above line and what does pipe symbol | before the quotes indicate??
The -E tells ncftpls to use an active connection -- see http://www.slacksite.com/other/ftp.html for an explanation. This works better for some data providers, but you may not need it. The -l instructs ncftpls to request a detailed listing, like "ls -l"; without it you just get a list of files but without date information.
@field = split;
Actually, split here is a command; one of the things that people find hard to get used to about perl is that if you don't supply an argument, it uses a default argument, which in this case happens to be the current line. The while command sets the default argument to be the contents of the line. Then split chops it into separate fields, using the default assumption that fields are delimited by white space (any number of spaces and/or tabs). The @field is an array; after the split, the fields on the line can be found in $field[0], $field[1], $field[2] etc. See the comment in the perl script just above that point, which gives an example line and shows how it would be chopped into separate fields.
what does ("... %02d/%02d/%d %s\n", mean here??
This is a formatting specificiation. Each % marks the start of an instruction about how to present the next argument in the sequence of arguments given to printf. The details can be found by looking at one of several man pages; try the command man printf.
If you don't know much about this sort of thing, it's worth trying to learn. Try man perlintro, or do what everyone else does: get a book, preferably choosing one whose style and price you like, start copying the examples and then modifying them. There is nothing like making a lot of mistakes to really help you learn something new, so start trying to enjoy making strange mistakes to see what happens!
perl has been around a long time, but there are enormous numbers of libraries freely available for it, and it is pretty efficient. For example, there are libraries that you can use to generate or read Excel spreadsheets without ever using a M*cr*s*ft product.
Of course, don't learn just perl. It's a good place to start learning some things that you will find reappearing in lots of other computer languages.
Hi Ross..
It's pretty clear except a single ques. I guess you forgot to answer.
It looks like u r finding the file date based on the time stamp...right? should not it be from the file name if it has the date in the name it self like abc_09282007.txt ??
does this process also gets the correct files??
Thanks a lot for your suggestions. I will start learning more perl.
It looks like u r finding the file date based on the time stamp...right? should not it be from the file name if it has the date in the name it self like abc_09282007.txt ??
The perl script does use the time stamp rather than the name. Extracting a date from a filename depends, of course, on the particular format of the name. Here is a hint:
Code:
@parts = split(/([-.])/, "file-23-07-2007.txt");
splits up that name wherever a dash or dot appears and puts the chunks into the array, including the separators (so $parts[0] is "file", $parts[1] is "-" etc). Then $parts[2], $parts[4] and $parts[6] are the date ingredients.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.