LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Perl with LWP and NTLM (https://www.linuxquestions.org/questions/programming-9/perl-with-lwp-and-ntlm-621296/)

wondergirl 02-15-2008 03:57 AM

Perl with LWP and NTLM
 
Hi guys!
Another newbie question...

I'm trying to get a csv file from a link on a website automatically instead of logging in to that website manually and download the file everyday.

I could do this with curl :

---------------------------------------------------
curl -k -A "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" -u username:******** --ntlm https://egsite/Reports/20080212/Enti...t_20080212.csv

---------------------------------------------------

This works OK on Linux, but curl that is available on my Solaris machine (where I want to do this automatically) does not support ntlm.

So I'm trying to do it with Perl LWP and NTLM modules but since I'm a perl beginner I'm finding it difficult :(

From documentation I got this :

==========================================================
use LWP::UserAgent;

my $ua = LWP::UserAgent->new(
agent=>'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)',
keep_alive=>'1'
);

my $host = 'server:port';
my $user = 'myDomain\\myUserName';
my $pass = 'password';

# note the empty '' is required.
$ua->credentials($host, '', $user, $pass);

my $req = new HTTP::Request GET => "http://winhost/path/to/file";
my $res = $ua->request( $req );

if ( $res->is_success() ) {
# Success !
}
else {
# Failure
}

=============================================================

But no mention of NTLM here....can anyone help me out? I would really appreciate it. Thanks!

wondergirl 02-15-2008 05:38 AM

I actually managed to do this, although a bit messy :


----------------------------------------------
#!/opt/perl/5.8.0/bin/perl

$|++;
use strict;

use FindBin;
use lib "$FindBin::Bin/lib";

use File::Basename;
use POSIX qw(strftime);
use LWP::UserAgent;
use HTTP::Headers;
use HTTP::Request::Common;
use Authen::NTLM;
use HTML::TableExtract;
use HTML::Form;
use HTML::Template;
use MIME::Entity;

my $Options = {
user => "me",
password => "xxxx",
domain => "\\",
timeout => 30,
protocol => "https",
AuthMethod => "NTLM",
BrowserAgent => "MSIE 6.0; Windows NT 5.0",
RequestMethod => "GET",
DataDir => "/tmp",
};

my $log = "/var/tmp/get_url.log";
my $DataDir = "/tmp";

my $browser = LWP::UserAgent->new(
agent=>'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)',
keep_alive=>'1'
);

my $header = HTTP::Headers->new(
Content_Type => 'text/html',
'WWW-Authenticate' => $Options->{'AuthMethod'}
);

#
# First stage of NTLM authentication
#

my $url = "https://egserver/Reports/Userlist_20080212.csv";

ntlm_domain($Options->{'domain'});
ntlm_user($Options->{'user'});
ntlm_password($Options->{'password'});

my $Authorization = Authen::NTLM::ntlm();

my $header = HTTP::Headers->new(
Content_Type => 'text/html',
'WWW-Authenticate' => $Options->{'AuthMethod'}
);

$header->header('Authorization' => "NTLM $Authorization");
my $request = HTTP::Request->new($Options->{'RequestMethod'} => $url, $header);
my $res = $browser->request( $request );

#
# Second stage of authentication
#
my $Challenge = $res->header('WWW-Authenticate');
$Challenge =~ s/^NTLM //g;

$Authorization = Authen::NTLM::ntlm($Challenge);
$header->header('Authorization' => "NTLM $Authorization");
$request = HTTP::Request->new($Options->{'RequestMethod'} => $url, $header);

$res = $browser->request($request);
#
# ntlm needs to be resetted after second stage
#
ntlm_reset();

if($res->is_success) {
&dump2file("$Options->{'DataDir'}/test_url", $res->content);
}
else {
&out2logfile($log, "ERROR 1 : Can not dump data from $url\n Returned code: " . $res->code . " (" . $res->status_line . ")\n");
}


sub dump2file {
my ($FileName, $Message) = @_;
my $Flag;

my $LogDir = dirname($FileName);
system("/bin/mkdir -p $LogDir") unless(-d $LogDir);

if(open(FILE, "> $FileName")){
for(my $i=0 ; $i<10 ; $i++){
$Flag = flock(FILE,2);
last if($Flag);
sleep(1);
}

unless($Flag){
return 1;
}

print FILE $Message, "\n";

unless(flock(FILE,8)){
return 1;
}

close(FILE);
}
return 0;
}

sub out2logfile {
my ($FileName, $Message, $PrintError, $Option) = @_;
my $Flag;

my $LogDir = dirname($FileName);
system("/bin/mkdir -p $LogDir") unless(-d $LogDir);

$Option=">>" if(!defined($Option) || $Option ne ">");

print $Message if($PrintError);

if(open(FILE, "$Option $FileName")){
for(my $i=0 ; $i<10 ; $i++){
$Flag = flock(FILE,2);
last if($Flag);
sleep(1);
}

unless($Flag){
return 1;
}

my $time=strftime "%Y-%m-%d %H:%M ", localtime;
print FILE $time, $Message;

unless(flock(FILE,8)){
return 1;
}

close(FILE);
}
return 0;
}

==========================================================

Any comments on this? it could use some improving..:)

Now I need to figure out how to get the latest report automatically based on the date of the report (the latest is always yesterday's report). Right now I'm hardcoding it :

my $url = "https://egserver/Reports/Userlist_20080212.csv";

Any suggestion how I can make it download the right file automatically by manipulating the _20080212 part? Its a stupid (and lazy) question, I guess I'm gonna look into the books now and see...(maybe use strftime in a way).

Thanks!

chrism01 02-17-2008 07:25 PM

Well, here's a way I get yesterday:

Code:


$today = get_todays_date();
($curr_yr, $curr_mth, $curr_day) = (substr($today, 0, 4),
                                    substr($today, 4, 2),
                                    substr($today, 6, 2));
print "today: $curr_yr, $curr_mth, $curr_day\n";
# calc prev date
($prev_yr, $prev_mth, $prev_day) = Add_Delta_Days($curr_yr, $curr_mth,
                                                        $curr_day, -1);
print "prev: $prev_yr, $prev_mth, $prev_day\n";

sub get_todays_date
{
    my (
        $year,  # year ( -1900 )
        $mth,  # month ( -1 )
        $mday,  # day-of-mth
        $today  # today string: yyyymmdd
        );


    # Get today
    # NB: Mth array starts at zero
    ($mday, $mth, $year) = (localtime)[3..5];
    $today = sprintf( "%04d%02d%02d", $year+1900, $mth+1, $mday );
    return $today;
}

You might also want to look at http://search.cpan.org/~petdance/WWW...W/Mechanize.pm "WWW::Mechanize, or Mech for short, helps you automate interaction with a website"
I've found it very useful for a very similar job to yours.

deisecairo 05-22-2010 02:32 PM

Hi all,

I found an easier way. It uses the simple lib LWP::Authen::Ntlm, but it needs some fixation before.

To use simple NTLM authentication through HTTP in a script, we need to be sure that we have installed the three modules in our system:

LWP::Authen::Ntlm
Authen::NTLM
MIME::Base64

then, we need to edit the file LWP/Authen/Ntlm.pm from module LWP::Authen::Ntlm, wherever it was installed in our system and modify the lines from:

use Authen::NTLM "1.02";
use MIME::Base64 "2.12";

to

use Authen::NTLM;
use MIME::Base64;

and is it.

Inside the file LWP/Authen/Ntlm.pm anyone can follow a complete secuence steps for NTLM authentication through HTTP. Anyway, wondergirl showed us a good secuence to make a NTLM access, and its good for curiosity humans.

Regards.


All times are GMT -5. The time now is 11:01 PM.