Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a site that I login to to check updates. It does not have RSS because users need to authenticate themselves before getting access to the page.
Is there a way to write a script that can login to the page and check whether the HTML has changed and then send me an email?
You could maybe script wget to do this? If wget asks for a log and password for the website, then you could automate this using expect. Another way to do this is to use the urllib or urllib2 in python. Depending on how you have to log into the page, these should do what you are looking for.
You could maybe script wget to do this? If wget asks for a log and password for the website, then you could automate this using expect. Another way to do this is to use the urllib or urllib2 in python. Depending on how you have to log into the page, these should do what you are looking for.
I cannot get wget to successfully login to the page. I try to submit the details by POST to a login form but it doesn't pick up the correct page afterwards, just the standard page when not logged in.
I picked up the form name from the source on this page, not sure if it's the correct one or not: https://www.inthemoneystocks.com/pro...watch_list.php
Here is my current curl file but it doesn't seem to download the file:
Code:
#! /usr/bin/php
<?php
// INIT CURL
$ch = curl_init();
// SET URL FOR THE POST FORM LOGIN
curl_setopt($ch, CURLOPT_URL,
'https://www.inthemoneystocks.com/login.php');
// ENABLE HTTP POST
curl_setopt ($ch, CURLOPT_POST, 1);
// SET POST PARAMETERS : FORM VALUES FOR EACH FIELD
curl_setopt ($ch, CURLOPT_POSTFIELDS,
'username=xxxxx¤tpassword=xxxxx');
// IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES
curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
# Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
# not to print out the results of its query.
# Instead, it will return the results as a string return value
# from curl_exec() instead of the usual true/false.
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
// EXECUTE 1st REQUEST (FORM LOGIN)
$store = curl_exec ($ch);
// SET FILE TO DOWNLOAD
curl_setopt($ch, CURLOPT_URL,
'https://www.inthemoneystocks.com/pro_trader_watch_list_prem.php');
// EXECUTE 2nd REQUEST (FILE DOWNLOAD)
$content = curl_exec ($ch);
// CLOSE CURL
curl_close ($ch);
?>
I guess I need to write $content to a file? How do I do that?
I've got it into a file but I am noe using file_get_contents to check the new file every 10mins and if it changes then I email it to myself.
Unfortunately, curl or something is changing the file size by a few bytes even if nothing has changed.
Any ideas?
Is there a date or perhaps a number of hits counter on the page? Curl should just get contents unmodified, so it's probably the page itself that is changing. I would do a diff on two versions of the page and see what changed. Then you can remove those sections from your checks. Just a thought.
Is there a date or perhaps a number of hits counter on the page? Curl should just get contents unmodified, so it's probably the page itself that is changing. I would do a diff on two versions of the page and see what changed. Then you can remove those sections from your checks. Just a thought.
Well, what I meant was do a diff manually just to see what's different. If what's different is minor, then you can program your php script to ignore those minor types of changes. Like, if the line that's different looks like "DATE: Thu Feb 27 2010", then you can tell the script to ignore that line when determining if the file changed. But, other than that...
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.