check if a website file has changed
I have a site that I login to to check updates. It does not have RSS because users need to authenticate themselves before getting access to the page.
Is there a way to write a script that can login to the page and check whether the HTML has changed and then send me an email? |
You could maybe script wget to do this? If wget asks for a log and password for the website, then you could automate this using expect. Another way to do this is to use the urllib or urllib2 in python. Depending on how you have to log into the page, these should do what you are looking for.
|
Quote:
I picked up the form name from the source on this page, not sure if it's the correct one or not: https://www.inthemoneystocks.com/pro...watch_list.php Code:
[root@serve~]# wget --post-data='username=DUMMYUSER&password=DUMMYPASSWORD'--save-cookies=my-cookies.txt --keep-session-cookies https://www.inthemoneystocks.com/swing_trade_month.php |
any thoughts on the login part?
|
After the initial request, it is rerouted through to login again :
Code:
[root@server ~]# wget --save-cookies cookies.txt --keep-session-cookies --post-data 'username=MYUSERN¤tpassword=MYPASS' http://www.inthemoneystocks.com/login.php |
For that sort of stuff I use Perl + http://search.cpan.org/~petdance/WWW...W/Mechanize.pm
|
Quote:
Could I use PHP as well? I'm more mailiar with that than perl. What lib do I need for PHP? |
Here is my current curl file but it doesn't seem to download the file:
Code:
#! /usr/bin/php |
I've got it into a file but I am noe using file_get_contents to check the new file every 10mins and if it changes then I email it to myself.
Unfortunately, curl or something is changing the file size by a few bytes even if nothing has changed. Any ideas? |
Is there a date or perhaps a number of hits counter on the page? Curl should just get contents unmodified, so it's probably the page itself that is changing. I would do a diff on two versions of the page and see what changed. Then you can remove those sections from your checks. Just a thought.
|
Quote:
|
Well, what I meant was do a diff manually just to see what's different. If what's different is minor, then you can program your php script to ignore those minor types of changes. Like, if the line that's different looks like "DATE: Thu Feb 27 2010", then you can tell the script to ignore that line when determining if the file changed. But, other than that...
Code:
exec("diff <file1> <file2> > /tmp/difftmp.txt"); |
All times are GMT -5. The time now is 09:14 AM. |