LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 12-16-2008, 12:52 PM   #1
ewingtux
LQ Newbie
 
Registered: Nov 2007
Posts: 10

Rep: Reputation: 0
Question Wget or cURL code for checking changes to a web page?


Does anyone know what command i could use to check and be notified of any price changes on a product/web page such as http://www.amazon.com/Tales-Beedle-B.../dp/0545128285. If the price changes to say $7.50 i need it to check and notify me.

Wget or cURL seem the best but don't know where to start with what command to use.

Any help much appreciated.
 
Old 12-16-2008, 03:16 PM   #2
MensaWater
Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 5,974
Blog Entries: 5

Rep: Reputation: 778Reputation: 778Reputation: 778Reputation: 778Reputation: 778Reputation: 778Reputation: 778
You could do it with wget but it would be a little involved.

If it were me I'd do it with lynx instead:

Code:
lynx -dump http://www.amazon.com/Tales-Beedle-B.../dp/0545128285 |grep "  Price:" |awk '{print $1,$2}'
Note that in the grep there are TWO spaces before the "Price:". This insures it gets the price rather than the list price.

In my awk I'm printing both the "Price:" and the current price ($7.14 when I ran it).

You could just print the current price ($2 in awk print statement) and strip off the $ to do numeric comparison using something like bc -l.

Code:
lynx -dump http://www.amazon.com/Tales-Beedle-B.../dp/0545128285 |grep "  Price:" |awk '{print $2}' |cut -c2-
The cut statement at end strips the $ off since it is always in position 2. (You could do it with sed or awk but then you have to figure out how to escape the dollar sign since it has special meaning itself.)
 
Old 12-16-2008, 04:46 PM   #3
jcookeman
Member
 
Registered: Jul 2003
Location: London, UK
Distribution: FreeBSD, OpenSuse, Ubuntu, RHEL
Posts: 417

Rep: Reputation: 33
Quick and nasty:

Code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib2
import re
import sys
try:
    pricefile = open('pricefile.txt')
    initial_price = pricefile.readline()
    pricefile.close()
except IOError:
    initial_price = 0
# Setup urllib to look like Firefox on Ubuntu so those
# clever Amazon engineers don't catch on (as fast)
hdrs = {'User-Agent':'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.4) Gecko/2008111318 Ubuntu/8.04 (hardy) Firefox/3.0.4'}
req = urllib2.Request(sys.argv[1], headers=hdrs)
# Grab the page
try:
    pg = urllib2.urlopen(req)
except urllib2.HTTPError, err:
    print "%s: %s" % (sys.argv[1], err)
    sys.exit(1)
# Look through the page
try:
    while True:
        if re.search('.*("priceBlockLabelPrice").*', pg.next()):
            pricere = re.search('.*>([$]+[0-9]+\.[0-9]+)<.*', pg.next())
            price = pricere.group(1)
            break
except StopIteration:
    print "Could not find price"
    sys.exit(1)
if price != initial_price:
    print "Price has changed from %s to %s" % (initial_price, price)
else:
    print "Price is still %s" % price
try:
    pricefile = open('pricefile.txt', 'w+')
    pricefile.write(price)
    pricefile.close()
except IOError, (strerr, errno):
    print "cannot write to pricefile: [%s] %s" % (errno, strerr)
    sys.exit(1)
sys.exit(0)

Last edited by jcookeman; 12-17-2008 at 03:23 AM. Reason: Too nasty for my tastes
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
wget and links2 can't access web page. fakie_flip Programming 6 01-11-2008 04:34 PM
Getting the web page in python :: What's wrong with the code ? indian Programming 1 09-12-2005 03:17 PM
wget/curl problems PLS HELP tommmmmm Linux - Software 0 08-19-2005 03:58 AM
YOU for SUSE 9.1 - curl or wget? djc Suse/Novell 1 02-15-2005 03:26 PM
Wget and cURL can't connect umberleigh Linux - Newbie 0 09-21-2004 05:59 PM


All times are GMT -5. The time now is 07:20 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration