LinuxQuestions.org - Wget or cURL code for checking changes to a web page?

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Wget or cURL code for checking changes to a web page? (https://www.linuxquestions.org/questions/programming-9/wget-or-curl-code-for-checking-changes-to-a-web-page-691006/)

Wget or cURL code for checking changes to a web page?

Does anyone know what command i could use to check and be notified of any price changes on a product/web page such as http://www.amazon.com/Tales-Beedle-B.../dp/0545128285. If the price changes to say $7.50 i need it to check and notify me.

Wget or cURL seem the best but don't know where to start with what command to use.

Any help much appreciated.

You could do it with wget but it would be a little involved.

If it were me I'd do it with lynx instead:

Code:

lynx -dump http://www.amazon.com/Tales-Beedle-B.../dp/0545128285 |grep " Price:" |awk '{print $1,$2}'

Note that in the grep there are TWO spaces before the "Price:". This insures it gets the price rather than the list price.

In my awk I'm printing both the "Price:" and the current price ($7.14 when I ran it).

You could just print the current price ($2 in awk print statement) and strip off the $ to do numeric comparison using something like bc -l.

Code:

lynx -dump http://www.amazon.com/Tales-Beedle-B.../dp/0545128285 |grep " Price:" |awk '{print $2}' |cut -c2-

The cut statement at end strips the $ off since it is always in position 2. (You could do it with sed or awk but then you have to figure out how to escape the dollar sign since it has special meaning itself.)

Quick and nasty:

Code:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import urllib2

import re

import sys

try:

    pricefile = open('pricefile.txt')

    initial_price = pricefile.readline()

    pricefile.close()

except IOError:

    initial_price = 0

# Setup urllib to look like Firefox on Ubuntu so those

# clever Amazon engineers don't catch on (as fast)

hdrs = {'User-Agent':'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.4) Gecko/2008111318 Ubuntu/8.04 (hardy) Firefox/3.0.4'}

req = urllib2.Request(sys.argv[1], headers=hdrs)

# Grab the page

try:

    pg = urllib2.urlopen(req)

except urllib2.HTTPError, err:

    print "%s: %s" % (sys.argv[1], err)

    sys.exit(1)

# Look through the page

try:

    while True:

        if re.search('.*("priceBlockLabelPrice").*', pg.next()):

            pricere = re.search('.*>([$£]+[0-9]+\.[0-9]+)<.*', pg.next())

            price = pricere.group(1)

            break

except StopIteration:

    print "Could not find price"

    sys.exit(1)

if price != initial_price:

    print "Price has changed from %s to %s" % (initial_price, price)

else:

    print "Price is still %s" % price

try:

    pricefile = open('pricefile.txt', 'w+')

    pricefile.write(price)

    pricefile.close()

except IOError, (strerr, errno):

    print "cannot write to pricefile: [%s] %s" % (errno, strerr)

    sys.exit(1)

sys.exit(0)