LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   simple script to grab an image from a web page and set background (https://www.linuxquestions.org/questions/programming-9/simple-script-to-grab-an-image-from-a-web-page-and-set-background-463192/)

stardotstar 07-12-2006 01:13 AM

simple script to grab an image from a web page and set background
 
I have a need to set the desktop to an image that appears as the only img tag on an html webpage.

The windows system being used for those clients is:

go to the page
find the image by identifying the img tags in the source
copy file to location
set desktop background

call this prog on startup and execute and exit

This has been done in dotnet I think.

Someone must have done this - know any links or what approach would be best - I am not very used to tackling scripting problems beyond basic local bash stuff.

TIA
Will.*

unSpawn 07-12-2006 03:40 AM

I am not very used to tackling scripting problems beyond basic local bash stuff.
Still it would have been cool to post whatever you tried, just to show us you did *try*.
Also look at Freshmeat.net and Sourceforge.org. There's a true cornucopia of tools waiting to be discovered...

go to the page, find the image by identifying the img tags in the source, copy file to location
Ayway, since I feel like reinventing some wheels today, maybe something along the lines of:
Code:

#!/bin/sh
uri="$1"; wget -O - "$uri" 2>/dev/null|grep "\<img.*src=" 2>/dev/null|\
head -1|sed -e "s/.*><img/<img/g" -e "s/><.*$/>/g"|while read l; do
l=(${l}); for i in $(seq 0 $[${#l[@]}-1]); do if [ "${l[$i]:0:3}" = "src" ]; then
img="$(dirname "$uri")$(echo ${l[$i]:4}|tr -d "\"")"; fi; done;
wget -q "$img"; done

YMMV(VM)


set desktop background
You figure that out using something like "xloadimage -onroot file.ext".

Tinkster 07-12-2006 03:51 AM

And I thought he was talking about a windows box with .NET ...


Cheers,
Tink

unSpawn 07-12-2006 05:26 AM

And I thought he was talking about a windows box with .NET
Was he? Hmm. Shame. I'll go find some other wheels to reinvent, then ;-p

muha 07-12-2006 05:30 AM

@unSpawn: cool! Did you get this script working on a page? Which page? I was trying it out but it did nothing for me ...

stardotstar 07-12-2006 05:47 AM

Thanks guys, I actually didn't even try - I didn't know where to begin - so I asked the question here; My scripts start and stop networking processes with simple config changes and do things to VirtualMachines, launch rsyncs and what not so I am at a loss with real programming.

Thank you for the links and example script - you are most generous and I appreciate the effort you put in.

And no it is not for a windows box - linux systems that need to coexist on a windows centric network...
I can see that the original post was very unclear on that score :oops:
Will

stardotstar 07-12-2006 06:10 AM

that script is great and I think I get the gist of it - I can't make it actually get an image but maybe that is because the tagging of the image is difficult...

I added some echoing so i could see if the $img was correctly populated but it never output... The $uri is correctly set.

here is the source of the image definition:
here is the properties of the image in the browser:
Code:

http://intrasv01.homx.casga/picup/image/cata/splash_12_07_06_lr.jpg
and here is the definitions in the html:

Code:

<br>
<a href="image/cata/splash_12_07_06_hr.jpg">
<IMG SRC="image/cata/splash_12_07_06_lr.jpg"
alt="Splash Group Nd.  Click for Hi_Res for Backgrounds Splash on cata relief."></a>
</td>

<td VALIGN="TOP">
<br>
<center>

Does this make it clearer.

I understand you have made a substantial effort to assist and I will work to understand what you have done better. I am very new to this level of scripting and I appreciate your time.

I basically need to automate the grabbing and saving of the high res version of the single image displayed on that intranet page...

Will

stardotstar 07-12-2006 06:36 AM

BTW thanks for going the extra mile for me - don't feel like you have to reinvent the reinvention - I will be scavanging for those other resources as well. Just need to accomplish this as its a part of local intranet compliance on machines and several of us run linux in a windows world which is quite unforgiving. Thanks for your help :)

unSpawn 07-12-2006 06:40 AM

And no it is not for a windows box - linux systems that need to coexist on a windows centric network
That's cool. Let's support that effort by finishing off the script and add some minor checking:
Code:

#!/bin/sh
uri="$1"; wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src && { img="$(dirname "$uri")$(echo ${l[$i]:4}|tr -d "\"")"; break; };
 done
 rand=$(date|sha1sum|cut -c 1-26); wget -q "$img" -O "/tmp/img.$rand" && \
 { file -bi /tmp/setimgfile|egrep -qe "^image\/(gi|jp|pn)" \
  && xloadimage -onroot "/tmp/img.$rand" && rm -f "/tmp/img.$rand"; };
done; exit 0

If saved as "getImg.sh" and testrun like "./getImg.sh http://www.google.com/ncr" it fills your desktop with that abomination ;-p If it doesn't work run as "sh -x ./getImg.sh uri-of-choice 2>&1 | tee getImg.tee" and post the contents of getImg.tee (you're invited to do so as well muha).

muha 07-12-2006 07:05 AM

Hmm, it grabs the image like you said. (i did not realize it would be in the /tmp dir)
It doesn't manage to stick it to my desktop though. Which command in your script would do that?
Quote:

$: sh -x ./getImg.sh http://www.google.com/ncr 2>&1 | tee getImg.tee
+ uri=http://www.google.com/ncr
+ wget -O - http://www.google.com/ncr
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ l=(${l})
++ seq 0 4
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<im'
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
++ dirname http://www.google.com/ncr
++ echo '"/intl/en/images/logo.gif"'
++ tr -d '"'
+ img=http://www.google.com/intl/en/images/logo.gif
+ break
++ date
++ sha1sum
++ cut -c 1-26
+ rand=e0e755ffc570f6f48ed2f50aa7
+ wget -q http://www.google.com/intl/en/images/logo.gif -O /tmp/img.e0e755ffc570f6f48ed2f50aa7
+ file -bi /tmp/setimgfile
+ egrep -qe '^image\/(gi|jp|pn)'
+ read l
+ exit 0

unSpawn 07-12-2006 08:01 AM

It doesn't manage to stick it to my desktop though.
Yes, that's because I know just enough scripting to fsck things up ;-p
Replace the lines:
Code:

{ file -bi /tmp/setimgfile|egrep -qe "^image\/(gi|jp|pn)" \
  && xloadimage -onroot "/tmp/img.$rand" && rm -f "/tmp/img.$rand"; };

with:
Code:

{ file -bi "/tmp/img.$rand"|egrep -qe "^image\/(gi|jp|pn)" \
  && xloadimage -onroot "/tmp/img.$rand" && rm -f "/tmp/img.$rand"; break; }


muha 07-12-2006 08:54 AM

Quote:

./getImg.sh: line 18: xloadimage: command not found
Which probably makes sense since i don't have it installed on my suse 10.0
Thanks for the script, might come in handy later on.

stardotstar 07-12-2006 06:54 PM

Fantastic - thank you works exactly as promised with one problem which I have identified as something to do with the line where the file type is set - it seems to wrk fine with gifs like the google one but as soon as I point to a jpg xloadimage fails:

Code:

tardotstar@geko ~ $ xloadimage -onroot  -center /tmp/img.7df45b6f2058c9d840ee211413
Warning: unknown JFIF revision number 0.00
/tmp/img.7df45b6f2058c9d840ee211413: unknown or unsupported image type
  Building XImage...done

Code:

{ file -bi "/tmp/img.$rand"|egrep -qe "^image\/(gi|jp|pn)" \
Code:


stardotstar@geko ~ $ file /tmp/daily_splash.jpg
/tmp/daily_splash.jpg: JPEG image data, JFIF standard 1.02, comment: "Adobe ImageReady\377"
stardotstar@geko ~ $ file /tmp/img.               
img.7df45b6f2058c9d840ee211413  img.dbd44c6e7596033f76f3794d42
img.ceb3972fea047f409ebec9323d  img.e464b9f0e3baf5821aaa286bc9
stardotstar@geko ~ $ file /tmp/img.7df45b6f2058c9d840ee211413
/tmp/img.7df45b6f2058c9d840ee211413: empty
stardotstar@geko ~ $ file /tmp/img.dbd44c6e7596033f76f3794d42
/tmp/img.dbd44c6e7596033f76f3794d42: empty
stardotstar@geko ~ $ file /tmp/img.e464b9f0e3baf5821aaa286bc9
/tmp/img.e464b9f0e3baf5821aaa286bc9: empty
stardotstar@geko ~ $ file /tmp/img.ceb3972fea047f409ebec9323d
/tmp/img.ceb3972fea047f409ebec9323d: GIF image data, version 89a, 276 x 110
stardotstar@geko ~ $

So the gif is properly populated with file information but the jpgs are not...

I am researching the way that file works and how it relates to "magic" and will report back if I can hack it out myself :)

Since I am so new to the pipes and management of so many bash commands I have unfortunately been unable to fix this and present an answer. I'm sure it is simple since xloadimage clearly works on the instance of the file that is saved from the browser as a jpg.

Thanks again for your help :)
Will

unSpawn 07-12-2006 07:26 PM

If you can't be bothered with the "file" check just replace
Code:

rand=$(date|sha1sum|cut -c 1-26); wget -q "$img" -O "/tmp/img.$rand" && \
 { file -bi /tmp/setimgfile|egrep -qe "^image\/(gi|jp|pn)" \
  && xloadimage -onroot "/tmp/img.$rand" && rm -f "/tmp/img.$rand"; };
done; exit 0

with
Code:

rand=$(date|sha1sum|cut -c 1-26); wget -q "$img" -O "/tmp/img.$rand" && \
 { xloadimage -onroot "/tmp/img.$rand" && rm -f "/tmp/img.$rand"; };
done; exit 0


stardotstar@geko ~ $ file /tmp/daily_splash.jpg
If you want to fix it you need to add output from file -bi, not just "file".
If the outcome is "image/jfif" then it'll prolly look like
Code:

egrep -qe "^image\/(gi|j[pf]|pn)" \

unSpawn 07-12-2006 07:31 PM

BTW: i did not realize it would be in the /tmp dir
Yes, that is a flaw. While it would be not so easy under normal(?) circumstances to predict the value of $rand (and for instance symlink it to a file to overwrite if the script was run as root) it should use "mktemp" instead of the $rand kludge to be safer.

stardotstar 07-12-2006 08:32 PM

This has me confused because:

Code:

stardotstar@geko ~ $ file -bi /tmp/splash_daily.jpg
image/jpeg

and yet despite changine and/or removing the file section I get no joy.

Code:

stardotstar@geko ~ $ xloadimage -onroot /tmp/img.264c8bb569b018170488a59aa1
Warning: unknown JFIF revision number 0.00
/tmp/img.264c8bb569b018170488a59aa1: unknown or unsupported image type

stardotstar@geko ~ $ file -bi /tmp/img.264c8bb569b018170488a59aa1
application/x-empty

I think I am understanding what the egrep does - detects from the original file name what the extention is and attaches that to the file command. But I can't actually see what this is asking file to do... man file does not tell me about "setting" the file type this way and so:

Code:

stardotstar@geko ~ $ file -bi /tmp/img.264c8bb569b018170488a59aa1 image/jpeg
application/x-empty
cannot open `image/jpeg' (No such file or directory)

so clearly I am not understanding the purpose of the file command in the script - But then again I don't really know how all the escape characters and conditional statements work.

Without the file statement google gif works just fine:
Code:

stardotstar@geko ~ $ ./backgrounder.2 http://www.google.com/ncr
/tmp/img.74b42f74785bd34f07611f44da is a 276x110 GIF image with 256 colors
  Building XImage...done

so somehow the jpeg is not wanting to play at all...

Will

stardotstar 07-12-2006 09:38 PM

Strange, it is not quite doing what I expected. I'll try to summarise my findings and if you can see the benefit of pursuing it I appreciate it - but I understand the effort you have gone to already and don't want to turn this into a new sf project :)

I have removed the file removal so I can compare what is captured. It seems that anything except gifs get 0 file size and are unknown type.

Here is the result of executing the script on several pages and the list of files generated, IBM and Google worked anything that found a png or jpg failed to write any content:

Code:

stardotstar@geko ~ $ sh -x ./backgrounder.1 http://www.kernel.org 2>&1             
+ uri=http://www.kernel.org
+ wget -O - http://www.kernel.org
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ l=(${l})
++ seq 0 7
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<im'
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo sty
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo sol
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '#bb'
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
++ dirname http://www.kernel.org
++ echo '"http://www1.kernel.org/bw-zeus1.png"'
++ tr -d '"'
+ img=http:http://www1.kernel.org/bw-zeus1.png
+ break
++ sha1sum
++ cut -c 1-26
++ date
+ rand=5c9cbd7d07df9a7eb87bf46873
+ wget -q http:http://www1.kernel.org/bw-zeus1.png -O /tmp/img.5c9cbd7d07df9a7eb87bf46873
+ read l
+ exit 0
stardotstar@geko ~ $ ls -l /tmp/img*
-rw-r--r-- 1 stardotstar stardotstar 0 Jul 13 12:10 /tmp/img.5c9cbd7d07df9a7eb87bf46873

Code:

stardotstar@geko ~ $ sh -x ./backgrounder.2 http://www.ibm.org 2>&1
+ uri=http://www.ibm.org
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ wget -O - http://www.ibm.org
+ l=(${l})
++ seq 0 6
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<im'
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo hei
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo wid
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo bor
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo alt
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
++ dirname http://www.ibm.org
++ echo '"//www.ibm.com/i/v14/t/ibm-logo.gif"'
++ tr -d '"'
+ img=http://www.ibm.com/i/v14/t/ibm-logo.gif
+ break
++ date
++ sha1sum
++ cut -c 1-26
+ rand=9eff1d95e7b8154bd10389f9d4
+ wget -q http://www.ibm.com/i/v14/t/ibm-logo.gif -O /tmp/img.9eff1d95e7b8154bd10389f9d4
+ file -bi /tmp/img.9eff1d95e7b8154bd10389f9d4
+ egrep -qe '^image\/(gi|j[pf]|pn)'
+ xloadimage -onroot /tmp/img.9eff1d95e7b8154bd10389f9d4
/tmp/img.9eff1d95e7b8154bd10389f9d4 is a 110x52 GIF image with 16 colors
  Building XImage...done
+ break
+ exit 0
stardotstar@geko ~ $ ls -l /tmp/img*
-rw-r--r-- 1 stardotstar stardotstar  0 Jul 13 12:10 /tmp/img.5c9cbd7d07df9a7eb87bf46873
-rw-r--r-- 1 stardotstar stardotstar 430 Oct  8  2004 /tmp/img.9eff1d95e7b8154bd10389f9d4

as you can see the IBM gif saved at 430 bytes and xloadimage worked fine.
so too below for google:

Code:

stardotstar@geko ~ $ sh -x ./backgrounder.2 http://www.google.com/ncr 2>&1
+ uri=http://www.google.com/ncr
+ wget -O - http://www.google.com/ncr
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ l=(${l})
++ seq 0 4
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ grep -qie src
+ echo '<im'
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
++ dirname http://www.google.com/ncr
++ echo '"/intl/en/images/logo.gif"'
++ tr -d '"'
+ img=http://www.google.com/intl/en/images/logo.gif
+ break
++ date
++ sha1sum
++ cut -c 1-26
+ rand=29a9da6b4c41a55add118c1f7d
+ wget -q http://www.google.com/intl/en/images/logo.gif -O /tmp/img.29a9da6b4c41a55add118c1f7d
+ file -bi /tmp/img.29a9da6b4c41a55add118c1f7d
+ egrep -qe '^image\/(gi|j[pf]|pn)'
+ xloadimage -onroot /tmp/img.29a9da6b4c41a55add118c1f7d
/tmp/img.29a9da6b4c41a55add118c1f7d is a 276x110 GIF image with 256 colors
  Building XImage...done
+ break
+ exit 0
stardotstar@geko ~ $ ls -l /tmp/img*
-rw-r--r-- 1 stardotstar stardotstar 8558 Jun  8 05:38 /tmp/img.29a9da6b4c41a55add118c1f7d
-rw-r--r-- 1 stardotstar stardotstar    0 Jul 13 12:10 /tmp/img.5c9cbd7d07df9a7eb87bf46873
-rw-r--r-- 1 stardotstar stardotstar  430 Oct  8  2004 /tmp/img.9eff1d95e7b8154bd10389f9d4
-rw-r--r-- 1 stardotstar stardotstar    0 Jul 13 12:06

When I point at a jpg - like the png it always just producds a null file type and content:

Code:

stardotstar@geko ~ $ file -bi /tmp/img*
image/gif
application/x-empty
image/gif
application/x-empty
application/x-empty
application/x-empty

Thanks

Will

UPDATE - I simplified the script further to try and see what was not working:

Code:

stardotstar@geko ~ $ sh -x ./backgrounder.4 http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ uri=http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ wget -O - http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ l=(${l})
++ seq 0 1
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<im'
+ grep -qie src
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
++ dirname http://antwrp.gsfc.nasa.gov/apod/astropix.html
++ echo '"image/0607/shuttlego_nasa.gif"'
++ tr -d '"'
+ img=http://antwrp.gsfc.nasa.gov/apodimage/0607/shuttlego_nasa.gif
+ break
+ wget -q http://antwrp.gsfc.nasa.gov/apodimage/0607/shuttlego_nasa.gif -O /tmp/img.daily
+ xloadimage -onroot /tmp/img.daily
Warning: unknown JFIF revision number 0.00
/tmp/img.daily: unknown or unsupported image type
+ read l
+ exit 0
stardotstar@geko ~ $ cat backgrounder.4
#!/bin/sh
uri="$1"; wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src && { img="$(dirname "$uri")$(echo ${l[$i]:4}|tr -d "\"")"; break; };
 done
wget -q "$img" -O "/tmp/img.daily";
xloadimage -onroot "/tmp/img.daily" ;
done; exit 0
stardotstar@geko ~ $ ls -l /tmp/img.daily
-rw-r--r-- 1 stardotstar stardotstar 0 Jul 13 15:50 /tmp/img.daily

This is a good site to test because it has daily changing images adn they vary file types... The current gif there is not working but it is an animation so maybe that is that.

unSpawn 07-15-2006 06:03 AM

"File" doesn't set anything, it just reads the MIME tag (in this case, due to -i). Egrep then (silently: -q) tries to match any string starting with "image/" and gi, jp or pn. "application/x-empty" means wget retrieved the file OK, but the file does not contain data. I'll cleanup some other stuff, rewrite in full and add comments. Note you loose PNG because xloadimage doesn't support it. You can of course always subsitute it for a loader that does. Also note you need to provide an URI in the format protocol://domain.tld/file.extension so protocol://domain.tld/ will not do. I could fix it by checking for and counting slashes or TLD position in "${uri:7}" and then not using "dirname" but not now. I also kept in recognition using "file" because I think that's the minimal check to do. I don't want to rely on extension alone.

Code:

#!/bin/sh
# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# If no arg given or protocol doesn't match that for teh intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi
# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src
  if [ "$?" = "0" ]; then
    # Make compound of URI base, strip the src= part.
    img="$(dirname "$uri")$(echo ${l[$i]:4}|tr -d "\"")"
    # Decidedly lame way to strip not having a page:
    if [ "${img:0:10}" = "http:http:" ]; then
    img=${img:5}
    fi
    break
  fi
 done
 # Lame way to get a randomised string for temp file.
 rand=$(date|sha1sum|cut -c 1-26)
 # Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 if [ -s "/tmp/img.$rand" ]; then
 # If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then
    loadImgRes=$(xloadimage -onroot "/tmp/img.$rand" 2>&1)
    # If xloadimage fails dump some info to stderr.
    if [ "$?" != "0" ]; then
      echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
      echo "${0##*/}: grabbed image was \"${img}\"."
      echo "${0##*/}: temp image stats:"
      stat "/tmp/img.$rand" 1>&2
      file "/tmp/img.$rand" 1>&2
    fi
  else
    # Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else
 # Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi
 # Remove the temporary image regardless.
 rm -f "/tmp/img.$rand"
# End of outer loop.
done
exit 0

BTW, thanks the debug output and for helping make *your* application "better".

stardotstar 07-16-2006 12:22 AM

Wow, thanks heaps. I really appreciate your taking the time to work on this project for me :) I am following the code and it is very informative - one day I hope to be able to tackle stuff like this.

I am still running into the zero size get problem. The astropix site is a good reference because they have only one pic and it changes from gif to jpg regularly which is the situation on my intranet (mainly catalogue pics that could be jpeg straight from a digital camera or gifs needing transparency for inclusion in a website. etc.)

http://antwrp.gsfc.nasa.gov/apod/astropix.html
I am using that for reference whilst at home over weekend at at nights because the intranet is not available.

I find that I am still continually getting zero file sizes but when I point wget directly at the image link it works...

Code:

geko stardotstar # xloadimage -onroot gcenter_2mass_big.jpg
gcenter_2mass_big.jpg is a 620x1214 JPEG image, color space YCbCr, 3 comps, Huffman coding.
  Building XImage...done

works manually but when the backgrounder script is called:

Code:

geko stardotstar # wget http://antwrp.gsfc.nasa.gov/apod/image/0607/gcenter_2mass_big.jpg
--15:20:11--  http://antwrp.gsfc.nasa.gov/apod/image/0607/gcenter_2mass_big.jpg
          => `gcenter_2mass_big.jpg.1'
Resolving antwrp.gsfc.nasa.gov... 128.183.17.121
Connecting to antwrp.gsfc.nasa.gov|128.183.17.121|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 626,425 (612K) [image/jpeg]

100%[====================================>] 626,425      53.18K/s    ETA 00:00

15:20:24 (50.48 KB/s) - `gcenter_2mass_big.jpg.1' saved [626425/626425]

perhaps it is a timeout thing... wget the image works and then the root window allows it to be set.
Code:

geko stardotstar # sh -x /home/stardotstar/backgrounder.4 http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ uri=http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ '[' 1 '!=' 1 -o http:// '!=' http:// ']'
+ wget -O - http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ l=(${l})
++ seq 0 1
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<im'
+ grep -qie src
+ '[' 1 = 0 ']'
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
+ '[' 0 = 0 ']'
++ dirname http://antwrp.gsfc.nasa.gov/apod/astropix.html
++ echo '"image/0607/gcenter_2mass.jpg"'
++ tr -d '"'
+ img=http://antwrp.gsfc.nasa.gov/apodimage/0607/gcenter_2mass.jpg
+ '[' http://ant = http:http: ']'
+ break
++ date
++ sha1sum
++ cut -c 1-26
+ rand=b0a9622f6ca5ad0e12610caa41
+ wget -q http://antwrp.gsfc.nasa.gov/apodimage/0607/gcenter_2mass.jpg -O /tmp/img.b0a9622f6ca5ad0e12610caa41
+ '[' -s /tmp/img.b0a9622f6ca5ad0e12610caa41 ']'
+ echo 'backgrounder.4: grabbed image was empty.'
backgrounder.4: grabbed image was empty.
+ rm -f /tmp/img.b0a9622f6ca5ad0e12610caa41
+ read l
+ exit 0

I am not at work now but the symptoms are the same as they were previously - so I havn;t bothered to turn off image cleanup.
BTW I have been prowling around sf and googling this type of solution and found chbg and such but they are not nearly as simple and convenient as this script.

Will

unSpawn 07-16-2006 04:34 AM

getImg.sh 1.4 (final)
 
I am still running into the zero size get problem.
That's because of how the img src links work, sometimes relative, sometimes full paths:
Code:

./getImg-v2.sh http://www.google.com/ncr
base: "http://www.google.com"
img: "/intl/en/images/logo.gif"
./getImg-v2.sh http://www.kernel.org
base: "http:"
img: "http://www1.kernel.org/bw-zeus1.png"
./getImg-v2.sh http://www.ibm.org
base: "http:"
img: "//www.ibm.com/i/v14/t/ibm-logo.gif"
./getImg-v2.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
base: "http://antwrp.gsfc.nasa.gov/apod"
img: "image/0607/gcenter_2mass.jpg"


I've corrected that and added some checks. After testing with your submitted URI's I consider this somewhat "final" but it still isn't optimised etc, etc. Please respect the License and retain comment headers when using/distributing.
Code:

#!/bin/bash
# getImg.sh 1.4 unspawn (www.linuxquestions.org) for stardotstar
# Purpose: grab web-based image from page (URI)
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage
# Run from: manual or cron

# Checks
# 0. wget, xloadimage
for b in wget xloadimage; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for teh intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi
# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src
  if [ "$?" = "0" ]; then
    # Strip the src= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:4}|tr -d "\"")"
    break
  fi
 done

 # Correct wrong approach:
 # 3. Get base
 base=$(dirname "$uri")
 # 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi
 # Minor checks
 # 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." > /dev/stderr
  exit 1
 fi
 # Lame way to get a randomised string for temp file.
 rand=$(date|sha1sum|cut -c 1-26)
 # Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 if [ -s "/tmp/img.$rand" ]; then
 # 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then
    loadImgRes=$(xloadimage -onroot "/tmp/img.$rand" 2>&1)
    # 8. If xloadimage fails dump some info to stderr.
    if [ "$?" != "0" ]; then
      echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
      echo "${0##*/}: grabbed image was \"${img}\"."
      echo "${0##*/}: temp image stats:"
      stat "/tmp/img.$rand" 1>&2
      file "/tmp/img.$rand" 1>&2
    fi
  else
    # Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else
 # Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi
 # Remove the temporary image regardless.
 rm -f "/tmp/img.$rand"
# End of outer loop.
done
exit 0


* If you (stardotstar) want to do something in return (though no pressure, this script is provided for free so if you can't / won't I understand), then post the name of a (GNU/Linux distro-agnostic!) background image loader that does handle PNG and other image formats. That's all. Have fun!

stardotstar 07-16-2006 06:34 AM

Of course I will respect the GPLv2 Licence and I am very appreciative of your efforts and insight into my particular problem. This, to me, is the very heart of the open source community spirit and I revere you for your skill and admire the willingness you exhibit in sharing it.

If I can find a programme which
Quote:

a (GNU/Linux distro-agnostic!) background image loader that does handle PNG and other image formats.
I will most certainly publish it here. Certainly your assistance has assisted me in making some substantial steps toward being able to tackle such things myself as an aspiring programmer.

Best Regards,
Will.*

stardotstar 07-20-2006 05:44 PM

This application is working a treat and I want first of all to thank you despawn for taking the time to develop it for me - and teaching me a new level in scripting. I am proud to say that this has led to my actually being able to demonstrate that I have *tried* to solve the current requirement I need to add to the script to finish it and make it match the windows clients...

The script currently loads the picture of the day perfectly (well I can't get it to work as well with multiple screens as Windows but that is way out of scope :lol: ) but the product or content is accompanied by a part number/name and description.

I am trying to get ImageMagick to print the name of the image on the picture and have so far got it to print static text but can't get it to write the ${img} - which I think is what it should be...

the relevant part is:

Code:

addTitleText=$(convert -fill white -draw 'text 100,100 "${img}"' /tmp/img.$rand /tmp/titled 2>&1)
        loadImgRes=$(xloadimage -onroot -center -border black -fullscreen "/tmp/titled" 2>&1)

All this does is write ${img} on the image in white at 100,100

I have also not been able to get convert to output over the top of the img.$rand or write *to* titled.$rand because of syntax problems (I must get a good tutorial on scripting). I also tried to echo the output of the file name to a text file with:

'text 100,100 ${img}'

appended and then call that in the script with convert using the @ modifier. to no avail...

I'm sure all this is just knowledge about how to insert variables in scripts with respect to "'\/ etc - makes my head spin :)

Ideally if I can strip the .jpg or .gif or whatever off the name and print it on the image it will be adequete - I will work out how to add the comments later.

Can I get some assistance with this please?

This is the current full listing (you will see that I have butchered some of it ;) by writing the daily randomised file name to a wallpaper directory so the screensaver uses it, and removal of the titled temp file as well as the img.$rand at the end.

Oh, and root tail since I can now use it with nautilus background management turned off :grin:

Finally, can we not just use convert/ImageMagick to handle any file types that are not compatible with xloadimage??? That would satisfy the request you made in your last post? (just a thought>)


Code:

#!/bin/bash
# getImg.sh 1.4 unspawn (www.linuxquestions.org) for stardotstar
# Purpose: grab web-based image from page (URI)
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage
# Run from: manual or cron

# Checks
# 0. wget, xloadimage
for b in wget xloadimage; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

# Set a pretty repeat pattern while we wait for the image to load.

xloadimage -onroot /home/stardotstar/wallpaper.png

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for teh intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi
# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src
  if [ "$?" = "0" ]; then
    # Strip the src= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:4}|tr -d "\"")"
    break
  fi
 done

 # Correct wrong approach:
 # 3. Get base
 base=$(dirname "$uri")
 # 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi
 # Minor checks
 # 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." > /dev/stderr
  exit 1
 fi
 # Lame way to get a randomised string for temp file.
 rand=$(date|sha1sum|cut -c 1-26)
 # Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 if [ -s "/tmp/img.$rand" ]; then
 # 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then
#    loadImgRes=$(xloadimage -onroot -at 5,30 -border black "/tmp/img.$rand" 2>&1)
        addTitleText=$(convert -fill white -draw 'text 100,100 "${img}"' /tmp/img.$rand /tmp/titled 2>&1)
        loadImgRes=$(xloadimage -onroot -center -border black -fullscreen "/tmp/titled" 2>&1)
    # 8. If xloadimage fails dump some info to stderr.
    if [ "$?" != "0" ]; then
      echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
      echo "${0##*/}: grabbed image was \"${img}\"."
      echo "${0##*/}: temp image stats:"
      stat "/tmp/img.$rand" 1>&2
      file "/tmp/img.$rand" 1>&2
    fi
  else
    # Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else
 # Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi
 # Remove the temporary image regardless.
 cp "/tmp/img.$rand" /home/wparker/AV/dailywallpaper/
 rm -f "/tmp/img.$rand"
 rm -f /tmp/titled
# End of outer loop.
done
# Call root-tail to finish screen setup
root-tail -fn -misc-fixed-*-*-*-*-10 --wordwrap -g 1000x150+5+610 /var/log/acpid,darkred /var/log/messages,purple &
exit 1


stardotstar 07-20-2006 06:16 PM

OK by hacking around and talking to the windows developer I have made some progress:

I reset img early to newimg and use that to title the image with convert:

Code:

    # Strip the src= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:4}|tr -d "\"")"
    newimg="${img}"

...

addTitleText=$(convert -fill white -draw 'text 100,100 '${newimg} /tmp/img.$rand /tmp/titled 2>&1)

...so now the titling of the picture is "image/0607/pictureoday.jpg" - perhaps I should be grepping the short image description from the web page...

konsolebox 07-20-2006 09:11 PM

Code:

#!/bin/bash
# getImg.sh 1.4 unspawn (www.linuxquestions.org) for stardotstar
# Purpose: grab web-based image from page (URI)
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage
# Run from: manual or cron

# konsolebox mod 1.4.1
# please tell me if you find bugs

# i suggest you use some variables
#XLOADIMAGE=/home/stardotstar/wallpaper.png
#DEBUG=1 OR novalue

# sends error message and exit
error() { echo "$@" >&2; exit 1; }
# you'll find this helpful when debugging
#debug() { [ "$DEBUG" ] && echo "$@" >&2; }

# Checks for wget and xloadimage
for b in wget xloadimage; do
 if ! which $b 2>&1>/dev/null; then
  error "$b not found or not in PATH"
 fi
done

# Set a pretty repeat pattern while we wait for the image to load.
xloadimage -onroot /home/stardotstar/wallpaper.png

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for teh intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 error "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting."
fi

# Grab file and dump on stdout and grep for HTML image tags and make HtTp: lowercase
declare -a imgs
declare -i imgcount=0
for a in $(wget -O - "$uri" 2>/dev/null | grep -oi "<img src=[^ ]*" | sed -e s/"<img src="// -e s/"http:\/\/"/"http:\/\/"/i); do
 imgs[$((++imgcount))]="$a"
done

if [ -z "${imgs[1]}" ]; then
 error "no image tag found from $uri"
fi

# Print the results
for ((a=1; a<=imgcount; a++)); do
 echo "[$a] ${imgs[a]}"
done

while read l; do
 # Pick the selected image
 img="${imgs[l]}"
 if [ -z "$img" ]; then
  error "invalid image number"
  continue
 fi
 
 # Correct wrong approach:
 if ! [ "${img:0:5}" = "http:" ]; then
  # 3. Get base
  base=$(dirname "$uri")
  # appends img to base with // cleared
  img="${base}/${img/*\/\/}"
 fi
 
 # Minor checks
 # 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  error "${0##*/}: img URI too short or empty (\""${img}"\")."
 fi
 
 # Still a lame way to get a randomised string for temp file.
 # We can also do some checks like unused files if we like but not for now
 declare -i rand=0
 until ((rand++ )); [ ! -e "/tmp/img.${rand}" ]; do
  :
 done
 touch /tmp/img.$rand 2>&1 >/dev/null || error "we can't create a new temp file"
 
 # Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 
 # no modifications from here and not yet tested
 
 if [ -s "/tmp/img.$rand" ]; then
 # 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null | egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then
  # loadImgRes=$(xloadimage -onroot -at 5,30 -border black "/tmp/img.$rand" 2>&1)
  addTitleText=$(convert -fill white -draw 'text 100,100 "${img}"' /tmp/img.$rand /tmp/titled 2>&1)
  loadImgRes=$(xloadimage -onroot -center -border black -fullscreen "/tmp/titled" 2>&1)
  # 8. If xloadimage fails dump some info to stderr.
  if [ "$?" != "0" ]; then
    echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
    echo "${0##*/}: grabbed image was \"${img}\"."
    echo "${0##*/}: temp image stats:"
    stat "/tmp/img.$rand" 1>&2
    file "/tmp/img.$rand" 1>&2
  fi
  else
  # Issue warning for unsupported image
  echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else
  # Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi
 
 # Remove the temporary image regardless.
 cp "/tmp/img.$rand" /home/wparker/AV/dailywallpaper/
 rm -f "/tmp/img.$rand"
 
 rm -f /tmp/titled
 # End of outer loop.
done

# Call root-tail to finish screen setup
root-tail -fn -misc-fixed-*-*-*-*-10 --wordwrap -g 1000x150+5+610 /var/log/acpid,darkred /var/log/messages,purple &
exit 1


konsolebox 07-20-2006 09:13 PM

sorry i immediately pressed the submit button without even making an intro. i just made some modifications to the script above. hope you'll find it useful. nice script by the way unspawn.

stardotstar 07-20-2006 11:40 PM

Hi Konsolebox;

there seem to be some problems with this version:

Code:

geko stardotstar # chmod a+x /usr/bin/getImg.sh
geko stardotstar # getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
/home/stardotstar/wallpaper.png is 14x14 PNG image, color type RGB, 8 bit
  Building XImage...done
[1] <IMG
[2] SRC="image/0607/PIA08576marsmeteorites45.jpg"
2
getImg.sh: grabbed image was empty.

invalid image number
geko stardotstar # getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
/home/stardotstar/wallpaper.png is 14x14 PNG image, color type RGB, 8 bit
  Building XImage...done
[1] <IMG
[2] SRC="image/0607/PIA08576marsmeteorites45.jpg"
1
getImg.sh: grabbed image was empty.

invalid image number
geko stardotstar # gvim /usr/bin/getImg.sh
(vim:1296): GnomeUI-WARNING **: While connecting to session manager:
Authentication Rejected, reason : None of the authentication protocols specified are supported and host-based authentication failed.
geko stardotstar # getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
/home/stardotstar/wallpaper.png is 14x14 PNG image, color type RGB, 8 bit
  Building XImage...done
[1] <IMG
[2] SRC="image/0607/PIA08576marsmeteorites45.jpg"

invalid image number
geko stardotstar # getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
/home/stardotstar/wallpaper.png is 14x14 PNG image, color type RGB, 8 bit
  Building XImage...done
[1] <IMG
[2] SRC="image/0607/PIA08576marsmeteorites45.jpg"
2
getImg.sh: grabbed image was empty.

invalid image number

I am inexperienced and so can't put my finger on it but it always presents an incorrect image selection and then either gets a zero size file or reports an invalid image number.

Although this could be useful since there are applications that require selecting the correct image in this case I don't want any interactivity and the page I am grabbing from has only one image. (well it has been known to have mouseovers and the second one is desirable but that is out of scope for now :) )

Thank you for ocntributing I will continue to try and grok your additions and integrate them if I can ...

Further to the zero image size problem the jpg in the above instance is also downloading as zero size... and yet it can be grabbed happily from browser etc...

Code:

geko stardotstar # wget http://antwrp.gsfc.nasa.gov/apod/image/0607/PIA08576marsmeteorites45.jpg -O thisimage
--14:42:41--  http://antwrp.gsfc.nasa.gov/apod/image/0607/PIA08576marsmeteorites45.jpg
          => `thisimage'
Resolving antwrp.gsfc.nasa.gov... 128.183.17.121
Connecting to antwrp.gsfc.nasa.gov|128.183.17.121|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 199,174 (195K) [image/jpeg]

100%[====================================>] 199,174      53.22K/s    ETA 00:00

14:42:46 (47.33 KB/s) - `thisimage' saved [199174/199174]

geko stardotstar # xloadimage thisimage
thisimage is a 727x540 JPEG image, color space YCbCr, 3 comps, Huffman coding.
  Building XImage...done

I wonder if it is the // in this example:

Code:


geko stardotstar # sh -x getImg.sh http://www.google.com/ncr
+ for b in wget xloadimage
+ which wget
+ case "$?" in
+ for b in wget xloadimage
+ which xloadimage
+ case "$?" in
+ xloadimage -onroot /home/stardotstar/wallpaper.png
/home/stardotstar/wallpaper.png is 14x14 PNG image, color type RGB, 8 bit
  Building XImage...done
+ uri=http://www.google.com/ncr
+ '[' 1 '!=' 1 -o http:// '!=' http:// ']'
+ wget -O - http://www.google.com/ncr
+ tr '[A-Z]' '[a-z]'
+ grep -ie '\<img.*src='
+ head -1
+ sed -e 's/.*<img/<img/g' -e 's/><.*$/>/g'
+ read l
+ l=(${l})
++ seq 0 4
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<im'
+ grep -qie src
+ '[' 1 = 0 ']'
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo src
+ grep -qie src
+ '[' 0 = 0 ']'
++ echo '"/intl/en/images/logo.gif"'
++ tr -d '"'
+ img=/intl/en/images/logo.gif
+ newimg=/intl/en/images/logo.gif
+ break
++ dirname http://www.google.com/ncr
+ base=http://www.google.com
+ '[' 21 = 5 -a http://www.google.com = http: ']'
+ img=http://www.google.com//intl/en/images/logo.gif
+ '[' 46 -lt 18 -o -z http://www.google.com//intl/en/images/logo.gif ']'
++ date
++ sha1sum
++ cut -c 1-26
+ rand=cb0894b63afc2d1d663abe40fa
+ wget -q http://www.google.com//intl/en/images/logo.gif -O /tmp/img.cb0894b63afc2d1d663abe40fa
+ '[' -s /tmp/img.cb0894b63afc2d1d663abe40fa ']'
+ echo 'getImg.sh: grabbed image was empty.'
getImg.sh: grabbed image was empty.
+ cp /tmp/img.cb0894b63afc2d1d663abe40fa /home/wparker/AV/dailywallpaper/
+ rm -f /tmp/img.cb0894b63afc2d1d663abe40fa
+ rm -f /tmp/titled
+ read l
+ root-tail -fn '-misc-fixed-*-*-*-*-10' --wordwrap -g 1000x150+5+610 /var/log/acpid,darkred /var/log/messages,purple
+ exit 0

again it seems to be related to relative and explicit paths...


Will

konsolebox 07-21-2006 12:10 AM

sorry about that. this should fix the problem now

Code:

#!/bin/bash
# getImg.sh 1.4 unspawn (www.linuxquestions.org) for stardotstar
# Purpose: grab web-based image from page (URI)
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage
# Run from: manual or cron

# konsolebox mod 1.4.2
# please tell me if you find bugs

# i suggest you use some variables
#XLOADIMAGE=/home/stardotstar/wallpaper.png
#DEBUG=1 OR novalue

# sends error message and exit
error() { echo "$@" >&2; exit 1; }
# you'll find this helpful when debugging
#debug() { [ "$DEBUG" ] && echo "$@" >&2; }

# Checks for wget and xloadimage
for b in wget xloadimage; do
 if ! which $b 2>&1>/dev/null; then
  error "$b not found or not in PATH"
 fi
done

# Set a pretty repeat pattern while we wait for the image to load.
xloadimage -onroot /home/stardotstar/wallpaper.png

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for teh intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 error "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting."
fi

# Grab file and dump on stdout and grep for HTML image tags and make HtTp: lowercase
declare -a imgs
declare -i imgcount=0
for a in $(wget -O - "$uri" 2>/dev/null | grep -oi "<img src=[^ ]*" | sed -e s/"<img src="//i -e s/"http:\/\/"/"http:\/\/"/i); do
 eval "imgs[$((++imgcount))]=$a"
done

if [ -z "${imgs[1]}" ]; then
 error "no image tag found from $uri"
fi

# Print the results
for ((a=1; a<=imgcount; a++)); do
 echo "[$a] ${imgs[a]}"
done

while read l; do
 # Pick the selected image
 img="${imgs[l]}"
 if [ -z "$img" ]; then
  error "invalid image number"
  continue
 fi
 
 # Correct wrong approach:
 if ! [ "${img:0:5}" = "http:" ]; then
  # 3. Get base
  base=$(dirname "$uri")
  # appends img to base with // cleared
  img="${base}/${img/*\/\/}"
 fi
 
 # Minor checks
 # 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  error "${0##*/}: img URI too short or empty (\""${img}"\")."
 fi
 
 # Still a lame way to get a randomised string for temp file.
 # We can also do some checks like unused files if we like but not for now
 declare -i rand=0
 until ((rand++ )); [ ! -e "/tmp/img.${rand}" ]; do
  :
 done
 touch /tmp/img.$rand 2>&1 >/dev/null || error "we can't create a new temp file"
 
 # Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"

 # no modifications from here and not yet tested
 
 if [ -s "/tmp/img.$rand" ]; then
 # 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null | egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then
  # loadImgRes=$(xloadimage -onroot -at 5,30 -border black "/tmp/img.$rand" 2>&1)
  addTitleText=$(convert -fill white -draw 'text 100,100 "${img}"' /tmp/img.$rand /tmp/titled 2>&1)
  loadImgRes=$(xloadimage -onroot -center -border black -fullscreen "/tmp/titled" 2>&1)
  # 8. If xloadimage fails dump some info to stderr.
  if [ "$?" != "0" ]; then
    echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
    echo "${0##*/}: grabbed image was \"${img}\"."
    echo "${0##*/}: temp image stats:"
    stat "/tmp/img.$rand" 1>&2
    file "/tmp/img.$rand" 1>&2
  fi
  else
  # Issue warning for unsupported image
  echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else
  # Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi
 
 # Remove the temporary image regardless.
 cp "/tmp/img.$rand" /home/wparker/AV/dailywallpaper/
 rm -f "/tmp/img.$rand"
 
 rm -f /tmp/titled
 # End of outer loop.
done

# Call root-tail to finish screen setup
root-tail -fn -misc-fixed-*-*-*-*-10 --wordwrap -g 1000x150+5+610 /var/log/acpid,darkred /var/log/messages,purple &
exit 1

regards :)

stardotstar 07-21-2006 12:12 AM

Ahhhhh, big problem we have been having here is that some of the files that are not making it down have mixed lower and upper case in file name...

so:

Code:

# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.

#wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
#|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})

# Here I allowed the file name to preserve its case...
wget -O - "$uri" 2>/dev/null|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})

and this worked fine for the image file that was complaining before.

:)

konsolebox 07-21-2006 12:29 AM

Good job there. :) Have you already tried my new mod? I've already tested the script and was able to download the image files properly. The reason why the script didn't work before is because of the uppercase IMG and SRC. And i have to use eval to reconstruct the statements removing the double quotes from being included in the variable's value.

But anyways. We already have two scripts here. So I hope you'll be successful building the rpm script. Best luck then. :)

Edit: batch script I mean

stardotstar 07-21-2006 01:41 AM

OK I am forging ahead with the title grabbing;

By modifying unspawn's greps and parsing I have managed to come up with a separate script that successfully looks for the first HTML <b> tag and head -1's that output from wget; then writes it to a text file which I try to strip the tags off...

Code:

#!/bin/bash

# Checks
# 0. wget, xloadimage
for b in wget xloadimage; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for the intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi


# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
#wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
#|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > /tmp/title.txt
cat /tmp/title.txt
 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
cat /tmp/title.txt|tr -d "<b>/" > /tmp/title.txt
cat /tmp/title.txt
    # Strip the src= part.
    #title="$(echo ${l[$i]:4}|tr -d "\"")"
    #title="$(echo ${l[$i]:4})"
    break
echo "$title" > /dev/stderr
exit 0

so what I get when I run it is *nearly* the title of my picture:

Code:

geko stardotstar # sh -x ./greptitle.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ for b in wget xloadimage
+ which wget
+ case "$?" in
+ for b in wget xloadimage
+ which xloadimage
+ case "$?" in
+ uri=http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ '[' 1 '!=' 1 -o http:// '!=' http:// ']'
+ wget -O - http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ grep -ie '<b>'
+ head -1
+ cat /tmp/title.txt
<b> Strangers on Mars </b> <br>
+ cat /tmp/title.txt
+ tr -d '<b>/'
+ cat /tmp/title.txt
 Strangers on Mars  r
+ break
+ echo ''

+ exit 0

You can see that the

<b> Strangers on Mars </b> <br>

almost gets properly formed - removing <b> and </b> <b > using tr (badly :lol: )
but I am left with the r which if I delete it ruins the text.

What I need is a way now of stripping leading white space and *all* the tags after the first <

I will keep hacking at it but I would appreciate some gentle nudges in the right direction.

Because our team formats the intranet page that displays the pic of the day (actually based on a very similar layout to that astronomy page - because it is so simple - image, title and description) I am using the astronomy page as a reference - because it can be tested naturally.

Will

stardotstar 07-21-2006 01:56 AM

OK I got my title in a text file with an evil use of sed:

Code:

cat /tmp/title.txt|tr -d "<b>/" > /tmp/title.txt
cat /tmp/title.txt|sed 's/^[ \t]*//;s/[ \t]*$//' > /tmp/title.txt
cat /tmp/title.txt|sed 's/r*$//' > /tmp/title.txt
cat /tmp/title.txt

now I will try to get it to work as the title in the image script...

konsolebox 07-21-2006 02:00 AM

hey you can try this one:

echo "<b> Strangers on Mars </b> <br" | grep -o "[[:alpha:]][ a-zA-Z]\+[[:alpha:]]"

Edit: or this one. this is the most perfect i can suggest
echo "<b> Strangers on Mars </b> <br" | sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i

stardotstar 07-21-2006 06:28 AM

Quote:

Originally Posted by konsolebox
hey you can try this one:

echo "<b> Strangers on Mars </b> <br" | grep -o "[[:alpha:]][ a-zA-Z]\+[[:alpha:]]"

Edit: or this one. this is the most perfect i can suggest
echo "<b> Strangers on Mars </b> <br" | sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i

Thank you konsolebox, I appreciate your help and will work with your solutions after I sleep:)

In the mean time I am really proud to display my (clunky and primitive) working script - I have hacked the original around and learnt heaps and managed to title the image when it is run against this page:

Code:

#!/bin/bash
# getImg.sh 1.4 unspawn (www.linuxquestions.org) for stardotstar
# expanded on by stardotstar 1.4.1 2getImg.sh
# Purpose: grab web-based image from page (URI)
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage
# Run from: manual or cron

# Checks
# 0. wget, xloadimage
for b in wget xloadimage; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

# Set a pretty repeat pattern while we wait for the image to load.

xloadimage -onroot /home/stardotstar/wallpaper.png

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for the intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi

# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.

#wget -O - "$uri" 2>/dev/null|tr [A-Z] [a-z]|grep -ie "\<img.*src=" 2>/dev/null|head -1\
#|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})

# Here I retain case for filename
wget -O - "$uri" 2>/dev/null|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
 
 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src
  if [ "$?" = "0" ]; then
    # Strip the src= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:4}|tr -d "\"")"
    newimg="${img}"
break
fi
done
# grep and write the title to a text file and then sed out the unwanted stuff
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > /tmp/title.txt
cat /tmp/title.txt|tr -d "<b>/" > /tmp/title.txt
cat /tmp/title.txt|sed 's/^[ \t]*//;s/[ \t]*$//' > /tmp/title.txt
cat /tmp/title.txt|sed 's/r*$//' > /tmp/title.txt
cat /tmp/title.txt|sed 's/[ \t]*$//' > /tmp/title.txt
title=`< /tmp/title.txt`
# Correct wrong approach:
 # 3. Get base
 base=$(dirname "$uri")
 # 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi
 # Minor checks
 # 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." > /dev/stderr
  exit 1
 fi
 # Lame way to get a randomised string for temp file.
 rand=$(date|sha1sum|cut -c 1-26)
 # Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 if [ -s "/tmp/img.$rand" ]; then
 # 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then
#    loadImgRes=$(xloadimage -onroot -at 5,30 -border black "/tmp/img.$rand" 2>&1)
        addTitleText=$(convert -fill white -pointsize 24 -draw "text 300,20 '$title'" /tmp/img.$rand /tmp/titled 2>&1)
        loadImgRes=$(xloadimage -onroot -center -border black -fullscreen "/tmp/titled" 2>&1)
    # 8. If xloadimage fails dump some info to stderr.
    if [ "$?" != "0" ]; then
      echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
      echo "${0##*/}: grabbed image was \"${img}\"."
      echo "${0##*/}: temp image stats:"
      stat "/tmp/img.$rand" 1>&2
      file "/tmp/img.$rand" 1>&2
    fi
  else
    # Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else
 # Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi
 # Remove the temporary image regardless.
 cp "/tmp/titled" "/home/wparker/AV/dailywallpaper/img.$rand"
 rm -f "/tmp/img.$rand"
 rm -f /tmp/titled
 rm -f /tmp/title.txt
# End of outer loop.
done
# Call root-tail to finish screen setup
root-tail -fn -misc-fixed-*-*-*-*-10 --wordwrap -g 1000x150+5+610 /var/log/acpid,darkred /var/log/messages,purple &
exit 1

This works well, although doesn't have quite the smarts it will need to place the text properly etc - but at least all features work. I will apply all suggested optimisations over the next few days or so and truly appreciate all the help and suggestions. I have really been bitten by bashing - I must find a good book or source to teach me the fundamentals!

Will

konsolebox 07-21-2006 06:42 AM

you're welcome and thank you too :)

if you want to learn more about bash i suggest:

The Bash Beginner's Guide http://www.tldp.org/LDP/Bash-Beginne...tml/index.html

or the Advanced Bash-Scripting Guide
http://www.tldp.org/LDP/abs/html/

that should help you a lot

stardotstar 07-21-2006 11:25 PM

BTW Consolebox your sed and grep worked beautifully (I settled on the sed) - thankyou for the optimisation. I have also make the temp tile a variable and used your debug section.

Just got to work out more about convert to centre the text and get it all nice and tidy.

I have written date onto the title too - seemed good for a pic'o'day

Code:

# set the date of the picoday
date +%D > /tmp/date.txt
date=`< /tmp/date.txt`
rm -f /tmp/date.txt
...
        addTitleText=$(convert -fill white -pointsize 18 -draw "text 50 50 '$date $title'" /tmp/img.$rand /tmp/titled 2>&1)

Will

stardotstar 07-23-2006 04:30 AM

OK guys, for the sake of completeness and to show off how proud I am of my finished product and thank you both for your help here is the script.

It is still a total mess from the POV of anyone with experience in programming and I can see that but I am happy that the final result works well and reliably for me for my application.

Code:

geko dailywallpaper # cat /usr/bin/2getImg.sh
#!/bin/bash
# 2getImg.sh 2.0 stardotstar and unspawn (www.linuxquestions.org)
# adapted by stardotstar from an original script by unspawn getImg.sh 1.4.1
# Purpose: grab web-based image from page (URI) and title
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage, convert
# Run from: manual or cron

# Checks
# 0. wget, xloadimage, convert
for b in convert wget xloadimage; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

debug=1

# sends error message and exit - contributed by konsolebox
error() { echo "$@" >&2; exit 1; }
# you'll find this helpful when debugging
debug() { [ "$DEBUG" ] && echo "$@" >&2; }

# Set a pretty repeat pattern while we wait for the image to load.
temptile=/home/stardotstar/wallpaper.png
xloadimage -onroot $temptile

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"
# 2. If no arg given or protocol doesn't match that for the intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi

# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
wget -O - "$uri" 2>/dev/null|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})

 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src
  if [ "$?" = "0" ]; then
    # Strip the src= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:4}|tr -d "\"")"
    newimg="${img}"
break
fi
done

# grep and write the title to a text file and then sed out the unwanted stuff - this is really clunky - almost to the point of being pure evil ;)
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > /tmp/title.txt
cat /tmp/title.txt|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i > /tmp/title.txt
title=`< /tmp/title.txt`

# set the date of the picoday
date +%d/%m/%y > /tmp/date.txt
date=`< /tmp/date.txt`

# Correct wrong approach:
# 3. Get base
 base=$(dirname "$uri")

# 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi

# Minor checks
# 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." > /dev/stderr
  exit 1
 fi

# Lame way to get a randomised string for temp file.
 rand=$(date|sha1sum|cut -c 1-26)

# Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 if [ -s "/tmp/img.$rand" ]; then

# 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then

# make a white canvas for the title block and append it to the grabbed pic
convert xc:white -resize 1x18! /tmp/blank.ppm 2>&1
convert -append /tmp/blank.ppm /tmp/img.$rand /tmp/intermediate_file.jpg 2>&1

# add the title text centered with the date, write as a jpg to maintain quality and xloadimage to root window
#addTitleText=$(convert -fill black -font Dragonwick-Regular -gravity "North Center" -pointsize 14 -draw "text 0 0 'Astronomy Picture of the day for $date \"$title\"'" /tmp/intermediate_file.jpg /tmp/titled.jpg 2>&1)
addTitleText=$(convert -fill black -font Verdana-Regular -gravity "North Center" -pointsize 12 -draw "text 0 0 'Astronomy Picture of the day for $date \"$title\"'" /tmp/intermediate_file.jpg /tmp/titled.jpg 2>&1)

loadImgRes=$(xloadimage -onroot -center -border black -fullscreen "/tmp/titled.jpg" 2>&1)

# 8. If xloadimage fails dump some info to stderr.
    if [ "$?" != "0" ]; then
      echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
      echo "${0##*/}: grabbed image was \"${img}\"."
      echo "${0##*/}: temp image stats:"
      stat "/tmp/img.$rand" 1>&2
      file "/tmp/img.$rand" 1>&2
    fi
  else

# Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else

# Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi

# Remove the temporary images and text files regardless but copy titled file to wallpaper directory for screensaver with date and title with whitespace and irregular characters removed.
 cat /tmp/title.txt|tr " :\/" "_" > /tmp/title.txt
 cat /tmp/date.txt|tr -d "/" > /tmp/date.txt
 title=`< /tmp/title.txt`
 date=`< /tmp/date.txt`
 name="$title-$date.jpg"
 cp "/tmp/titled.jpg" "/home/wparker/AV/wallpapers/dailywallpaper/$name"
 rm -f "/tmp/img.$rand"
 rm -f /tmp/titled
 rm -f /tmp/title.txt
 rm -f /tmp/date.txt
 rm -f /tmp/blank.ppm
 rm -f /tmp/intermediate.jpg
# End of outer loop.
done
exit 1

It does the titling using ImageMagick and so I have added convert to the required section and I have attempted to use variables where previously I had hard coded. I also did some renaming of the final files and copy it with a properly titled and dated name to another directory when the cleanup happens. I wanted a fixed name also because I will cron this job to run and if I am not on the network I don't want the program writing a new file each time it succeeds to the wallpapers directory. This way the name will always be the same - as long as the day is the same when the cron runs and the image changes etc...

Thanks for all your help - especially despawn - I am definitely on the way to knowing how to research my bash needs myself - this is what Linux Questions is all about.

PS UNSPAWN: I hope you don't mind my changing of the comments - I still attribute to you for the original script but I am so stoked on the outcome and the time I have spent getting it just right I wanted to claim the latest version for myself - I could use a short lecture on the ettiquette of this if you dont mind - because if I am out of line from an opensource POV I would like to know! :) Thanks again. I am indebted to you.
Also, I use convert to rewrite the file as a jpg and this seems not only to improve the quality of the display when using xloadimage but may also mean we can use other image formats not natively supported ??

Will

konsolebox 07-23-2006 04:42 AM

that's so nice to hear from you.
hey i wonder why unspawn's no longer making a reply. i hope he's not mad at me for modifying his code.

anyway congratulations dude. nice work there. :)

stardotstar 07-24-2006 04:09 AM

OK in an effort to get the desktop in Gnome/Nautilus back I dug around in the gconftool-2 and the graphic configuration editor.

So if anyone stumbles on this some time in the future and wants to run this as a native Gnome background I have added these lines instead of using xloadimage:

Code:

gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "vertical-gradient"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename /tmp/titled.jpg
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "centered"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "#0B0685"
gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "#000000"

this basically set the necessary keys in the gnome desktop wallpaper configuration and puts a vertial-gradient from black to dark blue.

I have run into a major show stopper at this point for three reasons:

1) gnome config editing seems to be more magic, smoke, mirrors and here's hoping (what with gnome settigns daemon restarting and erroring all the time on restarting sessions :roll: ) so I can't confirm that the keys are perfect - for example:
the configuration editor make the following comment about setting the picture options:
Quote:

Originally Posted by /desktop/gnome/background/picture_options 'long description'
Determines how the image set by wallpaper_filename is rendered. Possible values are "none", "wallpaper", "centered", "scaled", "stretched".

Notice that the comment refers to wallpaper_filename but the only valid existing key is picture_filename

Quote:

/desktop/gnome/background/picture_filename
File to use for the background image
anyway that may be a simple matter of interpretation but worth noting.

2) I found that the cron system can't open the display - so this script is not going to cut it when using xloadimage from a crontab call....

3) This appears to be fixed by using the gnome xml calls and punching the values in that way but for some reason I can't get it to work yet...

Code:

stardotstar@geko ~ $ cat /etc/crontab
# for vixie cron
#
# $Header: /var/cvsroot/gentoo-x86/sys-process/vixie-cron/files/crontab-3.0.1-r4,v 1.1 2005/03/04 23:59:48 ciaranm Exp $
#
#

# Global variables
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# check scripts in cron.hourly, cron.daily, cron.weekly and cron.monthly
0  *  * * *    root    rm -f /var/spool/cron/lastrun/cron.hourly
1  3  * * *    root    rm -f /var/spool/cron/lastrun/cron.daily
15 4  * * 6    root    rm -f /var/spool/cron/lastrun/cron.weekly
30 5  1 * *    root    rm -f /var/spool/cron/lastrun/cron.monthly
*/10  *  * * *  root    test -x /usr/sbin/run-crons && /usr/sbin/run-crons
4 15 * * *      root    /usr/bin/2getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html
27 17 * * *    stardotstar sh -x /usr/bin/2getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html > /var/log/getimg.log

this is still problematic when run as my user or as root. and it outputs nothing (or at best nothing useful)

Also, by writing things this way we are no longer in need of a random string - or it can be completely revised and better checks are needed to prevent wierd filenames being generated for the wallpaper directory (for use by the screensaver) so I am inclined to just get it running and then when I have more skills do a complete rewrite so that it works this way better.

I prefer to use the system this way - writing to the gnome desktop - because I can get my desktop icons and possibly call it from cron...

thoughts and contributions naturally appreciated but if this thread turns into a personal blog then that is just as cool :)

Will

stardotstar 07-24-2006 09:41 PM

OK here is the current Gnome Version - still messy but does a good job when incorporated with the cron call I have also copied:

Code:

stardotstar@geko ~ $ cat /usr/bin/2getImg.sh
#!/bin/bash
# 2getImg.sh 2.0 stardotstar and unspawn (www.linuxquestions.org)
# adapted by stardotstar from an original script by unspawn getImg.sh 1.4.1
# Purpose: grab web-based image from page (URI) and title
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage, convert
# Run from: manual or cron

# Checks
# 0. wget, xloadimage, convert
for b in convert wget xloadimage; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

debug=1

# sends error message and exit - contributed by konsolebox
error() { echo "$@" >&2; exit 1; }
# you'll find this helpful when debugging
debug() { [ "$DEBUG" ] && echo "$@" >&2; }

# Set a pretty repeat pattern while we wait for the image to load.
temptile=/home/stardotstar/wallpaper.png
#xloadimage -onroot $temptile
rm -f /tmp/titled.jpg
gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "solid"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename $temptile
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "wallpaper"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "#000000"
gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "#000000"

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"

# 2. If no arg given or protocol doesn't match that for the intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi

# Grab file and dump on stdout, make all lowercase, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
wget -O - "$uri" 2>/dev/null|grep -ie "\<img.*src=" 2>/dev/null|head -1\
|sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})

 # for each element in the array check if it matches the HTML image "src" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:3}"|grep -qie src
  if [ "$?" = "0" ]; then
    # Strip the src= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:4}|tr -d "\"")"
    newimg="${img}"
break
fi
done

# grep and write the title to a text file and then sed out the unwanted stuff - this is really clunky - almost to the point of being pure evil ;)
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > /tmp/title.txt
cat /tmp/title.txt|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i > /tmp/title.txt
title=`< /tmp/title.txt`

# set the date of the picoday
date +%d/%m/%y > /tmp/date.txt
date=`< /tmp/date.txt`

# Correct wrong approach:
# 3. Get base
 base=$(dirname "$uri")

# 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi

# Minor checks
# 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." > /dev/stderr
  exit 1
 fi

# Lame way to get a randomised string for temp file.
 rand=$(date|sha1sum|cut -c 1-26)

# Grab image and output to temp file.
 wget -q "$img" -O "/tmp/img.$rand"
 if [ -s "/tmp/img.$rand" ]; then

# 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "/tmp/img.$rand" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then

# make a white canvas for the title block and append it to the grabbed pic
convert xc:white -resize 1x18! /tmp/blank.ppm 2>&1
convert -append /tmp/blank.ppm /tmp/img.$rand /tmp/intermediate_file.jpg 2>&1

# add the title text centered with the date, write as a jpg to maintain quality and xloadimage to root window
#addTitleText=$(convert -fill black -font Dragonwick-Regular -gravity "North Center" -pointsize 14 -draw "text 0 0 'Astronomy Picture of the day for $date \"$title\"'" /tmp/intermediate_file.jpg /tmp/titled.jpg 2>&1)
addTitleText=$(convert -fill black -font Verdana-Regular -gravity "North Center" -pointsize 12 -draw "text 0 0 'Astronomy Picture of the day for $date \"$title\"'" /tmp/intermediate_file.jpg /tmp/titled.jpg 2>&1)

#loadImgRes=$(xloadimage -onroot -center -border black "/tmp/titled.jpg" 2>&1)
#gconftool-2 --type string  --set /desktop/gnome/background/wallpaper_filename /tmp/titled.jpg
gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "vertical-gradient"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename /tmp/titled.jpg
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "centered"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "#0B0685"
gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "#000000"


# 8. If xloadimage fails dump some info to stderr.
    if [ "$?" != "0" ]; then
      echo "${0##*/}: xloadimage failed with message \"${loadImgRes}\"."
      echo "${0##*/}: grabbed image was \"${img}\"."
      echo "${0##*/}: temp image stats:"
      stat "/tmp/img.$rand" 1>&2
      file "/tmp/img.$rand" 1>&2
    fi
  else

# Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported by xloadimage." > /dev/stderr
  fi
 else

# Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." > /dev/stderr
 fi

# Remove the temporary images and text files regardless but copy titled file to wallpaper directory for screensaver with date and title with whitespace and irregular characters removed.
 cat /tmp/title.txt|tr " :\/" "_" > /tmp/title.txt
 cat /tmp/date.txt|tr -d "/" > /tmp/date.txt
 title=`< /tmp/title.txt`
 date=`< /tmp/date.txt`
 name="$title-$date.jpg"
 cp "/tmp/titled.jpg" "/home/wparker/AV/wallpapers/dailywallpaper/$name"
 rm -f "/tmp/img.$rand"
 #rm -f /tmp/titled.jpg
 rm -f /tmp/title.txt
 rm -f /tmp/date.txt
 rm -f /tmp/blank.ppm
 rm -f /tmp/intermediate.jpg
# End of outer loop.
 done
exit 1

yes there is lots of redundant stuff in there now but it does work and as long as it is called from cron as your gnome user it should automatically update the background without intervention. This method also supports all the nautilus icons and desktop behaviours like transparency in terminals etc

Code:

39 22 * * *    stardotstar    sh -x /usr/bin/2getImg.sh http://antwrp.gsfc.nasa.gov/apod/astropix.html > /tmp/getimg.log
:)

stardotstar 07-25-2006 07:11 AM

Yesterday and today's pics on the astronomy site demonstrated that the image on the main page is not going to cut it and I set about grabbing the larger image. I also sorted out the problem with ' in the titles and subsequent script lines and file names.

At consolebox's suggestion I made many things variables - though imperfect for sure.

I couldn't get my if then else loop to work with a flag set for big=0/1 so that one could specify if you want the big image or the one from the main page... I have some work and research to do :)

anyway it works pretty well and as a cron job too so here is the current and relatively finished gnome version:

Code:

cat /usr/X11R6/bin/2getImg.sh
#!/bin/bash
# 2getImg.sh 2.0 stardotstar and unspawn (www.linuxquestions.org)
# adapted by stardotstar from an original script by unspawn getImg.sh 1.4.1
# Purpose: grab web-based image from page (URI) and title
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage, convert
# Run from: manual or cron

# Checks
# 0. wget, xloadimage, convert
for b in convert wget; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

# Set some variables:

debug=1
big=1 # if true the program will grab the larger version of the image otherwise only the one off the page itself
ibcst="solid" # initial background color shading type
ibpf=/home/stardotstar/wallpaper.png # tile to load as temp background
ibpo="wallpaper" # initial temp background image style/option
ibpc="#000000" # initial background primary color
ibsc="#000000" # initial background secondary color - I am using black but it would be useful to change these if we want a gradient and centered image that says please wait or something
tmpdir=/tmp # specify the temporary directory

bcst="vertical-gradient" # background color shading type
bpo="scaled" # background image style/option
bpc="#0B0685" # background primary color
bsc="#000000" # background secondary color - I am using black but it would be useful to change these if we want a gradient and centered image that says please wait or something
slideshwdir=/home/stardotstar/AV/wallpapers/dailywallpaper # specify the output directory for the renamed copies of the images

# remove yesterday's temporary image
rm -f $tmpdir/apod.jpg

# sends error message and exit - contributed by konsolebox
error() { echo "$@" >&2; exit 1; }
# you'll find this helpful when debugging
#debug() { [ "$DEBUG" ] && echo "$@" >&2; }

# Set a pretty background while we wait for the image to load.

gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "$ibcst"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename $ibpf
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "$ibpo"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "$ibpc"gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "$ibsc"

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"

# 2. If no arg given or protocol doesn't match that for the intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi

# Grab file and dump on stdout, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
# the following wget is to grab the image directly off the page - which is set with big=0

#if $big=0; then
#wget -O - "$uri" 2>/dev/null|grep -ie "\<img.*src=" 2>/dev/null|head -1 | sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
# for each element in the array check if it matches the HTML image "src" tag,
# first found match (break) gets stuffed in the "img" variable.
# for i in $(seq 0 $[${#l[@]}-1]); do
#  echo "${l[$i]:0:3}"|grep -qie src
#  if [ "$?" = "0" ]; then
    # Strip the href= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
#    img="$(echo ${l[$i]:5}|tr -d "\"")"
#    newimg="${img}"
#    echo $newimg > /tmp/img2get.txt
#break
#fi
#  done
#else
#if $big=1; then
wget -O - "$uri" 2>/dev/null|grep -ie "\<a.*href=\"image" 2>/dev/null|head -1 |sed -e "s/.*<a/<a/g" -e "s/>.*$//g"|while read l; do l=(${l})

 # for each element in the array check if it matches the HTML image "href" tag,
 # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:4}"|grep -qie href
  if [ "$?" = "0" ]; then
    # Strip the href= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:5}|tr -d "\"")"
    newimg="${img}"
    echo $newimg > /tmp/img2get.txt
break
fi
done
# grep and write the title to a text file and then sed out the unwanted stuff - contributed by konsolebox and update by stardotstar to handle ' ;)
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > /tmp/title.txtcat /tmp/title.txt|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i -e s/"'"/"\\\'"/i > $tmpdir/title.txt
title=`< $tmpdir/title.txt`
cp $tmpdir/title.txt $tmpdir/titletext.txt
# set the date of the picoday
date +%d/%m/%y > /tmp/date.txt
date=`< $tmpdir/date.txt`

# Correct wrong approach:
# 3. Get base
 base=$(dirname "$uri")

# 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi

# Minor checks
# 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." 2>&1
  exit 1
 fi

# Grab image and output to temp file.
 wget -q "$img" -O "$tmpdir/apod_orig"
 if [ -s "$tmpdir/apod_orig" ]; then

# 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "$tmpdir/apod_orig" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then

# make a white canvas for the title block and append it to the grabbed pic
convert xc:white -resize 1x18! $tmpdir/canvas.ppm 2>&1
convert -append $tmpdir/canvas.ppm $tmpdir/apod_orig $tmpdir/intermediate_file.jpg 2>&1

# add the title text centered with the date, write as a jpg to maintain quality

addTitleText=$(convert -fill black -font Verdana-Regular -gravity "North Center" -pointsize 12 -draw "text 0 0 'Astronomy Picture of the day for $date \"$title\"'" $tmpdir/intermediate_file.jpg $tmpdir/apod.jpg 2>&1)

# Poke the necessary stuff into the gnomeconftool-2

gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "$bcst"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename $tmpdir/apod.jpg
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "$bpo"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "$bpc"
gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "$bsc"

# Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported." >> /tmp/2getImg.sh
  fi
 else

# Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." >> /tmp/2getImg.log
 fi

# Remove the temporary images and text files regardless but copy titled file to wallpaper directory for screensaver with date and title with whitespace and irregular characters removed.
 cat $tmpdir/title.txt|tr " :\/" "_" > $tmpdir/title.txt
 cat $tmpdir/date.txt|tr -d "/" > $tmpdir/date.txt
 title=`< $tmpdir/title.txt`
 date=`< $tmpdir/date.txt`
 name="$title-$date.jpg"
 cp "$tmpdir/apod.jpg" "$slideshwdir/$name"
 rm -f "$tmpdir/apod_orig"
 rm -f $tmpdir/title.txt
 rm -f $tmpdir/date.txt
 rm -f $tmpdir/blank.ppm
 rm -f $tmpdir/intermediate.jpg
# End of outer loop.
done
exit 1

I have removed the random string and image names in favor of variables and the script tidying up after itself as well as possible. I am sure there will still be problems with error catching, filetypes and filenames but I will continue to work on it as a learning project.

Will

stardotstar 09-10-2006 02:37 AM

Need to raise the dead on this one:

I am finding that the script fails to grab the title from the page reliably. I can't fathom why but mostly I find that the file fails to get a title (and subsequently save with the correct file name) but I always get the daa te and other text. Something must be going on in the sed of the title section.


I don't know what it is - often it works second time around.

Can someone please have a quick look at this statement and suggest why:
Code:


# grep and write the title to a text file and then sed out the unwanted stuff - contributed by konsolebox and update by stardotstar to handle ' ;)
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > $tmpdir/title.txt
cat $tmpdir/title.txt|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i -e s/":"/"\:"/i -e s/"\""/"\\\""/i -e s/"'"/"\\\'"/i > $tmpdir/title.txt
title=''
title=`< $tmpdir/title.txt`
cat $tmpdir/title.txt
cp $tmpdir/title.txt $tmpdir/titletext.txt
cat $tmpdir/titletxt.txt
# set the date of the picoday
date +%d/%m/%y > /tmp/date.txt
date=`< $tmpdir/date.txt`

this is the debugging output:

Code:

wget -O - http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ grep -ie '<b>'
+ head -1
+ cat /tmp/title.txt
+ sed -e 's/.*<b> \{,1\}//i' -e 's/ \{,1\}<\/b>.*//i' -e 's/:/\:/i' -e 's/"/\"/i' -e 's/'\''/\\'\''/i'
+ title=
+ title=
+ cat /tmp/title.txt
+ cp /tmp/title.txt /tmp/titletext.txt
+ cat /tmp/titletxt.txt
cat: /tmp/titletxt.txt: No such file or directory
+ date +%d/%m/%y
+ date=10/09/06

The whole script and entire debug output:

Code:

stardotstar@spitfire /tmp $ cat /usr/local/bin/2getImg.sh
#!/bin/bash
# 2getImg.sh 2.0 stardotstar and unspawn (www.linuxquestions.org)
# adapted by stardotstar from an original script by unspawn getImg.sh 1.4.1
# Purpose: grab web-based image from page (URI) and title
# License: GPLv2
# Args: 1: http://doma.in/HTML-rendered-page.ext
# Deps: Bash, GNU utils, wget, xloadimage, convert
# Run from: manual or cron

# Checks
# 0. wget, xloadimage, convert
for b in convert wget; do
 which $b 2>&1>/dev/null; case "$?" in
  0) ;;
  *) echo "${0##*/}: $b not found or not in PATH, exiting"\
      > /dev/stderr; exit 127;;
 esac
done

# Set some variables:

#debug=1
big=1 # if true the program will grab the larger version of the image otherwise only the one off the page itself
ibcst="solid" # initial background color shading type
ibpf=/usr/local/share/images/wallpaper.png # tile to load as temp background
ibpo="wallpaper" # initial temp background image style/option
ibpc="#000000" # initial background primary color
ibsc="#000000" # initial background secondary color - I am using black but it would be useful to change these if we want a gradient and centered image that says please wait or something
tmpdir=/tmp # specify the temporary directory

bcst="vertical-gradient" # background color shading type
bpo="scaled" # background image style/option
bpc="#0B0685" # background primary color
bsc="#000000" # background secondary color - I am using black but it would be useful to change these if we want a gradient and centered image that says please wait or something
slideshwdir=/home/stardotstar/AV/Images/wallpapers/dailywallpaper # specify the output directory for the renamed copies of the images

# remove yesterday's temporary image
rm -f $tmpdir/apod.jpg

# sends error message and exit - contributed by konsolebox
error() { echo "$@" >&2; exit 1; }
# you'll find this helpful when debugging
#debug() { [ "$DEBUG" ] && echo "$@" >&2; }

# Set a pretty background while we wait for the image to load.

gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "$ibcst"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename $ibpf
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "$ibpo"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "$ibpc"
gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "$ibsc"

# Whatever first parameter is supplied serves as URI to grab.
uri="$1"

# 2. If no arg given or protocol doesn't match that for the intarweb we exit with a warning.
if [ "$#" != "1" -o "${uri:0:7}" != "http://" ]; then
 echo "${0##*/}: only one parameter needed: (URI + HTML rendered page: http://doma.in/file.ext), exiting." > /dev/stderr
 exit 1
fi

# Grab file and dump on stdout, grep for HTML image tag, and only use first line.
# (not grep -m1), sed strips off tags before and after. Now dump the string in variable "l" and turn it into array.
# the following wget is to grab the image directly off the page - which is set with big=0

#if $big=0; then
#wget -O - "$uri" 2>/dev/null|grep -ie "\<img.*src=" 2>/dev/null|head -1 | sed -e "s/.*<img/<img/g" -e "s/><.*$/>/g"|while read l; do l=(${l})
# for each element in the array check if it matches the HTML image "src" tag,
# first found match (break) gets stuffed in the "img" variable.
# for i in $(seq 0 $[${#l[@]}-1]); do
#  echo "${l[$i]:0:3}"|grep -qie src
#  if [ "$?" = "0" ]; then
    # Strip the href= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
#    img="$(echo ${l[$i]:5}|tr -d "\"")"
#    newimg="${img}"
#    echo $newimg > /tmp/img2get.txt
#break
#fi
#  done
#else
#if $big=1; then
wget -O - "$uri" 2>/dev/null|grep -ie "\<a.*href=\"image" 2>/dev/null|head -1 |sed -e "s/.*<a/<a/g" -e "s/>.*$//g"|while read l; do l=(${l})

 # for each element in the array check if it matches the HTML image "href" tag, # first found match (break) gets stuffed in the "img" variable.
 for i in $(seq 0 $[${#l[@]}-1]); do
  echo "${l[$i]:0:4}"|grep -qie href
  if [ "$?" = "0" ]; then
    # Strip the href= part.
    #img="$(echo ${l[$i]:4}|tr -d "\"")"
    img="$(echo ${l[$i]:5}|tr -d "\"")"
    newimg="${img}"
    echo $newimg > $tmpdir/img2get.txt
break
fi
done
# grep and write the title to a text file and then sed out the unwanted stuff - contributed by konsolebox and update by stardotstar to handle ' ;)
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > $tmpdir/title.txt
cat $tmpdir/title.txt|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i -e s/":"/"\:"/i -e s/"\""/"\\\""/i -e s/"'"/"\\\'"/i > $tmpdir/title.txt
title=''
title=`< $tmpdir/title.txt`
cat $tmpdir/title.txt
cp $tmpdir/title.txt $tmpdir/titletext.txt
cat $tmpdir/titletxt.txt
# set the date of the picoday
date +%d/%m/%y > /tmp/date.txt
date=`< $tmpdir/date.txt`

# Correct wrong approach:
# 3. Get base
 base=$(dirname "$uri")

# 4. Correct ifempty
 if [ "${#base}" = "5" -a "${base}" = "http:" ]; then
  base=EMPTY
  # 5. Follows whole URI is contained in img tag
  if [ "${img:0:2}" = "//" ]; then
  img="http://${img:2}"
  fi
 else
  img="${base}/${img}"
 fi

# Minor checks
# 6. Bail out if img is empty
 if [ "${#img}" -lt "18" -o -z "${img}" ]; then
  echo "${0##*/}: img URI too short or empty (\""${img}"\")." 2>&1
  exit 1
 fi

# Grab image and output to temp file.
 wget -q "$img" -O "$tmpdir/apod_orig"
 if [ -s "$tmpdir/apod_orig" ]; then

# 7. If size is not nil, check magic and find GIF or JPEG-type image (case insensitive).
  file -bi "$tmpdir/apod_orig" 2>/dev/null|egrep -qie "^image/(gi|jp)"
  if [ "$?" = "0" ]; then

# make a white canvas for the title block and append it to the grabbed pic
convert xc:white -resize 1x18! $tmpdir/canvas.ppm 2>&1
convert -append $tmpdir/canvas.ppm $tmpdir/apod_orig $tmpdir/intermediate_file.jpg 2>&1

# add the title text centered with the date, write as a jpg to maintain quality
addTitleText=$(convert -fill black -font Verdana-Regular -gravity "North Center" -pointsize 12 -draw "text 0 0 'Astronomy Picture of the day for $date \"$title\"'" $tmpdir/intermediate_file.jpg $tmpdir/apod.jpg 2>&1)

# Poke the necessary stuff into the gnomeconftool-2

gconftool-2 --type string  --set /desktop/gnome/background/color_shading_type "$bcst"
gconftool-2 --type string  --set /desktop/gnome/background/picture_filename $tmpdir/apod.jpg
gconftool-2 --type string  --set /desktop/gnome/background/picture_options "$bpo"
gconftool-2 --type string  --set /desktop/gnome/background/primary_color "$bpc"gconftool-2 --type string  --set /desktop/gnome/background/secondary_color "$bsc"

# Issue warning for unsupported image
    echo "${0##*/}: grabbed image is unsupported." >> /tmp/2getImg.sh
  fi
 else

# Add a warning for zero-sized images.
  echo "${0##*/}: grabbed image was empty." >> /tmp/2getImg.log
 fi

# Remove the temporary images and text files regardless but copy titled file to wallpaper directory for screensaver with date and title with whitespace and irregular characters removed.
 cat $tmpdir/title.txt|tr " \/" "_" > $tmpdir/title.txt
 cat $tmpdir/date.txt|tr -d "/" > $tmpdir/date.txt
 title=`< $tmpdir/title.txt`
 echo $title
 date=`< $tmpdir/date.txt`
 name="$title-$date.jpg"
 cp "$tmpdir/apod.jpg" "$slideshwdir/$name"
 rm -f "$tmpdir/apod_orig"a
 rm -f $tmpdir/img2get.txt
 rm -f $tmpdir/title.txt
 rm -f $tmpdir/date.txt
 rm -f $tmpdir/blank.ppm
 rm -f $tmpdir/intermediate.jpg
# End of outer loop.
done
exit 1
stardotstar@spitfire /tmp $


stardotstar@spitfire /tmp $ getdaily
+ for b in convert wget
+ which convert
+ case "$?" in
+ for b in convert wget
+ which wget
+ case "$?" in
+ big=1
+ ibcst=solid
+ ibpf=/usr/local/share/images/wallpaper.png
+ ibpo=wallpaper
+ ibpc='#000000'
+ ibsc='#000000'
+ tmpdir=/tmp
+ bcst=vertical-gradient
+ bpo=scaled
+ bpc='#0B0685'
+ bsc='#000000'
+ slideshwdir=/home/stardotstar/AV/Images/wallpapers/dailywallpaper
+ rm -f /tmp/apod.jpg
+ gconftool-2 --type string --set /desktop/gnome/background/color_shading_type s olid
+ gconftool-2 --type string --set /desktop/gnome/background/picture_filename /usr/local/share/images/wallpaper.png
+ gconftool-2 --type string --set /desktop/gnome/background/picture_options wallpaper
+ gconftool-2 --type string --set /desktop/gnome/background/primary_color '#000000'
+ gconftool-2 --type string --set /desktop/gnome/background/secondary_color '#000000'
+ uri=http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ '[' 1 '!=' 1 -o http:// '!=' http:// ']'
+ wget -O - http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ grep -ie '\<a.*href="image'
+ head -1
+ sed -e 's/.*<a/<a/g' -e 's/>.*$//g'
+ read l
+ l=(${l})
++ seq 0 1
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo '<a'
+ grep -qie href
+ '[' 1 = 0 ']'
+ for i in '$(seq 0 $[${#l[@]}-1])'
+ echo href
+ grep -qie href
+ '[' 0 = 0 ']'
++ echo '"image/0609/m46m47_hetlage_big.jpg"'
++ tr -d '"'
+ img=image/0609/m46m47_hetlage_big.jpg
+ newimg=image/0609/m46m47_hetlage_big.jpg
+ echo image/0609/m46m47_hetlage_big.jpg
+ break
+ wget -O - http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ grep -ie '<b>'
+ head -1
+ cat /tmp/title.txt
+ sed -e 's/.*<b> \{,1\}//i' -e 's/ \{,1\}<\/b>.*//i' -e 's/:/\:/i' -e 's/"/\"/i' -e 's/'\''/\\'\''/i'
+ title=
+ title=
+ cat /tmp/title.txt
+ cp /tmp/title.txt /tmp/titletext.txt
+ cat /tmp/titletxt.txt
cat: /tmp/titletxt.txt: No such file or directory
+ date +%d/%m/%y
+ date=10/09/06
++ dirname http://antwrp.gsfc.nasa.gov/apod/astropix.html
+ base=http://antwrp.gsfc.nasa.gov/apod
+ '[' 32 = 5 -a http://antwrp.gsfc.nasa.gov/apod = http: ']'
+ img=http://antwrp.gsfc.nasa.gov/apod/image/0609/m46m47_hetlage_big.jpg
+ '[' 66 -lt 18 -o -z http://antwrp.gsfc.nasa.gov/apod/image/0609/m46m47_hetlage_big.jpg ']'
+ wget -q http://antwrp.gsfc.nasa.gov/apod/image/0609/m46m47_hetlage_big.jpg -O/tmp/apod_orig
+ '[' -s /tmp/apod_orig ']'
+ file -bi /tmp/apod_orig
+ egrep -qie '^image/(gi|jp)'
+ '[' 0 = 0 ']'
+ convert xc:white -resize '1x18!' /tmp/canvas.ppm
+ convert -append /tmp/canvas.ppm /tmp/apod_orig /tmp/intermediate_file.jpg
++ convert -fill black -font Verdana-Regular -gravity 'North Center' -pointsize12 -draw 'text 0 0 '\''Astronomy Picture of the day for 10/09/06 ""'\''' /tmp/intermediate_file.jpg /tmp/apod.jpg
+ addTitleText=
+ gconftool-2 --type string --set /desktop/gnome/background/color_shading_type v ertical-gradient
+ gconftool-2 --type string --set /desktop/gnome/background/picture_filename /tmp/apod.jpg
+ gconftool-2 --type string --set /desktop/gnome/background/picture_options scaled
+ gconftool-2 --type string --set /desktop/gnome/background/primary_color '#0B0685'
+ gconftool-2 --type string --set /desktop/gnome/background/secondary_color '#000000'
+ echo '2getImg.sh: grabbed image is unsupported.'
+ cat /tmp/title.txt
+ tr ' \/' _
+ cat /tmp/date.txt
+ tr -d /
+ title=
+ echo

+ date=100906
+ name=-100906.jpg
+ cp /tmp/apod.jpg /home/stardotstar/AV/Images/wallpapers/dailywallpaper/-100906.jpg
+ rm -f /tmp/apod_origa
+ rm -f /tmp/img2get.txt
+ rm -f /tmp/title.txt
+ rm -f /tmp/date.txt
+ rm -f /tmp/blank.ppm
+ rm -f /tmp/intermediate.jpg
+ read l
+ exit 1

My underztanding of sed is very basic and I can't really see why the get imge section is working fine but the title - which often works when the script is run second time round fails - interestingly - on this notebook the title never if ever works but on my other it is generally only second time around and not when called by cron...

Hmmm.
Will

ghostdog74 09-10-2006 06:31 AM

hi
that's a good effort in bash. However there are modules out there that makes parsing html pages easy, eg in Perl, there are modules like LWP, or in Python, you can use urllib2 and BeautifulSoup module. eg a snippet

>>> import urllib2
>>> from BeautifulSoup import BeautifulSoup
>>> info = urllib2.urlopen("http://antwrp.gsfc.nasa.gov/apod")
>>> htmlsource = info.read()
>>> soup = BeautifulSoup(htmlsource)
>>> soup.head.title
<title>Astronomy Picture of the Day
</title>
>>> soup.findAll('img')
[<img src="image/0609/m46m47_hetlage.jpg" alt="See Explanation. Clicking on the picture will download the highest resolution version available." />]

For examples on Perl LWP you can see here
http://www.perl.com/pub/a/2002/08/20/perlandlwp.html

just a suggestion for your future web projects..;)

unSpawn 09-11-2006 11:36 AM

Need to raise the dead on this one
Nice to see you expand stuff, though it does need a bit of tending to, like what you're doing here:
Code:

# grep and write the title to a text file and then sed out the unwanted stuff - contributed by konsolebox and update by stardotstar to handle ' ;)
# 0. grab string to file
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > $tmpdir/title.txt
# 1. cat file, string ops and write to *same* file. That won't work.
cat $tmpdir/title.txt|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i -e s/":"/"\:"/i -e s/"\""/"\\\""/i -e s/"'"/"\\\'"/i > $tmpdir/title.txt
# 2. which gives empty title.
title=''
title=`< $tmpdir/title.txt`

while you could:
Code:

# 0. grab string to file
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1 > $tmpdir/title.tmp
# 1. string ops and write to new file.
sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i -e s/":"/"\:"/i -e s/"\""/"\\\""/i -e s/"'"/"\\\'"/i \
$tmpdir/title.tmp > $tmpdir/title.txt

...or:
Code:

# 0. grab string, string ops and write to file
wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1\
|sed -e s/".*<b> \{,1\}"//i -e s/" \{,1\}<\/b>.*"//i -e s/":"/"\:"/i -e s/"\""/"\\\""/i -e s/"'"/"\\\'"/i \
> $tmpdir/title.txt

...or even "better":
Code:

# 0. Grab string to array:
title=($(wget -O - "$uri" 2>/dev/null|grep -ie "<b>" 2>/dev/null|head -1))
# 1. Check and expect at least three elements:
if [ "${#title[@]}" -gt "3" ]; then
 # 2. Ooh, tis is lame! Walk through all elements in array, checking if the first
 # char is a leftside caret. If not, echo, so filling the array again
 title=$(n=${#array[@]}; c=0; until [ "$c" = "$n" ]; do [ \
 "${array[$c]:0:1}" != "<" ] && echo "${array[$c]}"; ((c++)); done)
else
        title="empty title"
fi

...and if that doesn't do it then it appears you're going to learn Perl or Python? ;-p

stardotstar 09-11-2006 10:52 PM

Perfect UnSpawn ! Thanks heaps. Titles are coming in consistently and file names are cleaning up great!

Will


All times are GMT -5. The time now is 04:49 AM.