Ladies & Gents
It only took me a half a day to get this working < proof that I am learning
But there has to be a better way to do this in bash. I find it kind of pointless to have wget an htm file, save it, use w3m to parse it to text, save it again, read it into an array and grep it twice just to get a single address : port out of it, and then delete the two saved files. I wouldn't go to the trouble except that periodically the site has streaming issues and the address : port gets changes which brakes my script. Then I have to go parse the file again by hand and change the hard code in my script to the new data before it will work again. The worst part is I don't know it needs changing until after it has broken. By then it is to late and all the automation I have worked for is borked.
I did not have any luck trying to just grep the htm for the data and according to what I have read regx does not work well on htm files. Especially if what you are trying to do is very complicated.
What I find most irritating is that w3m will not translate htm-to-text with out the file being saved locally first, or at least I have not figured out how to do it yet. man w3m has not helped there. If I try to redirect the http address instead of a local file it doesn't work.
Code:
#!/bin/bash
wget http://www.shaareyzedek.mb.ca/service/serviceslive/liveservice.htm
w3m -cols 40 < liveservice.htm > liveservice.txt
oldifs=($IFS)
IFS=
read -a livestream <<<"$(grep FlashVars liveservice.txt | grep -Eo '(http|https)://[^/"]+')"
echo $livestream
IFS=($oldifs)
rm liveservice*
It just seams that there should be a way to get rid of 5 or 6 steps here.