I figured, I would share something I am working on ATM. A program to download a whole photostream. This is the parser part of it.
The file contains 1243 words and cannot post it here. So, here's a link ->
http://www.4shared.com/file/UMrpIUld/photostream.html
That's just a file that consists of a bunch of Json objects. The new lines and tabs are removed. I have parsed out all of the http(s) urls.
The parser ->
awk '{gsub(",", "\n"); print}' photostream | sed -n 's/["{\\]//g;s/^[a-zA-Z0-9]*\://g;/\(^[http].*\:\).*\([jpg]$\)/p'
The file is obtained with an access token allowing you to obtain the Json file. Then the URL's are parsed out with the parser above. What do you think? If anyone wants to tighten up the parser, that would be cool and I welcome any suggestions!
I keep procrastinating and pushing this off to the side due to the next step being difficult. I wrote the downloader, but haven't accounted for the next URL set. Which resides at the bottom but is currently parsed out. I would have to separate the http urls from the next set url. Then run the parsed URL's through the downloader and download the next photoset before downloading them and proceeding. It's a bigger pain than it sounds!
Maybe someone wants to help write this?? Could incorporate some Perl to automate the access token process and a GUI with GTK!?
THOUGHTS? ADVICE?