copying/dumping guestbook - login problem in lynx
hello
The following task drives me crazy: I want to dump a guestbook from an quite old Mambo-CMS to a text file. Meanwhile there are several thousand pages and of course you have to log in to see the entries. First I wanted to use curl, but then of course I need the parsed content, so a browser seemed more appropriate. Lynx has the nice "dump" and "crawl" options, so it seemed like an easy task. But what about the login? This would be the command if no login were required (without crawl for starters): $ lynx -accept_all_cookies -dump -nolist "http://www.somepage....&startpage=1" >test.txt From this I get the login-page and not the page of interest. Looking for a solution I found the -post_data option, but I did not find a proper syntax for the datafile needed. There are some hints out there, but all way too cryptic for me. Is there a way to dump and crawl from within lynx? So I could login using lynx and then do for example the "print" command somehow for all the pages? Or is there a completly different way, for example using firefox or opera to automate cntr-c + cntr-v and calling the next page? thanks in advance Rainer |
If you can access the database it might be easier to get a dumb from there.
Another approach would be wget which allows to spider through websites. It's also a bit easier to pass post information for a login. Or you could do the login manually get the cookie or auth string and feed that to wget. I did that once but not sure of the exact routine. As far as the post-data options goes I guess you use key=value pairs per line and the --- at the end of the post data. |
Quote:
Quote:
Quote:
Quote:
|
All times are GMT -5. The time now is 04:10 AM. |