Click 'submit' in a remote web form from Bash, really
I have created a simple web form which pokes a PHP script:
<form name="update" action="http://www.mysite.com/script.php" method="POST"> <input name="submit" type="submit" value="submit"> </form> I can easily surf along to the URL http://www.mysite.com/update.html and click the submit button which pokes script.php and everyone's happy. Automating this in iMacros or Selenium - no problem. I want to automate this in a bash cron on the same server - preferably as a one-liner. I *thought* something like this might work: Code:
wget --spider --post-data="submit=submit" http://www.mysite.com/wget.html Code:
curl -d "submit=submit" http://www.mysite.com/wget.html Is there ANY way I can click the submit button from a Bash command line? Curl, wget, lynx - one of these simply MUST be capable of this. I do not want to run script.php from the command line - there are loads of relative links etc. |
The HTTP request must be sent to the PHP URL 'http://www.mysite.com/script.php', not to the web page. The request sent to the web page will simply return the page. Of course, the accordant POST data must be attached to the request, but you seem to have figured this out.
You should be able to easily see the details of what the HTTP request needs to look like by running tcpdump with an appropriate filter to capture and disassemble the HTTP request packet(s) being sent from a real browser. --- rod. |
I think the problem is an authentication routine getting in the way. I have tried this:
Code:
/usr/bin/wget --save-cookies cook.txt --post-data 'email=me@mysite.com&password=letmeindamnyou' http://www.mysite.com/ This doesn't work. |
Are you getting that cookie back from the server? If you do, does the cookie have a correct information (whatever it is)? Maybe the cookie is not enough for this configuration and you need to pass the session id over the GET or POST... It's hard to tell since I don't know how your authentication mechanism works. You're on the right track though. Get Firebug and check out what is sent with your requests when you use browser, then try to replicate with wget.
|
Quote:
Code:
[theNbomr@myhost ]$ sudo /usr/sbin/tcpdump -i eth0 -XX -n port 80 |
TcpDump is a bit too low level for analyzing high level protocols like HTTP though still invaluable tool in many other cases ;)
I strongly recommend Firebug and it's Net panel. It shows nicely formatted request headers etc etc... |
The authentication system is a simple MySQL lookup.
tcpdump -i eth1 -XX -n port 80 Returns a huge amount of gibberish in which I couldn't grep anything related e.g. POST, mysite etc. Firebug however rocks. Net > All > Post gives me: Code:
Source: user=me@mysite.com&password=letmeindamnyou&login=Login Code:
wget --save-cookies=cookie --post-data='user=me@mysite.com&password=letmeindamnyou&login=Login' http://www.mysite.com/script.php However, this is not the original site. I've somehow diddled the real site with my experimentation and now I cannot login through a browser. Suspect it's a permissions issue. As soon as I solve that, I'll try to poke the script and post the results. Thanks so far guys! |
If the site is not your own site, perhaps they've banned you due to too many failed login attempts. Sometimes the ban is only temporary, and automatically lifted in a day or so.
--- rod. |
That may be the case. Look closer at the response, maybe it will give you some hints on what is going wrong.
|
I thought I had it but I don't. The problem is the authentication step. If I remove the authentication from the target script, this works fine:
Code:
/usr/bin/wget --spider http://mysite.com/script.php Code:
/usr/bin/wget --post-data 'username=me&password=letmeindamnyou' http://www.mysite.com/ I need to know how I can authenticate to the site then separately access the protected pages; wedge the door open then steal the gold. |
I guess that your authentication is based on the PHP session id. If you don't get the cookie that means the server doesn't send it. In PHP when you disable session cookie the session id will be attached to the URL. It may be passed over hidden field. First, you must have a full understanding of how your authentication works. Can you disable cookies in your browser and then see if you can use the page?
You may need to fetch the session id from the response and attach it to the URL request (post or get, whatever is used). |
It is based on PHP session ID. It is not sent by the server. I don't notice anything in the URL, but I know what the ID code is from Firebug. How can I use it with wget?
|
Try adding this to your request: --keep-session-cookies --cookies=on check if you will get anyting saved. I still think you get that session cookie.
If you can see the session id in the Firebug then you can also see where it comes from (cookie? hidden field in HTML?). Also, you are using "spider" option, so if the session id is passed over html hidden field you are unable to fetch it as you simply discard it. You may have to use "-O -" with wget to print it to stdio and then grep it to fetch the session id. Then in your second following wget requests you may want to add something like --post-data="PHPSESSID=<your session id>". PHPSESSID is the default variable name that php uses so you can give it a blind shot. wget -O - --post-data="your login info" http://yourURL | grep "PHPSESSID" |
I had an old version of wget installed 1.9.1, so I upgraded to 1.11.4 which gave me the --keep-session-cookies function. So, I can now do it from the command line successfully:
Code:
/usr/bin/wget -nd --keep-session-cookies --cookies=on --save-cookies /root/cook.txt --post-data 'username=me&password=letmeindamnyou' http://www.mysite.com/ && /usr/bin/wget -nd --load-cookies /root/cook.txt -p http://www.mysite.com/script.php && rm -f /root/cook.txt && rm -f /root/index.html && rm -f /root/script.php However, I cannot get this to work as a cron - it doesn't execute. Do I need to enclose bits with quotations? |
Popping it into a file and calling it from cron works fine.
|
All times are GMT -5. The time now is 02:38 PM. |