LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Click 'submit' in a remote web form from Bash, really (https://www.linuxquestions.org/questions/programming-9/click-submit-in-a-remote-web-form-from-bash-really-882839/)

geokker 05-26-2011 09:22 AM

Click 'submit' in a remote web form from Bash, really
 
I have created a simple web form which pokes a PHP script:

<form name="update" action="http://www.mysite.com/script.php" method="POST">
<input name="submit" type="submit" value="submit">
</form>

I can easily surf along to the URL http://www.mysite.com/update.html and click the submit button which pokes script.php and everyone's happy. Automating this in iMacros or Selenium - no problem.

I want to automate this in a bash cron on the same server - preferably as a one-liner.

I *thought* something like this might work:

Code:

wget --spider --post-data="submit=submit" http://www.mysite.com/wget.html
or

Code:

curl -d "submit=submit" http://www.mysite.com/wget.html
and many variations thereof but no luck.

Is there ANY way I can click the submit button from a Bash command line? Curl, wget, lynx - one of these simply MUST be capable of this.



I do not want to run script.php from the command line - there are loads of relative links etc.

theNbomr 05-26-2011 11:07 AM

The HTTP request must be sent to the PHP URL 'http://www.mysite.com/script.php', not to the web page. The request sent to the web page will simply return the page. Of course, the accordant POST data must be attached to the request, but you seem to have figured this out.

You should be able to easily see the details of what the HTTP request needs to look like by running tcpdump with an appropriate filter to capture and disassemble the HTTP request packet(s) being sent from a real browser.

--- rod.

geokker 05-26-2011 11:53 AM

I think the problem is an authentication routine getting in the way. I have tried this:


Code:

/usr/bin/wget --save-cookies cook.txt --post-data 'email=me@mysite.com&password=letmeindamnyou' http://www.mysite.com/
&& /usr/bin/wget --load-cookies cook.txt -p http://www.mysite.com/script.php

Authenticating, saving any cookie giblets into a file then poking the script as a separate process.

This doesn't work.

krizzz 05-26-2011 01:42 PM

Are you getting that cookie back from the server? If you do, does the cookie have a correct information (whatever it is)? Maybe the cookie is not enough for this configuration and you need to pass the session id over the GET or POST... It's hard to tell since I don't know how your authentication mechanism works. You're on the right track though. Get Firebug and check out what is sent with your requests when you use browser, then try to replicate with wget.

theNbomr 05-26-2011 01:53 PM

Quote:

I think the problem is an authentication routine getting in the way.
All the more reason to snoop online to see how a real browser handles the job. This (truncated) sample shows what a request to this thread on LQ looks like. If there was authentication going on, it should show up:
Code:

[theNbomr@myhost ]$ sudo /usr/sbin/tcpdump -i eth0 -XX -n port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
11:46:37.697506 IP XX.XX.XX.XX.60883 > 75.126.162.205.80: Flags [SEW], seq 4081877752, win 5840, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
        0x0000:  0016 3e00 004c 001c c0ea e73f 0800 4500  ..>..L.....?..E.
        0x0010:  0034 8514 4000 4006 a4f2 8e5a 9417 4b7e  .4..@.@....Z..K~
        0x0020:  a2cd edd3 0050 f34c 82f8 0000 0000 80c2  .....P.L........
        0x0030:  16d0 e25a 0000 0204 05b4 0101 0402 0103  ...Z............
        0x0040:  0307                                    ..
11:46:37.765204 IP 75.126.162.205.80 > XX.XX.XX.XX.60883: Flags [S.], seq 4126949465, ack 4081877753, win 5840, options [mss 1460], length 0
        0x0000:  001c c0ea e73f 0016 3e00 004c 0800 4500  .....?..>..L..E.
        0x0010:  002c 0000 4000 3106 390f 4b7e a2cd 8e5a  .,..@.1.9.K~...Z
        0x0020:  9417 0050 edd3 f5fc 4059 f34c 82f9 6012  ...P....@Y.L..`.
        0x0030:  16d0 d5c8 0000 0204 05b4 0000            ............
11:46:37.765249 IP XX.XX.XX.XX.60883 > 75.126.162.205.80: Flags [.], ack 1, win 5840, length 0
        0x0000:  0016 3e00 004c 001c c0ea e73f 0800 4500  ..>..L.....?..E.
        0x0010:  0028 8515 4000 4006 a4fd 8e5a 9417 4b7e  .(..@.@....Z..K~
        0x0020:  a2cd edd3 0050 f34c 82f9 f5fc 405a 5010  .....P.L....@ZP.
        0x0030:  16d0 ed85 0000                          ......
11:46:37.765413 IP XX.XX.XX.XX.60883 > 75.126.162.205.80: Flags [.], seq 1:2921, ack 1, win 5840, length 2920
        0x0000:  0016 3e00 004c 001c c0ea e73f 0800 4500  ..>..L.....?..E.
        0x0010:  0b90 8516 4000 4006 9994 8e5a 9417 4b7e  ....@.@....Z..K~
        0x0020:  a2cd edd3 0050 f34c 82f9 f5fc 405a 5010  .....P.L....@ZP.
        0x0030:  16d0 1c40 0000 504f 5354 202f 7175 6573  ...@..POST./ques
        0x0040:  7469 6f6e 732f 6e65 7772 6570 6c79 2e70  tions/newreply.p
        0x0050:  6870 3f64 6f3d 706f 7374 7265 706c 7926  hp?do=postreply&
        0x0060:  743d 3838 3238 3339 2048 5454 502f 312e  t=882839.HTTP/1.
        0x0070:  310d 0a48 6f73 743a 2077 7777 2e6c 696e  1..Host:.www.lin
        0x0080:  7578 7175 6573 7469 6f6e 732e 6f72 670d  uxquestions.org.

--- rod.

krizzz 05-26-2011 02:04 PM

TcpDump is a bit too low level for analyzing high level protocols like HTTP though still invaluable tool in many other cases ;)
I strongly recommend Firebug and it's Net panel. It shows nicely formatted request headers etc etc...

geokker 05-26-2011 04:27 PM

The authentication system is a simple MySQL lookup.

tcpdump -i eth1 -XX -n port 80

Returns a huge amount of gibberish in which I couldn't grep anything related e.g. POST, mysite etc.

Firebug however rocks. Net > All > Post gives me:

Code:

Source: user=me@mysite.com&password=letmeindamnyou&login=Login
so

Code:

wget --save-cookies=cookie --post-data='user=me@mysite.com&password=letmeindamnyou&login=Login' http://www.mysite.com/script.php
works!

However, this is not the original site. I've somehow diddled the real site with my experimentation and now I cannot login through a browser. Suspect it's a permissions issue.

As soon as I solve that, I'll try to poke the script and post the results.

Thanks so far guys!

theNbomr 05-26-2011 07:11 PM

If the site is not your own site, perhaps they've banned you due to too many failed login attempts. Sometimes the ban is only temporary, and automatically lifted in a day or so.
--- rod.

krizzz 05-27-2011 08:34 AM

That may be the case. Look closer at the response, maybe it will give you some hints on what is going wrong.

geokker 05-31-2011 05:50 AM

I thought I had it but I don't. The problem is the authentication step. If I remove the authentication from the target script, this works fine:

Code:

/usr/bin/wget --spider http://mysite.com/script.php
But the authentication seems to fail if I do:

Code:

/usr/bin/wget --post-data 'username=me&password=letmeindamnyou' http://www.mysite.com/
&&
/usr/bin/wget http://www.mysite.com/script.php

When I use --save-cookies cookie.txt the file is empty. Clearly, && /usr/bin/wget http://www.mysite.com/script.php is hitting the authentication form again.

I need to know how I can authenticate to the site then separately access the protected pages; wedge the door open then steal the gold.

krizzz 05-31-2011 08:28 AM

I guess that your authentication is based on the PHP session id. If you don't get the cookie that means the server doesn't send it. In PHP when you disable session cookie the session id will be attached to the URL. It may be passed over hidden field. First, you must have a full understanding of how your authentication works. Can you disable cookies in your browser and then see if you can use the page?
You may need to fetch the session id from the response and attach it to the URL request (post or get, whatever is used).

geokker 05-31-2011 10:09 AM

It is based on PHP session ID. It is not sent by the server. I don't notice anything in the URL, but I know what the ID code is from Firebug. How can I use it with wget?

krizzz 05-31-2011 11:05 AM

Try adding this to your request: --keep-session-cookies --cookies=on check if you will get anyting saved. I still think you get that session cookie.

If you can see the session id in the Firebug then you can also see where it comes from (cookie? hidden field in HTML?). Also, you are using "spider" option, so if the session id is passed over html hidden field you are unable to fetch it as you simply discard it. You may have to use "-O -" with wget to print it to stdio and then grep it to fetch the session id. Then in your second following wget requests you may want to add something like --post-data="PHPSESSID=<your session id>". PHPSESSID is the default variable name that php uses so you can give it a blind shot.

wget -O - --post-data="your login info" http://yourURL | grep "PHPSESSID"

geokker 06-08-2011 12:22 PM

I had an old version of wget installed 1.9.1, so I upgraded to 1.11.4 which gave me the --keep-session-cookies function. So, I can now do it from the command line successfully:

Code:

/usr/bin/wget -nd --keep-session-cookies --cookies=on --save-cookies /root/cook.txt --post-data 'username=me&password=letmeindamnyou' http://www.mysite.com/ && /usr/bin/wget -nd --load-cookies /root/cook.txt -p http://www.mysite.com/script.php && rm -f /root/cook.txt && rm -f /root/index.html && rm -f /root/script.php
This runs the script and deletes the downloaded files - --spider prevents the script from running.

However, I cannot get this to work as a cron - it doesn't execute. Do I need to enclose bits with quotations?

geokker 06-21-2011 04:27 AM

Popping it into a file and calling it from cron works fine.


All times are GMT -5. The time now is 02:38 PM.