ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
This should be an interesting question I belive ( If not, am sorry )
There are so many webpages with action buttons in them. If the button is clicked then the registered action to the button gets executed. My question is how the process of searching manually for a button in the page and clicking the particular button be done through a script.
Basically, we have an internal page with so many action buttons in that , in which when the button is clicked it navigates to other page which contain the vital information which can be parsed. (HTML parsing which could be done easily). But the page as well contains other action buttons. Based on the values parsed I need to click the button(s).
Even if values can be parsed, how to script the job of clicking the right button.
If you are parsing the HTML and not rendering the page, you couldn't just trigger a mouse click over the button's area. However, if the button calls some javascript you could maybe analyse the function, or if it submits a form you can issue a POST or GET request to the same url with the expected parameters. Hopefully each button has a name, or reliable location.
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
Rep:
Just to elaborate a bit more on Proud's answer:
Usually information is sent thru forms. Somehow the form is submitted. Often there is a submit button, bit it can also be done by any event which triggers the java submit() method. Usually it is something like this.form.submit() or myformname.submit().
What you'd have to look for is something like
Code:
<FORM blabla action="next_page.html" bla bla>
.....
.....
</FORM>
So you would have to create a small parser which parses the <FORM> tag.
Additionally, within the form you will encounter form elements (inout box, radio buttons etc) which can be given a value. Once you figured that out, you can compose a POST string and pass it to wget so it can actually be posted.
Interesting project, I have sometimes faced the same need, but lacked the time to implement it.
I think that this mechanism is also being used by spammers who create spam bots to post on forums. (Hence the pattern you have to recognize before you can register) Bit maybe there is already something published about this subject.
My website actually watches for this sort of behavior, because it is typical spambot behavior.
If you do any of the things my site watches for, it'll blacklist your IP permanently, and send me an email. And I will contact your hosting service with a complaint.
A big ' NO ' - for the question whether its to spam.
Actually its the part of the project, where we are trying to automate the process of clicking and copying the data.
Since the base application has been successfull and tested manually, we would like to test it for different kinds of data - which manually is not possible at all.
This is an interesting topic.
Just for interests sake, I wrote a little perl script that uses HTML::Parser, to parse a web page that I grabbed with wget (this very page was the example I testd with). I was easily able to find all of the javascript elements. Some are inline javascript, and others are links. I don't know any javascript, but I couldn't see much hope of recognizing anything that looked like it was creating a button, or handling a button press.
Can someone give an example or description of what to look for?
I don't know if this helps, but you can try to use lynx (the commandline based browser). It can accept a command file with keystrokes.
Read man lynx, specifically the option cmd_log and cmd_script.
And the source code for lynx is available, so you can always go through that to get your own simulation.
This is an interesting topic.
Just for interests sake, I wrote a little perl script that uses HTML::Parser, to parse a web page that I grabbed with wget (this very page was the example I testd with). I was easily able to find all of the javascript elements. Some are inline javascript, and others are links. I don't know any javascript, but I couldn't see much hope of recognizing anything that looked like it was creating a button, or handling a button press.
Can someone give an example or description of what to look for?
--- rod.
If you dont mind could you please post your parser.
Though I have implemented the parser myself, its too complicated as it doesnt use any of the packages as HTML::Parser.
Okay, here it is. It was never written with the intention of publishing it, so it isn't pretty. This was hacked together on the fly, so lots of useless code, but I don't want to break it by trying to pretty it up. A lot of this was just cribbed from the documentation page on CPAN.
Code:
#! /usr/bin/perl -w
#
# jscriptParser.pl
#
# Finds all javascript references in an HTML file
#
# jscriptParser.pl <filename.html>
#
use HTML::Parser();
my @scripts;
my $expectingJscript = 0;
sub start_handler{
my $tag = shift;
my $attr = shift;
my $text = shift;
my $self = shift;
my %attrList = %{$attr};
if( $tag =~ m/^script/i ){
# print "Start Tag: $tag\n";
# print "Text: $text\n";
if( exists $attrList{ "type" } ){
if( $attrList{ "type" } =~ m/text\/javascript/i ){
# print "\tAttributes:\n";
# foreach my $attr ( keys %attrList ){
# print "\t$attr = $attrList{ $attr }\n";
# }
if( exists $attrList{ "src" } ){
push @scripts, $attrList{ "src" };
}
else{
# print "Expecting inline javascript\n";
$expectingJscript = 1;
}
}
}
else{
$expectingJscript = 0;
return;
}
}
return;
}
sub end_handler{
my $tag = shift;
my $text = shift;
my $self = shift;
if( $tag =~ m/script/i ){
# print "End Tag: $tag\n";
$expectingJscript = 0;
}
return;
}
sub comment_handler{
my $text = shift;
my $self = shift;
if( $expectingJscript ){
print "Javascript CommentText: $text\n";
}
else{
# print "Non-jscript comment ignored\n";
}
return;
}
sub text_handler{
my $text = shift;
my $self = shift;
if( $expectingJscript ){
print "\n\nInline javascript : \n";
print "===================\n";
print "$text\n",
}
else{
# print "Non-jscript Dtext ignored\n";
}
return;
}
my $p = HTML::Parser->new(api_version => 3);
#
# Assign handlers for various HTML element types
#
$p->handler( start => \&start_handler, "tagname,attr,text,self");
$p->handler( end => \&end_handler, "tagname,text,self");
$p->handler( comment => \&comment_handler, "text,self");
$p->handler( text => \&text_handler, "text,self");
$p->parse_file(shift || die) || die $!;
#
# Dump the list of found jscript references
#
print "\n\nExternal scripts named:\n",
"===========================\n";
foreach my $scriptSource ( @scripts ){
print $scriptSource,"\n";
}
Just run the script with the filename of an HTML page as an argument.
chrism01's reference to HTML::Mechanize, which I'd completely forgotten about, suggests that this problem is by no means trivial to solve.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.