Is There a Way in Perl To Locating The X,Y Coordinates of Links on a Web Page
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Distribution: Fedora on the desk / Gentoo in the Racks
Posts: 36
Rep:
Is There a Way in Perl To Locating The X,Y Coordinates of Links on a Web Page
I'm trying to parse an html page and output it into a list of links with there x,y corrdinates. I'm already using getLinks from the DOM Object in PHP, as described here: http://www.phpro.org/examples/Get-Links-With-DOM.html ... it works wonders, I trim the list and only return the Text Description of the link.
From everything that I can find and everything I've tried, I can get the x,y coordinates using javascript on the client but this won't be running on the client so that's no good... and I really don't like javascript.
Does anyone know how I could go about this to grab the X,Y's coordinates of these links maybe in Perl? Any help would be appreciated, btw: Again I'm trying to keep this server side.
This is only possible with JavaScript, because JavaScript "knows" the actual state of the rendered HTML/CSS and the actual window size and things like that you need to calculate XY coordinates within a webpage.
There is the possiblity to run a virtual browser from within Perl though, but AFAIK JavaScript is still necessary.
Forgot how the project was called - but it was made to measure user's click'n'drag behavior in a browser.
And short reminder: On the server side there are no XY coordinates _yet_. They exist only after the browser rendered the page - and they all depend on window size, font size, typ of layout of the webpage - not to mention the possibility that user X filters the links...
If you just need the links - at least 10 perl modules will deliver them to you without every calculating one coordinate.
I'm trying to parse an html page and output it into a list of links with there x,y corrdinates. I'm already using getLinks from the DOM Object in PHP, as described here: http://www.phpro.org/examples/Get-Links-With-DOM.html ... it works wonders, I trim the list and only return the Text Description of the link.
From everything that I can find and everything I've tried, I can get the x,y coordinates using javascript on the client but this won't be running on the client so that's no good... and I really don't like javascript.
Does anyone know how I could go about this to grab the X,Y's coordinates of these links maybe in Perl? Any help would be appreciated, btw: Again I'm trying to keep this server side.
Only the browser, and its built-in javascript interpreter, knows how the HTML is rendered, and browsers will render the page differently based on many things. Perhaps you can tell us what you really want to do, or why you think you want to do this, and alternative solutions can be suggested.
--- rod.
Distribution: Fedora on the desk / Gentoo in the Racks
Posts: 36
Original Poster
Rep:
Actually I don't need the client browser to generate the page view, I'm using CutyCapt "http://cutycapt.sourceforge.net/" which emulates a safarie browser environment through the use of webkit. It requires that the server have X running on it so in essence it's viewing the page being requested. Additionally It enables me to set a fixed browser size such as 1024X768.
Perhaps I can get the javascript to do it's thing within the webkit environment...
There is no such thing as a fixed XY-coordinate of some link within a website.
This entirely depends on how the browser renders the page - and even if a page's got a fixed layout with fixed pixel widths and heights, I still can change the font size in my browser which will move any text - including links - to different coordinates.
Not to mention that many people just don't use 1024x768 as a resolution (me for example), don't open their browser in full screen mode (me for example) and don't use webkit-based browsers (all Firefox, Opera and IE users, for example.)
You would have to grab all this values from within a user's browser environment (which I would consider _highly_ intrusive into my privacy), it would require that a user has _actually_ JavaScript enabled, doesn't filter stuff or uses NoScript...
And I haven't even started yet with userContent.css manipulations...
If you just want to grab links, use _anything_ able handling DOM - there're at least 10 Perl modules to do that on CPAN, as Sergei suggested.
Okay, I think I can read between the lines. You want to use CutyCapt to grab web pages, save them to image files, and then turn the links into image-maps associated with the bitmap images. My recommendation is to dig into the source for CutyCap, and add your requirement as a feature. It sounds like something a lot of users could make use of, and that is precisely the definitive place to acquire the information. Since the tool can store web pages as SVG, and SVG is a XML formatted text file, you may be able to extract XY coordinate information from that format.
--- rod.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.