Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
My question is this. I translated a historical novel a while back while on MS os and hand coded everything in html.
Well, alot of the links are coded for a Microsoft os. Now I'm on Linux. Is there a tool out there that will extract all hyperlinks?
This is a large tome here, and a tool that extracts html would be heaven sent.
HTML is a universal language and should be the same on both OS's. If you want to strip all the HTML then try running the pages through a lynx dump:
lynx -dump page.html
Yes, the language is the same but the hyperlinks aren't. One os uses / to delineate a directory and the other uses \. That's why I'm looking for an application that extracts hyperlinks so I can go ahead and change them.
Come on guys... Just simply write a tool that does it for you? Simply write a program that searches through a text file and replaces the /'s for you... I mean really? The guy said he hand coded it all in html... any one who hand codes an entire historic novel doesn't know how to compile there own parsing program. And even if does, lets not assume he does. Lets actually try to help him.
I'm confused why your links would not work for linux.. Are you using link locations such as "img\imagefile.jpg" or "c:\www\img\image.jpg" the 2nd is not needed and you can use the first type of linking. I beleive its reffered to as relative locations. Its relative (related) to the root of the website. Does this help at all?
As for finding a way to replace all of your tags... it depends what you have written, its possible to use a find and replace function inside most editors... such as Kwrite or something, but that also depends on what GUI your running on your machine. Wanna post some specifics to your problem?
Well, thanks, but you all aren't thinking things through, I think.
It's not just / vs \, but it's the directory structure itself. This was written a while ago, and C:\ Enriquillo\Enriquillo etc\ doesn't translate well to /user/local/apache2.
And Wht33, thank you for allowing for not only my newbieness but discerning the possibility of my complete inability to write a tool for replacing one character w/ another in a body of text.
But I think I should do this the right way, and not depend on first links vs. relative links. The text in question is a static thing but shouldn't be considered to a static thing
So, thanks.
Well you see then. your problem is is that you used absolute links. meaning that you pointed to the absolute location of the files you were linking to. This is not neccessary (sp?). next just link relatively to your documents. So what you need to do then is load up your favourite editor for text. Can be in windows or linux, and simply find the parts you need to replace. So if you need to replace <a href="C:\ Enriquillo\Enriquillo\file.html"> with <a href="\file.html"> then use a "find and replace" function in the text editor.
(Keep in mind you do need to use /user/local/apache2. infact i think if you tried it to replace your links with that. it would not work. You have to use relative linking.)
::Find and Replace Functions::
I know textpad (not note pad or word pad, its a seperate program) for windows does this. And so does Kwrite, which is a free text editing program that comes with the default installation of KDE which is a GUI for linux. Which could be what your running. Let me know if this helps at all.
Personally I would still use a global search and replace to aid your efforts - and wh33t I wasn't suggesting he wrote something all by himself - if you look at the example I gave it should work quite well. If all your links are absolute now - ie the start "C:\Enriquillo\Enriquillo\" then you only need to run another find and replace first to delete "C:\Enriquillo\Enriquillo\".
This way you can have all links relative to their own location and not to the root of any webserver - this is useful if you want to make the pages available in downloadable archive for offline viewing.
ergo_sum - if you are unsure if my script above is unable to help then feel free to post or e-mail me one of your pages and I'll check and write up a simple instruction set for you. I certainly don't think I would want to edit a whole novel by hand
alrite sorry mate, I just understand how frustrating it is to come into these forums and ask a simple question and definetly get more confused by the "answers" people give you. I think he should be well on his way now.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.