LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 11-07-2003, 12:35 PM   #1
ergo_sum
Member
 
Registered: Aug 2003
Posts: 253

Rep: Reputation: 30
re: writing html on linux and windows


Hello All:

My question is this. I translated a historical novel a while back while on MS os and hand coded everything in html.
Well, alot of the links are coded for a Microsoft os. Now I'm on Linux. Is there a tool out there that will extract all hyperlinks?

This is a large tome here, and a tool that extracts html would be heaven sent.

Thanks,

ergo_sum
 
Old 11-07-2003, 01:17 PM   #2
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 67
HTML is a universal language and should be the same on both OS's. If you want to strip all the HTML then try running the pages through a lynx dump:
lynx -dump page.html
 
Old 11-07-2003, 01:38 PM   #3
ergo_sum
Member
 
Registered: Aug 2003
Posts: 253

Original Poster
Rep: Reputation: 30
Yes, the language is the same but the hyperlinks aren't. One os uses / to delineate a directory and the other uses \. That's why I'm looking for an application that extracts hyperlinks so I can go ahead and change them.

ergo_sum
 
Old 11-07-2003, 02:13 PM   #4
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 67
Just write a script or something to do it for you - something like this should work:
Code:
#!/usr/bin/perl

while(<STDIN>){
$line=$_;
$line =~ s/href="(.*)"/&swaphref($1)/eg;
$line =~ s/src="(.*)"/&swapsrc($1)/eg;
print $line;
}

sub swaphref($1){
$ret="href=\"$1\"";
$ret =~ s/\\/\//xg;
return $ret;
}

sub swapsrc($1){
$ret="href=\"$1\"";
$ret =~ s/\\/\//xg;
return $ret;
}

exit;
 
Old 11-07-2003, 02:27 PM   #5
stickman
Senior Member
 
Registered: Sep 2002
Location: Nashville, TN
Posts: 1,552

Rep: Reputation: 53
If you know that \ is only used in the directory trees and not anywhere else:
perl -pi -e 's^\\^/^g' *
perl -pi -e 's^\\^/^g' filename

Be sure to make a backup before you do a mass search and replace.
 
Old 11-07-2003, 03:32 PM   #6
wh33t
Member
 
Registered: Oct 2003
Location: Canada
Posts: 802

Rep: Reputation: 58
Come on guys... Just simply write a tool that does it for you? Simply write a program that searches through a text file and replaces the /'s for you... I mean really? The guy said he hand coded it all in html... any one who hand codes an entire historic novel doesn't know how to compile there own parsing program. And even if does, lets not assume he does. Lets actually try to help him.

I'm confused why your links would not work for linux.. Are you using link locations such as "img\imagefile.jpg" or "c:\www\img\image.jpg" the 2nd is not needed and you can use the first type of linking. I beleive its reffered to as relative locations. Its relative (related) to the root of the website. Does this help at all?

As for finding a way to replace all of your tags... it depends what you have written, its possible to use a find and replace function inside most editors... such as Kwrite or something, but that also depends on what GUI your running on your machine. Wanna post some specifics to your problem?
 
Old 11-07-2003, 06:18 PM   #7
ergo_sum
Member
 
Registered: Aug 2003
Posts: 253

Original Poster
Rep: Reputation: 30
Well, thanks, but you all aren't thinking things through, I think.
It's not just / vs \, but it's the directory structure itself. This was written a while ago, and C:\ Enriquillo\Enriquillo etc\ doesn't translate well to /user/local/apache2.

And Wht33, thank you for allowing for not only my newbieness but discerning the possibility of my complete inability to write a tool for replacing one character w/ another in a body of text.
But I think I should do this the right way, and not depend on first links vs. relative links. The text in question is a static thing but shouldn't be considered to a static thing
So, thanks.

Now, what do I do?

ergo_sum
 
Old 11-07-2003, 06:30 PM   #8
wh33t
Member
 
Registered: Oct 2003
Location: Canada
Posts: 802

Rep: Reputation: 58
Well you see then. your problem is is that you used absolute links. meaning that you pointed to the absolute location of the files you were linking to. This is not neccessary (sp?). next just link relatively to your documents. So what you need to do then is load up your favourite editor for text. Can be in windows or linux, and simply find the parts you need to replace. So if you need to replace <a href="C:\ Enriquillo\Enriquillo\file.html"> with <a href="\file.html"> then use a "find and replace" function in the text editor.

(Keep in mind you do need to use /user/local/apache2. infact i think if you tried it to replace your links with that. it would not work. You have to use relative linking.)

::Find and Replace Functions::
I know textpad (not note pad or word pad, its a seperate program) for windows does this. And so does Kwrite, which is a free text editing program that comes with the default installation of KDE which is a GUI for linux. Which could be what your running. Let me know if this helps at all.
 
Old 11-07-2003, 06:40 PM   #9
ergo_sum
Member
 
Registered: Aug 2003
Posts: 253

Original Poster
Rep: Reputation: 30
Cool!

Thanks, and yes, it certainly does help.

I'm a newbie but also completely weened from MS. So it'll probably be either OpenOffice or Star Office.

ergo_sum
 
Old 11-08-2003, 10:34 AM   #10
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 67
Personally I would still use a global search and replace to aid your efforts - and wh33t I wasn't suggesting he wrote something all by himself - if you look at the example I gave it should work quite well. If all your links are absolute now - ie the start "C:\Enriquillo\Enriquillo\" then you only need to run another find and replace first to delete "C:\Enriquillo\Enriquillo\".

This way you can have all links relative to their own location and not to the root of any webserver - this is useful if you want to make the pages available in downloadable archive for offline viewing.

ergo_sum - if you are unsure if my script above is unable to help then feel free to post or e-mail me one of your pages and I'll check and write up a simple instruction set for you. I certainly don't think I would want to edit a whole novel by hand
 
Old 11-08-2003, 12:44 PM   #11
wh33t
Member
 
Registered: Oct 2003
Location: Canada
Posts: 802

Rep: Reputation: 58
alrite sorry mate, I just understand how frustrating it is to come into these forums and ask a simple question and definetly get more confused by the "answers" people give you. I think he should be well on his way now.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Writing Scripts for Linux in Windows AxXium Linux - General 4 07-18-2005 04:01 PM
writing to linux from windows Jan Tanjo Linux - Networking 4 03-19-2005 10:38 AM
writing Linux shell scripts in Windows NightWolf_NZ Linux - Newbie 3 09-10-2003 09:28 PM
Panels Help & Writing HTML johan the olive Linux - Newbie 5 02-13-2003 12:35 PM
Writing to Linux Drives From Windows Nyc0n Linux - Networking 9 08-20-2001 07:37 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 12:38 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration