LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How to Change Multiple Webpages at Once? (https://www.linuxquestions.org/questions/programming-9/how-to-change-multiple-webpages-at-once-735654/)

tigerflag 06-25-2009 04:35 PM

How to Change Multiple Webpages at Once?
 
I write websites in HTML using Kwrite. Some of the sites have more than 300 pages. I need a program or a way to make a change one time, and have it propagate over all the pages at once. Example: adding a new link to the navigation panel, changing the copywrite date at the bottom of each page, etc...

There's a program for Windows called MultiUpdate 3.0. Is there an equivalent app for Linux? Or a command I can use?

I'm using PCLinuxOS, which uses RPM's via Synaptic. If the solution involves command line, could you please give me a step-by-step howto, since I know almost nothing about command line.

(I've tried to learn command line but it's just beyond me. I can hardly figure out how to use software, either. Thus the Kwrite.)

TIA!

shane25119 06-25-2009 04:41 PM

When I used to maintain websites I used CSS and server side includes to take care of sitewide changes. Therein, the links menu would be just one document- call it links.shtml which each individual page called. Very convenient but dependent on the server you are hosting on, and also works best when implemented before the site gets to 300 pages.

tigerflag 06-25-2009 06:29 PM

Thanks, Shane, but that's way above my skill level; I have no clue how to do it. The sites are on a shared hosting server and I don't know anything about shtml.

I just know that a friend who uses Windows has a little program that lets him make a change once, and it instantly changes on all the pages on his website. I just want something simple like that, if possible.

pljvaldez 06-25-2009 07:11 PM

To be perfectly honest, the best thing you could do for yourself is learn to use HTML, CSS, and server side includes. Kwrite probably doesn't generate clean HTML code to the W3C standards. Thus it may not display correctly in all browsers.

Short of that, you probably need a script utilizing find, grep, and awk.

Searching through my package directory on Debian I came across regexxer or rpl which may do the trick. Or maybe bluefish has a find and replace function.

Wim Sturkenboom 06-25-2009 07:32 PM

Maybe you should start redesigning (rewriting) your websites so they use 'reusable' modules; especially when your websites are 10's or 100's of pages. And yes, that probably implies to add some new skills or additional knowledge about html (html iframes come to mind).

With that said, you can probably use sed to change the copyright and possibly to add menu entries or links. To give you the idea:
Code:

wim@desktop1:~$ cat abc.txt
abc
copyright 2009
wim@desktop1:~$ sed -e s/2009/2010/ abc.txt
abc
copyright 2010
wim@desktop1:~$

Please note that the above code will replace ANY 2009 in the document by 2010 so you have to finetune it. You can redirect the output to another file.

Quote:

I just know that a friend who uses Windows has a little program that lets him make a change once, and it instantly changes on all the pages on his website. I just want something simple like that, if possible.
Have you tried his program (on his computer) with one of your websites? Just curious because his websites might be designed completely different from yours.

tigerflag 06-25-2009 08:06 PM

All the .html pages use the same navigation column. If I wanted to add a link in that column between two other links, could I do it using sed? And how could I make it a sitewide change?

I do know that "MultiUpdate" works on my type of site. The pages are just standard HTML and CSS, nothing proprietary. I actually made my friend a simple 10 page site, using the same basic CSS template I used for my other sites, with some style modifications. He uses MultiUpdate to make sitewide changes, like adding an image to the same place on each page, or changing the text of a hyperlink. I know it's just a front end for some sort of batch command.

I would love to learn more, have a better toolset. Learning new stuff is very hard and slow for me. Most instructions are like gibberish to my brain, and I have trouble retaining things. Can't tell you how many hours I've tried to learn Vi or command line and gotten nowhere. Not an excuse, just an explanation.

*Sigh* I'm so much better with hardware than software.

Thanks.

Wim Sturkenboom 06-26-2009 12:40 AM

Just because I wanted to have some fun !!


Code:

<!-- navigation column start -->
<a href="index.html">Home</a>
<a href="page1.html">Page1</a>
<a href="page2.html">Page2</a>
<!-- navigation column end -->

Assuming that your html looks like above, the script below can place an additional hyperlink
Code:

#! /usr/bin/bash

echo "insert menu item"
echo "================"

# check if there are three arguments
if [ $# -ne 3 ]; then
        echo "requires 3 arguments"
        echo "filename; use double quotes when using wildcards (e.g. \"*.html\")"
        echo "last part of existing hyperlink between double quotes"
        echo "new hyperlink between double quotes; will be placed immediately after the existing hyperlink"
        exit
fi

for f in $( ls $1 ); do
        echo $f
# create backup
        bckfile=$f.bck
        cp $f $bckfile
# combine arg2 and arg3 to create new argument
        newentry="$2\n$3"
# replace arg2 by new argument
        sed -e 's/'"$2"'/'"$newentry"'/' <"$bckfile" >"$f"
done

What the code basically does:
find the second argument and replace it by that same argument followed by a newline followed by the third argument (so it is basically not limited to hyperlinks)
it does this for the file (wildcards allowed) that you have specified as the first argument
a backup is created as well

In the example below, I check for the last part of a hyperlink (Page1</a>) and append a newline and a new hyperlink for Page3
Code:

wim@btd-techweb01:~/progs/lq735654$ cat page1.html
<!-- navigation column start -->
<a href="index.html">Home</a>
<a href="page1.html">Page1</a>
<a href="page2.html">Page2</a>
<!-- navigation column end -->
wim@btd-techweb01:~/progs/lq735654$ ./insert_menu.sh "*.html" "Page1<\/a>" "<a href=\"page3.html\">Page3<\/a>"
insert menu item
================
page1.html
page2.html
wim@btd-techweb01:~/progs/lq735654$ cat page1.html
<!-- navigation column start -->
<a href="index.html">Home</a>
<a href="page1.html">Page1</a>
<a href="page3.html">Page3</a>
<a href="page2.html">Page2</a>
<!-- navigation column end -->
wim@btd-techweb01:~/progs/lq735654$

Please note that you have to finetune it for your situation. This applies to the script (you want e.g. no newlines) as well as for the arguments (they depend on the content of your html file)

Also be aware that it will do this everytime that it finds the second argument, so if that occurs on other places in the page, it will be replace as well.

PS:
- the code has only been tested with *.html for the file specification (argument 1)
- you can modify it yourself so it takes an additional argument indicating replace or append; in that case you can use the same script to replace your copyright stuff or remove a menu entry

tigerflag 06-26-2009 12:59 AM

Thanks for putting so much time into this, Wim. I wish I understood it. It's just too advanced for me.

shane25119 06-26-2009 03:40 PM

Quote:

It's just too advanced for me.
That is the wrong mentality tigerflag. It may seem intimidating, but let me summarize server side includes and shtml in a nutshell- you write an html document that includes links and only links. In each regular page you include a line that says 'include those links!' and it happens. If you need to make changes to all your links, you change that link document.

If you have ever used frames then you know how to do it.

Here is a tutorial on SSI
http://www.georgedillon.com/web/ssi.shtml

and on CSS

http://www.davesite.com/webstation/css/

I find it odd that you ask for our help but then tell us this is all too advanced.

tigerflag 06-26-2009 08:41 PM

I appreciate your time and help very much; I don't mean to sound ungrateful. But with all due respect, what I asked for was if anyone knew of a "simple" program or command that can do batch changes to my pages, not advice for changing my entire website. It seems like I asked for a recipe and I'm getting advice on how to remodel my kitchen.

I think I may have found a program: Kfilereplace. I just downloaded it and will see if I can figure out how to use it.

I'm not a full-time programmer. I AM a very slow learner; I have a whopping huge learning disorder. Doesn't mean I'm stupid or lazy. It just means that what seems trivial to you guys is like climbing Mount Everest for me, and I don't have time to climb Mount Everest right now. 'Nuff about that.

I read the tutorial on server side includes and actually understand it! It sounds great, but I'm not sure if search engine bots would be able to drill down into my site if the links weren't on the pages. If they can, I'll try it with my next site, but for the two sites I have now, all I want is a simple way to make changes to the pages I have already.

These are business sites. They have to pass vulnerability scans by Scan Alert in order to be able to accept credit cards. If I use techniques I don't fully understand and the site fails a scan or gets hacked, I could lose the ability to do business. So I've deliberately kept things as simple and secure as possible, and I'd like to keep it that way.

I have never used frames. I've learned how to write tableless sites in HTML and CSS that are standards-compliant, optimized for search and fully accessible.

I totally admire all of you who know so much and give so generously of your time. I wish I had the time right now to learn it all.

Thanks again for your help.

shane25119 06-27-2009 11:34 AM

Tigerflag-

If you use SSI when someone looks at your code the entire included code appears in your html- so no worries about bots drilling in as you put it.

Let us know if that program you downloaded does the trick.

tigerflag 06-28-2009 12:23 PM

Thanks Shane.

Kfilereplace works GREAT! Here's what I did:

Open Kfilereplace > New Session.

Opened the directory where the HTML files were (all 267 of them).

In the Search field I pasted an unordered list (my nav menu). In the Replace field I pasted the same list, with an additional line for a new item.

Told it to only scan for *.html. Under Options I marked "Case sensitive" and "Allow regular expressions".

Clicked "Search Now". It only took a few seconds and every page was changed. Perfect.

I'm studying SSI. For my sites there is a security risk unless "IncludesNOEXEC" is used. From apache.org:

IncludesNOEXEC
Server-side includes are permitted, but the #exec command and #exec CGI are disabled. It is still possible to #include virtual CGI scripts from ScriptAliase'd directories.

Until I learn for certain how Scan Alert views SSI, I will stick to straight HTML and use Kfilereplace. Now I can easily change the pages to use SSI if I decide to later on.

Thanks everybody for all your help.


All times are GMT -5. The time now is 04:13 AM.