LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-26-2015, 09:46 PM   #1
Darrell22
Member
 
Registered: Nov 2003
Posts: 83

Rep: Reputation: 15
How to replace single word in a file, with a series of lines?


Dear Experts,


I have a number of HTML files.
Near the end of each file, is the phrase, </body>


I'd like to replace this phrase to a series of lines. ie:


<!-- Start of StatCounter Code for Default Guide -->
<script type="text/javascript">
var sc_project=12345678;
var sc_invisible=1;
var sc_security="123c45db";
var scJsHost = (("https:" == document.location.protocol) ?
"https://secure." : "http://www.");
document.write("<sc"+"ript type='text/javascript' src='" +
scJsHost+
"statcounter.com/counter/counter.js'></"+"script>");
</script>
<noscript><div class="statcounter"><a title="web analytics"
href="http://statcounter.com/" target="_blank"><img
class="statcounter"
src="http://c.statcounter.com/01234567/0/890c12db/0/"
alt="web analytics"></a></div></noscript>
<!-- End of StatCounter Code for Default Guide -->


</body>



I've used awk and sed before.
But only for short phrases, in a single line.

What would be the best way to modify each file?


Thanks a lot!
 
Old 03-26-2015, 09:49 PM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
What have you tried? Is sed or awk giving you an error?
 
Old 03-27-2015, 08:14 AM   #3
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939Reputation: 3939
"A series of lines" is simply a string of characters with "newline sequences" in them ... which on any given system could be "CR," "LF," "CR+LF," or "LF+CR." (CR = carriage-return; LF = linefeed.) All of which can be represented by backslash-escape sequences.

Exactly what have you tried?
 
Old 03-27-2015, 12:20 PM   #4
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
Not really an awk person, but have used sed. So give that a try. Basically you issue sed, plus a command, and the file name. The output of the sed result ends up normally going to stdout, which is the terminal you're typing in. That's also a pretty good way to test and verify that things work the way you want. Then you can next recall the correct command and add a redirect "> new-filename" and a good idea is to not just replace the original file even though you can do that. Just make a new version or put a same named output file into a sub-directory so as to not risk some problem killing your original until you've assuredly got it right. And example sed command might be something like below where I have a file abc.txt containing the text 123 and I wish to change that text to be 456:
Code:
~/testcode$ cat abc.txt
123
~/testcode$ sed s/123/456/g abc.txt
456
So get trying to write the command and if you have trouble, post what you've tried and describe where you're stuck. Please review notes on how to use code tags.
 
Old 03-27-2015, 12:44 PM   #5
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
I have a good (tested) solution.
Show us what you already have and I will post my code.

Daniel B. Martin
 
Old 03-29-2015, 11:07 AM   #6
Darrell22
Member
 
Registered: Nov 2003
Posts: 83

Original Poster
Rep: Reputation: 15
Ok, so the only example I got was to use sed on a single line,
what I was already familiar with.


Single line sed is different from a nix script,
which I thought some might recommend.



-----

Here's what I tried in cygwin on Windows.

As I feared, the syntax of the javascript code caused issues (below).
Which is why I posted the code.


Any ideas?


Thanks



--------------------



cat index.html | sed s/</body>/</body> \n
<!-- Start of StatCounter Code for Default Guide --> \n
<script type="text/javascript"> \n
var sc_project=12345678; \n
var sc_invisible=1; \n
var sc_security="123c45db"; \n
var scJsHost = (("https:" == document.location.protocol) ? \n
"https://secure." : "http://www."); \n
document.write("<sc"+"ript type='text/javascript' src='" + \n
scJsHost+ \n
"statcounter.com/counter/counter.js'></"+"script>"); \n
</script> \n
<noscript><div class="statcounter"><a title="web analytics" \n
href="http://statcounter.com/" target="_blank"><img \n
class="statcounter" \n
src="http://c.statcounter.com/01234567/0/890c12db/0/" \n
alt="web analytics"></a></div></noscript> \n
<!-- End of StatCounter Code for Default Guide --> \n
</body> \n
/g


Lots of errors, starting with:
$ cat index.html | sed s/</body>/</body> \n
bash: /body: No such file or directory


$ <!-- Start of StatCounter Code for Default Guide --> \n
bash: !--: event not found

--------------------


cat index.html | sed s/</body>/</body> \
<!-- Start of StatCounter Code for Default Guide --> \
<script type="text/javascript"> \
var sc_project=12345678; \
var sc_invisible=1; \
var sc_security="123c45db"; \
var scJsHost = (("https:" == document.location.protocol) ? \
"https://secure." : "http://www."); \
document.write("<sc"+"ript type='text/javascript' src='" + \
scJsHost+ \
"statcounter.com/counter/counter.js'></"+"script>"); \
</script> \
<noscript><div class="statcounter"><a title="web analytics" \
href="http://statcounter.com/" target="_blank"><img \
class="statcounter" \
src="http://c.statcounter.com/01234567/0/890c12db/0/" \
alt="web analytics"></a></div></noscript> \
<!-- End of StatCounter Code for Default Guide --> \
</body> \
/g


- lots of errors, starting with:
> var scJsHost = (("https:" == document.location.protocol) ? \
bash: syntax error near unexpected token `('


--------------------



sed s/</body>/</body> \
<!-- Start of StatCounter Code for Default Guide --> \
<script type="text/javascript"> \
var sc_project=12345678; \
var sc_invisible=1; \
var sc_security="123c45db"; \
var scJsHost = (("https:" == document.location.protocol) ? \
"https://secure." : "http://www."); \
document.write("<sc"+"ript type='text/javascript' src='" + \
scJsHost+ \
"statcounter.com/counter/counter.js'></"+"script>"); \
</script> \
<noscript><div class="statcounter"><a title="web analytics" \
href="http://statcounter.com/" target="_blank"><img \
class="statcounter" \
src="http://c.statcounter.com/01234567/0/890c12db/0/" \
alt="web analytics"></a></div></noscript> \
<!-- End of StatCounter Code for Default Guide --> \
</body> \
/g index.html


- lots of errors, starting with:

$ sed s/</body>/</body> \
> <!-- Start of StatCounter Code for Default Guide --> \
bash: !--: event not found
> <script type="text/javascript"> \
bash: syntax error near unexpected token `<'
 
Old 03-29-2015, 11:19 AM   #7
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
With this InFile ...
Code:
This is the way the world ends
This is the way the world ends
This is the way the world ends
Not with a bang but a whimper.
... and this Insert file ...
Code:
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
... this code ...
Code:
BashVar="~"$(paste -sd~ $Insert)"~"
sed "s/\bway\b/$BashVar/g" $InFile |tr '~' '\n' >$OutFile
... produced this OutFile ...
Code:
This is the 
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
 the world ends
This is the 
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
 the world ends
This is the 
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
 the world ends
Not with a bang but a whimper.
Daniel B. Martin
 
Old 03-29-2015, 11:20 AM   #8
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
With this InFile ...
Code:
This is the way the world ends
This is the way the world ends
This is the way the world ends
Not with a bang but a whimper.
... and this Insert file ...
Code:
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
... this code ...
Code:
awk 'BEGIN{InsVar="\n"}
  {if (FNR==NR) InsVar=InsVar $0 "\n"
   else {gsub(/way/,InsVar); print}}' $Insert $InFile >$OutFile
... produced this OutFile ...
Code:
This is the 
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
 the world ends
This is the 
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
 the world ends
This is the 
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
 the world ends
Not with a bang but a whimper.
Daniel B. Martin

Last edited by danielbmartin; 03-29-2015 at 07:39 PM. Reason: Tightened code
 
Old 03-29-2015, 07:31 PM   #9
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
May I suggest you simply change the delimiters around your search and replace:
Code:
# you are using the default
sed 's/find/replace/' file

# but your data has '/' in it so you could use
sed 's#find#replace#' file
You will need to obviously use something not currently in your data and then you should be good to go
 
Old 03-30-2015, 12:24 AM   #10
Darrell22
Member
 
Registered: Nov 2003
Posts: 83

Original Poster
Rep: Reputation: 15
Thanks for the examples.

I'm using Cygwin on windows.
$ awk --version
GNU Awk 3.1.5

$ sed --version
GNU sed version 4.1.5

Upgrading is not an option.



I tried these examples, but still got errors.



Any other ideas, such as scripts?

It doesn't need to be a superfast one line wonder.

But should reliable with the HTML code and readable.


Thanks again!

------------


awk 'BEGIN{InsVar="\n"}
{if (FNR==NR) InsVar=InsVar $0 "\n"
else {gsub(/way/,InsVar); print}}' $Insert $InFile >$OutFile

---

awk 'BEGIN{InsVar="\n"}
{if (FNR==NR) InsVar=InsVar $0 "\n"
else {gsub(/</body>/,InsVar); print}}' counter_code.txt Pic20_02.html


awk: cmd. line:2: else {gsub(/</body>/,InsVar); print}}
awk: cmd. line:2: ^ unterminated regexp
awk: cmd. line:3: else {gsub(/</body>/,InsVar); print}}
awk: cmd. line:3: ^ unexpected newline or end of string


---


awk 'BEGIN{InsVar="\n"}
{if (FNR==NR) InsVar=InsVar $0 "\n"
else {gsub(/</body>/,InsVar); print}}' $counter_code.txt $Pic20_02.html

awk: cmd. line:2: else {gsub(/</body>/,InsVar); print}}
awk: cmd. line:2: ^ unterminated regexp
awk: cmd. line:3: else {gsub(/</body>/,InsVar); print}}
awk: cmd. line:3: ^ unexpected newline or end of string




------------



BashVar="~"$(paste -sd~ $Insert)"~"
sed "s/\bway\b/$BashVar/g" $InFile |tr '~' '\n'


---

BashVar="~"$(paste -sd~ counter_code.txt )"~"
sed "s/\</body>\b/$BashVar/g" Pic20_02.html |tr '~' '\n'

$ sed "s/\</body>\b/$BashVar/g" Pic20_02.html |tr '~' '\n'
sed: -e expression #1, char 14: unknown option to `s'


---


BashVar="~"$(paste -sd~ $counter_code.txt )"~"
sed "s/\</body>\b/$BashVar/g" $Pic20_02.html |tr '~' '\n'

paste: .txt: No such file or directory

$ sed "s/\</body>\b/$BashVar/g" $Pic20_02.html |tr '~' '\n'
sed: -e expression #1, char 14: unknown option to `s'



---------
 
Old 03-30-2015, 01:04 AM   #11
propofol
Member
 
Registered: Nov 2007
Location: Seattle
Distribution: Debian Wheezy & Jessie; Ubuntu
Posts: 334

Rep: Reputation: 60
Grail gave you good advice, you just need to follow it:
Code:
# but your data has '/' in it so you could use
sed 's#find#replace#' file
 
Old 03-30-2015, 06:00 AM   #12
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
Quote:
Originally Posted by propofol View Post
Grail gave you good advice, you just need to follow it:
Code:
# but your data has '/' in it so you could use
sed 's#find#replace#' file
I concur with that recommendation too.
Quote:
Originally Posted by Darrell22 View Post
Any other ideas, such as scripts?
The issue here is that scripts would be using the same commands you'd use on the command line, so if you can't attain results which work for you via the command line, you pretty much won't be able to do it via a script. We understand that changing your awk and sed may not be options, however your versions aren't that old. Why don't you try what you were recommended to do and please post the attempt and results within [code][/code] tags.
 
Old 03-30-2015, 09:18 AM   #13
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
With this InFile ...
Code:
Splish, splash
I was takin' a bath
Long about a Saturday night... yeah!
Rub-a-dub
Just relaxin' in the tub
</body>
Thinkin' everything was all right.
... and this Insert file ...
Code:
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
... this code ...
Code:
BashVar="~"$(paste -sd~ $Insert)"~"
sed "s#</body>#$BashVar#g" $InFile |tr '~' '\n' >$OutFile
... produced this OutFile ...
Code:
Splish, splash
I was takin' a bath
Long about a Saturday night... yeah!
Rub-a-dub
Just relaxin' in the tub

Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.

Thinkin' everything was all right.
Daniel B. Martin

---------- Post added 03-30-15 at 10:19 AM ----------

With this InFile ...
Code:
Splish, splash
I was takin' a bath
Long about a Saturday night... yeah!
Rub-a-dub
Just relaxin' in the tub
</body>
Thinkin' everything was all right.
... and this Insert file ...
Code:
Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.
... this code ...
Code:
awk 'BEGIN{InsVar="\n"}
  {if (FNR==NR) InsVar=InsVar $0 "\n"
   else {gsub(/<\/body>/,InsVar); print}}' $Insert $InFile >$OutFile
... produced this OutFile ...
Code:
Splish, splash
I was takin' a bath
Long about a Saturday night... yeah!
Rub-a-dub
Just relaxin' in the tub

Insert line 1, the first of 3.
Insert line 2, the middle child.
Insert line 3, the final line.

Thinkin' everything was all right.
Daniel B. Martin
 
Old 03-30-2015, 01:49 PM   #14
Darrell22
Member
 
Registered: Nov 2003
Posts: 83

Original Poster
Rep: Reputation: 15
Solution

Thanks all for your help.


Here is what finally worked.
For any other poor soul who has to fight with nix or nix scripts.



for file in ` ls -1 *.html `
do
var1="l"
# echo $file
file2=$file$var1
echo $file2


sed 's#</body>#<!-- Start of StatCounter Code for Default Guide --> \
<script type="text/javascript"> \
var sc_project=12345678; \
var sc_invisible=1; \
var sc_security="123c45db"; \
var scJsHost = (("https:" == document.location.protocol) ? \
"https://secure." : "http://www."); \
document.write("<sc"+"ript type='text/javascript' src='" + \
scJsHost+ \
"statcounter.com/counter/counter.js'></"+"script>"); \
</script> \
<noscript><div class="statcounter"><a title="web analytics" \
href="http://statcounter.com/" target="_blank"><img \
class="statcounter" \
src="http://c.statcounter.com/01234567/0/890c12db/0/" \
alt="web analytics"></a></div></noscript> \
<!-- End of StatCounter Code for Default Guide --> \

</body> \
#' $file > $file2

mv $file2 $file


done



----


The issue with this script, is that it needs to be edited each time it's used.

You can't just refer to another filename.


----


I don't do nix scripting much. For such a simple task,
there were a number of gotchas causing me grief.



What is the line continuation character?
LF? CF? \N, \n, \

You can't have spaces around the = sign.

Concatenation is weird.

Declaring versus using variables.


So different, compared to most other programming languages I've used.
Only in nix.
 
Old 03-30-2015, 07:29 PM   #15
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Here is a slightly cleaner alternative:

1. Place all your inserted text into a separate file:
Code:
$ cat to_insert
<!-- Start of StatCounter Code for Default Guide -->
<script type="text/javascript">
var sc_project=12345678;
var sc_invisible=1;
var sc_security="123c45db";
var scJsHost = (("https:" == document.location.protocol) ?
"https://secure." : "http://www.");
document.write("<sc"+"ript type='text/javascript' src='" +
scJsHost+
"statcounter.com/counter/counter.js'></"+"script>");
</script>
<noscript><div class="statcounter"><a title="web analytics"
href="http://statcounter.com/" target="_blank"><img
class="statcounter"
src="http://c.statcounter.com/01234567/0/890c12db/0/"
alt="web analytics"></a></div></noscript>
<!-- End of StatCounter Code for Default Guide -->

</body>
All I have done is remove the additional line escapes, but you could of course format it so the final code looks cleaner.

2. The actual bash code can be cleaned up and simplified to:
Code:
for file in *.html
do
  sed -i '/<\/body>/{
    r to_insert
    d
    }' "$file"
done
As you can see this is much simpler and the -i option will change the file instead of having to mv to a temp and back
 
  


Reply

Tags
awk, script, sed



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Perl: how to replace blank lines in a file with given lines from another karamaz0v Programming 8 04-19-2012 06:48 AM
Replace a word with a content from a other file. takayama Programming 7 12-18-2011 03:31 PM
replace several lines in a file with other lines in another file if condition yara Linux - General 12 10-27-2009 03:46 PM
replace word in a file ust Linux - Software 2 11-27-2007 08:39 PM
How to replace a word within a file elegantly Linh Programming 3 10-23-2003 10:21 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:52 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration