LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 12-07-2007, 09:02 AM   #1
zouriel
LQ Newbie
 
Registered: Dec 2007
Posts: 10

Rep: Reputation: 0
Search and replace across multiple files


I was needing help finding out away to search for a block of text across multiple files in multiple directories and replace it found with a new block. Most of the solutions i have found so far are for single line replacements

what i will be editing are php pages, and of course the lovly corp world changed the loction of a few required data fields.

the sad part is some of the lines are indented differently

so some are <tab><tab><tab><TAG> some are <tab><TAG> etc......
 
Old 12-07-2007, 09:42 AM   #2
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
First, to perform the recursive finding of files:
Code:
find /top/level/directory -name "*.php"
This should allow you to find all of the relevant files. find will allow you to do work on the found files, using it's '-exec' option.
Code:
find /top/level/directory -name "*.php" -exec searchAndReplaceCommand {} \;
For each matching filename, the 'searchAndReplaceCommand' will be invoked, and given the name of the matching file. In this case, a suitable search & replace tool is sed.
Code:
find /top/level/directory -name "*.php" -exec sed -i s/what you have/what you want/g {} \;
sed is given the '-i' option to edit 'in-place', along with a regular expression to substitute all instances of the 'what you have' with 'what you want'. You may use regex expressions, rather than simple literal text to express 'what you have' and 'what you want'.
If you use this, definitely create a backup and a test directory tree first, and examine closely the changes that are made.
--- rod.

Last edited by theNbomr; 12-07-2007 at 09:44 AM.
 
Old 12-07-2007, 10:04 AM   #3
zouriel
LQ Newbie
 
Registered: Dec 2007
Posts: 10

Original Poster
Rep: Reputation: 0
will this allow multiple line edits?

such as

!-- stop main content --



p class="terms" a href="url">TESTER /a /p

/div



!-- stop content head --

!-- stop main content --


and make it

p class="terms" a href="url">TESTER /a /p

/div

!-- stop main content --

!-- stop content head --

!-- stop main content --

and other variations......
 
Old 12-07-2007, 10:53 AM   #4
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
No. That isn't what is typically described as 'search and replace'; it seems you want to re-order some lines, which I see as a different task. Still, if you can come up with a way of unambiguously describing how you want to re-order your code, a perl script (for example) could be used instead of sed. Can you put in words the edit(s) you need to make?
BTW, when posting source code, it is helpful to post using [C O D E] tags to preserve formatting and prevent unwanted translations to smilies. It is hard to tell where your code starts and your narrative ends.

--- rod.
 
Old 12-07-2007, 11:19 AM   #5
zouriel
LQ Newbie
 
Registered: Dec 2007
Posts: 10

Original Poster
Rep: Reputation: 0
the code snippets are

Code:
<!-- stop main content -->


<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>


<!-- stop content head -->
But it needs to read


Code:
<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>

<!-- stop main content -->

<!-- stop content head -->
some of the formating maybe different as well such as


Code:
<!-- stop main content -->



<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>





<!-- stop content head -->
Thank you for all of your help .....
 
Old 12-07-2007, 11:28 AM   #6
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Okay, for me to put into words the edit you want to make, it looks like you want to find the line that looks like '<!-- stop main content -->', and make it the first line in the file. If this is incorrect, please state in words (important), what the edit(s) needs to look like. You will need to loosely express an algorithm/process which describes the required changes.
--- rod.
 
Old 12-07-2007, 11:45 AM   #7
zouriel
LQ Newbie
 
Registered: Dec 2007
Posts: 10

Original Poster
Rep: Reputation: 0
i want to move the
Code:
<p class="terms"><a href="http:\\url.here.for\link">test</a></p>
above the
Code:
<!-- stop main content -->

the reason i was wanting to look for the entire block is due to that some of the 4000 pages are correct. So i only need to do it on less than half the pages and i am unsure what pages are done incorrectly.


the block below is about 6 -10 lines from the bottom i want it to retain its location in the file itself .....

Code:
<!-- stop main content -->


<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>


<!-- stop content head -->
and move the
Code:
<p class="terms"><a href="http:\\url.here.for\link">test</a></p>
above the
Code:
<!-- stop main content -->

Last edited by zouriel; 12-07-2007 at 11:46 AM.
 
Old 12-07-2007, 01:10 PM   #8
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Okay, to rephrase your requirement: move a series of lines that are identified by a starting pattern matching
Code:
<p class="terms"><a href="http:\\url.here.for\link">test</a></p>
and an ending pattern matching
Code:
  </div>
and that follows a line matching
Code:
<!-- stop main content -->
to a position preceding the 'stop main content' line.
(This is definitely more complex than search & replace). In perl, the splice function is going to be useful, though. Awk also has the nice feature of being able to identify start-end patterns. Hmmm.
So, you mentioned the possibility of inconsistent use of tabs. It is fairly easy to deal with arbitrary amounts of whitespace in perl regexs's, so this should not be a big problem. Any other inconsistencies? Is the URL in each href attribute constant? What about upper/lower case? Your samples show all lower case. Is there always one and only one of each of the identifiable patterns? Now is the time to think about possible inconsistencies and other unexpected cases.
--- rod.

Last edited by theNbomr; 12-07-2007 at 01:12 PM.
 
Old 12-07-2007, 01:19 PM   #9
zouriel
LQ Newbie
 
Registered: Dec 2007
Posts: 10

Original Poster
Rep: Reputation: 0
the only inconsistancy is the whitespace. The url and text is exactly how listed with the exception it is a real url.


and what you have listed below is correct .....


Again thank you for all of ur assistance.....
 
Old 12-07-2007, 02:15 PM   #10
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Okay, try this instead of 'searchAndReplaceCommand' in my original reply.
Code:
#! /usr/bin/perl -w
#
#   LQzouriel.pl - Move identified regions of the input file to another identified location 
#
#   Usage LQzouriel.pl  filespec
#
#
#
use strict;

my  $startPattern = "<p\\s+class=\\\"terms\\\"><a\\s+href=\\\"http:\\\\\\\\url.here.for\\\\link\\\"\\s*>test</a>\\s*</p>";
my  $endPattern = "\\s*</div>\\s*";
my  $destinationPattern = "<!--\\s+stop\\s+main\\s+content\\s+-->";

my  $startLine = undef;
my  $endLine = undef;
my  $destLine = undef;

    open( INFILE, $ARGV[0] ) || die "Cannot open ",$ARGV[0]," for input : $! \n";
    my @file = <INFILE>;
    close( INFILE );    
    
    for( my $i = 0; $i < @file; $i++ ){

        # Make sure we find patterns in the expected order...

        if( $file[$i] =~ m/$destinationPattern/ && ! defined( $startLine ) ){
            $destLine = $i;
        }
        elsif( $file[$i] =~ m/$startPattern/ && defined( $destLine ) ){
            if( ! defined( $startLine ) ){
                $startLine = $i;
            }
        }
        elsif( $file[$i] =~ m/$endPattern/ && defined( $startLine ) ){
            $endLine = $i;
        }
    }
    
    if( defined( $startLine ) && defined( $endLine ) && defined( $destLine ) ){
        #   print "Start : $startLine\n";
        #   print "Stop  : $endLine\n";
        #   print "Dest  : $destLine\n";
        my $segmentLen = 1 + $endLine - $startLine;
        #
        #   Chop out the chunk delimited from startLine to endLine
        #
        my @fileSegment = splice @file, $startLine, $segmentLen;
        # 
        #   Splice in at the destination location
        #
        splice @file, $destLine, 0, @fileSegment;
    }

    #   For testing, uncomment these three lines and comment out actual file output segment
    #   print "============\n";
    #   print @file;
    #   print "============\n";

    open( OUTFILE, ">$ARGV[0]" ) || die "Cannot open $ARGV[0] for writing : $!\n";
    print OUTFILE @file;
    close( OUTFILE );
You will have to edit the URL to match your real one (those backslashes in the URL are nasty). Clip this into a file, make it executable, and try it on one or two files, manually first. Then, as I said before, create a backup tree and a test tree, run it there and check for expected behavior. I have tested this in only a rudimentary manner, and it did what I expected it to do.
--- rod.

Last edited by theNbomr; 12-07-2007 at 02:25 PM. Reason: Code fix re: File handles
 
Old 12-07-2007, 03:25 PM   #11
zouriel
LQ Newbie
 
Registered: Dec 2007
Posts: 10

Original Poster
Rep: Reputation: 0
ty so much.... but i think im borking something


when i run im doing

perl LQzouriel.pl /home/zouriel/Desktop/web/home.php

it runs without error.... but when i look at the file it isnt altering it

i went to the "debug" segment at the bottom and uncommented ran it and it is pulling the file and scanning it but not pushing out the rewrite .....

ill play with it a little and see if i can figure it out.....


Again TY so very much
 
Old 12-07-2007, 04:36 PM   #12
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Uncomment the lines that print the Start-End-Dest linenumbers. If any of those are undefined, it will not modify anything. I tested it by creating a file that was cut & pasted from your first code box in article #5 of this thread.
--- rod.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Search and Replace with multiple-line strings ChristianNerds.com Programming 4 08-21-2005 02:32 PM
search/replace in many files allelopath Linux - General 1 08-02-2005 09:21 PM
Looking for a multiple file search and replace tool, prefer graphical haimeltjnfg Linux - Software 6 02-02-2005 10:53 PM
Search and Replace over multiple files The_Nerd Linux - Software 8 06-20-2004 06:59 AM
trying to search and replace text file for single & multiple line breaks separately brokenfeet Programming 7 08-29-2003 01:56 PM


All times are GMT -5. The time now is 08:23 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration