Search and replace across multiple files

zouriel · 12-07-2007, 09:02 AM

I was needing help finding out away to search for a block of text across multiple files in multiple directories and replace it found with a new block. Most of the solutions i have found so far are for single line replacements

what i will be editing are php pages, and of course the lovly corp world changed the loction of a few required data fields.

the sad part is some of the lines are indented differently

so some are <tab><tab><tab><TAG> some are <tab><TAG> etc......

theNbomr · 12-07-2007, 09:42 AM

First, to perform the recursive finding of files:

Code:

find /top/level/directory -name "*.php"

This should allow you to find all of the relevant files. find will allow you to do work on the found files, using it's '-exec' option.

Code:

find /top/level/directory -name "*.php" -exec searchAndReplaceCommand {} \;

For each matching filename, the 'searchAndReplaceCommand' will be invoked, and given the name of the matching file. In this case, a suitable search & replace tool is sed.

Code:

find /top/level/directory -name "*.php" -exec sed -i s/what you have/what you want/g {} \;

sed is given the '-i' option to edit 'in-place', along with a regular expression to substitute all instances of the 'what you have' with 'what you want'. You may use regex expressions, rather than simple literal text to express 'what you have' and 'what you want'.
If you use this, definitely create a backup and a test directory tree first, and examine closely the changes that are made.
--- rod.

zouriel · 12-07-2007, 10:04 AM

will this allow multiple line edits?

such as

!-- stop main content --

p class="terms" a href="url">TESTER /a /p

/div

!-- stop content head --

!-- stop main content --

and make it

p class="terms" a href="url">TESTER /a /p

/div

!-- stop main content --

!-- stop content head --

!-- stop main content --

and other variations......

theNbomr · 12-07-2007, 10:53 AM

No. That isn't what is typically described as 'search and replace'; it seems you want to re-order some lines, which I see as a different task. Still, if you can come up with a way of unambiguously describing how you want to re-order your code, a perl script (for example) could be used instead of sed. Can you put in words the edit(s) you need to make?
BTW, when posting source code, it is helpful to post using [C O D E] tags to preserve formatting and prevent unwanted translations to smilies. It is hard to tell where your code starts and your narrative ends.

--- rod.

zouriel · 12-07-2007, 11:19 AM

the code snippets are

Code:

<!-- stop main content -->


<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>


<!-- stop content head -->

But it needs to read

Code:

<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>

<!-- stop main content -->

<!-- stop content head -->

some of the formating maybe different as well such as

Code:

<!-- stop main content -->



<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>





<!-- stop content head -->

Thank you for all of your help .....

theNbomr · 12-07-2007, 11:28 AM

Okay, for me to put into words the edit you want to make, it looks like you want to find the line that looks like '', and make it the first line in the file. If this is incorrect, please state in words (important), what the edit(s) needs to look like. You will need to loosely express an algorithm/process which describes the required changes.
--- rod.

zouriel · 12-07-2007, 11:45 AM

i want to move the

Code:

<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

above the

Code:

<!-- stop main content -->

the reason i was wanting to look for the entire block is due to that some of the 4000 pages are correct. So i only need to do it on less than half the pages and i am unsure what pages are done incorrectly.

the block below is about 6 -10 lines from the bottom i want it to retain its location in the file itself .....

Code:

<!-- stop main content -->


<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

  </div>


<!-- stop content head -->

and move the

Code:

<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

above the

Code:

<!-- stop main content -->

theNbomr · 12-07-2007, 01:10 PM

Okay, to rephrase your requirement: move a series of lines that are identified by a starting pattern matching

Code:

<p class="terms"><a href="http:\\url.here.for\link">test</a></p>

and an ending pattern matching

Code:

  </div>

and that follows a line matching

Code:

<!-- stop main content -->

to a position preceding the 'stop main content' line.
(This is definitely more complex than search & replace). In perl, the splice function is going to be useful, though. Awk also has the nice feature of being able to identify start-end patterns. Hmmm.
So, you mentioned the possibility of inconsistent use of tabs. It is fairly easy to deal with arbitrary amounts of whitespace in perl regexs's, so this should not be a big problem. Any other inconsistencies? Is the URL in each href attribute constant? What about upper/lower case? Your samples show all lower case. Is there always one and only one of each of the identifiable patterns? Now is the time to think about possible inconsistencies and other unexpected cases.
--- rod.

zouriel · 12-07-2007, 01:19 PM

the only inconsistancy is the whitespace. The url and text is exactly how listed with the exception it is a real url.

and what you have listed below is correct .....

Again thank you for all of ur assistance.....

theNbomr · 12-07-2007, 02:15 PM

Okay, try this instead of 'searchAndReplaceCommand' in my original reply.

Code:

#! /usr/bin/perl -w
#
#   LQzouriel.pl - Move identified regions of the input file to another identified location 
#
#   Usage LQzouriel.pl  filespec
#
#
#
use strict;

my  $startPattern = "<p\\s+class=\\\"terms\\\"><a\\s+href=\\\"http:\\\\\\\\url.here.for\\\\link\\\"\\s*>test</a>\\s*</p>";
my  $endPattern = "\\s*</div>\\s*";
my  $destinationPattern = "<!--\\s+stop\\s+main\\s+content\\s+-->";

my  $startLine = undef;
my  $endLine = undef;
my  $destLine = undef;

    open( INFILE, $ARGV[0] ) || die "Cannot open ",$ARGV[0]," for input : $! \n";
    my @file = <INFILE>;
    close( INFILE );    
    
    for( my $i = 0; $i < @file; $i++ ){

        # Make sure we find patterns in the expected order...

        if( $file[$i] =~ m/$destinationPattern/ && ! defined( $startLine ) ){
            $destLine = $i;
        }
        elsif( $file[$i] =~ m/$startPattern/ && defined( $destLine ) ){
            if( ! defined( $startLine ) ){
                $startLine = $i;
            }
        }
        elsif( $file[$i] =~ m/$endPattern/ && defined( $startLine ) ){
            $endLine = $i;
        }
    }
    
    if( defined( $startLine ) && defined( $endLine ) && defined( $destLine ) ){
        #   print "Start : $startLine\n";
        #   print "Stop  : $endLine\n";
        #   print "Dest  : $destLine\n";
        my $segmentLen = 1 + $endLine - $startLine;
        #
        #   Chop out the chunk delimited from startLine to endLine
        #
        my @fileSegment = splice @file, $startLine, $segmentLen;
        # 
        #   Splice in at the destination location
        #
        splice @file, $destLine, 0, @fileSegment;
    }

    #   For testing, uncomment these three lines and comment out actual file output segment
    #   print "============\n";
    #   print @file;
    #   print "============\n";

    open( OUTFILE, ">$ARGV[0]" ) || die "Cannot open $ARGV[0] for writing : $!\n";
    print OUTFILE @file;
    close( OUTFILE );

You will have to edit the URL to match your real one (those backslashes in the URL are nasty). Clip this into a file, make it executable, and try it on one or two files, manually first. Then, as I said before, create a backup tree and a test tree, run it there and check for expected behavior. I have tested this in only a rudimentary manner, and it did what I expected it to do.
--- rod.

zouriel · 12-07-2007, 03:25 PM

ty so much.... but i think im borking something

when i run im doing

perl LQzouriel.pl /home/zouriel/Desktop/web/home.php

it runs without error.... but when i look at the file it isnt altering it

i went to the "debug" segment at the bottom and uncommented ran it and it is pulling the file and scanning it but not pushing out the rewrite .....

ill play with it a little and see if i can figure it out.....

Again TY so very much

theNbomr · 12-07-2007, 04:36 PM

Uncomment the lines that print the Start-End-Dest linenumbers. If any of those are undefined, it will not modify anything. I tested it by creating a file that was cut & pasted from your first code box in article #5 of this thread.
--- rod.