find and replace script
Is there a simple (preferrably unix, awk or perl which I semi understand) that can find and replace data in an ascii file according to a second file (aka with the data to be replaced in one column and the new data in the 2nd column).
Many thanks |
If you semi-understand Perl, why not take a stab at writing a script? If it doesn't work, come back with questions.
|
s/pattern/replace
could you please post sample input and output ? |
Quote:
|
substitute in files making a .bak backup file...
Code:
perl -pi.bak -e 's/this/that/g' file ... |
Quote:
file A 1 4 2 5 3 6 file B 3 9 5 8 Output 1 4 2 8 9 6 |
Quote:
|
It worked on a simple example but when I tried to run it on my data it went haywire.
As far as I can tell (and this is a stab in the dark but makes sense to me) when I have 2 numbers like 10 and 101, and I want 10 replaced with 8, it is giving me 8 and 81 aka replacing parts of numbers as well as whole numbers. Is there any way for it to make it read whole numbers only? |
welcome to the world of regular expressions!
what are you using? clue? |
If you want to match whole lines ones, use ^ and $ at the start and end of your pattern to denote start of line and end of line. E.g.
Code:
s/^mypattern$/replacement/g; Another option is to use some of the zero-length patterns Perl-style regular expressions provide. \b means "word boundary", which is very useful if you want to match whole words only. For example: Code:
s/\bmypattern\b/replacement/g; |
Samples of the actual datafiles I'm using this on (of course there is over 10000 records in this one & over 100000 in some of the others I have to look at). The numbers in the first 3 columns (where the find and replace should occur) of hwt.dat start at 1 and go up from there, leading to the issues in other columns if 1, 10 etc need replaced.
datafile1 789 G1985 193 G1988 hwt.dat 170 789 172 1 53.1 495 1 1 1 97 1985 143 382 188 1 69.0 446 2 2 2 21 1988 149 146 193 1 69.8 446 2 2 1 21 1988 148 332 197 1 71.8 446 2 2 2 21 1988 I intially used this statement to prepare the data for the sed script awk '{print "s/"$1"/"$2"/g"}' datafile1 > temp5 then this to run it sed -f temp5 datafile2 > newdata I've just tried using this to incorporate the ^ and $ into the file like this: awk '{print "s/^"$1"$/"$2"/g"}' datafile1 > temp5 However, it made no changes - I guess the issue is some conflict between the ^ and $ and the column designators aka $1, $2. Or I stuffed up somewhere (again). Similarily I also tried the 2nd option suggested and got the same problem awk '{print "s/\b"$1"\b/"$2"/g"}' datafile > temp5 |
The first option would only work if you had a line with 789 on a line by itself in hw.dat.
The second option doesn't work because the \b's in your awk expression get substituted by bash for the backspace character: if you open temp5 with vi you'll see funny ^H's. You need to double up the backslashes Code:
awk '{print "s/\\b"$1"\\b/"$2"/g"}' datafile > temp5 Code:
#!/usr/bin/env python sample usage: Code:
~/tmp% ./replace.py datafile1 <hwt.dat >newdata |
used
Quote:
Thanks for all your help guys |
All times are GMT -5. The time now is 01:03 AM. |