LinuxQuestions.org - Complicated string substitution

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Complicated string substitution (https://www.linuxquestions.org/questions/programming-9/complicated-string-substitution-620931/)

Complicated string substitution

Hi,

I have a file with a lot of the following ocurrences:

denmark.handa.1-10
denmark.handa.1-12344
denmark.handa.1-4
denmark.handa.1-56

...

distributed randomly in a file. I need to convert each of this ocurrences to:

denmark.handa.1-10_1
denmark.handa.1-12344_1
denmark.handa.1-4_1
denmark.handa.1-56_1

so basically I add "_1" at the end of each ocurrence.

I thought about using sed, but as each "root" is different I have no clue how to go through this.

Any suggestion ?

Thanks in advance.

By "root" do you mean the "denmark.handa." part of the string? That part is all the same in your examples, so the

Quote:

I thought about using sed, but as each "root" is different I have no clue how to go through this.

part leaves me puzzled.

Assuming, however, that you want to replace \.([[:digit:]]+-[[:digit:]]+)$, ignoring the "root", sed should have no problem doing so.

Which part is the root you mentioned. "denmark" or "denmark.handa"?
Could you supply more "real" examples. For example, is the the ".[[:digit:]]-" always ".1-"?
You need to be as precise as possible in defining an input pattern and it's position in the line to prevent false positives or prevent missing a matching pattern.
Also what is the encoding scheme. You may need to use [[:alpha:]] instead of [a-z] for example to include accents.

Code:

sed 's/ $[[:alpha:]][[:alpha:]]*\.[[:alpha:]][[:alpha:]]*\.[[:digit:]]-[[:digit:]][[:digit:]]*$ /\1_1/' file

This example assumes that the initial digit is alway only one digit in length, but might be between 0-9, and that there will always be a space before and after the the pattern. Sometimes you will have more than one line to handle different input patterns. One thing to look out for is if the pattern might be split between two lines. This will complicate things greatly because then you will need to save some lines in the buffer and then check for matching patterns depending on where the split is. Then you also need to decide whether to add the "_1" to the end leaving the line split, or move the line split.

try the following shell script for rename all files in a dir with a following _1

#!/bin/bash

for X in `ls -1 ./temp/`
do
NEW_EXT=_1
mv ./temp/$X ./temp/$X$NEW_EXT
done

this would also work well with find.
for X in `find ./ -name denmark.handa*`
to just later the file with the starting chars of denmark.handa

The OP wants to replace text in a file, not rename files.

opps replace in file not file names

the above is good I am assuming you just want to alter the lines sartting with the denmark.handa prefix if so just replace the alpha portions.

If you need to do this in place (without a file redirect) I would say run it once to make sure the result appear good on screen and then use -i switch.

first/test run
sed 's/ $denmark\.handa*\.[[:digit:]]-[[:digit:]][[:digit:]]*$ /\1_1/' file

final run
sed -i 's/ $denmark\.handa*\.[[:digit:]]-[[:digit:]][[:digit:]]*$ /\1_1/' file