LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-13-2008, 05:00 PM   #1
horacioemilio
Member
 
Registered: Dec 2007
Posts: 61

Rep: Reputation: 15
Complicated string substitution


Hi,

I have a file with a lot of the following ocurrences:

denmark.handa.1-10
denmark.handa.1-12344
denmark.handa.1-4
denmark.handa.1-56

...

distributed randomly in a file. I need to convert each of this ocurrences to:


denmark.handa.1-10_1
denmark.handa.1-12344_1
denmark.handa.1-4_1
denmark.handa.1-56_1

so basically I add "_1" at the end of each ocurrence.

I thought about using sed, but as each "root" is different I have no clue how to go through this.

Any suggestion ?

Thanks in advance.
 
Old 02-13-2008, 05:13 PM   #2
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,147

Rep: Reputation: 330Reputation: 330Reputation: 330Reputation: 330
By "root" do you mean the "denmark.handa." part of the string? That part is all the same in your examples, so the
Quote:
I thought about using sed, but as each "root" is different I have no clue how to go through this.
part leaves me puzzled.

Assuming, however, that you want to replace \.([[:digit:]]+-[[:digit:]]+)$, ignoring the "root", sed should have no problem doing so.
 
Old 02-13-2008, 05:38 PM   #3
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
Which part is the root you mentioned. "denmark" or "denmark.handa"?
Could you supply more "real" examples. For example, is the the ".[[:digit:]]-" always ".1-"?
You need to be as precise as possible in defining an input pattern and it's position in the line to prevent false positives or prevent missing a matching pattern.
Also what is the encoding scheme. You may need to use [[:alpha:]] instead of [a-z] for example to include accents.
Code:
sed 's/ \([[:alpha:]][[:alpha:]]*\.[[:alpha:]][[:alpha:]]*\.[[:digit:]]-[[:digit:]][[:digit:]]*\) /\1_1/' file
This example assumes that the initial digit is alway only one digit in length, but might be between 0-9, and that there will always be a space before and after the the pattern. Sometimes you will have more than one line to handle different input patterns. One thing to look out for is if the pattern might be split between two lines. This will complicate things greatly because then you will need to save some lines in the buffer and then check for matching patterns depending on where the split is. Then you also need to decide whether to add the "_1" to the end leaving the line split, or move the line split.
 
Old 02-13-2008, 06:15 PM   #4
cybersekkin
LQ Newbie
 
Registered: Jan 2005
Location: Japan
Distribution: suse, debian, libranet, slack
Posts: 12

Rep: Reputation: 0
try the following shell script for rename all files in a dir with a following _1

#!/bin/bash

for X in `ls -1 ./temp/`
do
NEW_EXT=_1
mv ./temp/$X ./temp/$X$NEW_EXT
done

this would also work well with find.
for X in `find ./ -name denmark.handa*`
to just later the file with the starting chars of denmark.handa
 
Old 02-13-2008, 06:21 PM   #5
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
The OP wants to replace text in a file, not rename files.
 
Old 02-13-2008, 06:40 PM   #6
cybersekkin
LQ Newbie
 
Registered: Jan 2005
Location: Japan
Distribution: suse, debian, libranet, slack
Posts: 12

Rep: Reputation: 0
opps replace in file not file names

the above is good I am assuming you just want to alter the lines sartting with the denmark.handa prefix if so just replace the alpha portions.

If you need to do this in place (without a file redirect) I would say run it once to make sure the result appear good on screen and then use -i switch.

first/test run
sed 's/ \(denmark\.handa*\.[[:digit:]]-[[:digit:]][[:digit:]]*\) /\1_1/' file

final run
sed -i 's/ \(denmark\.handa*\.[[:digit:]]-[[:digit:]][[:digit:]]*\) /\1_1/' file
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract a number from a complicated string in python horacioemilio Programming 2 12-25-2007 09:09 PM
Shell Script: Delete lines til string found or until particular string. bhargav_crd Linux - General 3 12-20-2007 11:14 PM
substitution ovince Programming 3 04-28-2007 05:35 AM
Rewrite rule with query string in the pattern string basahkuyup Linux - Newbie 2 10-17-2006 02:06 AM
java test if string in string array is null. exodist Programming 3 02-21-2004 01:39 PM


All times are GMT -5. The time now is 12:34 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration