LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-27-2010, 04:56 PM   #1
billywayne
LQ Newbie
 
Registered: May 2009
Posts: 15

Rep: Reputation: 0
sed : substitute without displacing columns


Hello.

I am looking to use `sed' to substitute one string for another within a file.

My issue is that the new string is not always the same length as the old string. When this is the case, the other characters on the line are displaced.

For example, I have the following line.

Code:
  9 H      1    1    8    Y.YY       7  109.416000   6   65.783000        0
My goal is to replace the "Y.YY" (4 characters) with "10.01" (5 characters).

But if I simply use

Code:
s/Y.YY/10.01/
then all of the characters following the "10.01" will be moved to the right, which will cause an input error in the program into which I feed this input (the program is coded in FORTRAN and is inflexible as to where the input parameters are positioned).


How can I replace the Y.YY with 10.01 without causing the rest of the characters to be shifted?


Thanks!

BW
 
Old 05-27-2010, 05:10 PM   #2
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,895

Rep: Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015
How about:
Code:
s/Y.YY /10.01/ 

s/ Y.YY/10.01/
depending on exactly how you want it to line up.
 
Old 05-27-2010, 05:18 PM   #3
billywayne
LQ Newbie
 
Registered: May 2009
Posts: 15

Original Poster
Rep: Reputation: 0
Well, manually changing the command would work, and I'm thankful for the suggestion.

However, this sed command is part of a larger shell script which will operate on an array of files for a sequence of values, say 1.00 to 10.50, or so. So sometimes Y.YY is being replaced by a value with the same number of characters, like 1.11, but sometimes with more characters, like 10.01.

I'm looking to develop a robust way of substituting the pattern no matter how many characters need to be substituted in. I perform this operation several dozens of times, so being able to make the substitution independent of character string length is highly valuable to me.
 
Old 05-27-2010, 05:32 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,119

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Fix your Fortran parsing.
Else I'd reckon you're up for something like perl or awk to add some logic to figure the length of the substitute and resolve the correct offset to replace.
 
1 members found this post helpful.
Old 05-27-2010, 05:35 PM   #5
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Normally, you would use tabs to accomplish something like this. If that's not possible, then you can first count the number of characters in the target string, and then use that to adjust the replacement.

so, maybe like this?? (pseudocode---not tested):
Code:
set maxcount to appropriate value**
set rmspace to appropriate value
while read line; do
    count=$(echo $line | grep -o 'Y,Y*' | wc -m)
    subtract count from maxcount to get # of spaces to be added
    create fillstring with the right number of spaces
    echo $(echo $line | sed -r "s/Y.Y* {$rmspace}/10\.01$fillstring/")
done <filename >newfilename
**Get the appropriate value for maxcount by determining the maximum total of characters to be replaced and adjusting for the size of the new string to be added.

Last edited by pixellany; 05-27-2010 at 05:37 PM.
 
1 members found this post helpful.
Old 05-27-2010, 05:59 PM   #6
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,895

Rep: Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015
Quote:
Originally Posted by billywayne View Post
Well, manually changing the command would work, and I'm thankful for the suggestion.

However, this sed command is part of a larger shell script which will operate on an array of files for a sequence of values, say 1.00 to 10.50, or so. So sometimes Y.YY is being replaced by a value with the same number of characters, like 1.11, but sometimes with more characters, like 10.01.

I'm looking to develop a robust way of substituting the pattern no matter how many characters need to be substituted in. I perform this operation several dozens of times, so being able to make the substitution independent of character string length is highly valuable to me.
The trick is to work on the whole field and pad the value you're substituting appropriately. So for a 5 character right aligned field:
Code:
sed -e "s/ Y.YY/$(printf "%5s" $value)/"
and for a left aligned field:
Code:
sed -e "s/Y.YY /$(printf "%-5s" $value)/"

Last edited by GazL; 05-27-2010 at 06:25 PM. Reason: added alternative alignment.
 
1 members found this post helpful.
Old 05-27-2010, 06:07 PM   #7
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Following previous suggestions, I would end-up with something like this:
Code:
#!/bin/bash
number=10.01
digits=$(echo "scale=0; l($number)/l(10)" | bc -l)
sed -i.bck "s/ \{$digits\}Y.YY/$number/" file
the integer of the base-10 logarithm of the number, just counts the additional digits. This information can be used to count the number of spaces to substitute before (and together with) the Y.YY string.
 
1 members found this post helpful.
Old 05-27-2010, 06:38 PM   #8
billywayne
LQ Newbie
 
Registered: May 2009
Posts: 15

Original Poster
Rep: Reputation: 0
Very good input, you guys.

I've taken your ideas and mixed them up and threw in one of my own. Here's the result.


You're posts made me realize that what I really want to do isn't to replace one string of a given length with another string of, perhaps, a different length.

Actually, both strings need to be 11 characters in order for everything to be just right. Here's my first approximation:

Code:
#!/bin/bash

VALUE="10.01"     # the value I want to replace Y.YY
PATTERN="$( printf "%-11s" Y.YY )"    # Y.YY expressed as an 11 character string, the `-' left justifies it.
REPLACE="$( printf "%-11s" ${VALUE} )"  # the value of $VALUE expressed as a left justified 11 character string
YLINE="$( grep -n Y.YY *.z | cut -d ":" -f 1 )"  # the line on which Y.YY may be found

sed "${YLINE}s/${PATTERN}/${REPLACE}/" *.z
This works like a charm. I'm totally open to suggestions for improving it though.

Thanks again for all the input.
 
Old 05-27-2010, 06:54 PM   #9
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,895

Rep: Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015
Obviously, I don't have full view of exactly what you're doing but the grep and $YLINE look unnecessary.
sed won't replace if it doesn't match and dropping the grep will save you an extra pass through the file.
 
Old 05-27-2010, 07:00 PM   #10
billywayne
LQ Newbie
 
Registered: May 2009
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by GazL View Post
Obviously, I don't have full view of exactly what you're doing but the grep and $YLINE look unnecessary.
sed won't replace if it doesn't match and dropping the grep will save you an extra pass through the file.
Very true. It's something I started doing and now I can't even remember why.
 
Old 05-27-2010, 07:12 PM   #11
billywayne
LQ Newbie
 
Registered: May 2009
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by billywayne View Post
Very true. It's something I started doing and now I can't even remember why.
Oh yeah. Now I remember.

Once I've replaced the X.XX with the value (VALUE_1), I feed the input file into the main program.

The main program then produces a file exactly like the one I gave it, with certain other values updated, but not VALUE_1.

I then have to replace VALUE_1 with with another value (VALUE_2).

It's likely that VALUE_1 may appear somewhere else in the input file, so a global sed substitution may produce results I don't want.

I keep track of where X.XX was so that when it comes time to replace VALUE_1 with VALUE_2, I know exactly where to look for it.
 
Old 05-27-2010, 07:33 PM   #12
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,895

Rep: Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015Reputation: 5015
Yikes. That sounds messy. Best of luck.
 
Old 05-27-2010, 07:38 PM   #13
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Not to be a party pooper (or maybe I just misunderstand), but the following doesn't make sense:
Quote:
It's likely that VALUE_1 may appear somewhere else in the input file, so a global sed substitution may produce results I don't want.
I keep track of where X.XX was so that when it comes time to replace VALUE_1 with VALUE_2, I know exactly where to look for it.
Reason being is that grep would return all lines with VALUE_1 (would it not??) and so there would be multiple line numbers in YLINE.
Or did I miss something?
 
Old 05-27-2010, 08:33 PM   #14
billywayne
LQ Newbie
 
Registered: May 2009
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by grail View Post
Not to be a party pooper (or maybe I just misunderstand), but the following doesn't make sense:

Reason being is that grep would return all lines with VALUE_1 (would it not??) and so there would be multiple line numbers in YLINE.
Or did I miss something?
I think GazL's suggestion was that the entire YLINE procedure could be omitted. Not to remove the grep from the YLINE, but to get rid of YLINE altogether.

I could indeed remove the YLINE stuff if my script were only replacing Y.YY with a value. I could tell sed to perform a global substitution and be done with it. But I need to know where Y.YY was originally in order to perform subsequent substitutions.

Something like this
Code:
YLINE=$( grep -n Y.YY file.z | cut -d ":" -f 1 )
SEQUENCE=$( seq -w 1.00 0.01 10.00 )

for VALUE in ${SEQUENCE} ; do
    sed -i.bak "${YLINE}s/Y.YY/${VALUE}/" file.z
    submit_to_program  # produces an output.z file
    sed -i.bak "${YLINE}s/${VALUE}/Y.YY/" output.z
    cp output.z file.z
done
Not exactly but close.

I have some steps (initializing and incremented a counter and cp output.z output${COUNTER}.z) in order to create temporary files so as not to clobber intermediate output.z files, but that's kind of the essence of what I'm doing. And I do this more times than I'd like to think of, so I'd like for the script to be as general as possible, handling three digit or four digit numbers without having to think about the length of the variable to be substituted. Going in and replacing X.XX with XX.XX in all of its occurrences every time I needed to submit an input file would be repetitive and tedious. And isn't that what computers were made for, doing the repetitive tedious stuff so I don't have to? I guess I could just s/X.XX/XX.XX/g on the script every time, but what's the fun in that? I want to store the script in my local bin directory and call it whenever I need it without having to worry about it.

$VALUE may appear somewhere else in the file. Having $YLINE ensures me that everything is happening to the correct line. Granted I don't need it for the initial sed substitution, but I kind of like the feeling that sed isn't looking through the entire file when I can tell it exactly which line it's on.

See what I'm saying?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed substitute everything until character sqn Programming 5 03-30-2010 10:27 AM
[SOLVED] SED - substitute a word only in a certain line carolflb Linux - Newbie 3 02-02-2010 09:30 AM
Moving columns with sed or awk? btm Linux - Newbie 4 09-27-2007 02:03 PM
substitute 2 columns ovince Programming 2 05-04-2007 01:07 AM
Sed substitute for my username? camaroblue87 Linux - Newbie 1 04-29-2006 11:07 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:39 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration