sed substitution of variable with a variable

ngyz86 · 12-27-2010, 04:13 AM

Dear all,

I am trying to alter the character position of residue numbers above 999 in a pdb file.

The following script is an attempt to:
1) Get all unique pdb residue numbers (in column 5) using awk and assign it to a variable i.

2) Loop through all the values in $i and if it is greater than 999, shift that number one character to the right using sed.

However, the script only manages to alter the final residue number.

Could anyone please advise how I can loop through all values in $i and shift it one character to the right?

Many thanks for your help.

#!/bin/bash
# Script to alter position of residue number in pdb file for resid above 999

i=$(awk '{print $5}' wt-test.pdb | uniq)

for i in $i;
do
if [ "$i" -gt "999" ]; then
sed 's/ $i / $i/g' wt-test.pdb > wt-test-out.pdb ;
fi
done

Trumpen · 12-27-2010, 04:44 AM

Awk should just be enough:

Code:

awk '$5>999{$5=" "$5} 1' wt-test.pdb >wt-test-out.pdb

where I used a blank character to shift the fifth field one character to the right.

By the way, your sed command doesn't work mainly because you are using single quotes
which don't allow $i to be expanded to the value of shell variable $i. Try with double quotes.

grail · 12-27-2010, 07:10 AM

Or you could put it altogether with the uniqueness too:

Code:

awk '!_[$5]++{if($5 > 999)$5=" "$5;print $5}' wt-test.pdb

MTK358 · 12-27-2010, 08:14 AM

Quote:

Originally Posted by ngyz86

#!/bin/bash
# Script to alter position of residue number in pdb file for resid above 999

i=$(awk '{print $5}' wt-test.pdb | uniq)

for i in $i;
do
if [ "$i" -gt "999" ]; then
sed 's/ $i / $i/g' wt-test.pdb > wt-test-out.pdb ;
fi
done

Do you just want the "$i" to be substituted for the variable's value in the argument for sed? If so, use this:

Code:

#!/bin/bash
# Script to alter position of residue number in pdb file for resid above 999

i=$(awk '{print $5}' wt-test.pdb | uniq)

for i in $i;
do
        if [ "$i" -gt "999" ]; then
        sed 's/ '"$i"' /  '"$i"'/g'  wt-test.pdb > wt-test-out.pdb ;
        fi
done

Note the parts in bold.

ngyz86 · 01-05-2011, 04:03 AM

Dear all,

Many thanks for all your replies.

I tried awk but after substitution, the specific character spacing between columns is no longer preserved. You can specify your output to be comma or tab or space separated etc but you cannot retain specific character spacing between different columns. I believe that is a drawback of awk.

As for sed, although it can retain column spacing after substitution, but if I place it in a loop over all the variables I want replaced, it performs substitution multiple times on the same original file, resulting in only the last value of the variable being substituted in the output. One way around is to keep creating temporary files for each substitution, with the next substitution working on the previous output temporary file.

I have a perl script that can help users adjust pdb column spacings. The script works by splitting each column in the original file on spaces, then reassigning columns to have specific character widths, and values in the column to be left or right aligned. Please take a look at the standard pdb format (from PDB website) before modifying the script. The script must be modified to work on different pdb files, especially if chain numbers are present or absent etc. The script can also renumber atom numbers. I hope it will be useful as a starting point for people who need to modify pdb files or others where column adjustments and alignments are needed.

grail · 01-05-2011, 04:17 AM

Quote:

I tried awk but after substitution, the specific character spacing between columns is no longer preserved. You can specify your output to be comma or tab or space separated etc but you cannot retain specific character spacing between different columns. I believe that is a drawback of awk.

Maybe you could demonstrate what you mean as the limitation would mainly be on what is written so far ... not on awk.

larryhaja · 01-05-2011, 07:44 AM

Quote:

Originally Posted by ngyz86

As for sed, although it can retain column spacing after substitution, but if I place it in a loop over all the variables I want replaced, it performs substitution multiple times on the same original file, resulting in only the last value of the variable being substituted in the output. One way around is to keep creating temporary files for each substitution, with the next substitution working on the previous output temporary file.

I'm not sure if this is the issue to your problem but it looks like the file keeps getting overwritten because of '>' character. Try replacing it with '>>'. So I would change this.

Code:

#!/bin/bash
# Script to alter position of residue number in pdb file for resid above 999

i=$(awk '{print $5}' wt-test.pdb | uniq)

for i in $i;
do
if [ "$i" -gt "999" ]; then
sed 's/ $i / $i/g' wt-test.pdb > wt-test-out.pdb ;
fi
done

to

Code:

#!/bin/bash
# Script to alter position of residue number in pdb file for resid above 999

i=$(awk '{print $5}' wt-test.pdb | uniq)

for i in $i;
do
if [ "$i" -gt "999" ]; then
sed 's/ $i / $i/g' wt-test.pdb >> wt-test-out.pdb ;
fi
done