shell script for processing file:

RudraB · 06-29-2010, 07:09 AM

Dear Friends,
I have files which looks like:

Quote:

J_2 exchange interactions (mRy)
IQ JQ R direct ln det
Au -Au
1 1 0.500 0.500 0.000 ( 0.70711) 0.00171 0.00171
1 1 1.000 0.000 0.000 ( 1.00000) -0.00015 -0.00015
1 1 1.000 0.500 0.500 ( 1.22474) 0.00006 0.00006
1 1 1.000 1.000 0.000 ( 1.41421) 0.00007 0.00007
1 1 1.500 0.500 0.000 ( 1.58114) 0.00000 0.00000
1 1 1.000 1.000 1.000 ( 1.73205) -0.00001 -0.00001
1 1 1.500 0.500 1.000 ( 1.87083) 0.00000 0.00000
1 1 2.000 0.000 0.000 ( 2.00000) 0.00000 0.00000
1 1 2.000 0.500 0.500 ( 2.12132) 0.00000 0.00000
1 1 1.500 1.500 0.000 ( 2.12132) 0.00000 0.00000
1 1 2.000 1.000 0.000 ( 2.23607) 0.00001 0.00001
Au -Cr
1 1 0.500 0.500 0.000 ( 0.70711) 0.00587 0.00588
1 1 1.000 0.000 0.000 ( 1.00000) 0.00013 0.00013
1 1 1.000 0.500 0.500 ( 1.22474) -0.00048 -0.00048
1 1 1.000 1.000 0.000 ( 1.41421) -0.00116 -0.00116
1 1 1.500 0.500 0.000 ( 1.58114) 0.00014 0.00014
1 1 1.000 1.000 1.000 ( 1.73205) 0.00054 0.00054
1 1 1.500 0.500 1.000 ( 1.87083) -0.00009 -0.00009
1 1 2.000 0.000 0.000 ( 2.00000) -0.00030 -0.00030
1 1 2.000 0.500 0.500 ( 2.12132) -0.00007 -0.00007
1 1 1.500 1.500 0.000 ( 2.12132) 0.00013 0.00013

And so on.
I have numerous number of such file. What I am trying to do is take 6 and 7th column of them(which I need), but also within a file that takes tha AuCr or Au-Au into account.
What I have done by now is it takes all the $6 and $7 (awk one-liner). How can I write it in a output which contains the Au-Au or Au-Cr in it?
One thing is there which is not evident in the file I gave:
the name of components(Au-Au/Au-Cr etc) is the first letter in the line where the rest of the data
1 1 1.000 0.500 0.500 ( 1.22474) -0.00048 -0.00048
has a blank space at the beginning.....so that may help.....but i am clueless. plz help

pixellany · 06-29-2010, 07:28 AM

Do you want to print the 6th and 7th columns for all lines, or only those that follow "Au -Au" or "Au -Cr".

To simply print the lines beginning with Au, you can do this:

awk '/^Au/{print}'

RudraB · 06-29-2010, 08:06 AM

no...actually it may vary by Au-Cr, Au-Au, Cr-Au, Cr-Cr.
And I want to get the all lines below such Au-Cr combination until another such combination comes in and do the same until the end of the file.

pixellany · 06-29-2010, 08:23 AM

Pseudo-code:

while read line; do
if line begins with <Regex1>; then
echo $line
if line begins with <Regex2>; then
echo $line | awk '{print $6 #7}'
done < filename > newfilename

Regex1 to match Au-Cr, etc. (This might be simply "^[a-zA-z]" (all lines starting with a letter) OR "[AC]" (all lines starting with "A" or "C")

Regex1 to match the lines where you want columns 6 and 7--eg "^1 1" (all lines beginning with "1 1"

The whole thing can be written in AWK, if desired, or in BASH + calling AWK for the columns

Andrew Benton · 06-29-2010, 09:21 AM

Something like this may work. Save it as a bash script and when you run it pass it the name of the data file you want to process as the first argument. Eg, save it as data.bash then run it with
bash data.bash /path/to/data.file

Code:

#!/bin/bash
set -e

count=0
while read myline
do
  # turn myline into an array so we can work on different fields
  line=($myline)
  if [[ ${line[0]} = Au ]] && [[ ${line[1]} = -Au || ${line[1]} = -Cr ]]
  # thing is a temporary variable to hold "Au -Au" or "Au -Cr"
  then thing="${line[0]} ${line[1]}"
  fi
  echo $thing ${line[6]/)/} ${line[7]}
  ((++count))
done < $1

RudraB · 06-29-2010, 03:42 PM

I am sorry but you are not getting my point.
what you have defined as $thing should be a file and line(6) and line(7) are the data written in the file.
I tried :

Code:

#!/bin/bash
set -e

count=0
while read myline
do
# turn myline into an array so we can work on different fields
  line=($myline)
  if [[ ${line[0]} = Au ]] && [[ ${line[1]} = -Au || ${line[1]} = -Cr ]]
# thing is a temporary variable to hold "Au -Au" or "Au -Cr"
  then thing="${line[0]} ${line[1]}"
#  echo $thing
  fi
#  echo $thing
  print  ${line[6]/)/} ${line[7]} >$thing
  ((++count))
done < $1

which is giving error:

Code:

$ ./proc.sh test_prn 
./proc.sh: line 15: $thing: ambiguous redirect

I am also putting the test_prn:

Quote:

Au -Au
1 1 0.500 0.500 0.000 ( 0.70711) 0.00171 0.00171
1 1 1.000 0.000 0.000 ( 1.00000) -0.00015 -0.00015
1 1 1.000 0.500 0.500 ( 1.22474) 0.00006 0.00006
1 1 1.000 1.000 0.000 ( 1.41421) 0.00007 0.00007
1 1 1.500 0.500 0.000 ( 1.58114) 0.00000 0.00000
1 1 1.000 1.000 1.000 ( 1.73205) -0.00001 -0.00001
1 1 1.500 0.500 1.000 ( 1.87083) 0.00000 0.00000
1 1 2.000 0.000 0.000 ( 2.00000) 0.00000 0.00000
1 1 2.000 0.500 0.500 ( 2.12132) 0.00000 0.00000
1 1 1.500 1.500 0.000 ( 2.12132) 0.00000 0.00000
1 1 2.000 1.000 0.000 ( 2.23607) 0.00001 0.00001
1 1 1.500 1.500 1.000 ( 2.34521) 0.00000 0.00000
1 1 2.000 1.000 1.000 ( 2.44949) 0.00000 0.00000
1 1 2.500 0.500 0.000 ( 2.54951) 0.00000 0.00000
1 1 2.000 1.500 0.500 ( 2.54951) 0.00000 0.00000
1 1 2.500 1.000 0.500 ( 2.73861) 0.00000 0.00000
1 1 2.000 2.000 0.000 ( 2.82843) 0.00000 0.00000
1 1 2.000 1.500 1.500 ( 2.91548) 0.00000 0.00000
1 1 2.500 1.500 0.000 ( 2.91548) 0.00000 0.00000
1 1 3.000 0.000 0.000 ( 3.00000) 0.00001 0.00001
1 1 2.000 2.000 1.000 ( 3.00000) 0.00000 0.00000
1 1 2.500 1.500 1.000 ( 3.08221) 0.00000 0.00000
1 1 3.000 0.500 0.500 ( 3.08221) 0.00000 0.00000
1 1 3.000 1.000 0.000 ( 3.16228) 0.00000 0.00000
1 1 2.500 2.000 0.500 ( 3.24037) 0.00000 0.00000
Au -Cr
1 1 0.500 0.500 0.000 ( 0.70711) 0.00587 0.00588
1 1 1.000 0.000 0.000 ( 1.00000) 0.00013 0.00013
1 1 1.000 0.500 0.500 ( 1.22474) -0.00048 -0.00048
1 1 1.000 1.000 0.000 ( 1.41421) -0.00116 -0.00116
1 1 1.500 0.500 0.000 ( 1.58114) 0.00014 0.00014
1 1 1.000 1.000 1.000 ( 1.73205) 0.00054 0.00054
1 1 1.500 0.500 1.000 ( 1.87083) -0.00009 -0.00009
1 1 2.000 0.000 0.000 ( 2.00000) -0.00030 -0.00030
1 1 2.000 0.500 0.500 ( 2.12132) -0.00007 -0.00007
1 1 1.500 1.500 0.000 ( 2.12132) 0.00013 0.00013
1 1 2.000 1.000 0.000 ( 2.23607) 0.00011 0.00011
1 1 1.500 1.500 1.000 ( 2.34521) 0.00001 0.00001
1 1 2.000 1.000 1.000 ( 2.44949) -0.00002 -0.00002
1 1 2.500 0.500 0.000 ( 2.54951) 0.00005 0.00005
1 1 2.000 1.500 0.500 ( 2.54951) 0.00002 0.00002
1 1 2.500 1.000 0.500 ( 2.73861) 0.00001 0.00001
1 1 2.000 2.000 0.000 ( 2.82843) -0.00001 -0.00001
1 1 2.000 1.500 1.500 ( 2.91548) 0.00000 0.00000
1 1 2.500 1.500 0.000 ( 2.91548) -0.00007 -0.00007
1 1 3.000 0.000 0.000 ( 3.00000) -0.00001 -0.00001
1 1 2.000 2.000 1.000 ( 3.00000) -0.00002 -0.00002
1 1 2.500 1.500 1.000 ( 3.08221) 0.00001 0.00001
1 1 3.000 0.500 0.500 ( 3.08221) -0.00001 -0.00001
1 1 3.000 1.000 0.000 ( 3.16228) 0.00000 0.00000
1 1 2.500 2.000 0.500 ( 3.24037) -0.00002 -0.00002
Cr -Au
1 1 0.500 0.500 0.000 ( 0.70711) 0.00588 0.00589
1 1 1.000 0.000 0.000 ( 1.00000) 0.00013 0.00013
1 1 1.000 0.500 0.500 ( 1.22474) -0.00048 -0.00048
1 1 1.000 1.000 0.000 ( 1.41421) -0.00116 -0.00116
1 1 1.500 0.500 0.000 ( 1.58114) 0.00014 0.00014
1 1 1.000 1.000 1.000 ( 1.73205) 0.00054 0.00054
1 1 1.500 0.500 1.000 ( 1.87083) -0.00009 -0.00009
1 1 2.000 0.000 0.000 ( 2.00000) -0.00030 -0.00030
1 1 2.000 0.500 0.500 ( 2.12132) -0.00007 -0.00007
1 1 1.500 1.500 0.000 ( 2.12132) 0.00013 0.00013
1 1 2.000 1.000 0.000 ( 2.23607) 0.00011 0.00011
1 1 1.500 1.500 1.000 ( 2.34521) 0.00001 0.00001
1 1 2.000 1.000 1.000 ( 2.44949) -0.00002 -0.00002
1 1 2.500 0.500 0.000 ( 2.54951) 0.00005 0.00005
1 1 2.000 1.500 0.500 ( 2.54951) 0.00002 0.00002
1 1 2.500 1.000 0.500 ( 2.73861) 0.00001 0.00001
1 1 2.000 2.000 0.000 ( 2.82843) -0.00001 -0.00001
1 1 2.000 1.500 1.500 ( 2.91548) 0.00000 0.00000
1 1 2.500 1.500 0.000 ( 2.91548) -0.00007 -0.00007
1 1 3.000 0.000 0.000 ( 3.00000) -0.00001 -0.00001
1 1 2.000 2.000 1.000 ( 3.00000) -0.00002 -0.00002
1 1 2.500 1.500 1.000 ( 3.08221) 0.00001 0.00001
1 1 3.000 0.500 0.500 ( 3.08221) -0.00001 -0.00001
1 1 3.000 1.000 0.000 ( 3.16228) 0.00000 0.00000
1 1 2.500 2.000 0.500 ( 3.24037) -0.00002 -0.00002
Cr -Cr
1 1 0.500 0.500 0.000 ( 0.70711) -0.60706 -0.52455
1 1 1.000 0.000 0.000 ( 1.00000) 0.08964 0.06506
1 1 1.000 0.500 0.500 ( 1.22474) -0.11047 -0.11055
1 1 1.000 1.000 0.000 ( 1.41421) -0.04464 -0.04376
1 1 1.500 0.500 0.000 ( 1.58114) -0.00813 -0.00792
1 1 1.000 1.000 1.000 ( 1.73205) -0.03877 -0.03904
1 1 1.500 0.500 1.000 ( 1.87083) 0.00928 0.00922
1 1 2.000 0.000 0.000 ( 2.00000) -0.02249 -0.02250
1 1 2.000 0.500 0.500 ( 2.12132) 0.01496 0.01494
1 1 1.500 1.500 0.000 ( 2.12132) -0.00654 -0.00647
1 1 2.000 1.000 0.000 ( 2.23607) 0.00013 0.00015
1 1 1.500 1.500 1.000 ( 2.34521) -0.00134 -0.00134
1 1 2.000 1.000 1.000 ( 2.44949) 0.00431 0.00430
1 1 2.500 0.500 0.000 ( 2.54951) 0.00693 0.00692
1 1 2.000 1.500 0.500 ( 2.54951) 0.00136 0.00136
1 1 2.500 1.000 0.500 ( 2.73861) -0.00175 -0.00175
1 1 2.000 2.000 0.000 ( 2.82843) 0.00396 0.00396
1 1 2.000 1.500 1.500 ( 2.91548) -0.00149 -0.00150
1 1 2.500 1.500 0.000 ( 2.91548) -0.00053 -0.00053
1 1 3.000 0.000 0.000 ( 3.00000) 0.00023 0.00023
1 1 2.000 2.000 1.000 ( 3.00000) 0.00205 0.00205
1 1 2.500 1.500 1.000 ( 3.08221) -0.00104 -0.00104
1 1 3.000 0.500 0.500 ( 3.08221) -0.00031 -0.00031
1 1 3.000 1.000 0.000 ( 3.16228) -0.00216 -0.00216
1 1 2.500 2.000 0.500 ( 3.24037) -0.00082

plz help

Andrew Benton · 06-29-2010, 05:48 PM

Maybe this?

Code:

#!/bin/bash
set -e
thing=/dev/null
echo > "Au -Au"
echo > "Au -Cr"
echo > "Cr -Au"
echo > "Cr -Cr"

count=0
while read myline
do
# turn myline into an array so we can work on different fields
  line=($myline)
  if [[ ${line[0]} = Au || ${line[0]} = Cr ]] &&
    [[ ${line[1]} = -Au || ${line[1]} = -Cr ]]
#
# thing is a temporary variable to hold
# "Au -Au", "Au -Cr", "Cr -Au" or "Cr -Cr"
# 
  then thing="${line[0]} ${line[1]}"
  fi
  echo ${line[6]/)/} ${line[7]} >> "$thing"
  ((++count))
done < $1

grail · 06-29-2010, 06:38 PM

How about, based on last lot of input:

Code:

awk '{if(NF == 2)print;else print $6,$7}' input_file

RudraB · 06-30-2010, 01:13 AM

Andrew,
Thanks a lot. But I am not understanding how it reads the filenames.
actually, I have directories :

Quote:

$ ls -d */
Au10-Cr90/ Au15-Cr85/ Au20-Cr80/ Au25-Cr75/ Au5-Cr95/

inside which the working file is for each directory. Hence I tried:

Code:

#!/bin/bash
set -e
thing=/dev/null
echo > "Au -Au"
echo > "Au -Cr"
echo > "Cr -Au"
echo > "Cr -Cr"

c_min=$1
c_max=$2
c_gap=$3
for (( i = $c_min; i <= $c_max; i = i +$c_gap))
do
ca=$i
cb=`echo "100-$i"|bc`
tail -103 Au${ca}-Cr${cb}/jij/AuCr01.prn >./tmp
#awk '{print  $7," " $8}' tmp >jij
count=0
while read myline
do
# turn myline into an array so we can work on different fields
  line=($myline)
  if [[ ${line[0]} = Au || ${line[0]} = Cr ]] &&
    [[ ${line[1]} = -Au || ${line[1]} = -Cr ]]
#
# thing is a temporary variable to hold
# "Au -Au", "Au -Cr", "Cr -Au" or "Cr -Cr"
# 
  then thing="${line[0]} ${line[1]}"
  fi
  echo ${line[6]/)/} ${line[7]} >> "jij_Au$ca-$thing"
  ((++count))
done < $1
#awk '{sub (/)/,"");print}' jij>jij_Au$ca
done

and run this code as :

Code:

$ ./new_jij.sh 5 25 5
./new_jij.sh: line 35: 5: No such file or directory
./new_jij.sh: line 35: 5: No such file or directory
./new_jij.sh: line 35: 5: No such file or directory
./new_jij.sh: line 35: 5: No such file or directory
./new_jij.sh: line 35: 5: No such file or directory

Will you plz help me a bit more?