LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-04-2011, 08:58 PM   #1
vjramana
Member
 
Registered: Sep 2009
Posts: 88

Rep: Reputation: 0
selecting digits from regular expression


I shall alter the question to make it more clear.have data set as below:
Quote:
HBOND SUMMARY
output to file HB_lowLyo_D_lipid_A_water_001_064.tbl,
data was sorted, intra-residue interactions are NOT included,
Distance cutoff is 4.00 angstroms, angle cutoff is 120.00 degrees
Hydrogen bond information dumped for occupancies > 0.00

DONOR ACCEPTORH ACCEPTOR
atom# res@atom atom# res@atom atom# res@atom %occupied distance angle
| 4645 58@O12 | 23489 1174@H1 23488 1174@O | 22.79 2.945 ( 0.28) 26.79 (14.41)
| 4645 58@O12 | 23490 1174@H2 23488 1174@O | 22.49 2.965 ( 0.31) 28.01 (14.47)
| 2701 34@O12 | 23333 1122@H1 23332 1122@O | 20.60 2.965 ( 0.23) 30.07 (14.18)
| 2701 34@O12 | 23334 1122@H2 23332 1122@O | 19.74 2.963 ( 0.23) 31.43 (13.88)
| 271 4@O12 | 23334 1122@H2 23332 1122@O | 19.70 2.825 ( 0.19) 21.92 (12.15)
| 271 4@O12 | 23333 1122@H1 23332 1122@O | 19.55 2.826 ( 0.19) 22.22 (12.71)
| 4655 58@O16 | 21156 396@H2 21154 396@O | 19.43 2.933 ( 0.22) 31.95 (15.18)
| 4658 58@O15 | 21156 396@H2 21154 396@O | 18.96 3.163 ( 0.27) 37.03 (14.63)
| 4310 54@O26 | 23202 1078@H2 23200 1078@O | 18.73 2.821 ( 0.24) 25.87 (13.92)
| 4655 58@O16 | 21155 396@H1 21154 396@O | 18.63 2.917 ( 0.22) 31.91 (15.00)
| 1820 23@O16 | 21167 400@H1 21166 400@O | 18.14 2.910 ( 0.22) 27.20 (13.87)
| 1820 23@O16 | 21168 400@H2 21166 400@O | 17.96 2.907 ( 0.21) 26.69 (13.86)
| 3845 48@O16 | 23454 1162@H2 23452 1162@O | 17.68 2.991 ( 0.31) 28.45 (14.88)
| 4658 58@O15 | 21155 396@H1 21154 396@O | 17.31 3.177 ( 0.27) 38.82 (14.69)
| 3845 48@O16 | 23453 1162@H1 23452 1162@O | 17.29 3.016 ( 0.32) 28.84 (14.57)
| 1489 19@O13 | 23201 1078@H1 23200 1078@O | 16.66 2.884 ( 0.23) 31.39 (15.56)
| 3824 48@O26 | 21099 377@H2 21097 377@O | 15.44 2.992 ( 0.30) 30.78 (15.01)
| 4253 53@O15 | 23454 1162@H2 23452 1162@O | 14.98 2.961 ( 0.27) 33.71 (15.09)
| 1459 19@O22 | 23201 1078@H1 23200 1078@O | 14.84 3.012 ( 0.33) 35.08 (16.12)
| 1081 14@O12 | 21173 402@H1 21172 402@O | 14.76 2.937 ( 0.24) 27.54 (14.26)
| 4253 53@O15 | 23453 1162@H1 23452 1162@O | 14.63 2.955 ( 0.25) 33.68 (15.11)
| 1081 14@O12 | 21174 402@H2 21172 402@O | 14.41 2.944 ( 0.25) 28.34 (14.35)
| 3824 48@O26 | 21098 377@H1 21097 377@O | 13.70 3.002 ( 0.30) 31.00 (15.21)
| 3845 48@O16 | 21156 396@H2 21154 396@O | 13.06 2.934 ( 0.26) 27.71 (14.05)
.
.
.
few thousand lines
The first field, $1 represent "|".
The $3 (3rd field) and $6 (6th field) in my data file represent "number-molecule" which has arrangement as below:


Quote:
1 2 3 4 5 6 7 8

9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56

57 58 59 60 61 62 63 64
Any pairs made from above numbers actually represents pairs in the 3rd and 6th field of each line in the data file.

What I want is to select the pairs from the data file made only by the numbers which are arranged at the outer most lines of the above number-molecule ordering.

In short, ANY PAIRS made by only the numbers

Quote:
(1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 64 9 17 25 33 41 49 57 8 16 24 32 40 48 56 64)

in other words

1 , 2
1 , 3
1 , 4
.
.
1 , 57
1 , 58
1 , 59
.
.
.
2, 1
2, 3
2, 4
2, 5
.
.
.
2, 57
2, 58
2, 59
.
.
.
are need to be deleted from the data file.

To achieve this I have tried to write awk script as below to test to print out the line which I suppose to delete. But at this level I fail to select those line pairs.

Quote:
#!/usr/bin/awk -f

BEGIN {
i=0
for (n=1; n<=8; n++) set[i++] = n;
for (n=57; n<=64; n++) set[i++] = n;
for (n=9; n<=49; n+=8) {set[i++] = n; set[i++] = n+7};
}


($1== "|") {
split($3, res1, "@"); split($6, res2, "@"); #print res1[1], res2[1]

if ( (res1[1] in set) == (res2[1] in set) );

{
print;
}

}
Can I get any help to resolve this needs?

Thanks in advance

Last edited by vjramana; 09-08-2011 at 07:44 PM. Reason: to make my question more clear
 
Old 09-05-2011, 02:57 AM   #2
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,426

Rep: Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876Reputation: 1876
Based on the input you have shown I would expect zero output as nothing in the sixth column is <= 64
 
Old 09-05-2011, 06:02 AM   #3
Proud
Senior Member
 
Registered: Dec 2002
Location: England
Distribution: Used to use Mandrake/Mandriva
Posts: 2,794

Rep: Reputation: 116Reputation: 116
There seems to be an assumption that the | characters are to be ignored, but I think most things default to using spaces/whitespace as field separators.

Even ignoring the |s aka using the first line of column/field headers, your 3rd field is of atom#s when it seems more intuitive that you'd mean another res part of a res@atom field, i.e. fields 4 and 6.
 
  


Reply

Tags
awk


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
regular expression Ammad Linux - General 5 08-01-2008 07:41 AM
Regular Expression harkonen Programming 6 07-12-2008 12:06 PM
regular expressions in perl-get amount of matching digits baddah Programming 3 06-27-2007 09:18 PM
Anyone know regular expression? ahhua Linux - Software 1 12-04-2003 08:13 AM
regular expression gumby Programming 3 07-15-2003 12:13 PM


All times are GMT -5. The time now is 06:57 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration