LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 11-27-2012, 10:55 PM   #1
vjramana
Member
 
Registered: Sep 2009
Posts: 87

Rep: Reputation: 0
read pdb file and store x,y,z data into array


I have a pdb (protein data bank) file for my bilayer structure. The format of the data in the file is:

Code:
REMARK   1                     PDB file generated by ptraj (set  1000)
ATOM      1  O22 DDM     1       2.800   4.419  20.868  0.00  0.00
ATOM      2  H22 DDM     1       3.427   4.096  20.216  0.00  0.00
ATOM      3  C22 DDM     1       3.351   5.588  21.698  0.00  0.00
ATOM      4  H42 DDM     1       3.456   5.274  22.736  0.00  0.00
ATOM      5  C23 DDM     1       2.530   6.846  21.639  0.00  0.00
ATOM      6  H43 DDM     1       2.347   7.159  20.611  0.00  0.00
ATOM      7  O23 DDM     1       1.313   6.498  22.334  0.00  0.00
ATOM      8  H23 DDM     1       0.903   5.837  21.771  0.00  0.00
ATOM      9  C24 DDM     1       3.073   8.109  22.266  0.00  0.00
ATOM     10  H44 DDM     1       3.139   7.837  23.319  0.00  0.00
ATOM     11  O24 DDM     1       2.218   9.278  22.007  0.00  0.00
ATOM     12  H24 DDM     1       1.278   9.184  22.179  0.00  0.00
ATOM     13  C25 DDM     1       4.494   8.317  21.764  0.00  0.00
ATOM     14  H45 DDM     1       4.391   8.452  20.687  0.00  0.00
ATOM     15  C26 DDM     1       5.150   9.522  22.451  0.00  0.00
ATOM     16  H46 DDM     1       5.281   9.338  23.517  0.00  0.00
ATOM     17  O26 DDM     1       6.463   9.793  21.813  0.00  0.00
ATOM     18  H26 DDM     1       6.132   9.832  20.913  0.00  0.00
ATOM     19  H47 DDM     1       4.556  10.422  22.293  0.00  0.00
ATOM     20  O25 DDM     1       5.310   7.134  22.038  0.00  0.00
ATOM     21  C21 DDM     1       4.803   5.890  21.312  0.00  0.00
ATOM     22  H41 DDM     1       5.411   5.039  21.619  0.00  0.00
ATOM     23  O14 DDM     1       4.862   6.018  19.861  0.00  0.00
ATOM     24  C14 DDM     1       5.929   5.361  19.124  0.00  0.00
ATOM     25  C13 DDM     1       5.287   4.475  18.028  0.00  0.00
ATOM     26  C12 DDM     1       6.158   3.960  16.892  0.00  0.00
ATOM     27  H32 DDM     1       6.656   3.058  17.247  0.00  0.00
ATOM     28  O12 DDM     1       5.297   3.529  15.813  0.00  0.00
ATOM     29  H12 DDM     1       5.801   2.924  15.263  0.00  0.00
ATOM     30  H33 DDM     1       4.519   5.032  17.491  0.00  0.00
ATOM     31  O13 DDM     1       4.504   3.420  18.590  0.00  0.00
ATOM     32  H13 DDM     1       3.745   3.387  18.002  0.00  0.00
ATOM     33  H34 DDM     1       6.534   4.701  19.745  0.00  0.00
ATOM     34  C15 DDM     1       6.877   6.364  18.569  0.00  0.00
ATOM     35  H35 DDM     1       6.346   7.058  17.919  0.00  0.00
ATOM     36  C16 DDM     1       7.803   7.026  19.651  0.00  0.00
ATOM     37  H36 DDM     1       8.506   6.293  20.046  0.00  0.00
ATOM     38  O16 DDM     1       8.542   8.137  19.068  0.00  0.00
ATOM     39  H16 DDM     1       9.284   7.739  18.608  0.00  0.00
ATOM     40  H37 DDM     1       7.190   7.442  20.451  0.00  0.00
ATOM     41  O15 DDM     1       7.779   5.737  17.573  0.00  0.00
ATOM     42  C11 DDM     1       7.212   4.973  16.463  0.00  0.00
ATOM     43  H31 DDM     1       6.772   5.664  15.744  0.00  0.00
ATOM     44  O11 DDM     1       8.252   4.232  15.669  0.00  0.00
ATOM     45  C71 DDM     1       9.333   5.077  15.113  0.00  0.00
ATOM     46  H71 DDM     1       8.967   6.046  14.774  0.00  0.00
ATOM     47  H72 DDM     1       9.939   5.217  16.008  0.00  0.00
ATOM     48  C72 DDM     1      10.045   4.280  14.061  0.00  0.00
ATOM     49  H73 DDM     1      10.905   3.799  14.528  0.00  0.00
ATOM     50  H74 DDM     1       9.392   3.499  13.672  0.00  0.00
ATOM     51  C73 DDM     1      10.626   5.095  12.881  0.00  0.00
ATOM     52  H75 DDM     1       9.754   5.292  12.256  0.00  0.00
ATOM     53  H76 DDM     1      11.064   6.067  13.108  0.00  0.00
ATOM     54  C74 DDM     1      11.588   4.237  11.991  0.00  0.00
ATOM     55  H77 DDM     1      11.191   3.259  11.720  0.00  0.00
ATOM     56  H78 DDM     1      11.682   4.611  10.971  0.00  0.00
ATOM     57  C75 DDM     1      12.935   4.121  12.765  0.00  0.00
ATOM     58  H79 DDM     1      13.382   5.114  12.712  0.00  0.00
ATOM     59  H80 DDM     1      12.738   4.024  13.833  0.00  0.00
ATOM     60  C76 DDM     1      13.868   3.158  12.105  0.00  0.00
ATOM     61  H81 DDM     1      13.366   2.211  11.905  0.00  0.00
ATOM     62  H82 DDM     1      14.128   3.459  11.090  0.00  0.00
ATOM     63  C77 DDM     1      15.160   2.954  12.878  0.00  0.00
ATOM     64  H83 DDM     1      15.878   2.491  12.202  0.00  0.00
ATOM     65  H84 DDM     1      15.521   3.960  13.090  0.00  0.00
ATOM     66  C78 DDM     1      14.983   2.175  14.214  0.00  0.00
ATOM     67  H85 DDM     1      15.831   2.297  14.888  0.00  0.00
ATOM     68  H86 DDM     1      14.207   2.662  14.804  0.00  0.00
ATOM     69  C79 DDM     1      14.599   0.716  13.949  0.00  0.00
ATOM     70  H87 DDM     1      13.649   0.765  13.417  0.00  0.00
ATOM     71  H88 DDM     1      15.367   0.337  13.276  0.00  0.00
ATOM     72  C80 DDM     1      14.470  -0.193  15.194  0.00  0.00
ATOM     73  H89 DDM     1      15.352  -0.161  15.834  0.00  0.00
ATOM     74  H90 DDM     1      13.690   0.301  15.773  0.00  0.00
ATOM     75  C81 DDM     1      14.189  -1.652  14.698  0.00  0.00
ATOM     76  H91 DDM     1      13.282  -1.613  14.094  0.00  0.00
ATOM     77  H92 DDM     1      14.991  -1.879  13.995  0.00  0.00
ATOM     78  C82 DDM     1      13.956  -2.610  15.877  0.00  0.00
ATOM     79  H93 DDM     1      14.909  -2.750  16.387  0.00  0.00
ATOM     80  H94 DDM     1      13.113  -2.520  16.562  0.00  0.00
ATOM     81  H95 DDM     1      13.766  -3.622  15.519  0.00  0.00
TER       0              0
ATOM     83  O22 DDM     2       1.292   2.253  10.624  0.00  0.00
ATOM     84  H22 DDM     2       2.169   2.339  10.244  0.00  0.00
ATOM     85  C22 DDM     2       1.453   2.811  11.976  0.00  0.00
ATOM     86  H42 DDM     2       1.031   1.990  12.556  0.00  0.00
.
.
.
The column 6, 7 and 8 represent x, y and z data respectively.

I am new to perl but I tried to write simple code to read the pdb file.

Code:
#!/usr/bin/perl -w
use strict;
use diagnostics;
##################

open (PDB,'malto.dat') or die "Could not open file.\n";

my @lines=<PDB>;
print @lines;
close(PDB);
This code could only read the data from the file.

What I need is the code to read the data and store the column 6,7,8 into x,y,z variable respectively and ignore the first line in the file.

Appreciate if anyone could guide.
Thanks
 
Old 11-28-2012, 07:44 AM   #2
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 2,860

Rep: Reputation: 697Reputation: 697Reputation: 697Reputation: 697Reputation: 697Reputation: 697
Well, I don't know diddly about perl but I do know about AWK:
Code:
BEGIN {
}
{
        if ($1 == "REMARK") {
                ;
        } else if ($1 == "TER") {
                ;
        } else {
                printf ("%f %f %f\n", $6, $7, $8);
        }
}
The above is data.awk, your data is in a file, data:
Code:
awk -f data.awk data
Produces
Code:
2.800000 4.419000 20.868000
3.427000 4.096000 20.216000
3.351000 5.588000 21.698000
3.456000 5.274000 22.736000
2.530000 6.846000 21.639000
2.347000 7.159000 20.611000
1.313000 6.498000 22.334000
0.903000 5.837000 21.771000
3.073000 8.109000 22.266000
3.139000 7.837000 23.319000
2.218000 9.278000 22.007000
That can be redirected into a file and read by whatever method or program you are using to analyze the information (C, FORTRAN, whatever). It could also be redirected into an application that expects three numeric values as input.

Perhaps a little more information about what you're doing (and how) would be useful? As in, are you trying to write a program that does the analysis? Is the "TER" tag the termination of a data set and it's time to calculate some stuff? Personally, I'd do this in C (nothing real complicated here) but that may not be to your taste so maybe a little more explanation might make it easier to help out, eh?

Hope this helps some.
 
Old 11-28-2012, 09:03 AM   #3
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,388
Blog Entries: 2

Rep: Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900
A problem such as this one virtually screams for a solution in AWK, and tronayne has given pretty much the definitive solution as far as I can tell. However, this Perl one-liner seems to do the trick.
Code:
perl -e 'while(<>){@columns=split /\s+/, $_; if($columns[0]=~ m/ATOM/){ print "$columns[5], $columns[6], $columns[7]\n";}}'
Note the use of zero-based column numbers.

--- rod.
 
  


Reply

Tags
perl, perlscript


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
C++ store,modify,and save data in text file. dmuffet Programming 9 01-26-2010 02:37 PM
3D array in MAT file ->DAT file to read in FORTRAN BrandonPossible Programming 2 07-03-2009 01:14 AM
C++ - Store a text file in a Data Structure nilly16 Programming 3 05-26-2009 06:42 AM
How to read .pdb files 666 Linux - Software 8 04-19-2007 03:01 PM


All times are GMT -5. The time now is 05:12 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration