Chemistry problem- File matching and Sorting!!!
Dear Programmers
I have a file called ranking.txt, in which I have 4 chemical compounds in *.sdf file format named ligands_m1, ligands_m2, ligands_m3, ligands_m4. Each compounds is assigned with a particular score along with the file location. ------------------------------------------------------------------------ Score Directory Name 37.36 ~/chemscore/ligands_m1/ligands_m1.sdf ligands_m1 19.35 ~/chemscore/ligands_m2/ligands_m2.sdf ligands_m2 28.35 ~/chemscore/ligands_m3/ligands_m3.sdf ligands_m3 30.31 ~/chemscore/ligands_m4/ligands_m4.sdf ligands_m4 ------------------------------------------------------------------------ In the same directory, I have another set of files called cluster files also in the *.sdf format. I have included the cluster files structure below: ~/chemscore/ligands_m1/ cluster_ligands_m1_1.sdf cluster_ligands_m1_2.sdf cluster_ligands_m1_3.sdf ~/chemscore/ligands_m2/ cluster_ligands_m2_1.sdf cluster_ligands_m2_2.sdf cluster_ligands_m2_3.sdf cluster_ligands_m2_4.sdf cluster_ligands_m2_5.sdf ~/chemscore/ligands_m3/ cluster_ligands_m3_1.sdf ~/chemscore/ligands_m4/ cluster_ligands_m4_1.sdf cluster_ligands_m4_2.sdf cluster_ligands_m4_3.sdf cluster_ligands_m4_4.sdf ------------------------------------------------------------------------ I need a script that does the following job. For example, ONLY If the score is above 28 and number of cluster files is less than or equal to 3, then write the output. Score Directory Name Clusters 37.36 ~/chemscore/ligands_m1/ligands_m1.sdf ligands_m1 3 28.35 ~/chemscore/ligands_m3/ligands_m3.sdf ligands_m3 1 ------------------------------------------------------------------------ Could anybody please help me to sort out this problem? Thank you in advance. Robert. |
try this:
usage: ligands.pl ranking.txt Code:
#!/usr/bin/env perl |
Chemistry problem- File matching and Sorting!!!
Dear bigearsbilly
The script works fine thank you very much. But it didnt write the number of clusters in the last column. Could you please modify this program to add no_of_clusters as last column. Also it will be great, if the output is written in a separate file "output.txt". Thank you very much for your time and consideration Regards Robert |
Here is a slightly different take:
Code:
find -name 'cluster*' | awk -F_ '{_[$(NF-1)]++}END{while((getline < "ranking.txt") > 0)if($1 > 28 && _[$NF] >=3)print $0" "_[$NF]}' |
Chemistry problem- File matching and Sorting!!!
Dear Grail
Thank you for the script. Your awk script runs without any error but did not produce any output. |
And ???.
Seems you want some-one else to do all the work for you. Personally I don't mind giving people a nudge in the right direction - I think you have certainly received that - and more. |
I am with syg00 on this one ... we have given a fairly good hand on this. I will simply add that the following is as it ran on my machine:
Code:
grail@wetworks:~$ find -name 'cluster*' | awk -F_ '{_[$(NF-1)]++}END{while((getline < "ranking.txt") > 0)if($1 > 28 && _[$NF] >=3)print $0" "_[$NF]}' |
ditto the last 2 posts.
if you must, maybe change... print $_, scalar @L; not tested, no warranty |
Dear Billy
Perfect.. It worked great....Thank you for the wonderful script Regards Robert |
Please mark as SOLVED once you have your answer
|
All times are GMT -5. The time now is 02:24 PM. |