LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-07-2015, 01:29 AM   #1
kmkocot
Member
 
Registered: Dec 2007
Location: Queensland, Australia
Posts: 122

Rep: Reputation: 15
Question Need help with perl script to read in values from a files


Hi all,

I'm trying to write a script to query a database with the scientific names of organisms for their taxonomic IDs to automatically bin DNA sequences in a dataset as being from an animal, bacteria, etc. I have a text file containing many scientific names that I would like to run through this script. I'm having a hard time editing the script to iterate through the text file pulling in variables one after the other and none of the examples I've been able to find on the web match the situation I'm in. I realize that I'll need to change the lines "my @scientific_name = ("Neosartorya fischeri NRRL 181");" and "foreach my $scientific_name (@scientific_name) {" but otherwise I'm really stuck.


Here's the script I'm trying to edit:
Code:
use strict;
use warnings;
use Bio::DB::Taxonomy;
use Bio::Tree::Tree;

my @scientific_name = ("Neosartorya fischeri NRRL 181");
my @lineages = ();
my $db = Bio::DB::Taxonomy->new(-source => 'entrez');

foreach my $scientific_name (@scientific_name) {
    my $taxon = $db->get_taxon(-name => @scientific_name);
    my $tree = Bio::Tree::Tree->new(-node => $taxon);
    my @taxa = $tree->get_nodes;
    my @tids = ();
    foreach my $t (@taxa) {
        unshift(@tids, $t->id());
    }
    push(@lineages, @scientific_name . "\t|\t" . $taxon->ancestor() . "\t|\t" . "@tids")
}

foreach my $lineage (@lineages) {
    print "$lineage\n";
}
Here's the first few lines from species_names.txt, which contains the terms I'd like to feed into my @scientific_name:
Code:
Caldicellulosiruptor owensensis OL
Homo sapiens;Homo sapiens;synthetic construct;Homo sapiens
Teredinibacter turnerae T7901
Arcobacter nitrofigilis DSM 7299
Neosartorya fischeri NRRL 181
Homo sapiens;synthetic construct
Ruegeria pomeroyi DSS-3
Planctomyces limnophilus DSM 3776
Planctomyces limnophilus DSM 3776
Flavobacteria bacterium BBFL7
Thank you!
Kevin
 
Old 10-07-2015, 03:40 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
I'm not entirely clear, but if you are asking about how to read file line by line
Code:
open(KFILE, "<", "kfile.txt" ) or  die "Can't open kfile: $!\n";
while ( defined($krec = <KFILE>) )
{
   chomp($krec);

    # Here you do stuff with the rec; 
    # Note that your recs seem to sometimes have space separated fields, sometimes ';' separators
}

close(KFILE) or die "Can't close kfile: $!\n";
Obviously you rename the vars etc, but you get the idea.

Also, I'd avoid having scalars and arrays having effectively the same name, even if they are in separate name-spaces; it becomes prone to difficult to find typos as the code gets longer.

To get separate "fields" from your recs (if you need to), use http://perldoc.perl.org/functions/split.html


HTH - come back if you need more
 
Old 10-08-2015, 07:47 PM   #3
kmkocot
Member
 
Registered: Dec 2007
Location: Queensland, Australia
Posts: 122

Original Poster
Rep: Reputation: 15
Thanks for the help. I'm a novice with perl but between that and some more web searching, I got something that works figured out:

Code:
use strict;
use warnings;
use Bio::DB::Taxonomy;
use Bio::Tree::Tree;

my $all_sci_names = `cat scientific_names.txt`;

my @scientific_name = (split/\n/,$all_sci_names);
my @lineages = ();

my $db = Bio::DB::Taxonomy->new(-source => 'entrez');

foreach my $scientific_name (@scientific_name) {
    my $taxon = $db->get_taxon(-name => $scientific_name);
    my $tree = Bio::Tree::Tree->new(-node => $taxon);
    my @taxa = $tree->get_nodes;
    my @tids = ();
    foreach my $t (@taxa) {
        unshift(@tids, $t->id());
    }
    push(@lineages, $scientific_name . "\t|\t" . "@tids")
}

foreach my $lineage (@lineages) {
    print "$lineage\n";
}

#Notes
#http://doc.bioperl.org/bioperl-live/Bio/DB/Taxonomy.html
#http://doc.bioperl.org/bioperl-live/Bio/Tree/Tree.html
#cat mgm4664974.3_organism_GenBank.tab | awk -F "\t" '{print $13}' | sed '/semicolon.\+/d' > scientific_names.txt
#perl get_ancestor_taxonomy_from_MG-RAST_output.pl scientific_names.txt
 
Old 10-08-2015, 08:50 PM   #4
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
Please use the proper Perl way of reading files as per my example.
I'd also point out that it splits on new lines by default (you can change that for funky files).

I also re-iterate my advice about not using the 'same' names for scalars/arrays (& indeed hashes).
You'll thank me later...
 
Old 10-09-2015, 03:30 AM   #5
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,104

Rep: Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267Reputation: 2267
http://stackoverflow.com/questions/7...rray-with-perl
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl script to read Hex from files cnelson Programming 4 06-26-2012 02:06 PM
how to read values from a file (as part of a script) AndrewJS Linux - General 4 06-13-2011 12:40 PM
bash script 'read' with default values m4rtin Programming 1 03-05-2010 10:06 AM
storing values in an array in perl script :) kdelover Programming 5 09-16-2009 07:01 PM
how to read values in trace files amna Linux - Newbie 1 03-06-2008 04:54 AM


All times are GMT -5. The time now is 06:36 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration