LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-07-2012, 10:05 AM   #1
ShiGua
LQ Newbie
 
Registered: Nov 2012
Posts: 5

Rep: Reputation: Disabled
Need help searching for values in file then adding to line


Hello!

I'm currently trying to organize data for some bio research, but I'm not sure how to compare a value to values in a file. So what I have are 2 arrays, one array contains NM numbers and can be referenced as NM[#]. The other array has symbols, SYM[#]. I have a file for which it contains an NM number every other line and between each NM number, irrelevant information (but I need it in there still). What I need to do is match every NM[#] in my array to the NM number in the file, but also add :Sym[#] to the end of that line. The problem is, before each NM number in the file, there is a > symbol in front of the line (which needs to stay there). So for example I have an array NM that looks like:

{NM_23948375 NM_03948274 NM_39482746 NM_20475839} #except there are about 2 thousand values

and SYM:

{fj48g9sk 2idjf8a0s ajsie9rt skdjie8t} #same amount of values as NM

and the file looks like:

>NM_########
AUGCGCUAGCUGAUGCUGAGCACGAUCGAUCGAAA
>NM_########
AUGUCGUAGCUAGCGUAGCUGUAUCGUGAC

I need to take the first NM number in my NM array and compare it to every other line in the file without the > in front. Then, when that line in the file is found, I need to add :SYM, where SYM is the same order as the NM number from the array. So take the first NM number, find the line, add the first symbol. Then the second NM number, match it, add second symbol, and so on, for a final product that looks like:

>NM_########:SYM
AUGCAGUCGAUCGAUGCUAGUCUACAGCUAUCGGAAA
>NM_########:SYM
AUGCCGUAGCUAGCUACGUACGUGUAGCUGAC

I feel like the process should be relatively simple, I'm just completely new at this and was looking for any help. I'm not really even sure how to start.

Here's what I have (forgive all syntax errors, everything I want to do is in there, I just need help translating it to code, file to be edited is called file.fa, I can also take it as an argument and refer to it as $1 if that's easier):

Code:
#!/bin/bash

for ((i=0; i<$(wc -l file.fa)/2; i++))
  for ((j=0; j<$(wc -l file.fa)/2; j++))
    if ($NM[i] = $fileline[2*j+1)]) #without the >
      sed '(2*(j+1)s/.*/>$NM[i]:$SYM[i]/
    fi
  done
done
I also have access to perl if that makes things easier. Also, if this is all possible by just using the command line, that'd be simpler for me.

Sorry for the long post and any help is appreciated!
 
Old 12-07-2012, 11:44 AM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,331
Blog Entries: 55

Rep: Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531Reputation: 3531
Couldn't you just increment both array elements, grep the NM number line +1 and delete array member 0?
Code:
for ((n=0; n<${#NM[@]}; n++)); do
 SEQ=($(grep -m1  -A1 "^>${NM[$n]}" file.fa)); unset SEQ[0]
 echo -en ">${NM[$n]}:${SYM[$n]}\n${SEQ[*]}\n"
done
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] I need help searching for values in a file. jim.thornton Linux - Newbie 8 10-13-2012 05:59 PM
[SOLVED] adding line from file1 into a line of another file based on maching IDs rossk Programming 6 01-06-2011 01:06 AM
how to write command line arguments as values, into a file which contains variables sathiawathi.m Linux - Newbie 2 07-17-2009 03:27 AM
Adding values to a file and making a graph with it by command line (Spreadsheet?) Romanus81 Programming 4 07-02-2008 10:14 AM
Adding values on command line.. 3saul Linux - Software 1 03-06-2006 04:01 AM


All times are GMT -5. The time now is 09:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration