LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   shell scripting for extracting input from different directorires (http://www.linuxquestions.org/questions/linux-newbie-8/shell-scripting-for-extracting-input-from-different-directorires-842660/)

kswapnadevi 11-05-2010 11:26 PM

shell scripting for extracting input from different directorires
 
I have two input data files ‘input1’ and ‘input2’ with same name (different values) in different directories like dir1, dir2, dir3, dir4. I have to take these two input files from each directory one at a time (i.e, first inputs from dir1; after final ouput; same process for dir2; etc) to execute the following shells ‘t1prog’ and ‘t2prog’. The output of these two shells ‘t1out’ and ‘t2out’ is input to java program named as ‘RNA’. I placed these three statements in a shell like ‘RNA final’. When I run this shell, final output is coming from Java program.

SHELL SCRIPT FOR MAKING THIS THREE STATEMENTS IN A LOOP LIKE ACCEPTING INPUTS (input1 and input2) FROM EACH DIRECTORY ONE AFTER ANOTHER and execute the two shells ‘t1prog’ and ‘t2prog’, redirect their outputs to ‘t1out’ and ‘t2out’ and run the java program and redirect the final output to a new file ‘RNA out’ for each directory inputs i.e., ‘RNAout’ is output file for input from dir1; ‘RNAout1’ is output file for input from dir2; so on.

Code:

Sh  t1prog > t1out (presently I am given ‘done<input1’ inside  this
                    shell script; because I am executing this shell in 
                    dir1 only) 
Sh t2prog > t2out (similarly  done<input2)
Java RNA. (t1out and t2out are inputs into this program)


(NEW SCRIPT LINE FOR REDIRECTING THE OUTPUT TO A FILE ‘RNAout’ as a
 fourth line of the script after executing the java program is also
needed )

THANKS IN ADVANCE.

joec@home 11-05-2010 11:46 PM

While it was a different kind of coding, I was working on something similar that once I hit very large data sets I had to change my coding to optimize searches. Not quite what you are looking for but may still be useful.

http://forums.theplanet.com/index.ph...dpost&p=605109

catkin 11-06-2010 04:46 AM

Quote:

Originally Posted by kswapnadevi (Post 4150773)
THANKS IN ADVANCE.

I'd like to help but you haven't asked a question!

It might help if you
  1. posted the input data files (or a few lines, if they are big)
  2. the directory+file tree
  3. how what happens when you run your script (copy and paste from the terminal into the post is good) and explain how it differe from what you want.
This would give a clear understanding of what you are doing and what you question is.

kswapnadevi 11-06-2010 06:13 PM

Shell scripting
 
I written two shell scripts named 't1prog' and 't2prog' and executed these two by giving the input inside the program (done < input1 for t1prog and done < input2 for t2prog). I am redirecting the output from t1prog to 't1out' and t2prog to 't2out'. Third, I written a java program which takes 't1out' and 't2out' as input (inside the program) and give the final ouput. This entire process from the beginning I am doing in a directory 'dir1' where my data files 'input1' and 'input2' are placed.

I NEED A SHELL SCRIPT TO PUT THESE THREE ACTIVITIES IN A LOOP LIKE

1. Accepting 'input1' from Dir1 and read by 't1prog' automatically
2. Run the shell script 't1prog' and redirect the output to 't1out'
3. Accepting 'input2' from Dir1 and read by 't2prog' automatically
4. Run the shell script 't2prog' and redirect the output to 't2out'
5. Run the java program 'java RNA' and redirect its output to 'RNA1out'

After executing the above 5 steps and the shell has to take the inputs from 'dir2' directory (because the filenames are same and format same; values different)do the 5 steps and redirect the output after 5th step to 'RNA2out'. Like this it has to accept the input from each directory, repeat 5 steps above and redirect the output to sepearte output file like 'RNAoutnout'. All directories named as 'dir1' 'dir2' dir3' like this. All these are placed in a single directory 'RNA'.

PLS WRITE A SHELL SCRIPTING FOR THIS WHICH IS ESSENTAIL IN MY BIOINFORMATICS RESEARCH. THANKS IN ADVANCE.
Quote:

Originally Posted by catkin (Post 4150893)
I'd like to help but you haven't asked a question!

It might help if you
  1. posted the input data files (or a few lines, if they are big)
  2. the directory+file tree
  3. how what happens when you run your script (copy and paste from the terminal into the post is good) and explain how it differe from what you want.
This would give a clear understanding of what you are doing and what you question is.


kswapnadevi 11-07-2010 01:05 AM

I am herewith posting all the details. kindly help me.

The two shell scripts (t1prog and t2prog) are given below they are working fine. The input for the first program is 't1.det' and for second program is 't1.rnaml'. These two input files are in 'dir1' folder. I am executing the shell like 'sh t1prog > t1out' and 'sh t2prog > t2out' from this directory only. Then I am executing a java program 'java RNA'; for this, t1out and t2out are input files used in the program and I am getting the final output on screen.

The input files 't1.det' and 't1.rnaml' are in different folders with same name and with different values. Each folder specifies one gene sequence input files.

In 'mfold' directory there are 5 directories and each directory contains these input files (t1.det and t1.rnaml)as shown below
cd mfold
dir1 dir2 dir3 dir4 dir5
cd dir1
t1.det t1.rnaml
cd ..
cd dir2
t1.det t1.rnaml (like this in all directories)

I requesting you to please help me in shell script to automate this process like 't1prog' and 't2prog' first take the inputs from dir1 folder, run the shell and redirect the output to 't1out' and 't2out' and after executing the java program with statment 'java RNA' the output will be redirected to a file like 'RNAout1'. Then takes the inputs from dir2 folder, same process again, and redirect the output to 'RNAout2'. Like this for all inputs values from 5 folders. Modify the two shell scripts given below, to accept the inputs in that way

shell 1: t1prog

1. #!/bin/bash
2. PrintVal() { C=${C#*(}; C=${C%)*}; echo $C | sed 's/).*( /-/'; }
3. while IFS=':=' read A B C
4. do
5. case $(echo $A) in
6. Initial*)
7. ((NotFirst)) && { echo; echo; } || NotFirst=1
8. echo ${B#* }; echo
9. Hairpins=0
10. ;;
11.
12. Hairpin*)
13. ((Hairpins==0)) && echo "$A:"
14. PrintVal
15. ((Hairpins++))
16. ;;
17. Multi-loop)
18. echo "$A:"
19. PrintVal
20. ;;
21. esac
22. done <t1.det

shell 2: t2prog

#!/bin/sh
i=1
while read line
do
freecount=`echo "${line}" | grep -i "free" | wc -l`
poscount=`echo "${line}" | grep -i "position" | wc -l`
if [ $freecount -eq 1 ]
then
rangeoutput=0
else
rangeoutput=1
fi
if [ $rangeoutput -eq 0 ]
then
value=`echo "${line}" | sed -n 's/<.*>\(.*\)<.*>/\1/p'`
echo $value
else
if [ $poscount -eq 1 ]
then
pvalue[$i]=`echo "${line}" | sed -n 's/<.*>\(.*\)<.*>/\1/p'`
if [ $i -eq 2 ]
then
echo "${pvalue[1]} - ${pvalue[2]}"
i=0
fi
i=`expr $i + 1`
fi
fi
done <t1.rnaml

-----------------------------
for one input set in 'dir1' i am executing like this with the above two scripts.

sh t1prog > t1out
sh t2prog > t2out
java RNA

for inputs in different directories and executing these and redirecting the final ouput after executing 'java RNA' statement to a file is needed.
PLS. KINDLY HELP ME WHICH IS VERY ESSENTAIL FOR ME. THANKING YOU





Quote:

Originally Posted by catkin (Post 4150893)
I'd like to help but you haven't asked a question!

It might help if you
  1. posted the input data files (or a few lines, if they are big)
  2. the directory+file tree
  3. how what happens when you run your script (copy and paste from the terminal into the post is good) and explain how it differe from what you want.
This would give a clear understanding of what you are doing and what you question is.


catkin 11-07-2010 01:38 AM

Everyone at LQ is a volunteer, free to help you or not as they choose.

CAPITAL LETTERS usually denote heavy emphasis or shouting on bulletin boards, thus your "PLS. KINDLY HELP ME WHICH IS VERY ESSENTAIL FOR ME. THANKING YOU" could be taken as demanding -- which is more likely to put people off answering than encourage them.

It also helps us understand if you use code tags (that's a link to instructions or you may prefer to use "Advanced Edit" which has a # button to insert code tags.

Another very useful (and easy) technique for explaining what you are doing is to copy and paste from a terminal emulator, into code tags to show us what you have done. Your
Quote:

cd mfold
dir1 dir2 dir3 dir4 dir5
cd dir1
t1.det t1.rnaml
cd ..
cd dir2
t1.det t1.rnaml (like this in all directories)
would have been clearer as
Code:

c@CW8:/tmp$ cd mfold
c@CW8:/tmp/mfold$ cd dir1
c@CW8:/tmp/mfold/dir1$ ls
t1.det    t1.rnaml
c@CW8:/tmp/mfold/dir1$ cd ..
c@CW8:/tmp/mfold$ cd dir2
c@CW8:/tmp/mfold/dir2$ ls
t1.det    t1.rnaml

Let's start with a clear definition of the requirement before misunderstanding and so wasting time on a solution to some other requirement.

As I understand it the directory+file hierarchy relative to the mfold directory is (this is output from the tree command):
Code:

.
|-- dir1
|  |-- t1.det
|  `-- t1.rnaml
|-- dir2
|  |-- t1.det
|  `-- t1.rnaml
|-- dir3
|  |-- t1.det
|  `-- t1.rnaml
|-- dir4
|  |-- t1.det
|  `-- t1.rnaml
`-- dir5
    |-- t1.det
    `-- t1.rnam

Mmm ... maybe this does what you want (not tested). It exits if any of the programs set a non-zero return code, conventionally used to indicate failure. That may not be what you want but it usually is.
Code:

#!/bin/bash

# Ensure system will only find the usual executables; the Java
# bin directory may need adjustment.  Compare with the output
# of echo $PATH on your system
export PATH=/usr/local/bin:/usr/bin:/bin:/usr/lib64/java/bin

# Trap any reference to unset variables
set -o nounset

# Remove all aliases so they do not mask external commands, example java
unalias -a 

# Don't rely on $PATH to find the scripts
t1prog=/home/kswapnadevi/bin/t1prog  # Adjust as required
t2prog=/home/kswapnadevi/bin/t1prog  # Adjust as required

# Go, baby, go!
for dir in dir1 dir2 dir3 dir4 dir5
do
    cd mfold/$dir || exit 1  # Safer to use a path beginning with / for mfold
    echo "Processing in the $PWD directory"
    sh $t1prog > t1out || exit 1  # sh not required if t1prog is executable
    sh $t2prog > t2out || exit 1  # sh not required if t2prog is executable
    java RNA || exit 1
done


kswapnadevi 11-08-2010 06:52 PM

Shell scirpting
 
When I run the script I am getting the following errors. The shell processing the input in 'dir1' and generating two output files t1out and t2out. It is not processing for inputs in 'dir2'. For each folder output i.e., t1out and t2out Java program has to be executed and it has to redirect the output to a folder. That is also not working. I am herewith submitting my java program also. . When I am executing single data set 't1out' and 't2out'(using the following code) Java program is running successuflly and giving output.Pls help me. I am in confusion

sh t1prog > t1out
sh t2prog > t2out
java RNA

--------------------------------------------------------------------------
errors:
Processing in the /users/rsankar/rna/mfold/dir1 directory
Exception in thread "main" java.lang.NoClassDefFoundError: RNA
Caused by: java.lang.ClassNotFoundException: RNA
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Quote:

Originally Posted by catkin (Post 4151618)
-------------------------------------------------------------------------
My Java Code:
import java.io.*;
import java.util.*;

public class RNA
{
public static final String inputFile1 = "t1out";
public static final String inputFile2 = "t2out";


private Hashtable rangeTable = new Hashtable();
private ArrayList multiloopList = new ArrayList();
private ArrayList hairPinList = new ArrayList();
private int STARTRANGE = -1;
private int DEFAULTSTARTRANGE = 40;

private static final boolean DEBUG = false;


public RNA()
{
}

public RNA(int STARTRANGE)
{

this.STARTRANGE = STARTRANGE;
if(STARTRANGE == -1) this.STARTRANGE = DEFAULTSTARTRANGE;
}


private void init() throws IOException
{
FileReader fileReaderObj = new FileReader(inputFile2);
BufferedReader bufferReaderObj = new BufferedReader(fileReaderObj);

String line = null;
line = bufferReaderObj.readLine();
String saveLine = null;
boolean advanced = false;
while(line != null)
{
advanced = false;

if (line.trim().startsWith("-"))
{

saveLine = line.trim();
String multiloop = new String();
if(DEBUG) System.out.println(saveLine);

while (((line = bufferReaderObj.readLine()) != null) && !line.trim().startsWith("Hairpin loop:"))
{
if (line.trim().equals("Multi-loop:"))
{
multiloop = bufferReaderObj.readLine();
multiloopList = new ArrayList();
multiloopList.add("Multi-loop:");
multiloopList.add(multiloop);
if(DEBUG) System.out.println("multiloop--->"+multiloop);

}
}

if(line.trim().equals("Hairpin loop:"))
{
multiloopList.add("Hairpin loop:");
while (((line = bufferReaderObj.readLine()) != null) && !line.trim().startsWith("-"))
{
advanced = true;
if(line.trim().length() > 0)multiloopList.add(line);
if(DEBUG) System.out.println("Hairpin loop--->"+line);


}
if(DEBUG) System.out.println("Hairpin loop size--->"+multiloopList.size());

}



}
rangeTable.put(saveLine,multiloopList);
if(!advanced )
{
line = bufferReaderObj.readLine();
//System.out.println("advance-->"+line);

}
}

}


public void compare() throws IOException
{
FileReader fileReaderObj = new FileReader(inputFile1);
BufferedReader bufferReaderObj = new BufferedReader(fileReaderObj);
boolean advanced = false;
String line = null;
line = bufferReaderObj.readLine();

while(line != null)
{
advanced = false;
System.out.println(line);
if(line.trim().startsWith("-"))
{
List list = (ArrayList)rangeTable.get(line.trim());
while(((line = bufferReaderObj.readLine()) != null) && !line.trim().startsWith("-"))
{

String []result = line.trim().split("-");
int left = Integer.parseInt(result[0].trim());
int right = Integer.parseInt(result[1].trim());
if(left > STARTRANGE && (right - left) > 20)
{
String multiLoopRange = (String)list.get(1);
String []result1 = multiLoopRange.trim().split("-");
int multiLoopLeft = Integer.parseInt(result1[0].trim());
int multiLoopRight = Integer.parseInt(result1[1].trim());

if(left > multiLoopLeft)
{
for(int i = 3; i < list.size(); i++)
{

String hairPinValue = (String)list.get(i);
String []hairPinresult = hairPinValue.trim().split("-");
int hairPinleft = Integer.parseInt(hairPinresult[0].trim());
int hairPinRight = Integer.parseInt(hairPinresult[1].trim());

if(left < hairPinleft && right > hairPinRight)
{

int microRNAleft = left + 21;

if (microRNAleft <= hairPinleft)
{
System.out.println("MIRNA--"+ left +":"+microRNAleft);
}
int microRNAright = hairPinRight + 1 + 21;

if(microRNAright <= right)
{
System.out.println("MIRNA--"+ (hairPinRight+1)+":"+microRNAright);

}

}
}

}


}
advanced = true;
}
}

if(!advanced) line = bufferReaderObj.readLine();
}
}

public static void main(String []r) throws Exception
{
System.out.println("Enter the Starting Micro RNA range to be computed...");
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String sInputRange = in.readLine();
int inputRange = -1;
try
{

inputRange = Integer.parseInt(sInputRange);
}
catch(Exception e)
{
System.out.println("Invalid input...computing with DEFAULT Micro RNA Range 40.");
inputRange = -1;
}
RNA rna = new RNA(inputRange);
rna.init();
rna.compare();

}

}

CAPITAL LETTERS usually denote heavy emphasis or shouting on bulletin boards, thus your "PLS. KINDLY HELP ME WHICH IS VERY ESSENTAIL FOR ME. THANKING YOU" could be taken as demanding -- which is more likely to put people off answering than encourage them.

It also helps us understand if you use code tags (that's a link to instructions or you may prefer to use "Advanced Edit" which has a # button to insert code tags.

Another very useful (and easy) technique for explaining what you are doing is to copy and paste from a terminal emulator, into code tags to show us what you have done. Your
would have been clearer as
Code:

c@CW8:/tmp$ cd mfold
c@CW8:/tmp/mfold$ cd dir1
c@CW8:/tmp/mfold/dir1$ ls
t1.det    t1.rnaml
c@CW8:/tmp/mfold/dir1$ cd ..
c@CW8:/tmp/mfold$ cd dir2
c@CW8:/tmp/mfold/dir2$ ls
t1.det    t1.rnaml

Let's start with a clear definition of the requirement before misunderstanding and so wasting time on a solution to some other requirement.

As I understand it the directory+file hierarchy relative to the mfold directory is (this is output from the tree command):
Code:

.
|-- dir1
|  |-- t1.det
|  `-- t1.rnaml
|-- dir2
|  |-- t1.det
|  `-- t1.rnaml
|-- dir3
|  |-- t1.det
|  `-- t1.rnaml
|-- dir4
|  |-- t1.det
|  `-- t1.rnaml
`-- dir5
    |-- t1.det
    `-- t1.rnam

Mmm ... maybe this does what you want (not tested). It exits if any of the programs set a non-zero return code, conventionally used to indicate failure. That may not be what you want but it usually is.
Code:

#!/bin/bash

# Ensure system will only find the usual executables; the Java
# bin directory may need adjustment.  Compare with the output
# of echo $PATH on your system
export PATH=/usr/local/bin:/usr/bin:/bin:/usr/lib64/java/bin

# Trap any reference to unset variables
set -o nounset

# Remove all aliases so they do not mask external commands, example java
unalias -a 

# Don't rely on $PATH to find the scripts
t1prog=/home/kswapnadevi/bin/t1prog  # Adjust as required
t2prog=/home/kswapnadevi/bin/t1prog  # Adjust as required

# Go, baby, go!
for dir in dir1 dir2 dir3 dir4 dir5
do
    cd mfold/$dir || exit 1  # Safer to use a path beginning with / for mfold
    echo "Processing in the $PWD directory"
    sh $t1prog > t1out || exit 1  # sh not required if t1prog is executable
    sh $t2prog > t2out || exit 1  # sh not required if t2prog is executable
    java RNA || exit 1
done



catkin 11-09-2010 07:10 AM

Quote:

Originally Posted by kswapnadevi (Post 4153082)
When I run the script I am getting the following errors. The shell processing the input in 'dir1' and generating two output files t1out and t2out. It is not processing for inputs in 'dir2'. For each folder output i.e., t1out and t2out Java program has to be executed and it has to redirect the output to a folder. That is also not working. I am herewith submitting my java program also. . When I am executing single data set 't1out' and 't2out'(using the following code) Java program is running successuflly and giving output.Pls help me. I am in confusion

Please show us what you did and the resulting output by copying and pasting the terminal session into code tags in a reply to this thread, both for the manual case that works and for the scripted case that does not work. Your narrative descriptions confuse me.


All times are GMT -5. The time now is 05:54 AM.