LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-02-2010, 08:07 PM   #1
kswapnadevi
LQ Newbie
 
Registered: Oct 2010
Posts: 16

Rep: Reputation: 0
shell scripting testing


I have an input file (10000 lines) named ‘out’ in the given format. Each two lines represent one structure. I am giving this data file as input to a shell program which creates 5000 folders; one directory for each chromosomes using built-in tool ‘mfold SEQ=input’ command. The shell program is ‘process.awk’ given below. I am executing with a command ‘awk –f process.awk out’. The program executed and created 1020 directories and after that it is giving error like this:
awk: process.awk:6: (FILENAME=out FNR=1022) fatal: can't redirect to `dir1021/input' (No such file or directory)
I am not able to correct this. Help for this highly appreciated. Thanks in advance.

Quote:
out file: input
>Chr5:26236034-26236054
ACCGCCGCCGCCTGCCGCGTA
>Chr25:2622217-2622237
TGATTCTCGCTTTGGGTGCGA
>Chr10:23813143-23813163
AGTTAGTCTTTGTTTTTTGTT
>Chr23:24400416-24400436
AAACACTCAGCTCCCGATCTG
>Chr14:68746745-68746765
TCACATTCTAAGATTTTGCTG
>Chr29:3473120-3473140
CAAATACCATGGTTTCTACAG
>ChrX:62081589-62081609
ACGGGGGGGCGCCGGGGGCCT
>Chr18:31139220-31139240
AAGGGATTGGGAGAGTAGGAT
process.awk

Quote:
BEGIN {
FS=">";RS=">";ORS="";
}
$NF { d++
system("mkdir dir"d);
print ">"$0 > ("dir"d"/input");
system("cd dir"d"; mfold SEQ=input");
system("cd dir"d"; /home/rsankar/bin/mfold SEQ=input");
 
Old 12-02-2010, 08:44 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
And did you check whether dir1021/input exists?

Maybe the script creating the dirs had a partial failure?



Cheers,
Tink
 
Old 12-02-2010, 09:12 PM   #3
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
I'm curious if awk is running out of file descriptors (too many open files) since there are no close() calls in the script. We don't know what OS this is running on (do we?); GNU gawk apparently has no limits (within reason?) but other awk implementations may have limits.

This page may be helpful if lack of file descriptors is a problem:
http://www.gnu.org/manual/gawk/html_...And-Pipes.html
 
1 members found this post helpful.
Old 12-02-2010, 09:38 PM   #4
kswapnadevi
LQ Newbie
 
Registered: Oct 2010
Posts: 16

Original Poster
Rep: Reputation: 0
shell script testing

I am working on Linux OS. How to modify the above awk program? help me.

Quote:
Originally Posted by GrapefruiTgirl View Post
I'm curious if awk is running out of file descriptors (too many open files) since there are no close() calls in the script. We don't know what OS this is running on (do we?); GNU gawk apparently has no limits (within reason?) but other awk implementations may have limits.

This page may be helpful if lack of file descriptors is a problem:
http://www.gnu.org/manual/gawk/html_...And-Pipes.html
 
Old 12-03-2010, 12:06 PM   #5
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Have you made progress, or done further investigation yet? If so, what did you find?

I am not sure if what I suggest above is correct; but the way I figure it, your script is trying to create fd #1021, which would be the 1022nd fd created based upon the input data; the `mfold` system command's fd would account for the 1023rd fd; the input file would account for the 1024th fd; the awk script itself *might* account for the 1025th fd, or maybe each of your `mfold` commands accounts for its own fd -- either way makes for a total of 1024 open fd's and an attempt being made to open another one, which fails. The failure hints at a limit of 1024 open fd's for your awk version.

If this is all correct (again, I do not know if it is), then the solution would be to issue the close() command once for every open file descriptor that gets created, after the script is finished using that descriptor. So if it were me, I would use close("fd name here") after every `print` command you use.

Keep us posted!
 
1 members found this post helpful.
Old 12-04-2010, 01:35 AM   #6
kswapnadevi
LQ Newbie
 
Registered: Oct 2010
Posts: 16

Original Poster
Rep: Reputation: 0
Shell scripting testing

Still I am in investigation round the clock. Pls modify the given script by adding close statements madam. I will try that also
Thanks in advance.

Quote:
Originally Posted by GrapefruiTgirl View Post
Have you made progress, or done further investigation yet? If so, what did you find?

I am not sure if what I suggest above is correct; but the way I figure it, your script is trying to create fd #1021, which would be the 1022nd fd created based upon the input data; the `mfold` system command's fd would account for the 1023rd fd; the input file would account for the 1024th fd; the awk script itself *might* account for the 1025th fd, or maybe each of your `mfold` commands accounts for its own fd -- either way makes for a total of 1024 open fd's and an attempt being made to open another one, which fails. The failure hints at a limit of 1024 open fd's for your awk version.

If this is all correct (again, I do not know if it is), then the solution would be to issue the close() command once for every open file descriptor that gets created, after the script is finished using that descriptor. So if it were me, I would use close("fd name here") after every `print` command you use.

Keep us posted!
 
Old 12-04-2010, 08:22 AM   #7
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Quote:
Originally Posted by kswapnadevi View Post
Still I am in investigation round the clock. Pls modify the given script by adding close statements madam. I will try that also
Thanks in advance.
With the information I have given thus far, plus the link given above, and based the snippets of code you have posted in the past, I am reasonably confident that you yourself are capable of adding a single close() statement with the right stuff inside the brackets, after the print statement. I have used numerous close() statements in my awk code over here:
http://www.linuxquestions.org/questi...2/#post4126234
Have a look and see what I did. Modify your code. Test it. Show us the results if it fails (copy + paste the errors of execution of your program) and show us the code again, with the close() statements added.

Please allow me to remind you again though: I do not know if this will address the issue, or if the number of fd's is even the problem, but if it were me, I would be doing exactly what I'm suggesting you try: close() statements.

Good luck!
 
1 members found this post helpful.
Old 12-04-2010, 09:30 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
As I have said in other posts from same OP on same topic, why not just use bash seeing all the calls to system?
A simple while loop could easily read from the file and then issue all your command as you have inside the system calls.

Something along the lines of:
Code:
#!/bin/bash

exec 3<f2

d=1

while read -u 3 -r fline
do
    read -u 3 -r sline

    DIR="dir$((d++))"
    mkdir $DIR
    echo "$fline" > $DIR/input
    echo "$sline" >> $DIR/input
    ...
done

exec 3>&-
Seems pretty simple if you ask me.
 
1 members found this post helpful.
Old 12-04-2010, 09:41 AM   #9
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Even simpler with (untested but I saw it on LQ the other day!):
Code:
#!/bin/bash

d=1

while read -u 3 -r fline
do
    ...
done 3<f2
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Terminal functions for shell scripting with Shell Curses LXer Syndicated Linux News 0 03-26-2008 11:50 PM
Scripting: testing for zero byte files seanys Linux - Newbie 2 03-25-2008 04:23 AM
Shell Scripting: Getting a pid and killing it via a shell script topcat Programming 15 10-28-2007 02:14 AM
bash scripting testing for file exvor Programming 4 08-08-2007 03:42 PM
teaching shell scripting: cool scripting examples? fax8 Linux - General 1 04-20-2006 04:29 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:50 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration