LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-02-2013, 09:40 PM   #1
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Rep: Reputation: Disabled
Need help running program in background in BASH


Somehow, I managed to create a program - in C, if that matters - which solves a math differential equation (don't worry if you don't understand that). We need the program to run for a very long time (multiple weeks!) on a cluster. I need to figure out how to run it in the background, since it seems my fine ISP doesn't seem to want to give me a consistent internet connection (and I want to disconnect and watch the new Arrested Development during the next few weeks). I am trying to do so using the "&" and "screen" features of BASH, but having very little luck. I am connecting to the cluster using SSH from a Cygwin terminal on my fine Windows Vista PC.

First off, how do you get the "&" feature (I don't know what it's called) to work right? I'm told the "&" after a command should execute that command in the background. But when I try to use that, the background job stops the moment I try to try to do something else. Then, I have to call the job back into the foreground (using "fg"), but then it's no longer running in the background and I have to wait for it to finish. For example:

Code:
prompt>mpirun 4 MYPROG 6 0 &
[1] 11571
prompt>PID 20705 executing on node3.
PID 22374 executing on node4.
PID 14730 executing on node5.
PID 14216 executing on node2.

 Simulating 64x64 oscillators for 9999 timesteps

 Number of PEs =   4

t = 00000. Overwriting files: /raid/data/N=064x064_Beta=0.00_t=0000000_T.csv,        /raid/data/N=064x064_Beta=0.00_t=0000000_U.csv
ls
total 1292
drwxr-xr-x 3 schwarz schwarz   4096 Jun  2 20:55 ./
...

[1]+  Stopped                 mpirun -machinefile $HOME/utils/Host_file -np 4 FPU 6 0
prompt>jobs
[1]+  Stopped                 mpirun -machinefile $HOME/utils/Host_file -np 4 FPU 6 0
prompt>fg 1
mpirun -machinefile $HOME/utils/Host_file -np 4 MYPROG 6 0
Notice where I typed "ls" (I tried to bold it) and got a file listing, and was then informed that job 1 was stopped. I was then forced to bring the job to the foreground using "fg 1".

Also, what happens if a background program has some output, or requires input? I can find no documentation on "&" and could use a good, clear explanation.

Second (is this post too long?), how do I use the "screen" command? I have searched the web far and wide and I see that you create a new screen with "screen -S SCNNAME" which works fine. I checked that it works by adding $STY to my prompt. But then everything I read says to switch screens you type "Ctrl-a c" or "Ctrl-a n". But none of those Ctrl-a sequences are working for me? Am I using screen in an environment that doesn't support it? Am I reading the right instructions? Am I doing something else wrong? Please tell me how to use "screen" and where I can get good documentation for it(not the "screen --help" or "man screen" crypto-help).

So how can I get my long running program to run so that I can disconnect and come back later, and still see the relevant output? TIA.
 
Old 06-02-2013, 09:48 PM   #2
linosaurusroot
Member
 
Registered: Oct 2012
Distribution: OpenSuSE,RHEL,Fedora,OpenBSD
Posts: 979
Blog Entries: 2

Rep: Reputation: 235Reputation: 235Reputation: 235
man page:
Screen does not understand the prefix "C-" to mean control, although this
notation is used in this manual for readability. Please use the caret
notation ("^A" instead of "C-a") as arguments to e.g. the escape command
or the -e option. Screen will also print out control characters in caret
notation.
 
Old 06-02-2013, 10:06 PM   #3
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
I'm not sure what you're trying to tell me. Can you clarify please?
 
Old 06-02-2013, 10:24 PM   #4
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
If you're going to run a program in the background via '&', you have to re-direct any output not already directed to a file ( eg stdout & stderr) to a file.
It should NOT require any input or it will likely hang waiting for the input. You'd have to bring it back to the foreground to do input....
If you want it to continue after you have logged out, you need to prefix with 'nohup' eg
Code:
nohup ./myprog >myprog.log 2>&1 &
 
1 members found this post helpful.
Old 06-04-2013, 10:27 AM   #5
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
I am using "screen" in place of "nohup". I understand where you put ">myprog.log" at the end of the execution statement - that is to send output to the file myprog.log. What does "2>&1" do? What do I search on to read about that?
 
Old 06-04-2013, 10:44 AM   #6
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Distribution: UBUNTU 5.10 since Jul-18,2006 on Intel 820 DC
Posts: 804

Rep: Reputation: 186Reputation: 186
File handles numbers 0,1 and 2 are respectively the standard input, standard output and standard error messages. These are created by default and are available to every process.

By your command line, the output (file handle 1) stands redirected to myprog.log

2>&1 redirects the standard error messages to the standard output which already stands redirected to myprog.log

So myprog.log contains BOTH the standard output and standard errors.

Thus all outputs are redirected and won't wait indefinitely for a non existent output device.

Questions to answer.
(1) Whats the role of & in 2>&1?
(2) Would 2>myprog.log work as well?

OK
 
1 members found this post helpful.
Old 06-04-2013, 11:09 AM   #7
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
Thanks, Anantha.

The output is not working quite right. I'm running the program as
Code:
mpirun -machinefile $HOME/utils/Host_file -np 30 MYPROG 6 4 >MYPROG.log 2>&1
But it doesn't seem to be capturing everything. I can look at the output directory and see that it is creating output files. The file MYPROG.log has a message for the creation of the first output file, but none of the ones after that. Any idea why MYPROG.log did not contain all the standard output from my program?

To answer your questions:
(1) The role of "&" is obviously some type of delimiter which tells linux not to interpret "1" as a file name.
(2) My first impression is that would work, but I'm guessing this is a trick question and there would be some file conflict issues. By redirecting standard error to standard output, it would combine the two outputs.

Last edited by Jeff9; 06-05-2013 at 10:10 AM.
 
Old 06-04-2013, 03:53 PM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
In file descriptors/redirections, '&' represents a 'file duplicator'. In natural language terms it could be translated as 'the same place as'.

>MYPROG.log 2>&1 means " send stdout to MYPROG.log, and also send stderr to the same place as stdout's current setting (i.e. also to MYPROG.log).

File descriptors are defined for the main command process launched on the line, mpirun in this case.

redirections and file descriptors explained

Since you've already mentioned screen though, I think you should forget about backgrounding anything and just use a dedicated screen session for it. Set it running, detach it, and open a new terminal for general use. You can re-attach to it at any time for control and monitoring. It's exactly the kind of thing screen was designed for.

(Don't ask me how to do it though, I don't have much experience with screen either. )
 
1 members found this post helpful.
Old 06-05-2013, 06:48 AM   #9
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,600

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
If you are using a cluster, then normally the cluster configuration includes a batch process that will do this for you (though this depends on the cluster - a cheap thrown together cluster might not... but then it also isn't really a cluster as it is more just a bunch of nodes on a net).

You can also read the man pages on "batch" and "cron" (batch uses cron to implement a simple batch queuing system).
 
1 members found this post helpful.
Old 06-05-2013, 10:19 AM   #10
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by David the H. View Post
In file descriptors/redirections, '&' represents a 'file duplicator'. In natural language terms it could be translated as 'the same place as'.

>MYPROG.log 2>&1 means " send stdout to MYPROG.log, and also send stderr to the same place as stdout's current setting (i.e. also to MYPROG.log).

File descriptors are defined for the main command process launched on the line, mpirun in this case.

redirections and file descriptors explained

Since you've already mentioned screen though, I think you should forget about backgrounding anything and just use a dedicated screen session for it. Set it running, detach it, and open a new terminal for general use. You can re-attach to it at any time for control and monitoring. It's exactly the kind of thing screen was designed for.

(Don't ask me how to do it though, I don't have much experience with screen either. )
Hi David. Thanks for the link. I'll read that page this afternoon.

In the meantime, do you have any idea why MYPROG.log is not receiving output? It seems like the first few lines are sent to that file, but no further output. It seems like the OS might be buffering what it writes to MYPROG.log. But it's now 20 hours since I submitted the job - it is clearly still running because it is creating new output files - but there are no more messages being sent to MYPROG.log. If the file is being buffered, then it will be complete when the job finished (which will be about 30 hours from now). But if the job hangs - like it did last time - I get no further output.

Any idea what I can do about that? I hope I'm describing it well.

As to using screen (whoever is reading this thread), if I submit the job in the foreground how do I get out of the current screen? Using Ctrl-a c isn't working for me, and I don't understand what linosaurusroot, in the second post, was trying to tell me.
 
Old 06-05-2013, 10:28 AM   #11
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by jpollard View Post
If you are using a cluster, then normally the cluster configuration includes a batch process that will do this for you (though this depends on the cluster - a cheap thrown together cluster might not... but then it also isn't really a cluster as it is more just a bunch of nodes on a net).

You can also read the man pages on "batch" and "cron" (batch uses cron to implement a simple batch queuing system).
Hi jp. I'll read about batch and cron this afternoon. Thanks for the pointer to them.

Yes, the MPI (multiple programming interface?) submits the job to all of the cluster's processors. But the command line that issues the mpi command
Code:
>mpirun -machinefile $HOME/utils/Host_file -np 30 MYPROG 6 4 >MYPROG.log 2>&1
is in the foreground of the node the node that I'm on and is waiting for the program to finish - approximately 50 hours - and in the meantime it is displaying the standard output and error output.

The file MYPROG.log has received the first few lines of the standard output, but nothing more. Also, if the job hangs - which it did after 30 hours the last time I ran it - I never get the rest of the output. Is it buffering the output before writing to MYPROG.log? Did it stop sending output to MYPROG.log when I changed the job to the background (using Ctrl-z)?

Thanks.
 
Old 06-05-2013, 10:06 PM   #12
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,600

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
Most clusters will be running a batch system, something like PBS or LFS. There are others:

http://en.wikipedia.org/wiki/Job_scheduler

specifically the section on queuing for HPC clusers.

....

IF you just use Ctrl-z you didn't put the process in background - you just suspended it. If you want it to continue running you need to use the "bg" command to resume it.

Last edited by jpollard; 06-05-2013 at 10:08 PM.
 
1 members found this post helpful.
Old 06-06-2013, 09:03 AM   #13
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
Well, it seems to keep running after I used Ctrl-z. Using redirection and "&" didn't supply me with adequate output (Ctrl-z may not, either).
 
Old 06-06-2013, 09:30 AM   #14
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,600

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
The only process suspended is the one attached to the terminal. Any processes spawned by the mpirun before the suspension will still be running. But since status data will still be sent back to the mpirun control process will be unprocessed (and if UDP, could be dropped).

It is the logging that you want one of the cluster based batch systems - errors from remote processes may get lost otherwise.
 
1 members found this post helpful.
Old 06-06-2013, 09:35 AM   #15
Jeff9
Member
 
Registered: Jun 2013
Posts: 36

Original Poster
Rep: Reputation: Disabled
I think I get the first paragraph - basically, my terminal is suspended, but not running in the background. The cluster's processors are running the "mpispawn" jobs in the background and sending output back to the suspended terminal. The terminal stores the standard output and standard error so that when I call it to the foreground, I can see it. But if the job dies before I return it to the foreground, I will miss the important why-it-died information.

Can you clarify your second paragraph?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Running program in the background HarryBoy Programming 2 06-12-2008 12:42 PM
Running C Program in Background help ibshar Linux - Newbie 7 10-04-2007 03:41 PM
Running a program as a background daemon? Vor Kragresh Slackware 8 04-20-2007 11:48 AM
running a program in background of X using & Dachy Linux - General 1 09-13-2005 12:38 PM
Running a program in the background - SSH ziggo0 Linux - Newbie 4 03-05-2005 02:30 PM


All times are GMT -5. The time now is 01:42 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration