LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   child script doesn't return to complete parent script once complete. (https://www.linuxquestions.org/questions/linux-server-73/child-script-doesnt-return-to-complete-parent-script-once-complete-4175424572/)

zQUEz 08-29-2012 06:53 AM

child script doesn't return to complete parent script once complete.
 
Hi, I have a situation where "something" has changed on a server a few months ago, but I don't know what. Consequently, I have an issue with one server that I am not seeing on other servers, and I am stuck knowing how to troubleshoot further.

I have a bash script that is launched from cron (root).
that script sets some exports and then launches a ksh script as a second user via command (su - user -c "command.sh param1")
The ksh script completes successfully but never returns to complete commands on the parent bash script.

If I launch the same bash script manually (i.e. elimintaing cron), it works as expected and the parent bash script completes commands after the child script completes.
I have other servers that essentially run the same without issues.
This server "used" to work, but I don't know "what changed" the day that it stopped (months ago).

I have put a simple "logger" command in the bash script script right after the ksh command to see if it makes it that far and it doesn't.
There is no error checking or "exit" statements in the bash script that would cause it to stop prematurely.
There are no errors in cron logs, though it only happens when run from cron.

It appears as though the the child job runs but never returns to complete the parent job.
Is this possible?
How could I further troubleshoot to identify the issue?
Strace maybe? Except it only does it when run from cron.
I didn't think it was possible to divorce a child process from it's parent. I don't know that is what is happening, but that is the apparent symptom.

thanks for any pointers.

Update: I just ran a ps listing in tree format, and the child process is clearly still a child of crond --> bash script --> ksh script.

unSpawn 08-29-2012 07:25 AM

Quote:

Originally Posted by zQUEz (Post 4767289)
Hi, I have a situation where "something" has changed on a server a few months ago, but I don't know what.

Personally I like to use a central log to jot down changes but that requires discipline and I use versioning for configuration so I'm always able to revert if necessary. If you have regular backups then you could diff them for changes?


Quote:

Originally Posted by zQUEz (Post 4767289)
I have other servers that essentially run the same without issues. This server "used" to work, but I don't know "what changed" the day that it stopped (months ago).

Diff relevant components and dependencies between two servers?


Quote:

Originally Posted by zQUEz (Post 4767289)
The ksh script completes successfully but never returns to complete commands on the parent bash script. (..) I have put a simple "logger" command in the bash script script right after the ksh command to see if it makes it that far and it doesn't. (..) How could I further troubleshoot to identify the issue? Strace maybe?

You could add some echo statements to the Ksh script and then launch it from the cron job as '/path/to/ksh -vx /path/to/script' and check output?

zQUEz 08-29-2012 07:59 AM

thanks for the feedback.
diff on the scripts between the servers don't reveal any differences that shouldn't be there.
The ksh script does output to a log and that comes out as expected, including the final exit 0 status.
At this stage, I will just delete the scripts and recreate them from templates as they appear to be working everywhere else.

The note about change tracking is heard. We do have that in place and nothing changed according to that, however, clearly "something" did (these things don't just break themselves). I think I need to look into some sort of external script versioning to track and report changes when they occur.

unSpawn 08-29-2012 08:59 AM

Quote:

Originally Posted by zQUEz (Post 4767334)
The ksh script does output to a log and that comes out as expected, including the final exit 0 status.

List open files for the process? As in 'lsof -Pwlnp $PID'?


All times are GMT -5. The time now is 08:42 PM.