[SOLVED] KSH and BASH return different exit codes from same program for SIGSEGV

tom_mee · 12-06-2016, 11:05 AM

Here is the C program:
#include <stdio.h>
#include <stdlib.h>
#include <sys/signal.h>
#include <string.h>
#include <sys/errno.h>

int main(int argc, char **argv)
{
pid_t pid;

pid=getpid();
kill(pid, SIGSEGV);
exit(0);
}

Here is the script and the output:
#!/bin/ksh
test.xec
ecode=$?
echo "trapped exit code $ecode"
exit $ecode

test.sh: line 2: 22557: Memory fault(coredump)
trapped exit code 267
Memory fault(coredump)
This resulted in KSH crashing.

If I change the first line to #!/bin/bash and run it again, I get:
test.sh: line 3: 24296 Segmentation fault (core dumped) test.xec
trapped exit code 139

BASH appears to behave as expected. I am running on CentOS 6.5. /bin/ksh points to /etc/alternatives/ksh. I tried using the /bin/ksh93, but it behaved the same as /bin/ksh - and resulted in ksh93 crashing.

Is this a BUG is ksh??

jlliagre · 12-06-2016, 03:45 PM

What ksh behavior do you suspect being a bug and why?

tom_mee · 12-06-2016, 09:50 PM

Quote:

Originally Posted by jlliagre

What ksh behavior do you suspect being a bug and why?

1) All searches I've seen report that a SIGSEGV should exit with code 139 - and certainly less than 256
2) A crash report for KSH is created after a command exits with exit code 267 - the shell shouldn't (in my opinion) crash due to the exit code returned by the command it was running.

szboardstretcher · 12-06-2016, 10:08 PM

For extra info on this, i did this:

kshtest.sh

Code:

#!/bin/ksh
exit 0

Then ran this:

Code:

for i in {259..300}; do sed -i '/exit/d' kshtest.sh; echo "exit $i" >> kshtest.sh; ./kshtest.sh; done

And got this:

Code:

Quit
Illegal instruction
Trace/breakpoint trap
Aborted
Bus error
Floating point exception
Killed
User defined signal 1
Segmentation fault
User defined signal 2
Alarm clock
Terminated
Stack fault

Which leads me here: http://www.cse.psu.edu/~deh25/cmpsc3.../sigs.output.1

Which leads me to believe you are exiting while sending a 'segmentation fault' signal.

"Out of range exit values can result in unexpected exit codes. An exit value greater than 255 returns an exit code modulo 256. For example, exit 3809 gives an exit code of 225 (3809 % 256 = 225)."

So 267 % 256 is ... you guessed it ... 11, which is the Segmentation Fault signal.

jlliagre · 12-07-2016, 02:48 AM

Quote:

Originally Posted by tom_mee

1) All searches I've seen report that a SIGSEGV should exit with code 139 - and certainly less than 256

You didn't search to the right locations. Standards and manual pages are the first places to look at when you suspect a bug.

The shell standard (POSIX) says

Quote:

The exit status of a command that terminated because it received a signal shall be reported as greater than 128.

bash reports 139 and ksh reports 267. Both values are above 128 so both shells are compliant here.

The bash manual page states:

Quote:

The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n.

139 - 128 = 11, 11 is precisely the "Segmentation Fault" number for Linux (and most if not all other Unix/Unix like OSes).

bash behaves as documented.

The ksh93 manual page states:

Quote:

The value of a simple-command is its exit status; 0-255 if it terminates normally; 256+signum if it terminates abnormally

267 - 256 = 11, same value as bash, ksh is behaving as documented. The rationale for ksh to use 256 instead of 128 is to allow exit status between 128 and 255 not to be confused with signals.

No bugs on this side, both shells are POSIX compliant and follow their documented behavior.

Quote:

2) A crash report for KSH is created after a command exits with exit code 267 - the shell shouldn't (in my opinion) crash due to the exit code returned by the command it was running.

Opinions are not authoritative, here again, you must demonstrate the shell is misbehaving to ascertain a bug.

If you look again to the POSIX standard, exit builtin, you can read:

Quote:

The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255.

and later:

Quote:

The behavior of exit when given an invalid argument or unknown option is unspecified, because of differing practices in the various historical implementations.

So your script is non portable and has an undefined behavior as far as the standard is concerned. You fail to check if the value you pass to exit is acceptable. If there is a bug, that bug is in your script.

Finally, you suspect ksh shouldn't crash even when one of its builtin is given a out of range argument. This is a reasonable assumption but not shared by ksh implementors who decided to allow propagating the command exception to the shell itself. They were free to do it as it doesn't break the POSIX standard and gives ksh the same feature you used in your C program, a feature that bash is AFAIK missing.

I guess you'll agree that your C program, even while its return status shows it got a signal, is not buggy and is behaving as designed. There is no reason not to allow a shell to support such a feature.

tom_mee · 12-12-2016, 07:59 AM

Quote:

Originally Posted by szboardstretcher

For extra info on this, i did this:

kshtest.sh

Code:

#!/bin/ksh
exit 0

Then ran this:

Code:

for i in {259..300}; do sed -i '/exit/d' kshtest.sh; echo "exit $i" >> kshtest.sh; ./kshtest.sh; done

And got this:

Code:

Quit
Illegal instruction
Trace/breakpoint trap
Aborted
Bus error
Floating point exception
Killed
User defined signal 1
Segmentation fault
User defined signal 2
Alarm clock
Terminated
Stack fault

Which leads me here: http://www.cse.psu.edu/~deh25/cmpsc3.../sigs.output.1

Which leads me to believe you are exiting while sending a 'segmentation fault' signal.

"Out of range exit values can result in unexpected exit codes. An exit value greater than 255 returns an exit code modulo 256. For example, exit 3809 gives an exit code of 225 (3809 % 256 = 225)."

So 267 % 256 is ... you guessed it ... 11, which is the Segmentation Fault signal.

I know it translates to the same thing - that is not the problem. The problem is WHY does ksh CRASH?

tom_mee · 12-12-2016, 08:03 AM

Quote:

Originally Posted by jlliagre

You didn't search to the right locations. Standards and manual pages are the first places to look at when you suspect a bug.

The shell standard (POSIX) says

bash reports 139 and ksh reports 267. Both values are above 128 so both shells are compliant here.

The bash manual page states:

139 - 128 = 11, 11 is precisely the "Segmentation Fault" number for Linux (and most if not all other Unix/Unix like OSes).

bash behaves as documented.

The ksh93 manual page states:

267 - 256 = 11, same value as bash, ksh is behaving as documented. The rationale for ksh to use 256 instead of 128 is to allow exit status between 128 and 255 not to be confused with signals.

No bugs on this side, both shells are POSIX compliant and follow their documented behavior.

Opinions are not authoritative, here again, you must demonstrate the shell is misbehaving to ascertain a bug.

If you look again to the POSIX standard, exit builtin, you can read:

and later:

So your script is non portable and has an undefined behavior as far as the standard is concerned. You fail to check if the value you pass to exit is acceptable. If there is a bug, that bug is in your script.

Finally, you suspect ksh shouldn't crash even when one of its builtin is given a out of range argument. This is a reasonable assumption but not shared by ksh implementors who decided to allow propagating the command exception to the shell itself. They were free to do it as it doesn't break the POSIX standard and gives ksh the same feature you used in your C program, a feature that bash is AFAIK missing.

I guess you'll agree that your C program, even while its return status shows it got a signal, is not buggy and is behaving as designed. There is no reason not to allow a shell to support such a feature.

I would have no problem handling either exit code - that is not really the issue I consider a bug. What I don't understand is WHY ksh crashes after the program exits with SIGSEGV? KSH is the program that creates a crash report.

jlliagre · 12-12-2016, 09:55 AM

It does crash by design, or expressed differently, it doesn't really crash.

It simply happens to behave like it would have should it had really crashed.

No actual segmentation violation did happen in your C program but it reports a segmentation violation, the very same situation happens with ksh.

You told ksh to return the exit status of a command (exit $ecode) but this command had no exit status. A segmentation violation has been reported instead to the shell. The shell is propagating this information to its caller, a useful ksh feature bash is to the best of my knowledge unable to achieve.

tom_mee · 12-12-2016, 10:44 AM

Quote:

Originally Posted by jlliagre

It does crash by design, or expressed differently, it doesn't really crash.

It simply happens to behave like it would have should it had really crashed.

No actual segmentation violation did happen in your C program but it reports a segmentation violation, the very same situation happens with ksh.

You told ksh to return the exit status of a command (exit $ecode) but this command had no exit status. A segmentation violation has been reported instead to the shell. The shell is propagating this information to its caller, a useful ksh feature bash is to the best of my knowledge unable to achieve.

If I run the following:
#!/bin/ksh
test.xec
ecode=$?
echo "trapped exit code $ecode"
exit $ecode

I get the output:
test.sh: line 2: 31748: Memory fault(coredump)
trapped exit code 267
Memory fault(coredump)

AND the ABRT kicks in and generates a crash report for KSH.

However, if I run:
#!/bin/ksh
test.xec
ecode=$?
echo "trapped exit code $ecode"
if [[ $ecode -gt 255 ]]; then
echo changing ecode
let "ecode = $ecode - 128"
echo "ecode now $ecode"
fi
exit $ecode

I get the output:
test1.sh: line 2: 1015: Memory fault(coredump)
trapped exit code 267
changing ecode
ecode now 139

THIS is what I think is NORMAL behaviour - there is no ABRT crash report generated, there is only ONE Memory fault message (which I assume comes from the executable that created the SIGSEGV), AND ksh exits with the expected exit code. Since the program is the same, KSH cannot seem to successfully exit with the same exit code which it genereated as a result of the SIGSEGV (ie 267) and causes it to crash.

I ran another test by wrapping the test1.sh script with yet another script. You are correct - KSH does NOT CRASH - however it does create a crash report indicating that it crashed - which is what confused me. So now my question is how come it behaves differently when exit codes 267 and 139 are both SIGSEGV, why does one (267) create a crash report, and the other (139) does not?

Semi-solved

jlliagre · 12-12-2016, 11:20 AM

Quote:

Originally Posted by tom_mee

THIS is what I think is NORMAL behaviour

This is bash behavior, there is no normal behavior to expect as far as standard compliant shell interpreters are concerned.

Quote:

there is no ABRT crash report generated, there is only ONE Memory fault message (which I assume comes from the executable that created the SIGSEGV), AND ksh exits with the expected exit code.

There is no real exit code to expect from a program that crashed, the exit(0) which is in your C code wasn't called. The C program crashed (you simulated a crash). The exception that was reported to the shell is a different field in a C structure, it is not the exit status. Bash implementors decided to limit the return status allowed for a binary to 0 to 127 while ksh implementors decided to allow the full range of return status values (0 to 255). That is a `bash` limitation (or convention if you prefer).

Quote:

Since the program is the same, KSH cannot seem to successfully exit with the same exit code which it genereated as a result of the SIGSEGV (ie 267) and causes it to crash.

This is expected too, an exit code is an unsigned 8 bit integer, 267 is (purposely) out of range.