[SOLVED] program segfaults 80% of the time on OpenBSD 6.2, but never on Linux
*BSDThis forum is for the discussion of all BSD variants.
FreeBSD, OpenBSD, NetBSD, etc.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
program segfaults 80% of the time on OpenBSD 6.2, but never on Linux
I'm maintaining a program called rmw. I'm normally not a BSD user but sometimes test it on OpenBSD 6.2. Last night when I tested, I found it segfaults about 80% of the time when it's run (no arguments needed to reproduce the segfault.
gdb reports the fault at L107. I don't see any problems with the code though, and as I mentioned, it never segfaults on Linux.
Any feedback would be appreciated.
EDIT: I found what may be the cause and am working on it. I'll post here with the results...
UPDATE: Nope. I've refactored a bit of related code, added a few commits, but still haven't solved the problem.
Last edited by Andy Alt; 11-15-2018 at 10:46 PM.
Reason: fix typo
OpenBSD 6.2 is no longer supported. It was end-of-life with the release of 6.4 on October 18. You may want to run your tests on this newest release, or, for bug reporting purposes, on a recent development snapshot. It may help to know that the OpenBSD Project publishes two releases a year, and only supports the most recent two.
Your gdb backtrace may provide an indication of what is occurring within isspace(3), and perhaps point to a root cause.
While there are a few OpenBSD users like me here, this forum is not an official support channel for the OS, and our expertise is limited.
Last edited by jggimi; 11-14-2018 at 11:29 AM.
Reason: clarity
OpenBSD 6.2 is no longer supported. It was end-of-life with the release of 6.4 on October 18. You may want to run your tests on this newest release, or, for bug reporting purposes, on a recent development snapshot. It may help to know that the OpenBSD Project publishes two releases a year, and only supports the most recent two.
Ok, I upgraded to 6.4 last night. Same results.
Quote:
Your gdb backtrace may provide an indication of what is occurring within isspace(3), and perhaps point to a root cause.
I'm not very proficient yet with debugging tools but I'll keep practicing.
Quote:
While there are a few OpenBSD users like me here, this forum is not an official support channel for the OS, and our expertise is limited.
Is there a good forum you'd recommend for a problem like this?
This is the most current version of the function causing difficulties...
Code:
/*
*
* trim_white_space: remove trailing blanks, tabs, newlines, carriage returns
*
*/
void
trim_white_space (char *str)
{
/* Advance pointer until NULL terminator is found */
while (*str != '\0')
str++;
/* set pointer to segment preceding NULL terminator */
str--;
while (isspace ((unsigned int)*str))
{
*str = '\0';
str--;
}
return;
}
Last edited by Andy Alt; 11-15-2018 at 10:44 PM.
Reason: adding code snippet
The problem occurs within the isspace(3) library function. I assume that str contains an invalid address. A similar value may be a valid address in a Linux process address space, which could explain why you don't see the error appear on Linux systems.
The OpenBSD Project's support is via Email. Specifically, its mailing lists. The bugs@ list is for bug reporting, but at this point it is not clear that there is a bug in the OpenBSD C library to report.
If this were my problem, my assumption would be it was my application at fault, and probably my application's handling of str. I'd initiate an informal query on the misc@ mailing list, asking for debugging assistance of my application, to obtain help discovering the root cause. If it turns out later that it's a problem with the C library, then I'd have enough information at that time to make a formal bug report.
Last edited by jggimi; 11-16-2018 at 07:03 AM.
Reason: typo
Your code will underflow the buffer if the buffer is empty, or contains only isspace() characters, which could potentially cause a segfault and is something you should address.
Other than that, most likely, you're passing the function a bad pointer from elsewhere in your code. Checking for str == NULL and either writing out an error, or just returning from the function without doing anything would probably also be a good idea.
Lastly, I don't believe you need the (unsigned int) cast but that's a minor point.
NULL strings weren't the problem in this case (I use quite a bit of error-checking my program). The address was was going out-of-bounds during my subtraction, going past &str[0] in the wrong direction.
This was happening when the rmw config file was getting read, and the config files between my Linux system and BSD system had some subtle differences, and that may be why it didn't reproduce on the Linux system. However, there are some lines with only white_space in my Linux rmw config file, so really I'm pretty sure Linux should have segfaulted. My guess is that on some level.. system or compiler, that prevented the address from "underflowing" even without the proper code. (I forgot to mention that the segfault doesn't happen on OSX either.) But I think this is a good change and I surely do appreciate talking through the problem with both of you.
Code:
void
trim_white_space (char *str)
{
if (str == NULL)
{
MSG_ERROR;
fprintf (stderr, _("String passed to %s is NULL.\n\
Please report this bug to the rmw development team. Exiting...\n"), __func__);
exit (EXIT_FAILURE);
}
char *pos_0 = str;
/* Advance pointer until NULL terminator is found */
while (*str != '\0')
str++;
/* set pointer to segment preceding NULL terminator */
if (str != pos_0)
str--;
else
return;
while (isspace ((unsigned int)*str))
{
*str = '\0';
if (str != pos_0)
str--;
else
break;
}
return;
}
Quote:
Lastly, I don't believe you need the (unsigned int) cast but that's a minor point.
This is mentioned in the isspace() man page:
Quote:
NOTES
The standards require that the argument c for these functions is either
EOF or a value that is representable in the type unsigned char. If the
argument c is of type char, it must be cast to unsigned char, as in the
following example:
It's always worked fine for me without any casting at all and I only recently made that change. But it seems I should change it from int to char if I want to keep things compliant. I'm not sure.. as you say, it's a minor point but worth mentioning for the people who are learning.
It's always worked fine for me without any casting at all and I only recently made that change. But it seems I should change it from int to char if I want to keep things compliant. I'm not sure.. as you say, it's a minor point but worth mentioning for the people who are learning.
You're quite right, I was in error there. I'm very much still learning myself, so thanks for the correction.
Best of luck with your project.
P.S.
Trying to improve my understanding of this issue, I found this helpful:
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.