Tracking down "Segmentation fault" in blassic
I have blassic installed on my 64-bit PC and 32-bit netbook.
On the PC it runs fine (using multilib) but on the netbook, every time I try to dimension a string array, it causes a segmentation fault and the terminal cursor disappears. In an attempt to try and find out what is going wrong, I edited the blassic.SlackBuild script to stop it stripping down the symbols and then recreated and upgraded the new package. When I ran gdb, this is what I got: Code:
Reading symbols from /usr/bin/blassic...done. |
Code:
bash-4.2$ gdb ./blassic |
Looking at the backtrace may be useful.
exchange_and_add sounds like a threading synchronization primitive to me. |
Quote:
This is not the only package that gives inconsistent results like this and I want to get to the bottom of it. If I can figure out what the problem is for one package then I might be able to figure it out for the others. Blassic got the short straw so I am starting off with that package. @ntubski, I will see what backtrace does and post the results here. |
Quote:
rename /usr/lib to /usr/lib~ and rebuild for 64 bit. and if your building for compat32 that is another road. to look at. the reason it runs on multi-lib is when you built it it used some 32bit stuff and linked to it. remeber to switch it back after building |
Quote:
When I created a symbol version of blassic for the PC, it too started seg-faulting. When I rebuilt the symbol-free version, it was also seg-faulting. However, when I upgrade to the original package, it runs fine. @ntubski This is the backtrace I got on the netbook: Code:
(gdb) run Code:
(gdb) run The segfault in the PC version occurs at delref() in "anonymous namespace" (the linux-gate.so.1 bit) but in the netbook version, delref() manages to call __exchange_and_add_dispatch() before the segfault happens. Now I'm stuck. |
you may try ldd to check which libraries are in use....
|
ldd on netbook:
Code:
$ ldd /usr/bin/blassic Code:
$ ldd /usr/bin/blassic-symb |
Quote:
This because the gdb output mentions linux-gate.so.1. This implies that it is compiled as a x86 32bit program, not a 64 bit one. See vDSO names listed in the man page. So if you are rebuilding to get symbols on the 64 bit machine you have to make sure you indeed build in a 32 bit environment on that machine. Is that the case? |
are those libraries really identical (use file and/or md5sum)
|
Quote:
I just compiled it as a 64-bit executable again but I still get seg-faults when I try to dimension a string array. Quote:
|
Seeing the combination of a segfault in a C++ section of the blassic code and gcc to compile I personally would start suspecting that the blassic code itself has "issues" with more recent versions of gcc. Simply because this would be the key thing that is different compared to when you compiled it in 2012 and there have been a lot of changes on the C++ front since then.
That something compiles on a newer compiler doesn't necessarily mean that it actually works as expected. |
`blassic-0.11.0' : Segmentation fault.
`blassic-0.10.2 : OK with ./configure --disable-svgalib → Code:
$ blassic - |
Quote:
The website https://build.opensuse.org/package/v...c/blassic.spec included some "sed"s which I tried too but to no avail. Thanks for the suggestion though. |
Quote:
So the next thing to try is to run under valgrind, but compile with debug info so you can get line numbers. |
BTW, I am running slackware 14.1 32-bit. So it's likely how you are compiling or other things influencing the segmentation fault. I'm not running a stock kernel here, so I'm not sure if there is some kind of kernel protection in play either. I do think the program is at fault here, and haven't taken the steps to run valgrind or other kinds of memory debugging to see if there is a corruption going on.
I also tried compiling 0.11, and I got syntax errors. Not a good sign. |
Tell the program devs about it, if you don't know the program internals it's hard for you to fix the bug.
|
Quote:
This is a session running blassic alone. Code:
Blassic 0.10.2 Code:
$ valgrind --leak-check=yes blassic |
you can try to set a breakpoint and go into it - or try to link statically...
|
Quote:
Even if static linking masked the problem for blassic, it doesn't help me in the long run. I have a lot of packages with segfaulting problems (even the esteemed Alien's vlc). In fact, it's almost a coin toss whether a package I install will actually run or scream "segfault" and die. There can't be that many poorly put together packages. There must be a problem with my Slackware setup which is why I must continue along this path. That said, I find it amazing that a simple basic interpreter would need over 40 source files each thousands of lines long and during execution would have to link up with over 15 dynamic libraries. And in spite of the code verbosity, it can't handle a simple heap violation. |
Quote:
If you have lots of segfaults, there are probably several different causes. There certainly are a *lot* of poorly put together packages. In the case of vlc, it isn't fair for me to imply it might be poorly put together: no, it's a major achivement that it works at all, considering that it has to rely on so many poorly put together dependencies and rotten APIs. But whilst the in-depth approach you've taken with blassic is good for finding problems in blassic, it's not really the best way of finding and solving a more general problem. For one thing, don't start with blassic: C is much simpler to debug than C++. For another, you only need to find out whether the stack dumps of the various failing programs have a common theme (what were they trying to do when they crashed). And most of all, it's just easier to manage your Slackware installation properly. No little experiments outside a VM, no mix-and-match between -stable and -current, the absolute minimum of multilib, and, dare I say, no rebuilds of standard packages with debugging symbols outside a VM. I apologise if you haven't done anything like that. But it will be more effective to eliminate this potential problem by performing an audit of your system against a Slackware install dvd. The reason for this is that incompatibilities in dynamic linked libraries will often cause segfaults. You can make this happen by using a different .so file to the one that the program was built against. This is why the often-repeated advice to "just create the missing symlink" is *TERRIBLE* advice. As a different example, the gdal library can optionally be built with a patched internal version of libjpeg to support 12-bit jpegs -- this mostly works, but it makes qgis (which depends on gdal) segfault in one or two very specific places. Or it might be hardware... |
Quote:
Quote:
Code:
... So apparently arrayvarstring["a$"].value is uninitialized. It looks like all Array<C> constructors do initialize it, so I'm not sure how that could be. Is it possible some versions of C++ runtime libraries don't like the 0 length array (new C [0])? It is legal C++ though. Quote:
Quote:
|
Quote:
I can't guarantee that some obscure package I installed hasn't slipped in a rogue substitute library somewhere but md5ing all of my libraries will take some time. Of course, this is rendered unlikely since both 32-bit and 64-bit compilations of blassic have the same segfault problem - meaning that both 64-bit library and its equivalent 32-bit library would have had to be changed the same way. If it helps, some of the segfaulting packages (like vlc) still run under gdb. In fact, the run instruction for my vlc is echo run | gdb vlc. |
Quote:
If you experience segfaulting packages that run fine for others you should check that your hardware is fine by running memtest86+ for an extended period. This is simply one of the first thing to do if strange segfaults and/or compilation-errors pop up on a system. If your system doesn't run this test without errors you are sure to have a hardware problem. The slackware install DVD has the memtest option available when you boot from it. btw) Keep in mind that completing the memtest run without errors doesn't actually guarantee that the hardware is fine. I've seen memory errors pop up only when a system is pushed harder, ie during compilation when the voltage lines might drop. |
Quote:
It is taking me longer than expected to learn how to set a break point at line 401 in var.cpp (either in gdb, valgrind or both) so if it takes me a while before I can provide more info don't think I have given up on this task. |
I'm thinking blassic is no longer maintained. Their main site is down and distros ship with patched versions in order for it to even build. How about FreeBasic or some other FLOSS basic compiler or interpreter ?
|
I only keep blassic around for those rare occasions when I get a little maudlin since it is closest to the basics I grew up with (like Commodore 64 basic or GWbasic). For any remotely serious programming, I would of course use Java, C or C++ (or even fpc).
As explained before, my main reason for trying to fix this is to gain some experience at fixing other rogue programs (although admittedly, vlc with all its dependencies might be beyond my abilities). Since blassic used to work before the latest Slackware upgrade, it is possible that one or more libraries doesn't operate the same way that blassic expects it to anymore. If I give up now, I will never know. |
Quote:
|
Quote:
Slackware 14.1 x86 and x86_64 and can't reproduce either. Before we go around calling blassic and vlc rogue, let's make sure the problem isn't something particular to your configuration. You're not being very precise in your reports. What Slackware version are you using? When you say "the latest Slackware upgrade", which do mean? --mancha |
Quote:
Although my Slackware installation is pretty much vanilla, I suspect a problem with how it is setup but so far I have no way of knowing whether the problem is with my Slackware setup or with blassic itself. All I know is that when I built the package under Slackware 13.37 there were apparently no problems (even when running the package under 14.0) but after compiling under Slackware 14.0 I get segfaults. |
Update
My SO also has a laptop with Slackware 14.0 installed on it. So I built and ran blassic on that laptop and got the same segmentation fault.
I then remembered an old laptop that still has Slackware 33.7 (32 bits) on it. Compiling blassic on that laptop produced a program that ran without errors on the laptop, the netbook and the PC. It seems that the problem lies with Slackware 14.0 - probably gcc. When I installed multilib, I had to upgrade the gcc packages to their multilib versions as per the multilib instructions. This is the list of gcc packages currently installed: Code:
~$ ls /var/log/packages | grep gcc |
All times are GMT -5. The time now is 07:56 AM. |