LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   GCC update to another version (https://www.linuxquestions.org/questions/slackware-14/gcc-update-to-another-version-4175733291/)

selfprogrammed 01-28-2024 06:00 PM

GCC update to another version
 
I have got to the point where I am questioning the bug-iness of the gcc 11.2 distributed with Slackware 15.0.

There is this very large program that is very buggy. It should not be that buggy.
I have been working on it for months trying to diagnose the problem.

I have made changes like:
Code:

// constructor
obj2::obj2(): ptr1(NULL), bval(false), ptr2(NULL) { }

// I changed that to:

obj2::obj2() {
 ptr1 = NULL;
 ptr2 = NULL;
 bval = false;
 }

This has cured that particular bug.

I can find no reasoning for why that should make a difference, but it definitely has worked.
The program previously would crash every couple of hours, taking work with it, and has stopped doing that entirely. I have made numerous other patches, instrumentation, and other changes, that have not affected these problems at all. It is not a matter of just recompiling.

In another place I have made another change like this, with it having notable effect upon that program bug.
Did not fix it entirely, but it is like a different bug now.

Still have a mass of other bug-iness that also defies understanding.
That style of constructor is all over the program, and it is large.

// ----

I am compiling for target i686 (am running a Quad Athlon).
The program uses CMake.

I ran a memory tester for an entire night, and got no errors.

// ----

This is gcc 11.2.
A year ago, they released gcc 11.4 a bug fix release.
Any chance that gcc 11.4 could be released as a patch upgrade for Slack 15.0.

This is a system used for business purposes, developing programs, and I cannot be testing unstable compilers. I need a stable release version of gcc. I am hoping that such an upgrade just might fix some of this bug-iness.

It is also possible that the program is just that bad, or that style of constructor has those kind of problems. But I cannot find any evidence of that either.

Have compiled a massive amount of slackbuilds using this compiler, and have not noticed anything like this with other programs. So I am confused. Getting desperate to try something, and upgrading GCC is a way to test if that makes a difference.

Trying to put another GCC on the side, and getting this slackbuild to compile with that does not look like an easy task.

jailbait 01-28-2024 06:16 PM

You should report the bugs you find to the gcc project:

https://gcc.gnu.org/bugs/

Lockywolf 01-28-2024 08:21 PM

Quote:

Originally Posted by selfprogrammed (Post 6479908)
I have got to the point where I am questioning the bug-iness of the gcc 11.2 distributed with Slackware 15.0.

There is this very large program that is very buggy. It should not be that buggy.
I have been working on it for months trying to diagnose the problem.

I have made changes like:
Code:

// constructor
obj2::obj2(): ptr1(NULL), bval(false), ptr2(NULL) { }

// I changed that to:

obj2::obj2() {
 ptr1 = NULL;
 ptr2 = NULL;
 bval = false;
 }

This has cured that particular bug.

I can find no reasoning for why that should make a difference, but it definitely has worked.
The program previously would crash every couple of hours, taking work with it, and has stopped doing that entirely. I have made numerous other patches, instrumentation, and other changes, that have not affected these problems at all. It is not a matter of just recompiling.

In another place I have made another change like this, with it having notable effect upon that program bug.
Did not fix it entirely, but it is like a different bug now.

Still have a mass of other bug-iness that also defies understanding.
That style of constructor is all over the program, and it is large.

// ----

I am compiling for target i686 (am running a Quad Athlon).
The program uses CMake.

I ran a memory tester for an entire night, and got no errors.

// ----

This is gcc 11.2.
A year ago, they released gcc 11.4 a bug fix release.
Any chance that gcc 11.4 could be released as a patch upgrade for Slack 15.0.

This is a system used for business purposes, developing programs, and I cannot be testing unstable compilers. I need a stable release version of gcc. I am hoping that such an upgrade just might fix some of this bug-iness.

It is also possible that the program is just that bad, or that style of constructor has those kind of problems. But I cannot find any evidence of that either.

Have compiled a massive amount of slackbuilds using this compiler, and have not noticed anything like this with other programs. So I am confused. Getting desperate to try something, and upgrading GCC is a way to test if that makes a difference.

Trying to put another GCC on the side, and getting this slackbuild to compile with that does not look like an easy task.

>I can find no reasoning for why that should make a difference, but it definitely has worked.

I am afraid you would have to look at the disassembly to find out where the problem lies. I am not that well versed in C++ standartese, but I don't think that those constructors are identical. I think there is some difference when those are called for base/derived classes.

In any case, if your code is fixed by 11.4, you can compile it yourself from scratch, looking at Slackware's gcc SlackBuild, just change the prefix into /opt/gcc-11.4/, or something like that. You can also have a look at the gcc-5.SlackBuild on SlackBuilds.Org. That script does install the older gcc into /opt/.

volkerdi 01-28-2024 10:00 PM

Quote:

Originally Posted by selfprogrammed (Post 6479908)
I have got to the point where I am questioning the bug-iness of the gcc 11.2 distributed with Slackware 15.0.

There is this very large program that is very buggy. It should not be that buggy.
I have been working on it for months trying to diagnose the problem.

In several cases my fix for issues like this has been to use clang.

BrunoLafleur 01-29-2024 03:59 AM

Quote:

Originally Posted by selfprogrammed (Post 6479908)
I have got to the point where I am questioning the bug-iness of the gcc 11.2 distributed with Slackware 15.0.

There is this very large program that is very buggy. It should not be that buggy.
I have been working on it for months trying to diagnose the problem.

I have made changes like:
Code:

// constructor
obj2::obj2(): ptr1(NULL), bval(false), ptr2(NULL) { }

// I changed that to:

obj2::obj2() {
 ptr1 = NULL;
 ptr2 = NULL;
 bval = false;
 }

This has cured that particular bug.

I can find no reasoning for why that should make a difference, but it definitely has worked.
The program previously would crash every couple of hours, taking work with it, and has stopped doing that entirely. I have made numerous other patches, instrumentation, and other changes, that have not affected these problems at all. It is not a matter of just recompiling.

In another place I have made another change like this, with it having notable effect upon that program bug.
Did not fix it entirely, but it is like a different bug now.

Still have a mass of other bug-iness that also defies understanding.
That style of constructor is all over the program, and it is large.

// ----

I am compiling for target i686 (am running a Quad Athlon).
The program uses CMake.

I ran a memory tester for an entire night, and got no errors.

// ----

This is gcc 11.2.
A year ago, they released gcc 11.4 a bug fix release.
Any chance that gcc 11.4 could be released as a patch upgrade for Slack 15.0.

This is a system used for business purposes, developing programs, and I cannot be testing unstable compilers. I need a stable release version of gcc. I am hoping that such an upgrade just might fix some of this bug-iness.

It is also possible that the program is just that bad, or that style of constructor has those kind of problems. But I cannot find any evidence of that either.

Have compiled a massive amount of slackbuilds using this compiler, and have not noticed anything like this with other programs. So I am confused. Getting desperate to try something, and upgrading GCC is a way to test if that makes a difference.

Trying to put another GCC on the side, and getting this slackbuild to compile with that does not look like an easy task.

For that sort of bug, how is the definition of the class ? In what order are the variables members and are they all initialized ?

The example is incomplete to have some ideas.

pan64 01-29-2024 04:41 AM

you told nothing about your code, how do you know it is buggy?
You need to compile it with -Wall, and check all the warnings or errors.
It was already mentioned we don't know how was that class declared. Also what kind of bug was fixed by that modification?
If you want make a bug report you need to show [us] or describe exactly how can we reproduce that bug (means a working example)

drumz 01-29-2024 08:25 AM

As said above, install the version you want in /opt. I have a few for various reasons:

Code:

# ls /opt | grep gcc
gcc-10.2.0/
gcc-13.2.0/
gcc-8.4.0/

Here's my do_build.sh script for 13.2.0. Yes, I'm a bad boy and don't create a package. But everything is contained in /opt/gcc-13.2.0, so I don't feel too bad. Obviously build script is inspired by Slackware's build script.

Code:

#!/bin/sh

set -e

srcdir=../gcc-13.2.0
destdir=/opt/gcc-13.2.0

SLKCFLAGS="-O2 -fPIC"
LIBDIRSUFFIX="64"
LIB_ARCH=amd64

TARGET=x86_64-slackware-linux

NUMJOBS=" -j 8 "

#tar xvf gcc-13.2.0.tar.xz
#cd gcc-13.2.0
#zcat ../patches/gcc-no_fixincludes.diff.gz | patch -p1
#cd ..

mkdir build
cd build

GCC_ARCHOPTS="--disable-multilib"

CFLAGS="$SLKCFLAGS" \
CXXFLAGS="$SLKCFLAGS" \
"$srcdir/configure" \
  --prefix=$destdir \
  --libdir=$destdir/lib$LIBDIRSUFFIX \
  --enable-shared \
  --enable-bootstrap \
  --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++ \
  --enable-threads=posix \
  --enable-checking=release \
  --enable-objc-gc \
  --with-system-zlib \
  --enable-libstdcxx-dual-abi \
  --with-default-libstdcxx-abi=new \
  --disable-libstdcxx-pch \
  --disable-libunwind-exceptions \
  --enable-__cxa_atexit \
  --disable-libssp \
  --enable-gnu-unique-object \
  --enable-plugin \
  --enable-lto \
  --disable-install-libiberty \
  --disable-werror \
  --with-gnu-ld \
  --with-isl \
  --verbose \
  --with-arch-directory=$LIB_ARCH \
  --disable-gtktest \
  --enable-clocale=gnu \
  $GCC_ARCHOPTS \
  --target=${TARGET} \
  --build=${TARGET} \
  --host=${TARGET} || exit 1

make $NUMJOBS bootstrap || exit 1
( cd gcc || exit
  make $NUMJOBS gnatlib GNATLIBCFLAGS="$SLKCFLAGS" || exit 1

  CFLAGS="$SLKCFLAGS" \
          CXXFLAGS="$SLKCFLAGS" \
          make $NUMJOBS gnattools || exit 1
)
make info || exit 1

#make $NUMJOBS check  || exit 1

make install || exit 1


selfprogrammed 01-30-2024 01:01 AM

Well, it was possible that users had seen this before and knew of such a bug in GCC, but I guess that is not going to be the case.

My first thought was to compile it with clang. Some of my users use clang (and FreeBSD, and NetBSD, etc), and I have had to augment a program to support CLANG too.
But that was why I mentioned that it uses CMAKE.

Do you have an easy and simple way to convince slackbuild and CMAKE to accept another version of GCC compiler (or clang), without having to rewrite and/or debug that effort too.
I expect it is just another option to pass to CMAKE, but I have not used such before. I have tried to configure CMAKE before, and it did not go well. I expect I will have to dive into the CMAKE docs again.
If I try to change the Makefile, CMAKE will just recreate it. The slackbuilds erasing and recreating everything does not help much either.
The chance is that I will just add to my workload. From my previous work, it might do the same exact thing, as clang does mostly what GCC does.

I have been using Slackware since the 90's, and have not seen the GCC updated very often. I just want to put in a word that there is a bug fix for this version, and I would like to get that in an official Slackware package, if possible. Not needing to go to 12 or 13, as they probably have other enhancements, and other possible bugs. But I would like the official Slackware GCC to be the last to have all of the bug fixes for that major version.

As to the actual program. I purposely did not give much detail. That class is moderate, but is not understandable without seeing a dozen more structs and classes.
(see slackbuilds Voxelands, mapblock_mesh.cpp, MeshMakeData constructor)

It will crash often, and with explicit error messages. I have it instrumented by now and have been trying to identify exactly how the data gets screwed, but cannot find anything.
I also got canarys in place and they detect nothing.
That my "fix" actually cured that particular problem makes me uneasy, as it should not have done so.
It is also possible and likely that the original constructor was at fault, and I just cannot see it. A strange warning message did go away too, but I was making many other changes too.

I have found some other problems in the code, such as their use of an Exception when a particular function would fail.
They had an identical function that did not throw the Exception but returned NULL instead.
After disabling the Exception throwing version, and making everything use the explicit NULL test, the Exception error messages have stopped.
I did find two places where they called the exception throwing version, but did not have the try/catch, and I fixed those.

Had one bug where a destructor would segfault, due to a C++ iterator on a vector generating bad ptrs. It was putting out a ptr value of 4, and such.
I stopped that one by guarding the iterator with a test on the vector being empty. There are too many obscure details on these std:: operators that are gotchas.
I did not write this code, just trying to rescue it.

It still crashes, often, but not the same errors now.

I will eventually submit patches to the actual maintainers of this program. From past experience, it is not likely the maintainer will accept them.
(One project was an exception to that. They accepted my patches. And that was how I ended up as the main programmer for a free software project. It's a trap I tell you.)

Thank you for your attention.

pan64 01-30-2024 01:18 AM

crashes, segfaults are usually memory problems, you can use valgrind to catch them (for example)
You can simply specify the compiler for CMAKE, but without details we can only guess how.
https://stackoverflow.com/questions/...piler-in-cmake
Additionally you can use static code analyzers which can find a lot of problems too.

BrunoLafleur 01-30-2024 04:38 AM

For the specific incomplete example you give at the beginning, the order of the list of the class members in the constructor should be the same as in the declaration of the class. Else it could segfault. But with -Worder option or -Wall, the compiler should emit a warning.

If the class members are initialized in the body of the constructor, the order doesn't matter (you do what you want in the body).

Some old codes also don't always initialize every member of all classes. The rely on defaults from the compiler which may have changed with the latest versions of the standards.

In your code that you didn't write from scratch, it may be not the compiler which is buggy but the code itself. So like said above, you could try some tools like valgrind to find faulty areas. I use valgrind a lot even on code that never segfaults running months.

For valgrind it is better to compile with -g option to have all lines where problems are detected.

GazL 01-30-2024 09:25 AM

Quote:

Originally Posted by selfprogrammed (Post 6479908)
I have made changes like:
Code:

// constructor
obj2::obj2(): ptr1(NULL), bval(false), ptr2(NULL) { }

// I changed that to:

obj2::obj2() {
 ptr1 = NULL;
 ptr2 = NULL;
 bval = false;
 }

This has cured that particular bug.

I can find no reasoning for why that should make a difference, but it definitely has worked.

I'm only a novice at C++ (I much prefer C), but as I understand it the difference is as follows...

The first approach calls the constructors that take an argument for each of the members in the initialisation list. The second approach calls the default constructor of each member in the class definition and then assigns values afterwards when the containing class's constructor is run.

It likely won't matter for fundamental types, but the constructors for nested class objects could potentially end up doing different things.

the3dfxdude 01-30-2024 09:30 AM

Quote:

Originally Posted by BrunoLafleur (Post 6480217)
For the specific incomplete example you give at the beginning, the order of the list of the class members in the constructor should be the same as in the declaration of the class. Else it could segfault. But with -Worder option or -Wall, the compiler should emit a warning.

I saw that earlier in the example, but...

Code:

      -Wreorder (C++ and Objective-C++ only)
          Warn when the order of member initializers given in the code does
          not match the order in which they must be executed.  For instance:

                  struct A {
                    int i;
                    int j;
                    A(): j (0), i (1) { }
                  };

          The compiler rearranges the member initializers for "i" and "j" to
          match the declaration order of the members, emitting a warning to
          that effect.  This warning is enabled by -Wall.

Please tell me that gcc cannot still introduce a bug, if they can detect it, and give a warning if you want. I don't see why they would allow buggy code, unless there is more to it than just this.

GazL 01-30-2024 09:44 AM

Quote:

Originally Posted by the3dfxdude (Post 6480273)
Please tell me that gcc cannot still introduce a bug, if they can detect it, and give a warning if you want. I don't see why they would allow buggy code, unless there is more to it than just this.

It sounds more along the lines of -Wparenthesis or -Wmisleading-indentation where it's just a "Hey, are you sure you got this right?" type of thing.

BrunoLafleur 01-30-2024 10:17 AM

Quote:

Originally Posted by the3dfxdude (Post 6480273)
I saw that earlier in the example, but...

Code:

      -Wreorder (C++ and Objective-C++ only)
          Warn when the order of member initializers given in the code does
          not match the order in which they must be executed.  For instance:

                  struct A {
                    int i;
                    int j;
                    A(): j (0), i (1) { }
                  };

          The compiler rearranges the member initializers for "i" and "j" to
          match the declaration order of the members, emitting a warning to
          that effect.  This warning is enabled by -Wall.

Please tell me that gcc cannot still introduce a bug, if they can detect it, and give a warning if you want. I don't see why they would allow buggy code, unless there is more to it than just this.

It can be more because initialisation can be dynamic and not only be some constants. One member can depend on other members as defined by the programmer. If the order is not respected and the compiler changes it, some dependencies can rely on not yet initialized values for example.

Also some defaults in the C++ standard have changed and can lead also to uninitialized values for some or all members of a class or subclasses.

Valgrind for examples detects all those errors on runtime.

the3dfxdude 01-31-2024 02:31 PM

Quote:

Originally Posted by BrunoLafleur (Post 6480286)
It can be more because initialisation can be dynamic and not only be some constants. One member can depend on other members as defined by the programmer. If the order is not respected and the compiler changes it, some dependencies can rely on not yet initialized values for example.

Also some defaults in the C++ standard have changed and can lead also to uninitialized values for some or all members of a class or subclasses.

Valgrind for examples detects all those errors on runtime.

Maybe I should not say a bug as much here to not be confused with when the programmer won't heed the warning that is the issue. Yes the programmer should know when to use list initializers or not use them or change the ordering in the definition. But I was looking for the bug, as in what was reported. I can't see the bug in the Voxeland MeshMakeData constructor initializer list, as it appears to be correctly ordered in what needs to be initialized. Maybe there is something different in addition to this that was more like the generic example in the beginning? Or the only other thing I can think of, is with the constructor and methods in the header, is this an optimizing compiler issue? Because I think the gcc compiler here won't trigger the reorder warning, so how would the programmer know of a buggy constructor for that reason? So I guess try compiling with the affected gcc in "-O0 -Wall" and maybe "-Wall" with llvm and look for problems if the programmer didn't use -Wall before and also determining if there is badly optimized code causing a problem?


All times are GMT -5. The time now is 12:45 AM.