Hello.
Not a request per se, but does the team have any thoughts on LTO builds (maybe when gcc 6.X lands on -current) ?
I have enabled lto for quite some months now and everything seems to work fine (at least kde5, utilities from coreutils / utilinux, git, vim, and everything in general that you use in every day desktop use and programming) but i do not know if the benefits justify the extra load that Pat's build server will get and extra burden of testing.
An inaccurate and very layman's description of LTO is that instead of producing an object file, the compiler saves in .o files its internal representation of the code. Then at the time of the link, it reads these files and compiles the whole unit as one so it can optimize far better than it can for just one file at a time. This results in better code that performs quicker and the side benefit of less disk consumption (although with today's disks i don't think anyone cares for that).
The cons of LTO are:
* Larger compile times (From 20% to 2-3 times larger and in extreme cases up to 5-6 times larger).
* Significantly more amounts of memory needed for the link stage (firefox needs about 8G memory).
* Some programs cannot be compiled or are miscompiled.
* Messed debugging (there is work being done but at this point debugging info cannot be reliably used together with LTO).
I did a sloppy/naive test of compiling the slackware tree without and with lto and measure the differences. Unfortunately i don't know how to reliable measure peak memory usage so i only measured compile times and disk space. Besides very few programs that didn't compile out even without lto and kde that i left because it would take a lot of time, i compiled the rest of the tree. I noticed the following things:
For the majority of programs the compile time difference is negligible because they are small and not complex enough. If we read a statistic with percentages, we will read large numbers like 50% more time or 100% more time, etc but in truth the difference is a few seconds. For example, hdparm needed 91% more time but in practice it went from 1sec to 1.91sec
. Many other libs and sys apps went from 15-17secs to 23-25secs, so unless we are talking about a full slackware rebuild where these differences would add up, then the difference in compile time isn't much.
Some examples of notable programs are:
(the time is not so accurate because it is whole slackbuild run so it also contains the untar and makepkg steps)
Code:
a/coreutils 1min:08sec -> 1:50
a/util-linux 1:13 -> 1:34
d/binutils 3:15 -> 4:05
d/git 3:46 -> 3:56
l/glib2 1:10 -> 1:56
l/glibc 12:02 -> 12:33
l/gtk+3 6:07 -> 8:39
l/ncurses 1:46 -> 2:43
l/qt 1h:26min:07sec -> 1h:27min:28sec (i would expect way more than only 1min difference here. i don't know why it is that)
Large time differences and weird results:
Code:
d/cmake 3:33 -> 8:27
d/ruby 5:03 -> 3:57
l/akonadi 2:37 -> 07:05
l/virtuoso-ose 2:11 -> 10:21
n/NetwManager 2:54 -> 6:48
n/samba 12:12 -> 28:44
x/mesa 10:08 -> 21:44
Disk space:
Code:
/usr/bin 742588K -> 703932K
/usr/lib64 2380320K -> 2272004K
Most significant space differences
libgtkmm-3something 5424K -> 3976K
libaudiofile.a 4456K -> 2372K
cmake 8436K -> 3748K
libavcodec.a 151913K -> 84666K
libavformat.a 49143K -> 34147K
The larger the libraries/binaries a project has the larger the difference will be. Besides firefox and libreoffice which take an enormous amount of time to be built, the difference can be best shown with ffmpeg. The libavcodec library went frm 152M to 84.5M.
Every GCC release contains enormous work on LTO and the optimizer gets better with each release and needs less and less memory. As you can read
here, libreoffice needed 18GB memory with gcc 4.9, 12GB with gcc 5.0 and 10GB with gcc 6.0 (all the numbers are highly inflated due to large paralelism of 16 jobs. the needed memory would be smaller with eg -j4 build).
Programs that cannot be compiled with slack's gcc 5.4.0:
Code:
a/btrfs-progs (can be compiled but segfaults)
a/elilo (cannot be linked when gnuefi is built with lto)
a/pciutils
a/xfsprogs (*)
ap/cdrdao
ap/cdrtools (when i compiled cdrtools-3.02a06, mkisofs would segfault but the version provided with slackware seems to work correctly in my test)
ap/flac
ap/mariadb
d/distcc
d/perl (*)
d/ruby (*)
l/alsa-lib
l/elfutils
l/glibc (**)
l/js185
l/libgphoto2
l/libvpx
l/phonon-gstreamer
l/pulseaudio
n/autofs
n/dhcp
n/libgcrypt
n/php
n/procmail
n/ytalk
x/scim-anthy
x/scim-m17n
x/xorg-server (the problem is in the test directory. everything else builds fine)
x/xf86-video-intel
* I have xfsprogs, perl and ruby in my list that aren't buildable but they built correctly which pleasantly surprised me. I didn't test ruby at all but perl worked fine and xfs seems to work fine (i think it was xfs_db that used to segfault but i don't remember)
** glibc supposedly shouldn't be built with lto because it causes problems with many programs (particularly with libpthread). However, i didn't have any problems with it.
Essentially the most significant burden of LTO for the team is additional testing for programs that appear to have been compiled fine but segfault like btrfs and mkisofs that i mentioned.
How to enable LTO:
With gcc-5 you don't need to pass anything to the linker nor mess with no-fat-objects and other stuff that are mentioned in many guides. You just pass -flto to CFLAGS/CXXFLAGS. For programs that use the assembler or the ar utility to produce static libraries, there is an additional step. They need to use gcc to do the job instead of the normal utilities.
One way to do this is to set some variables (for example "export AS=gcc-as" and "export AR=gcc-ar") to tell the build system what binaries to use but not all build systems honour that. A better way is to symlink gcc's lto library to the binutils plugin directory and then the utilities will automatically use the plugin. For slackware the plugin directory is /usr/lib/bfd-plugins (i use slackware64 but binutils still wanted the path in /usr/lib).
Code:
mkdir -p /usr/lib/bfd-plugins
cd /usr/lib/bfd-plugins
ln -snf ../../libexec/gcc/x86_64-slackware-linux/5.4.0/liblto_plugin.so .
If my post is too large / too irrelevant for the requests thread, please move it to a new thread so not to pollute the thread. Thank you for your time.