LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Other *NIX Forums > *BSD
User Name
Password
*BSD This forum is for the discussion of all BSD variants.
FreeBSD, OpenBSD, NetBSD, etc.

Notices


Reply
  Search this Thread
Old 02-15-2017, 07:22 AM   #1
Cág
LQ Newbie
 
Registered: Oct 2016
Posts: 18

Rep: Reputation: 0
Segfaults in multiple programmes


Running NetBSD but had the same problems in Alpine Linux. I already posted to multiple lists, though received no answer.

All my GTK+2 apps segfault on keyboard input. lxappearance for example, when looking for a theme you can start pressing keys and it will search. But in my case it dumps core with /usr/lib/libpthread.so.1, /usr/lib/libc.so.12 and /usr/pkg/lib/libXcursor.so.1. The same thing happens when typing something into a GTK+2 text editor, leafpad, or looking for something in Ctrl+O window in firefox or gimp or any other programme. gimp can't even run inside gdb because of:
Code:
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007f7fea49f6aa in ___lwp_park60 () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007f7fea49f6aa in ___lwp_park60 () from /usr/lib/libc.so.12
#1  0x00007f7fec808f2b in pthread_cond_timedwait () from /usr/lib/libpthread.so.1
#2  0x00007f7feb880b80 in g_cond_wait () from /usr/pkg/lib/libglib-2.0.so.0
#3  0x00007f7feb81d7cd in g_async_queue_pop_intern_unlocked () from /usr/pkg/lib/libglib-2.0.so.0
#4  0x00007f7feb86742f in g_thread_pool_thread_proxy () from /usr/pkg/lib/libglib-2.0.so.0
#5  0x00007f7feb866a7d in g_thread_proxy () from /usr/pkg/lib/libglib-2.0.so.0
#6  0x00007f7fec80a9cc in ?? () from /usr/lib/libpthread.so.1
#7  0x00007f7fea483de0 in ?? () from /usr/lib/libc.so.12
#8  0x0000000000000000 in ?? ()
Firefox also has problems in libc.so.12 and libpthread.so.1 but doesn't say about __lwp_park60. It also can't run inside gdb.

lxappearance also dumps core when clicking Apply after changing something (themes, cursor or icon themes, fonts etc.) with another output:
Code:
#0  0x00007f7fefcb27ba in ?? () from /usr/lib/libc.so.12
#1  0x00007f7fefcb2bc7 in malloc () from /usr/lib/libc.so.12
#2  0x00007f7ff1849782 in g_malloc () from /usr/pkg/lib/libglib-2.0.so.0
#3  0x00007f7ff185ef1c in g_memdup () from /usr/pkg/lib/libglib-2.0.so.0
#4  0x00007f7ff18356b8 in g_hash_table_insert_node () from /usr/pkg/lib/libglib-2.0.so.0
#5  0x00007f7ff1835823 in g_hash_table_insert_internal () from /usr/pkg/lib/libglib-2.0.so.0
#6  0x00007f7ff183ccb1 in g_key_file_flush_parse_buffer () from /usr/pkg/lib/libglib-2.0.so.0
#7  0x00007f7ff183cf62 in g_key_file_parse_data () from /usr/pkg/lib/libglib-2.0.so.0
#8  0x00007f7ff183d0e1 in g_key_file_load_from_fd () from /usr/pkg/lib/libglib-2.0.so.0
#9  0x00007f7ff183d99e in g_key_file_load_from_file () from /usr/pkg/lib/libglib-2.0.so.0
#10 0x0000000000405532 in _start ()
Apart from these programmes I receive SIGILL in mplayer when trying to play videos. Backtrace doesn't tell anything useful.

sxiv, an image viewer, segfaults with this:
Code:
#0  0x00007f7ff64b209f in ?? () from /usr/lib/libc.so.12
#1  0x00007f7ff64b3983 in free () from /usr/lib/libc.so.12
#2  0x000000000040729c in remove_file ()
#3  0x0000000000409a92 in main ()
Previously, if built from local pkgsrc tree it worked but now stopped working at all at all.

mpg321 dumps core and says "Memory fault" with this backtrace:
Code:
#0  0x00007f7ff78068b1 in sem_post () from /usr/lib/libpthread.so.1
#1  0x000000000040afe0 in ?? ()
#2  0x0000000000403695 in ?? ()
#3  0x00007f7ff7ffa000 in ?? ()
#4  0x0000000000000002 in ?? ()
#5  0x00007f7ffffffdb0 in ?? ()
#6  0x00007f7ffffffdb7 in ?? ()
#7  0x0000000000000000 in ?? ()
I did memtests, once for four hours (two passes) and once for eight hours (eight passes). I did Dell's ePSA tests (diagnostic utility accessed from BIOS), it has own memtest; all of them returned no errors. I rebuilt gtk2 with debug symbols but it changed nothing.

Thanks everyone for any kind of help.

Last edited by Cág; 02-18-2017 at 03:35 AM. Reason: typo
 
Old 02-15-2017, 08:21 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,252

Rep: Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321
A real segmentation fault is a memory paging fault. It dates to the 80286, which had 20 address lines, but only 16 bit registers. How do you put data in the top 4 address bits? The answer was 4 bit paging registers. Except when you addressed a page with no memory on it, you got a segmentation error. Nowadays, they're any memory error.

On 2 separate Operating Systems, we can eliminate software. You're left with ( in rough order)
1. Memory errors.
2. Disk errors.
3. Some weird motherboard error. The big ASICs you get today can cause errors that are next to impossible to trace. Heat can also bring them on.
4. Power supply problems.


If you're overclocking, stop. Check all cooling. Run overnight on memtest86. Check the disks with smartmontools as well as the filesystem utilities. Borrow someone's power supply to check that. That eliminates all except the motherboard.
 
Old 02-15-2017, 11:26 AM   #3
Cág
LQ Newbie
 
Registered: Oct 2016
Posts: 18

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by business_kid View Post
On 2 separate Operating Systems, we can eliminate software.
I've just tried Ubuntu with glibc and Void with musl, both live USB, on Ubuntu all those things work fine. On Void I tried only sxiv and it worked.

Dell's ePSA includes all kinds of tests, keyboard, hard drive, memory, fans, CPU. It didn't return any errors. I may try longer memtests but yet I am not convinced that these are hardware problems.

Last edited by Cág; 02-15-2017 at 11:28 AM.
 
Old 02-15-2017, 12:28 PM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,792

Rep: Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306
if not hardware related, then probably you have incompatible libraries
 
Old 02-16-2017, 05:57 AM   #5
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,252

Rep: Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321
Ubuntu with glibc and Void with musl seem ti eliminate memory, and most of the motherboard. It still leaves disks, and the psu errors become more remote.

Check the disks. Install something properly. You should not have segfaults.
 
1 members found this post helpful.
Old 02-16-2017, 08:47 AM   #6
cynwulf
Senior Member
 
Registered: Apr 2005
Posts: 2,727

Rep: Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367
It looks like your ports tree is out of sync with the base system (hence the libc related core dumps). As you've provided next to no info of the release version of NetBSD you're running, it's hard to say for sure.

As you've installed binary packages and have also been building via pkgsrc that could also be part of the issue. What repository do you have defined in: /usr/pkg/etc/pkgin/repositories.conf ?
 
Old 02-16-2017, 09:37 AM   #7
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,640
Blog Entries: 4

Rep: Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933
Does BSD have anything comparable to Linux's /sbin/ldconfig command? A loader name-cache that must be updated (by re-running this privileged command ...) whenever libraries change?

Basically, it does sound like "incompatible libraries." If one library attempts to call another, and it doesn't actually know what the parameter-list should be, how parameters are to be passed and so-forth, then basically "all hell breaks loose real quick."

For instance, a parameter got added to a function in version 3.x of the library. The function-call pushes three items onto the stack: the called function pops-off four. Not only is its "fourth parameter" garbage, but, "say goodbye to your stack!" You're headed for a hard fall, and probably a totally-useless stack trace.

"Basically, 'the stack got munged.'" And in this case, the information found in a traceback probably is neither meaningful nor correct – because the content and thus the expected structure of the stack was hosed.

Last edited by sundialsvcs; 02-16-2017 at 09:42 AM.
 
Old 02-16-2017, 10:48 AM   #8
cynwulf
Senior Member
 
Registered: Apr 2005
Posts: 2,727

Rep: Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367
Yes, ldconfig(8) has been around in various BSD's since the early days.

However, NetBSD in particular seems to be moving away from it: https://www.netbsd.org/docs/elf.html
 
Old 02-16-2017, 12:54 PM   #9
jggimi
Member
 
Registered: Jan 2016
Distribution: None. Just OpenBSD.
Posts: 289

Rep: Reputation: 169Reputation: 169
Cág did post a dmesg earlier today at daemonforums, looking for NetBSD-specific help. It's NetBSD 7.0.2, amd64, on an Ivy Bridge CPU.
 
Old 02-16-2017, 02:25 PM   #10
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,640
Blog Entries: 4

Rep: Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933
Quote:
Originally Posted by cynwulf View Post
Yes, ldconfig(8) has been around in various BSD's since the early days.

However, NetBSD in particular seems to be moving away from it: https://www.netbsd.org/docs/elf.html
Thought so.

However, my "gut take" on this particular situation is probably that it is something much more basic –*such as quite-literal "library incompatibility," or something that didn't get re-compiled, and so on.

It's "just not nice™" when pieces of computer software can't play well together . . .
 
Old 02-18-2017, 03:33 AM   #11
Cág
LQ Newbie
 
Registered: Oct 2016
Posts: 18

Original Poster
Rep: Reputation: 0
I ran testdisk and two things caught my attention:
Code:
Warning: Bad starting head (CHS and LBA don't match)
Warning: the current number of heads per cylinder is 16 but the correct value may be 255
Errors from the system log:
Code:
PCH transcoder FIFO underrun /* it has always been on this machine) */
ACPI Error: [\_SB_.PCI0.GFX0.DD02._BCL] Namespace lookup failure, AE_NOT_FOUND (20131218/psargs-393)
ACPI Error: Method parse/execution failed [\_SB_.PCI0.PEG0.PEGP.DD02._BCL] (Node 0xfffffe81dd1d2408), AE_NOT_FOUND (20131218/psparse-553)
acpiout1: failed to evaluate \_SB_.PCI0.PEG0.PEGP.DD02._BCL: AE_NOT_FOUND
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20131218/hwxface-646)
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20131218/hwxface-646)
i915drmkms0: interrupting at ioapic0 pin 16 (i915)
drm: GMBUS [i915 gmbus vga] timed out, falling back to bit banging on pin 2
intelfb0 at i915drmkms0
i915drmkms0: info: registered panic notifier
DRM error in radeon_get_bios: Unable to locate a BIOS ROM
error: Fatal error during GPU init
radeon0: unable to attach drm: 22
Graphics work fine, but acpi(4) and apm(8) commands don't exist and I suppose I need a custom kernel (which I will certainly build after solving these issues).

A note about mplayer: as I said on DF, it receives SIGILL if running alone and SIGSEGV if inside gdb and in different spots:
Code:
Program terminated with signal SIGILL, Illegal instruction.
#0  0x00007f7fee102d28 in ?? ()
(gdb) bt
#0  0x00007f7fee102d28 in ?? ()
#1  0x0000000000000000 in ?? ()
and
Code:
Program received signal SIGSEGV, Segmentation fault.
0x00007f7ff2407c9e in ?? ()
(gdb) bt
#0  0x00007f7ff2407c9e in ?? ()
#1  0x0000000000000000 in ?? ()
To make it clear: I am running NetBSD 7.0.2, with the latest stable pkgsrc tree that is 2016Q4; in both /usr/pkg/etc/pkgin/repositories.conf and PKG_PATH I have
Code:
http://cdn.netbsd.org/pub/pkgsrc/packages/NetBSD/$arch/7.0/All
GLib, GTK+2, Firefox, MPlayer, GIMP and their almost all dependencies are built locally from the tree.

ldconfig(8) is disabled as advised.
 
Old 02-19-2017, 04:29 AM   #12
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,252

Rep: Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321
Quote:
Warning: Bad starting head (CHS and LBA don't match)
Warning: the current number of heads per cylinder is 16 but the correct value may be 255
On this: Back in history, ms-dos was coded to deal for hard disks - back when a big clunky disk had a tiny capacity - 10MB was good back then. They were to have no more than 16 heads, 1024 tracks, and some equally ridiculous sectors per track. Since then systems have been lying; There were various maximum limits for disk sizes - 512MB, 2 Gig, etc. Sectors/track and total tracks have expanded beyond all expectations, and various sets of lies were told at various stages to correct this, CHS(Cylinders, Heads, Sectors) & LBA (Logical Block Addressing) being 2 of them.

Nearly certainly you should be set on 255 heads; There is only 2, but it was about the one number with room in it. but altering the heads setting may break up data on the partition.

Graphics is another area where software lies have affected hardware designs, as nobody thought about vga when the pc was designed. The competition was from CP/M mini computers on 80x25 consoles, and mainframes on something similar. Consoles were for printing ascii, and incapable of graphics, although that, like everything else changed.

My suggestion: Ignore the graphics errors(my system says most of that); Back up, change to 255 heads and see what breaks; there's usually an autodetect in the bios these days; You have an acpi problem. I'd delete and reinstall that. Then see what shows.
 
Old 02-20-2017, 04:59 AM   #13
Cág
LQ Newbie
 
Registered: Oct 2016
Posts: 18

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by business_kid View Post
My suggestion: Ignore the graphics errors(my system says most of that); Back up, change to 255 heads and see what breaks; there's usually an
autodetect in the bios these days.
Setting 255 heads in testdisk, then writing MBR switches back to 16 after reboot. It is explained in the docs since it is the only operating system on the disk.
Quote:
You have an acpi problem. I'd delete and reinstall that. Then see what shows.
Reinstall what? I tried disabling ACPI but it doesn't change anything.
 
Old 02-20-2017, 05:49 AM   #14
cynwulf
Senior Member
 
Registered: Apr 2005
Posts: 2,727

Rep: Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367
I'm not familiar with "testdisk" so not sure if that error referring to the drive geometry is relevant. Can you post your fdisk and disklabel outputs? For disklabel you will need to specify the device node.
 
Old 02-20-2017, 08:26 AM   #15
jggimi
Member
 
Registered: Jan 2016
Distribution: None. Just OpenBSD.
Posts: 289

Rep: Reputation: 169Reputation: 169
Drive geometry should have nothing to do with segfaults. Unless I completely misunderstand.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to remove programmes GuyFawkes Linux - Newbie 12 08-26-2016 06:26 AM
I can't run my programmes fazc99 Programming 5 02-11-2012 10:53 AM
email programmes ambiant Linux - Software 8 01-01-2011 01:54 PM
Install programmes lijoyx Linux - Newbie 1 11-05-2007 10:44 AM
Compiling programmes: How to use #ifdef? asciimonster Linux - General 1 12-13-2004 01:35 PM

LinuxQuestions.org > Forums > Other *NIX Forums > *BSD

All times are GMT -5. The time now is 05:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration