LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 08-06-2009, 10:18 PM   #1
whaler1
LQ Newbie
 
Registered: Aug 2009
Posts: 3

Rep: Reputation: 0
Segmentation fault during program execution?


I am running large matrix solvers and ran into a problem in that I cannot have matrices larger than ~80000^2 indices or else I get a segmentation fault. I am using a Dell XPS with an i7 processor and 8GB of RAM, the OS is Ubuntu 8.10. On my laptop with 2GB of RAM and an XP OS I can solve much larger matrices using the same fortran code as on my Linux system without any trouble so I suspect that perhaps I have a configuration issue with the Linux system.

When I try to boot into memtest86+ I get the message; Error 28: Selected item cannot fit into memory

and when I run displaymem from grub I get the following;

grub> displaymem
EISA Memory BIOS Interface is present
Address Map BIOS Interface is present
Lower memory: 602K, Upper memory (to first chipset hole): 3136064K
[Address Range Descriptor entries immediately follow (values are 64-bit)]
Ussable RAM: Base Address: 0x0 X 4GB + 0x0,
Length: 0x0 X 4GB + 0x96800 bytes
Reserved: Base Address: 0x0 X 4GB + 96800,
Length: 0x0 X 4GB + 0x9800 bytes
Reserved: Base Address: 0x0 X 4GB + 0xe0000,
Length: 0x0 X 4GB + 0x20000 bytes
Ussable RAM: Base Address: 0x0 X 4GB + 0x100000,
Length: 0x0 X 4GB + 0xbf690000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf790000,
Length: 0x0 X 4GB + 0xe000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf79e000,
Length: 0x0 X 4GB + 0x32000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf7d0000,
Length: 0x0 X 4GB + 0x10000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf7ec000,
Length: 0x0 X 4GB + 0x14000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf800000,
Length: 0x0 X 4GB + 0x800000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xfee00000,
Length: 0x0 X 4GB + 0x1000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xffb00000,
Length: 0x0 X 4GB + 0x500000 bytes
Ussable RAM: Base Address: 0x1 X 4GB + 0x0,
Length: 0x1 X 4GB + 0x40000000 bytes

I was wondering if anyone can interpret the above and point me in the right direction to find a solution.

Many thanks for any help
 
Old 08-06-2009, 11:54 PM   #2
clsgis
LQ Newbie
 
Registered: Nov 2007
Posts: 18

Rep: Reputation: 1
half of RAM not recognized, large processes failing

Your displaymem shows only half your RAM was recognized. Verify that with the shell command
cat /proc/meminfo

You didn't tell us whether you installed an "i386" or "amd64" system. The i386 kernel will use all 8 GB with the big RAM extension, but your userland will be happier in amd64. If you decide to stay in 32 bits, do this.
sudo apt-get update
sudo sudo apt-get install linux-headers-server linux-image-server linux-server

then reboot and choose the new kernel to boot from.
If you're already using the 64-bit kernel (amd64), and your 8GB isn't recognized, something is very wrong in hardware or BIOS. Type
uname -a
to see which you have.

Run your program with a small enough data set that it doesn't crash. Watch it run with the top program. Now get another terminal window and run
/sbin/swapon -s
to see how much swap space is being used. If no areas are listed, that's your problem. Create a swap file or partition. File is easy to create,
sudo dd bs=1024k count=900 if=/dev/zero of=/var/cache/swapfile.swp
and activate
sudo swapon /var/cache/swapfile.swp

If that fixes the problem, look in System Administration for a graphical way to make it permanent.
 
Old 08-07-2009, 07:57 AM   #3
whaler1
LQ Newbie
 
Registered: Aug 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Thank you for the reply, here is some more info that will likely shed some light on the problem.

modeler@ubuntu:~$ uname -a
Linux ubuntu 2.6.27-7-generic #1 SMP Fri Oct 24 06:40:41 UTC 2008 x86_64 GNU/Linux

modeler@ubuntu:~$ cat /proc/meminfo
MemTotal: 8175312 kB
MemFree: 7524728 kB
Buffers: 16588 kB
Cached: 191652 kB
SwapCached: 0 kB
Active: 305152 kB
Inactive: 142116 kB
SwapTotal: 23952872 kB
SwapFree: 23952872 kB
Dirty: 60 kB
Writeback: 0 kB
AnonPages: 239068 kB
Mapped: 61644 kB
Slab: 49204 kB
SReclaimable: 24256 kB
SUnreclaim: 24948 kB
PageTables: 15132 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 28040528 kB
Committed_AS: 538612 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 317320 kB
VmallocChunk: 34359420815 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 42560 kB
DirectMap2M: 8337408 kB

modeler@ubuntu:~$ /sbin/swapon -s
Filename Type Size Used Priority
/dev/mapper/isw_cbghejgbg_ARRAY5 partition 23952872 0 -1
 
Old 08-07-2009, 08:24 AM   #4
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,139

Rep: Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127
Quote:
Originally Posted by whaler1 View Post
Code:
grub> displaymem
EISA Memory BIOS Interface is present
Address Map BIOS Interface is present
Lower memory: 602K, Upper memory (to first chipset hole): 3136064K
I've never seen the displaymem program, so I don't know what it ought to be telling you. That phrase to first chipset hole is significant and seems to mean the info you are getting from displaymem isn't what you might expect if you ignored that phrase.

You have lots of ram, lots of swap space, and a 64 bit kernel. I don't think there is any memory problem in the hardware or the Linux kernel.

IIUC, you're trying to work with matrices of 6.4G items, but you say those work in Windows XP with 2GB of ram. Even if that is 64 bit Windows XP and you have lots of swap space, an operation on 6.4G items would run too slowly to be useful.

So either you mean sparse matrices or I entirely misunderstood what you mean. Assuming sparse matrices, we have no way of guessing what the size really is. Since we don't know how sparse they are, 80000^2 tells us nothing.

It's hard to guess from so little info, but I'll guess anyway. Please try
Code:
ulimit -s
That tells you the current limit (in KB) on the size of a stack. Maybe your program allocates the main data structures on the stack and the default stack limit is too low.

You might try something like
Code:
ulimit -s 2000000
then try to run your program (from the same shell).
IIUC, that command sets the stack limit (only for programs run from the current shell) to ~2GB. I've never tried anything with a stack that large. It is poor programming practice to put the very big objects on the stack rather than the heap. But with a 64 bit system, I think this is OK and if your problem is stack limit, this ought to cover it.

Quote:
Originally Posted by clsgis View Post
cat /proc/meminfo
...
uname -a
BTW, those were very good suggestions. The answers to those eliminated almost all the likely issues one might have guessed from the first post, and thus made it much easier to look for the real problem.

Last edited by johnsfine; 08-07-2009 at 09:00 AM.
 
Old 08-07-2009, 11:37 AM   #5
whaler1
LQ Newbie
 
Registered: Aug 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Thumbs up

Many thanks folks,

Bumping the stack limit up using the ulimit command solved the problem. Is there anyway to make changes to the default stack limits permanent?

I should have mentioned that the big matrices I work with are sparse (with various levels of fill). That fact escaped me because I was so caught up with how the same solver code compiled on the Linux system would not work but on XP it worked fine.
 
Old 08-07-2009, 01:49 PM   #6
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,139

Rep: Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127
Quote:
Originally Posted by whaler1 View Post
Is there anyway to make changes to the default stack limits permanent?
If you want to attract an expert answer to that, you may want to start a new thread for it.

It is common to put ulimit commands in various startup scripts, such as .bashrc

Where is right for you depends on which programs you want affected and how you run those programs.

For use in a startup script you probably want
ulimit -S -s 2000000

With the -S a subsequent ulimit command from the same shell can raise or lower the limit. Without the -S a subsequent command can only lower it (there is both the actual stack limit and a limit on setting the stack limit. The command can set either or both and defaults to both).

Last edited by johnsfine; 08-07-2009 at 02:21 PM.
 
Old 08-07-2009, 02:10 PM   #7
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,049

Rep: Reputation: 378Reputation: 378Reputation: 378Reputation: 378
Mind that, in any case, this is a bug in your software that should be submitted upstream for them to fix it. A program should make the right checks and never segfault just because the data you entered is out of range or too much data.

Besides that, I agree with the other poster: really poor programming if you truly need a 2gb stack.
 
Old 08-07-2009, 02:33 PM   #8
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,139

Rep: Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127
Quote:
Originally Posted by i92guboj View Post
A program should make the right checks and never segfault just because the data you entered is out of range or too much data.
Nice in theory, but generally not practical in the real world. If you defend against every possible problem, your program is nothing but defense and no room for function. There is always a tradeoff relative to customer type and market size. If you hope to sell a million copies of the product you can afford to write a lot more defense against odd situations (and maybe you can't afford not to write all that defense). If you are going to use it yourself and just a few times, you want to code for results and not defense. The real world in between is always a judgment call.

Quote:
I agree with the other poster: really poor programming if you truly need a 2gb stack.
I don't think anyone said it needed a 2GB stack.

It needed a bigger stack than the default stack limit that was previously set. But I didn't yet know that when I made the suggestion.

I suggested 2GB because I wanted a value that would definitively answer the question of whether stack size was the issue. If the old stack limit was 10MB and he changed it to 100MB and it failed anyway, what would that mean? Would it mean the problem wasn't stack size or would it mean 100MB wasn't enough? I don't know. The program was usable on a 2GB Windows XP system. With a lot of kludging you could get a program using more than 2GB of stack to run slowly (swapping a lot) on such a system. But that is just barely possible, not in any way likely enough to really consider. So I decided that 2GB was either enough or it was chasing the wrong issue: decidable in one test.

Last edited by johnsfine; 08-07-2009 at 02:35 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting Segmentation Fault when im trying to start this program :/ Fronix Linux - Server 5 12-30-2008 01:34 AM
Simple C++ Program: Program Compiles But Won't Run (Segmentation Fault) violagirl23 Programming 3 01-09-2008 01:09 AM
C++ Program, Segmentation Fault Fireball7 Programming 6 12-07-2005 05:22 PM
why segmentation fault in this program? asahlot Programming 13 10-17-2005 01:47 PM
SIGSEGV- segmentation fault during execution ashwinipahuja Programming 5 05-02-2004 11:02 PM


All times are GMT -5. The time now is 04:29 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration