[SOLVED] Segmentation fault during program execution?
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am running large matrix solvers and ran into a problem in that I cannot have matrices larger than ~80000^2 indices or else I get a segmentation fault. I am using a Dell XPS with an i7 processor and 8GB of RAM, the OS is Ubuntu 8.10. On my laptop with 2GB of RAM and an XP OS I can solve much larger matrices using the same fortran code as on my Linux system without any trouble so I suspect that perhaps I have a configuration issue with the Linux system.
When I try to boot into memtest86+ I get the message; Error 28: Selected item cannot fit into memory
and when I run displaymem from grub I get the following;
grub> displaymem
EISA Memory BIOS Interface is present
Address Map BIOS Interface is present
Lower memory: 602K, Upper memory (to first chipset hole): 3136064K
[Address Range Descriptor entries immediately follow (values are 64-bit)]
Ussable RAM: Base Address: 0x0 X 4GB + 0x0,
Length: 0x0 X 4GB + 0x96800 bytes
Reserved: Base Address: 0x0 X 4GB + 96800,
Length: 0x0 X 4GB + 0x9800 bytes
Reserved: Base Address: 0x0 X 4GB + 0xe0000,
Length: 0x0 X 4GB + 0x20000 bytes
Ussable RAM: Base Address: 0x0 X 4GB + 0x100000,
Length: 0x0 X 4GB + 0xbf690000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf790000,
Length: 0x0 X 4GB + 0xe000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf79e000,
Length: 0x0 X 4GB + 0x32000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf7d0000,
Length: 0x0 X 4GB + 0x10000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf7ec000,
Length: 0x0 X 4GB + 0x14000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xbf800000,
Length: 0x0 X 4GB + 0x800000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xfee00000,
Length: 0x0 X 4GB + 0x1000 bytes
Reserved: Base Address: 0x0 X 4GB + 0xffb00000,
Length: 0x0 X 4GB + 0x500000 bytes
Ussable RAM: Base Address: 0x1 X 4GB + 0x0,
Length: 0x1 X 4GB + 0x40000000 bytes
I was wondering if anyone can interpret the above and point me in the right direction to find a solution.
half of RAM not recognized, large processes failing
Your displaymem shows only half your RAM was recognized. Verify that with the shell command
cat /proc/meminfo
You didn't tell us whether you installed an "i386" or "amd64" system. The i386 kernel will use all 8 GB with the big RAM extension, but your userland will be happier in amd64. If you decide to stay in 32 bits, do this.
sudo apt-get update
sudo sudo apt-get install linux-headers-server linux-image-server linux-server
then reboot and choose the new kernel to boot from.
If you're already using the 64-bit kernel (amd64), and your 8GB isn't recognized, something is very wrong in hardware or BIOS. Type
uname -a
to see which you have.
Run your program with a small enough data set that it doesn't crash. Watch it run with the top program. Now get another terminal window and run
/sbin/swapon -s
to see how much swap space is being used. If no areas are listed, that's your problem. Create a swap file or partition. File is easy to create,
sudo dd bs=1024k count=900 if=/dev/zero of=/var/cache/swapfile.swp
and activate
sudo swapon /var/cache/swapfile.swp
If that fixes the problem, look in System Administration for a graphical way to make it permanent.
grub> displaymem
EISA Memory BIOS Interface is present
Address Map BIOS Interface is present
Lower memory: 602K, Upper memory (to first chipset hole): 3136064K
I've never seen the displaymem program, so I don't know what it ought to be telling you. That phrase to first chipset hole is significant and seems to mean the info you are getting from displaymem isn't what you might expect if you ignored that phrase.
You have lots of ram, lots of swap space, and a 64 bit kernel. I don't think there is any memory problem in the hardware or the Linux kernel.
IIUC, you're trying to work with matrices of 6.4G items, but you say those work in Windows XP with 2GB of ram. Even if that is 64 bit Windows XP and you have lots of swap space, an operation on 6.4G items would run too slowly to be useful.
So either you mean sparse matrices or I entirely misunderstood what you mean. Assuming sparse matrices, we have no way of guessing what the size really is. Since we don't know how sparse they are, 80000^2 tells us nothing.
It's hard to guess from so little info, but I'll guess anyway. Please try
Code:
ulimit -s
That tells you the current limit (in KB) on the size of a stack. Maybe your program allocates the main data structures on the stack and the default stack limit is too low.
You might try something like
Code:
ulimit -s 2000000
then try to run your program (from the same shell).
IIUC, that command sets the stack limit (only for programs run from the current shell) to ~2GB. I've never tried anything with a stack that large. It is poor programming practice to put the very big objects on the stack rather than the heap. But with a 64 bit system, I think this is OK and if your problem is stack limit, this ought to cover it.
Quote:
Originally Posted by clsgis
cat /proc/meminfo
...
uname -a
BTW, those were very good suggestions. The answers to those eliminated almost all the likely issues one might have guessed from the first post, and thus made it much easier to look for the real problem.
Bumping the stack limit up using the ulimit command solved the problem. Is there anyway to make changes to the default stack limits permanent?
I should have mentioned that the big matrices I work with are sparse (with various levels of fill). That fact escaped me because I was so caught up with how the same solver code compiled on the Linux system would not work but on XP it worked fine.
Is there anyway to make changes to the default stack limits permanent?
If you want to attract an expert answer to that, you may want to start a new thread for it.
It is common to put ulimit commands in various startup scripts, such as .bashrc
Where is right for you depends on which programs you want affected and how you run those programs.
For use in a startup script you probably want
ulimit -S -s 2000000
With the -S a subsequent ulimit command from the same shell can raise or lower the limit. Without the -S a subsequent command can only lower it (there is both the actual stack limit and a limit on setting the stack limit. The command can set either or both and defaults to both).
Mind that, in any case, this is a bug in your software that should be submitted upstream for them to fix it. A program should make the right checks and never segfault just because the data you entered is out of range or too much data.
Besides that, I agree with the other poster: really poor programming if you truly need a 2gb stack.
A program should make the right checks and never segfault just because the data you entered is out of range or too much data.
Nice in theory, but generally not practical in the real world. If you defend against every possible problem, your program is nothing but defense and no room for function. There is always a tradeoff relative to customer type and market size. If you hope to sell a million copies of the product you can afford to write a lot more defense against odd situations (and maybe you can't afford not to write all that defense). If you are going to use it yourself and just a few times, you want to code for results and not defense. The real world in between is always a judgment call.
Quote:
I agree with the other poster: really poor programming if you truly need a 2gb stack.
I don't think anyone said it needed a 2GB stack.
It needed a bigger stack than the default stack limit that was previously set. But I didn't yet know that when I made the suggestion.
I suggested 2GB because I wanted a value that would definitively answer the question of whether stack size was the issue. If the old stack limit was 10MB and he changed it to 100MB and it failed anyway, what would that mean? Would it mean the problem wasn't stack size or would it mean 100MB wasn't enough? I don't know. The program was usable on a 2GB Windows XP system. With a lot of kludging you could get a program using more than 2GB of stack to run slowly (swapping a lot) on such a system. But that is just barely possible, not in any way likely enough to really consider. So I decided that 2GB was either enough or it was chasing the wrong issue: decidable in one test.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.