ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: Solaris 11.4, Oracle Linux, Mint, Debian/WSL
Posts: 9,789
Rep:
Your program is likely referencing data outside its valid address space.
Common reasons are:
- bogus array index
- NULL or unitialized pointer
- Too much recursion
I think I remember seeing that error returned on a SPARC, where the software had a bug and did an unaligned memory operation, which means it tried to read a 32 bit value where the lower 2 bits of the address pointer were not 0.
This should not happen on Intel processors though, they can do those types of memory access but take a few extra clock cycles to do the fetch.
What processor type is this error showing up on if I may ask.
Distribution: Solaris 11.4, Oracle Linux, Mint, Debian/WSL
Posts: 9,789
Rep:
A core file is an image of a process memory automatically generated when a program ends after some fatal exceptions, like the memory fault you observe.
Its purpose is to help post-mortem debug.
Maybe ucLinux isn't generating core files, or the core file can't be created because the process current directory isn't writable at crash time.
The fact that ucLinux is dedicated to hardware lacking a MMU may be important to investigate this problem.
What is your program ?
Did you wrote it and compiled it yourself ?
Is the same problem with this program happening on other Linux systems ?
ucLinux is not generating any core file. And you are right the system doesn't have any MMU. I am writing a video communication software and have a cross compiler for the system. The same problem doesn't occure in Red Hat Linux 9.0 and Fedora Core 2.
What is the target processor of your cross compiler. If its a unaligned access, sometimes called "bus fault" or "memory fault", then this will not occure in Intel, but will on most other processors, such as Motorola, SPARC. MIPS does not have this problem. So what kind of processor is it.
Distribution: Solaris 11.4, Oracle Linux, Mint, Debian/WSL
Posts: 9,789
Rep:
One possible reason of different behaviour between ucLinux and mainstream linux is the former not growing the process stack size dynamically, which may lead to the memory fault you observe.
There is a compiler option to increase this stack size (see flthdr command or FLTFLAG variable)
I'm not familiar with their processors, so compile and run this example program on your processor, and if it causes a bus/memory fault then that is what is going on. This program will fail if your processor is limited to aligned memory r/w, which is what I suspect your issue is. This program works on Intel because their processors have additional micro-code to handle double-fetches when accesses are not word aligned.
Code:
#include <stdio.h>
char buf[8]={0,1,2,3,4,5,6,7};
int main(int argc, char *argv[]) {
int j;
unsigned long *p;
for (j=0; j<5; j++) {
p=(unsigned long *)(buf+j);
printf("address=0x%08X value=0x%08X\n",p,*p);
}
}
//Example Intel processor output
//address=0x08049494 value=0x03020100
//address=0x08049495 value=0x04030201
//address=0x08049496 value=0x05040302
//address=0x08049497 value=0x06050403
//address=0x08049498 value=0x07060504
In pseudo assembly, cuz I don't know your specific processor,
I'm showing an example of loading a 32 bit quantity from memory
at address 0x12345-0x12348 into the accumulator.
mov ix,0x12345 ;load index reg with constant address pointer
mov a,[ix] ;load 32 bit accum from address pointer ix
On your processor an exception interrupt is being generated
when you execute the mov a,[ix] because the least sig. 2 bits
of ix are not 0. Note the 5 on the end of the address,
it ends in 01 binary. It must be 00 binary or the processor
throws an exception.
This means you can only read 32 bit values from an address
that is divisible by 4, and 16 bit values from an address
that is divisible by 2 (remainder=0).
This is exactly why C does byte stuffing when you create
a char variable and a long varaible right next to each
other. The 3 bytes between the char and before the long
are wasted space. You can test this yourself by
by printing out the addresses of the two variables
and seeing that the char will start at address X and the
long will start at address X+4, even though it would
have fit at address X+1.
Take a look at the example program I gave you, you will
see I am doing a funky pointer cast from char * to
unsigned long *. Normally this causes no problems
if you are careful, but if the char * points to
an address not divisible by 4 (which is perfectly
valid for a char * BTW) then the unsigned long *
is not a valid pointer and will cause a memory/bus
fault when you use it. In my example program
the de-reference *p at the end of the printf()
is the exact point where the exception is thrown
because the pointer p is not divisible by 4 the
second time through the j for loop.
I hope you find your problem, historically this
is a very difficult problem to find if you have
a lot of code (and not written by you) to go through.
When I changed the (unsigned long *) to (char *) the program ran properly. Why is that? It must be the case that for unaligned addresses two memory reads are performed. But who is doing it? The compiler?
I wouldn't say it ran properly after you made that change, because you altered the program logic at the same time. You changed the program so it reads five 8 bit values instead of five 32 bit values from buf[]. Asside from the obvious big/little endian difference, your output will look quite different from the example output shown from the Intel computer in the example. So you removed one bug but added a second bug to the code's logic by doing the pointer type change. The program doesn't work the same as it did before, in spite of the memory fault not happening after the change.
Like I said before, its is the processor causing the exception, it is a hardware exception and not the compiler's fault or the OS fault. Try running the same program on Intel, it works just fine. Its because of the differences in the processor hardware.
Somewhere in your code is lurking the same sort of flaw in the program logic, it may not look exactly like the example program I showed you but the flavor of the problem is the same. Some form of pointer casting that changes the size of the referenced type followed by pointer addition or the like.
I'm afraid its roll up the sleeves and test the code function by function, trying to half-split and isolate the problem to a specific function and going into the logic from there.
Distribution: Solaris 11.4, Oracle Linux, Mint, Debian/WSL
Posts: 9,789
Rep:
Russoue,
I'm not sure if it's supported with your architecture, but if you could use remote gdb on your program (gdbserver), that would greatly help you finding the parts of your code where the errors occur.
As Randyding pointed, your target CPU is big-endian while your code works on intel architecture, which is little endian. If your code isn't already taking this into account, you may first fix it at the source level before doing any runtime debugging.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.