LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-23-2009, 08:22 AM   #1
ljlinde
LQ Newbie
 
Registered: Mar 2009
Posts: 9

Rep: Reputation: 0
"C" realloc () function problem.


calloc () is used to allocate and clear space for the a data set as a typedef struct. One structure is allocated. The structure contains 10 char xxx[3];
variables.
Data can be loaded into the structure and it all checks. When space for a second set of data is created using realloc (ptr,2) - the data in the first structure
second string is corrupted. The first and subsequent strings are correct.
Data entered into the second structure is correct. Space for a third set of data
is realloc(ptr,3) and all data in second is correct. First structure second string
is still corrupt.
The first set of data can be entered and more space realloc and it is all correct.

Any clues.
This code compiles and runs without error or warnings. Runs fine on several
Unix machines but not on Suse 11 linux.

Any clues.
 
Old 03-23-2009, 08:40 AM   #2
vkmgeek
Member
 
Registered: Feb 2006
Location: Ahmedabad
Distribution: rhel5
Posts: 185
Blog Entries: 2

Rep: Reputation: 31
Can u post piece of code ?
 
Old 03-24-2009, 06:40 PM   #3
JaksoDebr
Member
 
Registered: Mar 2009
Distribution: Fedora, Slackware
Posts: 104

Rep: Reputation: 21
Are the other UNIX machines also Linux systems, or is Suse the only Linux box involved?
Is it the same on another Suse installation?
Is it always the same memory area that is trashed?
Did you run the code via 'ddd' to debug it in detail?
Does it produce the same result when run via 'ddd' as it does on a normal run?
Do all involved systems use the same GCC and GLIBC combination?
Tried to compile it with another compiler (like LCC or Intel)?

The source code is probably not at fault, if it runs correctly on other systems. It seems like something is trashing the memory area of the application. It can even be a problem within glibc.

Linux Archive

Last edited by JaksoDebr; 04-02-2009 at 05:19 AM.
 
Old 03-24-2009, 10:17 PM   #4
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by ljlinde View Post
space for a second set of data is created using realloc (ptr,2)
Do you mean that literally? You did a realloc to change the size to be 2 bytes?

If you mean realloc(ptr,2*sizeof(whatever_type)) then you should say so.

Quote:
Originally Posted by vkmgeek View Post
Can u post piece of code ?
Good suggestion.

Quote:
Originally Posted by JaksoDebr View Post
The source code is probably not at fault, if it runs correctly on other systems.
Nonsense! The source code is almost certainly at fault. It probably runs correctly because of luck. You can clobber quite a bit of heap memory that you incorrectly think you allocated and work anyway (by luck) especially in a short simple program.

Last edited by johnsfine; 03-24-2009 at 10:20 PM.
 
Old 03-25-2009, 04:49 PM   #5
JaksoDebr
Member
 
Registered: Mar 2009
Distribution: Fedora, Slackware
Posts: 104

Rep: Reputation: 21
Quote:
Nonsense! The source code is almost certainly at fault. It probably runs correctly because of luck.
I admit that 'realloc (ptr,2)' effectively does the trashing itself, so that would be really a problem in the source code. I just wonder, why the same line of code behaves 'correctly' on other UNIXes. This would mean that all other UNIX systems run it correctly "by luck" - effectively meaning that all the other systems are buggy for depending on luck.

Luck is not really a well defined concept in programming. But 'realloc' has a well-defined standard definition, so I still keep up my theory that some system library can be at fault: either in Suse, or in all those other "lucky" UNIX systems (even if their fault is by hiding the error and making some assumption on how to behave).

Is there any standard way to acquire such "coding luck" for programming? A lot of people could use it.
 
Old 03-25-2009, 09:01 PM   #6
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by JaksoDebr View Post
I admit that 'realloc (ptr,2)' effectively does the trashing itself
If he meant that literally, which seems very possible, that is what does the trashing.

Quote:
I just wonder, why the same line of code behaves 'correctly' on other UNIXes.
Obviously, the line of code wound not behave correctly. But the program might run to completion with correct output.

Quote:
effectively meaning that all the other systems are buggy for depending on luck.

I don't get that interpretation at all. The other systems don't "depend" on any luck. They assume the program makes correct calls.

In a well designed system a correct program produces correct results.

It would be impossible to design a system in which every incorrect program produces incorrect results. Even after reading the programmer's mind to deduce intent, you would still run into the decidability issues in correctness.

Sometimes an incorrect program will produce "correct" output.

Quote:
Is there any standard way to acquire such "coding luck" for programming? A lot of people could use it.
Hardly! If you're a serious programmer that would be bad luck. It means you tested your program and it worked, but it is still wrong and under other conditions it will crash.

I'm much happier when broken programs fail the simplest test, than when one runs perfectly for six years of serious use by many people, then fails on a totally ordinary example and takes days of debugging (last week) to find a serious coding error that took a massive amount of "luck" to not crash far more often.

If it had failed any of the initial tests I would have spotted the bug without even debugging. If the 64bit Visual Studio debugger were not seriously flawed I would have found the bug in a few minutes of debugging Unfortunately gdb shares the important bugs that got in my way. Win32 was the one platform I use with a good enough (obsolete Visual Studio) debugger for this bug and the one platform where the bug got "lucky" and never appeared.
 
Old 03-26-2009, 07:50 AM   #7
ljlinde
LQ Newbie
 
Registered: Mar 2009
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by JaksoDebr View Post
Are the other UNIX machines also Linux systems, or is Suse the only Linux box involved?
The Unix system are BSD 4.3.

Is it the same on another Suse installation?
Don't have access to another Suse system.
Is it always the same memory area that is trashed?
Yes. The strange part is that you can realloc() more structures and they are
all OK. It is only the first structure, second member that get trashed.

Did you run the code via 'ddd' to debug it in detail?
Does it produce the same result when run via 'ddd' as it does on a normal run?
What is "ddd"?

The orig. code was developed on a Convex BSD 4.3 in 92 time frame. It work with
out any any changes - Convex is now part of HP. Convex spent a lot of time fixing
minor glitches in their compilers. I was able to run a control simulation
written in "C" coupled to a Fortran 66 Engine model with Ada monitoring linked up.

Do all involved systems use the same GCC and GLIBC combination? No.
The other systems were all BSD 4.1, 4.2 or 4.3

Tried to compile it with another compiler (like LCC or Intel)?
I do not have access to the Intel product.

The source code is probably not at fault, if it runs correctly on other systems. It seems like something is trashing the memory area of the application. It can even be a problem within glibc.
The code is text book stuff and simple. It compiles without any warning's and
with gdb you can see what happens.
I moved the program so it was not using the same memory by runnin another program before and let it run in an endless loop. When I inspected the pointer it was to
another place in memory. I looked to see if there was trash on the stack and it
all made sense.
I did a 12 memory test on the target mach. to make usre I was not looking at a
HW problem.
 
Old 03-26-2009, 08:00 AM   #8
ljlinde
LQ Newbie
 
Registered: Mar 2009
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by johnsfine View Post
Do you mean that literally? You did a realloc to change the size to be 2 bytes?

If you mean realloc(ptr,2*sizeof(whatever_type)) then you should say so.



Good suggestion.



Nonsense! The source code is almost certainly at fault. It probably runs correctly because of luck. You can clobber quite a bit of heap memory that you incorrectly think you allocated and work anyway (by luck) especially in a short simple program.
It was clearly stated that the ptr was pointer to a structure. The space
for it was created at run time by ptr = (STR *)calloc(1, sizeof(STR));
You can see the contents in the gdb by using *ptr -- (note calloc args are
reversed from malloc() )

It displays all of the contents.
realloc () knows what the size is from the orig calloc.
realloc (ptr,2) means that it will allocate one more structure in memory.
You can see this with gdb p *(ptr+1). This shows you the contents of the second
structure. To really test the problem I filled each structure with data and
every structur's contents were perfect.
Read Std "C".
 
Old 03-26-2009, 08:07 AM   #9
ljlinde
LQ Newbie
 
Registered: Mar 2009
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by JaksoDebr View Post
I admit that 'realloc (ptr,2)' effectively does the trashing itself, so that would be really a problem in the source code. I just wonder, why the same line of code behaves 'correctly' on other UNIXes. This would mean that all other UNIX systems run it correctly "by luck" - effectively meaning that all the other systems are buggy for depending on luck.

Luck is not really a well defined concept in programming. But 'realloc' has a well-defined standard definition, so I still keep up my theory that some system library can be at fault: either in Suse, or in all those other "lucky" UNIX systems (even if their fault is by hiding the error and making some assumption on how to behave).

Is there any standard way to acquire such "coding luck" for programming? A lot of people could use it.
There is no such function in "C" called luck()

You can see the corrupted first structure immediately after realloc () is run.

The program compiles with a warnings or errors.

If you create a temp ptr and set the temp ptr to tptr = ptr + 1;
You can realloc() for a least a dozen cases without any problems. It is always
the first structure - second member that is corrupted!
 
Old 03-26-2009, 08:10 AM   #10
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by ljlinde View Post
realloc () knows what the size is from the orig calloc.
realloc (ptr,2) means that it will allocate one more structure in memory.
Is there an online copy anywhere for whatever documentation told you that? It contradicts every example of realloc documentation I can find.

Quote:
You can see this with gdb p *(ptr+1). This shows you the contents of the second
structure.
Memory doesn't usually fail to exist nor even fail to work just because you allocated it incorrectly.

It is not surprising that you can see the contents of the second structure even after the memory it occupied has been released back to the free pool by your incorrect realloc.

Freeing a chunk of memory typically modifies just the first few bytes of the freed memory (exactly fitting your described symptom). (2 bytes is below a minimum chunk size, so the beginning of the freed memory is likely more than 2 bytes past the initial pointer).

Last edited by johnsfine; 03-26-2009 at 08:14 AM.
 
Old 03-26-2009, 08:16 AM   #11
fpmurphy
Member
 
Registered: Jan 2009
Location: /dev/ph
Distribution: Fedora, Ubuntu, Redhat, Centos
Posts: 299

Rep: Reputation: 62
Interesting problem. Ijlinde, can you provide a subset of your code which displays the problem?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
"substraction" and "division" function in OpenOffice Calc mac1234mac Linux - Software 7 12-03-2009 06:36 AM
Replacing "function(x)" with "x" using sed/awk/smth Griffon26 Linux - General 3 11-22-2006 10:47 AM
When "function pointer" meets "template"... I can't get rid of this compiling er cyu021 Linux - Software 3 12-17-2004 07:52 PM
"Function not implemented" error in call to "sem_open()" Krishnendu8 Linux - Newbie 1 06-07-2003 02:52 AM
"Function not imlemented" error in call to "sem_open()" Krishnendu8 Linux - Networking 0 06-07-2003 02:19 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 03:12 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration