LinuxQuestions.org - [SOLVED] SEGMENTATION FAULT using gcc 4.4.4 -O2 , works with gcc 4.1.0 -O2 or gcc 4.4.4 -O1

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - SEGMENTATION FAULT using gcc 4.4.4 -O2 , works with gcc 4.1.0 -O2 or gcc 4.4.4 -O1 (https://www.linuxquestions.org/questions/programming-9/segmentation-fault-using-gcc-4-4-4-o2-works-with-gcc-4-1-0-o2-or-gcc-4-4-4-o1-822124/)

amir1981

07-25-2010 08:22 PM

SEGMENTATION FAULT using gcc 4.4.4 -O2 , works with gcc 4.1.0 -O2 or gcc 4.4.4 -O1

Hi,
I'm trying to update a relatively old software to be used with new 64-bit systems and also new version of gcc. Becuase the original software is written for 32-bit systems, I decide to use controlled data types which are the same on both 64 bit and 32 bit Machines. I change the code according to these new defined types, Here is the situation:
1- As I expect everything works on 32-bit machine.
2-If I use gcc 4.1.0 on 64-bit Machine everything is working
3-If I use gcc 4.4.4 on 64-bit Machine Segmentation Fault would occur! (Optimization O2)
4-If I use gcc 4.4.4 on 64- bit Machine with -O everything works!!

Here is the output of Valgrind:
==13412==
==13412==
==13412== Process terminating with default action of signal 11 (SIGSEGV)
==13412== Access not within mapped region at address 0x700000008
==13412== at 0x409F29: SysString::clear(Integral::CMODE) (sstr_03.cc:1623)
==13412== by 0x40B73D: SysString::assign(unsigned char, wchar_t const*) (sstr_03.cc:838)
==13412== by 0x403474: SysString::diagnose(Integral::DEBUG) (sstr_02.cc:221)
==13412== by 0x401E1C: main (in /usr/local/isip/tools/ifc/class/system/SysString/SysString.exe)
==13412== If you believe this happened as a result of a stack
==13412== overflow in your program's main thread (unlikely but
==13412== possible), you can try to increase the size of the
==13412== main thread stack using the --main-stacksize= flag.
==13412== The main thread stack size used in this run was 16777216.
==13412==
==13412== For counts of detected and suppressed errors, rerun with: -v

Why I am getting Segmentation Error in case 3?

tnx

paulsm4

07-25-2010 08:40 PM

Hi -

As I'm sure you know, just because some code happens to run without crashing, doesn't necessarily mean that code is "correct". There could have been a latent bug there since Day One.

On the other hand (as Valgrind is reporting), maybe you're getting a stack overflow. Certainly worth instrumenting and looking for:

It looks like the code in question is trying to emulate Windows MFC functionality (which, itself, is probably fraught with danger ;)).

STRONG SUGGESTION:
1. See if you can reproduce the problem with "-g"

2. If so, see if you can troubleshoot whether your input values are correct and your data structures are uncorrupted, and your stack OK under GDB.

3. You might also be interested in using libsigsegv() for your troubleshooting:

http://savannah.gnu.org/projects/libsigsegv/

'Hope that helps .. PSM

Sergei Steshenko

07-25-2010 08:51 PM

Quote:

Originally Posted by amir1981 (Post 4045228)

If not yet, always use

-Wall -Wextra -Wformat=2

during compilation - sometimes warnings produced by the compiler give the clue.

amir1981

07-25-2010 08:55 PM

Quote:

Originally Posted by paulsm4 (Post 4045237)

Hey,
Actually, I have checked different stack sizes already and it is not helping.
This code is a part of a bigger code and is relatively complex system
:D

I have complied with -g and it is the result of gdb:
gdb) r
Starting program: /usr/local/isip/tools/ifc/class/system/SysString/SysString.exe
diagnosing class SysString:
testing required public methods...
<SysString::str1> value_d = (16 >= 16) "hello my name is"
<SysString::str2> value_d = (4 >= 4) "rjck"
<SysString::str3> value_d = (100 >= 0) ""
<SysString::str4> value_d = (4 >= 4) "rjck"
testing class-specific public methods: extensions to required methods...

Program received signal SIGSEGV, Segmentation fault.
SysString::clear (this=0x7f00000000, cmode_a=Integral::RESET) at sstr_03.cc:1623
1623 if (capacity_d > 0) {
Current language: auto; currently c++
(gdb)

I will work with libsigsegv to see if it can help or not!!!

anyway, thanks for the reply

amir1981

07-25-2010 08:57 PM

Quote:

Originally Posted by Sergei Steshenko (Post 4045240)

If not yet, always use

-Wall -Wextra -Wformat=2

during compilation - sometimes warnings produced by the compiler give the clue.

I have just used -Wall , but thanks for reminding I would add -Wextra -Wformat=2 too Maybe some useful information comes out

johnsfine

07-25-2010 09:36 PM

Quote:

Originally Posted by amir1981 (Post 4045228)

2-If I use gcc 4.1.0 on 64-bit Machine everything is working
3-If I use gcc 4.4.4 on 64-bit Machine Segmentation Fault would occur! (Optimization O2)
4-If I use gcc 4.4.4 on 64- bit Machine with -O everything works!!

A significant fraction of problems posted with descriptions like that turn out to by strict-aliasing problems.

To find out whether it is a strict-aliasing problem, replace the -O2 option with
-O2 -fno-strict-aliasing

If that fixes it, the problem was probably strict-aliasing (though that wouldn't be certain). If -fno-strict-aliasing doesn't fix the problem, then the problem definitely wasn't strict-aliasing.

If the problem is strict-aliasing, it is best to find and fix that error each place where it occurs in your code. But it large old programs that usually isn't practical, so -fno-strict-aliasing becomes a long term part of your compile command.

amir1981

07-25-2010 09:44 PM

Quote:

Originally Posted by johnsfine (Post 4045262)

Unfortunately it is no the case, code is already compiled with -fno-strict-aliasing

johnsfine

07-25-2010 10:11 PM

Quote:

Originally Posted by amir1981 (Post 4045242)

Program received signal SIGSEGV, Segmentation fault.
SysString::clear (this=0x7f00000000, cmode_a=Integral::RESET) at sstr_03.cc:1623
1623 if (capacity_d > 0) {

Is capactity_d a member variable of SysString? Otherwise what is it?

Seg faults should be pretty easy to understand when you catch them this way in GDB. The this pointer 0x7f00000000 looks a little improbable, but not definitely wrong. GDB commands can be used to examine the *this object and/or the contents of memory at 0x7f00000000 to see whether that pointer is wrong.

I don't know whether your Valgrind results were run with the same addresses used as your GDB results. The faulting address reported by Valgrind 0x700000008 seems quite unlikely for that line of code (a simple read of capactity_d) and the GDB reported value of the this pointer.

If you post a bit more of the source of SysString::clear, that might make the problem obvious.

If you know any asm, it is very effective to look at some disassembly and register values in GDB at the point of the seg fault.

The seg fault means some address was bad. You need to figure out what address was bad and what the code was supposed to be doing with that address and why it had a wrong value instead. All that should be pretty easy to find in GDB at the point of the seg fault.

amir1981

07-25-2010 11:51 PM

Quote:

Originally Posted by johnsfine (Post 4045282)

capacity_d is a member variable of SysString
Here is the snap of the code:

Code:

// method: clear

//

// arguments:

//  Integral::CMODE cmode: (input) clear mode 

//

// return: a bool8 value indicating status

//

// this method clears the string

//

bool8 SysString::clear(Integral::CMODE cmode_a) {



  // for release and free, ensure that memory is actually deleted

  //

  if (cmode_a >= Integral::RELEASE) {



    // delete all memory associated with this string

    //    

    freeMem();



    // assign null string to class data

    //

    allocateMem();

  }



  // for reset and retain, just clear value

  //

  else {



    // make the string a zero-length string. 

    //

    if (capacity_d > 0) {

      value_d[0] = (unichar)NULL;

    }

  }



  // exit gracefully

  //

  return true;

}

Code:

// --------------------------------------------------------------

// these two methods have to be in the same file so they get the

// static constant NULL_STRING with the exact same address

// --------------------------------------------------------------



// method: allocateMem

//

// arguments: none

//

// return: a boolean value indicating status

//

// this method allocates memory for the string

//



bool8 SysString::allocateMem() {



  // either allocate the memory or assign it to the static null string

  //

  if (capacity_d > 0) {

    if (value_d != (unichar*)NULL) {

      return Error::handle(name(), L"allocateMem", Error::MEM,

                          __FILE__, __LINE__);

    }



    // allocate and initialize the memory

    //

    value_d = new unichar[capacity_d + 1];

    value_d[0] = (unichar)NULL;

  }

  else {

    if (value_d != (unichar*)NULL) {

      return Error::handle(name(), L"allocateMem", Error::MEM,

                          __FILE__, __LINE__);

    }



    // reset the capacity 

    //

    capacity_d = 0;



    // assign null string to class data

    //

    value_d = (unichar*)NULL_STRING;

  }



  // exit gracefully

  //

  return true;

}



// method: freeMem

//

// arguments: none

//

// return: a bool8 value indicating status

//

// this method deletes memory for the string

//

bool8 SysString::freeMem() {



  // possibly free memory associated with this string

  //

  if (capacity_d > 0) {

    if (value_d != (unichar*)NULL) {



      // release and initialize the pointer

      //

      delete [] value_d;

      value_d = (unichar*)NULL;

    }

    else {

      return Error::handle(name(), L"freeMem", Error::MEM, __FILE__, __LINE__);

    }

  }



  // if capacity is less than or equal to 0

  //

  else {

    if (value_d == (unichar*)NULL_STRING) {



      // initialize the ptr

      //

      value_d = (unichar*)NULL;

    }

    else {

      return Error::handle(name(), L"freeMem", Error::MEM, __FILE__, __LINE__);

    }



  }



  // reset the capacity

  //

  capacity_d = 0;



  // exit gracefully

  //

  return true;

}



// method: growMem

//

// arguments:

//  int32 new_size: (input) new size of memory required

//

// return: a bool8 value indicating status

//

// allocate more memory for the string, nondestructively

//

bool8 SysString::growMem(int32 new_size_a) {



  // see if we have to do anything

  //

  if (new_size_a <= capacity_d) {

    return true;

  }



  // if there is nothing in the old memory block, we don't have to

  // save any memory, just delete and reallocate

  //

  if (length() == 0) {

    freeMem();

    capacity_d = new_size_a;

    allocateMem();

  }

  else {



    // this is the only nontrivial case, we need to increase the size

    // of the memory without destroying existing contents. we do this

    // by creating a new string of the specified size, assigning this

    // new string to have our current value, and swapping the memory

    // pointers (so that the old buffer is deleted with the new

    // SysString)

    //

    SysString* tmp_str = new SysString(new_size_a);



    // assign tmp_str to have current values

    //

    tmp_str->assign(*this);



    // swap the memory pointers

    //

    swap(*tmp_str);



    delete tmp_str; 

  }

  

  // exit gracefully

  //

  return true;

}

Actually the error tends to move, for example if I comment out some part of the code it would appear somewhere else!

output of gdb and backtrace:
gdb) r
Starting program: /usr/local/isip/tools/ifc/class/system/SysString/SysString.exe
diagnosing class SysString:
testing required public methods...
<SysString::str1> value_d = (16 >= 16) "hello my name is"
<SysString::str2> value_d = (4 >= 4) "rjck"
<SysString::str3> value_d = (100 >= 0) ""
<SysString::str4> value_d = (4 >= 4) "rjck"
testing class-specific public methods: extensions to required methods...

Program received signal SIGSEGV, Segmentation fault.
SysString::clear (this=0x7f00000000, cmode_a=Integral::RESET) at sstr_03.cc:1623
1623 if (capacity_d > 0) {
Current language: auto; currently c++
(gdb) backtrace
#0 SysString::clear (this=0x7f00000000, cmode_a=Integral::RESET) at sstr_03.cc:1623
#1 0x000000000040b73e in SysString::assign (this=0x7f00000000, arg_a=27 '\033', fmt_a=<value optimized out>) at sstr_03.cc:838
#2 0x0000000000403475 in SysString::diagnose (level_a=<value optimized out>) at sstr_02.cc:221
#3 0x0000000000401e1d in main ()
(gdb)

zirias

07-26-2010 01:49 AM

Quote:

Originally Posted by amir1981 (Post 4045329)

#1 0x000000000040b73e in SysString::assign (this=0x7f00000000, arg_a=27 '\033', fmt_a=<value optimized out>) at sstr_03.cc:838

As john already mentioned, the *this looks VERY suspect to me. Although it is inside the valid region where a heap object COULD be in x86_64 virtual address space, this would actually mean your heap was around 500 TB in size and the fact that the lower 32bit are all zeros is not that probable either ;)

Try setting a breakpoint earlier and see where this pointer is coming from.

johnsfine

07-26-2010 06:07 AM

Quote:

Originally Posted by amir1981 (Post 4045329)

Actually the error tends to move, for example if I comment out some part of the code it would appear somewhere else!

But that isn't what you're showing this time, right? You're showing the error in the same place.

An error that moves like that, usually is a memory clobber bug: The code with the actual bug uses some memory that doesn't belong to it. Then the error appears when the section of code that does own that memory uses it.

A memory clobber bug usually needs to be backtracked in two stages. First you need to follow the bad value (the this pointer in your example) back to the memory location where it was clobbered. Then you need to restart and set a data breakpoint to catch the real bug (In GDB I don't know how, nor even the correct terminology. I'm usually chasing such bugs in Visual Studio).

The info you posted makes it much more likely that the this pointer is bad (otherwise GDB is wrong about the line number, which is possible, but less likely). You also showed that the this pointer came through SysString::assign. So you should be looking in SysString::assign, or more likely the code that called it, for the point where this got clobbered.

amir1981

07-26-2010 11:58 AM

I have found something that might be related to the problem :
If use gdb and put a breakpoint just before the segmentation fault occurs in sstr_02.cc (at line 220) and then examine the value of "value_d" (value_d is a pointer to unichar) and then go one step into assign function and examine the "value_d" again I see this:
(gdb) r
Starting program: /usr/local/isip/tools/ifc/class/system/SysString/SysString.exe
testing class SysString
diagnosing class SysString:
testing required public methods...
<SysString::str1> value_d = (16 >= 16) "hello my name is"
<SysString::str2> value_d = (4 >= 4) "rjck"
<SysString::str3> value_d = (100 >= 0) ""
<SysString::str4> value_d = (4 >= 4) "rjck"
testing class-specific public methods: extensions to required methods...

Breakpoint 1, SysString::diagnose (level_a=<value optimized out>) at sstr_02.cc:221
(gdb) p num.value_d
$3 = (unichar *) 0x61fcb0
(gdb) s
SysString::assign (this=0x7f00000000, arg_a=27 '\033', fmt_a=0x41afd8) at sstr_03.cc:818
(gdb) p value_d
$4 = (unichar *) 0x0
(gdb)

on 32 bit system it like this:
(gdb) r
Starting program: /home/amir/local/isip/tools/system-ifc/class/system/SysString/SysString.exe
testing class SysString
diagnosing class SysString:
testing required public methods...
testing class-specific public methods: extensions to required methods...

Breakpoint 1, SysString::diagnose (level_a=Integral::BRIEF) at sstr_02.cc:221
(gdb) p num.value_d
$3 = (unichar *) 0x81672e8 L"27"
(gdb) s
SysString::assign (this=0xbfffeb84, arg_a=27 '\033', fmt_a=0x8063ac0 L"asdf = %u xyz") at sstr_03.cc:828
(gdb) p value_d
$4 = (unichar *) 0x81672e8 L"27"
(gdb)

As you can see for some reason "value_d" is pointing the NULL in the first case which is wrong, How this could happen?

johnsfine

07-26-2010 01:17 PM

Quote:

Originally Posted by amir1981 (Post 4045932)

As you can see for some reason "value_d" is pointing the NULL in the first case which is wrong, How this could happen?

You haven't provided the kind of information that would allow us to determine that.

Meanwhile, there is something strange in what you just provided. Can you explain this:

In your 64 bit version line 221 in SysString::diagnose called a version of SysString::assign at line 818. But in your 32 bit version line 221 in SysString::diagnose called an apparently different version of SysString::assign at line 828.

If you don't have a good explanation for that, post the area around each of those lines (around 221 in sstr_02.cc as well as around 818 through 828 in sstr_03.cc).

amir1981

07-26-2010 01:29 PM

Quote:

Originally Posted by johnsfine (Post 4046007)

Hi thanks for the fast response :D what kind of details should I provide?
Here is the code:

bool8 SysString::assign(byte8 arg_a, const unichar* fmt_a){<---Line 818

// allocate a static buffer for printing
//
static char buf[MAX_LENGTH];
static char fmt[MAX_LENGTH];
static char* fmt_ptr;

// check the arguments
//
if (fmt_a == (unichar*)NULL) { <---- Line 828
return Error::handle(name(), L"assign", Error::ARG, __FILE__, __LINE__);
}

SysString temp(fmt_a);
temp.getBuffer((byte8*)fmt, MAX_LENGTH);
fmt_ptr = fmt;

// clear out the current value
//
clear(Integral::RESET);

// create and possibly assign the string
//
if (sprintf(buf, fmt_ptr, (uint32)arg_a) > 0) {
assign((byte8*)buf);
return true;
}

// exit gracefully
//
return false;
}

I think it is a gdb issue that shows line 828 instead of 818

johnsfine

07-26-2010 02:06 PM

OK, now I see I misunderstood GDB output regarding 818 vs. 828. That is just a difference in the optimizer behavior of the two compiles.

I don't know how much to trust GDB regarding the values of this and value_d when stopped at line 818. Generally I don't trust any implausible variable values reported by GDB. GDB and/or the compiler are not very good at tracking which variables are in which registers and/or stack locations at which lines of the source code.

zirias expressed the opinion (that I mostly share) that 0x0x7f00000000 is an unreasonable value for this. You told me that value_d is a member of SysString so at line 818 value_d should be equivalent to this->value_d which (assuming this is invalid) should have been Cannot access memory at address rather than $4 = (unichar *) 0x0

If I were debugging it, I would poke around a bit more at that point to find out which, if any, of the apparently contradictory pieces of info represent the result of the bug you're looking for, vs. which represent wrong info displayed by GDB.

At 818 and maybe at an s further into that function, I would want to know what is:
this
&value_d
this->value_d
If those don't start to add up to something consistent, I'd look at disassembly of the code at that point and at register values and also try directly looking at memory at address 0x7f00000000

amir1981

07-26-2010 02:51 PM

Quote:

Originally Posted by johnsfine (Post 4046063)

Breakpoint 1, SysString::diagnose (level_a=<value optimized out>) at sstr_02.cc:221
221 num.assign(dbyte, L"asdf = %u xyz");
Current language: auto; currently c++
(gdb) p &num
$10 = (SysString *) 0x7fbfffe2a0

(gdb) p num.value_d
$1 = (unichar *) 0x61fcb0
(gdb) s
SysString::assign (this=0x7f00000000, arg_a=27 '\033', fmt_a=0x41afd8) at sstr_03.cc:818
818 bool8 SysString::assign(byte8 arg_a, const unichar* fmt_a) {
(gdb) p this
$2 = (SysString * const) 0x7f00000000
(gdb) p &value_d
$3 = (unichar **) 0x7f00000000
(gdb) p this->value_d
$4 = (unichar *) 0x0
(gdb) s
828 if (fmt_a == (unichar*)NULL) {
(gdb) p this
$5 = (SysString * const) 0x7f00000000
(gdb) p &value_d
$6 = (unichar **) 0x7f00000000
(gdb) p this->value_d
$7 = (unichar *) 0x0
(gdb)

Now what? Should not &num and this point to the same memory?

johnsfine

07-26-2010 02:58 PM

That's a bit of a surprise. 0x7f00000000 seems to be a valid address.

So that makes the original seg fault less plausible.

Your post #9 makes it look like there was a seg fault at

Code:

SysString::clear (this=0x7f00000000, cmode_a=Integral::RESET) at sstr_03.cc:1623

1623 if (capacity_d > 0) {

Is GDB wrong about the line number for that seg fault, or did 0x7f00000000 stop being a valid address by the time the code reached that point?

The latter should be easy to determine by just setting a breakpoint there and proceeding to it and seeing what this and capacity_d and &capacity_d all are.

Assuming GDB might be wrong about the line number of the seg fault, we'd also like to know what value_d is at that point.

Edit: Sorry, I wasn't thinking clearly. We already know value_d was bad before it reached there, so we can reasonably assume GDB is wrong about the line number of the seg fault and you need to look earlier not later to find out when/why value_d was clobbered.

amir1981

07-26-2010 03:05 PM

Quote:

Originally Posted by johnsfine (Post 4046111)

That's a bit of a surprise. 0x7f00000000 seems to be a valid address.

So that makes the original seg fault less plausible.

Your post #9 makes it look like there was a seg fault at

Code:

SysString::clear (this=0x7f00000000, cmode_a=Integral::RESET) at sstr_03.cc:1623

1623 if (capacity_d > 0) {

I don't know why, but seems suddenly the pointer to "num" object changes from its value to 0x7f00000000, because before calling "assign"
the address of num is 0x7fbfffe2a0 and just after calling it changes
What kind of things could do this?

johnsfine

07-26-2010 03:08 PM

Quote:

Originally Posted by amir1981 (Post 4045932)

Breakpoint 1, SysString::diagnose (level_a=<value optimized out>) at sstr_02.cc:221
(gdb) p num.value_d
$3 = (unichar *) 0x61fcb0
(gdb) s
SysString::assign (this=0x7f00000000, arg_a=27 '\033', fmt_a=0x41afd8) at sstr_03.cc:818
(gdb) p value_d
$4 = (unichar *) 0x0

I discounted the significance of the above earlier, because I didn't trust gdb.

Now, I assume you showed us that because you think num at line sstr_02.cc:221 is the same object as *this at line sstr_03.cc:818. So you should validate that by having gdb give the vale of &num at line sstr_02.cc:221

Edit: you answered that while I was asking the question.

johnsfine

07-26-2010 03:12 PM

Quote:

Originally Posted by johnsfine (Post 4046007)

post the area around each of those lines (around 221 in sstr_02.cc

That seems to be the approximate location of the problem and you seem to have ignored my earlier request to post that code.

Quote:

Originally Posted by amir1981 (Post 4046116)

seems suddenly the pointer to "num" object changes from its value to 0x7f00000000, because before calling "assign"
the address of num is 0x7fbfffe2a0 and just after calling it changes
What kind of things could do this?

Notice the low 32 bits of the pointer became zero.

That is exactly the kind of bug typical of an error in porting 32 bit code to 64 bit.

This could easily be caused by the immediately preceding object in memory being stored as 64 bits into a 32 bit allocated space overwriting the next 32 bits with zero.

Note I mean the object preceding the pointer to num, not the object preceding num itself.

I could be a lot more specific if I saw all the code from the declaration of num through line 221.

amir1981

07-26-2010 03:14 PM

Quote:

Originally Posted by johnsfine (Post 4046119)

on 32-bit machine:

Breakpoint 1, SysString::diagnose (level_a=Integral::BRIEF) at sstr_02.cc:221
221 num.assign(dbyte, L"asdf = %u xyz");
(gdb) p &num
$3 = (SysString *) 0xbfffebc4
(gdb) s
SysString::assign (this=0xbfffebc4, arg_a=27 '\033',
fmt_a=0x8063ac0 L"asdf = %u xyz") at sstr_03.cc:828
828 if (fmt_a == (unichar*)NULL) {
(gdb) p this
$4 = (SysString * const) 0xbfffebc4
(gdb)

so &num and this are the same

on 64bit:

Breakpoint 1, SysString::diagnose (level_a=<value optimized out>) at sstr_02.cc:221
221 num.assign(dbyte, L"asdf = %u xyz");
(gdb) p &num
$15 = (SysString *) 0x7fbfffe2a0
(gdb) s
SysString::assign (this=0x7f00000000, arg_a=27 '\033', fmt_a=0x41afd8) at sstr_03.cc:818
818 bool8 SysString::assign(byte8 arg_a, const unichar* fmt_a) {
(gdb) p this
$16 = (SysString * const) 0x7f00000000
(gdb)

This shows that the address for num has changed , right?

johnsfine

07-26-2010 03:20 PM

Quote:

Originally Posted by amir1981 (Post 4046123)

This shows that the address for num has changed , right?

Right. I already concluded that in posts above. Reread above to see what you missed while you were constructing that post.

amir1981

07-26-2010 03:21 PM

Quote:

Originally Posted by johnsfine (Post 4045557)

this is that part of the code:

Code:

// make sure that an empty string fails

  //

  SysString num;

  int32 i;

  if (num.get(i)) {

    return Error::handle(name(), L"get", Error::TEST, __FILE__, __LINE__);

  }



  // make sure that non numeric types fail for get

  //

  num.assign(L"abc");

  if (num.get(i)) {

    return Error::handle(name(), L"get", Error::TEST, __FILE__, __LINE__);

  }



  // make sure that strings greater than length 1 fail for a get on SysChar

  //

  SysChar tmp_char;

  if (num.get(tmp_char)) {

    return Error::handle(name(), L"get", Error::TEST, __FILE__, __LINE__);

  }



  // setup temporary variables

  //

  byte8 dbyte = (int32)27;

  byte8 dbyte_v;



  int16 dshort = (int32)27;

  int16 dshort_v;



  int32 dlong = (int32)277;

  int32 dlong_v;

  

  int64 dllong = (int64)13020756033LL;

  int64 dllong_v;

  

  uint16 dushort = (uint16)6907;

  uint16 dushort_v;

  

  uint32 dulong = (uint32)2777;

  uint32 dulong_v;

  

  uint64 dullong = (uint64)1302075603332LL;

  uint64 dullong_v;



  float32 dfloat = (float32)27.27e-19;

  float32 dfloat_v;

  

  float64 ddouble = (float64)272727272727.272727e+20;

  float64 ddouble_v;



  bool8 dboolean = true;

  bool8 dboolean_v;

  

  void* dvoidp = (void*)27;

  void* dvoidp_v;



  // test the byte conversions

  //

  num.assign(dbyte);

  num.get(dbyte_v);



  if (dbyte != dbyte_v) {

    Error::handle(name(), L"assign(byte8)", Error::TEST, __FILE__, __LINE__);

  }



  num.assign(dbyte, L"asdf = %u xyz");

johnsfine

07-26-2010 03:24 PM

That makes it look like the call to num.get(dbyte_v); did the harm.

Is the source code to get posted yet?

num is a local variable in the current stack frame. GDB deduces its address (at line 221) from the rbp register. That register must not be corrupted or gdb would be totally confused at that point.

The only way to get this symptom is if the optimizer had put the address of num into a callee saved register (because it is used so much) rather than recompute it from ebp each time. That's strange because recomputing it from ebp is nearly free compared to simply copying it into edi (all that would make sense if you knew x86_64 asm).

amir1981

07-26-2010 03:33 PM

Quote:

Originally Posted by johnsfine (Post 4046135)

That makes it look like the call to num.get(dbyte_v); did the harm.

Is the source code to get posted yet?

Here is the code for "get"

Code:

// method: get

//

// arguments:

//  void*& val: (output) pointer value

//

// return: a bool8 value indicating status

//

// this method converts the object into an address pointer

//

bool8 SysString::get(void*& val_a) const {



  // declare a null pointer

  //

  val_a = (void*)NULL;



  // check if string is null string

  //

  if (firstStr((unichar*)NULL_PTR) >= 0) {

    return true;

  }

    

  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_VOIDP_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  bool8& val: (output) bool8 value

//

// return: a bool8 value indicating status

//

// this method converts the string into a bool8 value

//

bool8 SysString::get(bool8& val_a) const {



  // initialize the return value

  //

  bool8 status = false;

  

  // return bool8 true or false

  //

  if (eq((unichar*)BOOL_TRUE)) {

    val_a = true;

    status = true;

  }

  else if (eq((unichar*)BOOL_FALSE)) {

    val_a = false;

    status = true;

  }



  // exit gracefully

  //

  return status;

}



// method: get

//

// arguments:

//  byte8& val: (output) byte value

//

// return: a bool8 value indicating status

//

// this method converts the object into a ubyte8 integer

//

bool8 SysString::get(byte8& val_a) const {



  // read in the integer

  //

  uint32 val = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_LONG_8BIT, &val) != 1) {

    return false;

  }



  // assign it to a byte8

  //

  val_a = (byte8)val;

  

  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  unichar& val: (output) unichar value

//

// return: a bool8 value indicating status

//

// this method converts the string into a unichar value

//

bool8 SysString::get(unichar& val_a) const {



  // if length is 1 then assign zeroth element to unichar

  //

  if (length() == 1) {

    val_a = value_d[0];

    return true;

  }



  // exit ungracefully

  //

  return false;

}



// method: get

//

// arguments:

//  uint16& val: (output) uint16 value

//

// return: a bool8 value indicating status

//

// this method converts the object into a uint16 integer

//

bool8 SysString::get(uint16& val_a) const {



  // declare local variables

  //

  uint32 tmp_val = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_ULONG_8BIT, &tmp_val) != 1) {

    return false;

  }



  // set the output

  //

  val_a = tmp_val;



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  uint32& val: (output) uint32 value

//

// return: a bool8 value indicating status

//

// this method converts the object into a uint32 integer

//

bool8 SysString::get(uint32& val_a) const {



  // declare local variables

  //

  val_a = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_ULONG_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  uint64& val: (output) uint64 value

//

// return: a bool8 value indicating status

//

// this method converts the object into a uint64 integer

//

bool8 SysString::get(uint64& val_a) const {



  // declare local variables

  //

  val_a = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_ULLONG_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  int16& val: (output) int16 value

//

// return: a bool8 value indicating status

//

// this method converts the object into a int16 integer

//

bool8 SysString::get(int16& val_a) const {



  // declare local variable

  //

  int32 tmp_val = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_LONG_8BIT, &tmp_val) != 1) {

    return false;

  }



  // set the output

  //

  val_a = tmp_val;

  

  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  int32& val: (output) int32 value

//

// return: a bool8 value indicating status

//

// this method converts the object into a int32 integer

//

bool8 SysString::get(int32& val_a) const {



  // declare local variables

  //

  val_a = 0;



  // use the 8-bit character string conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_LONG_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  int64& val: (output) int64 value

//

// return: a bool8 value indicating status

//

// this method converts the object into a int64 integer

//

bool8 SysString::get(int64& val_a) const {



  // declare local variables

  //

  val_a = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_LLONG_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  float32& val: (output) float32 value

//

// return: a bool8 value indicating status

//

// this method converts the object to a single precision floating point number

//

bool8 SysString::get(float32& val_a) const {



  // declare local variables

  //

  val_a = (float32)0.0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_RFMT_FLOAT_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  float64& val: (output) float64 value

//

// return: a bool8 value indicating status

//

// this method converts the object to a float64 precision floating point number

//

bool8 SysString::get(float64& val_a) const {



  // declare local variables

  //

  val_a = (float64)0.0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_RFMT_DOUBLE_8BIT, &val_a) != 1) {

    return false;

  }



  // exit gracefully

  //

  return true;

}



// method: get

//

// arguments:

//  SysComplex<TIntegral>& arg: (output) complex value

//

// return: a bool8 value indicating status

//

// this method converts the object to a complex number

// 

template <class TIntegral>

bool8 SysString::get(SysComplex<TIntegral>& arg_a) const {



  // declare local variable

  //

  SysString str(*this);

  str.trim();

  int32 imag_pos = str.firstChr(L'j');

  if ((imag_pos > 0) && (imag_pos != str.length() - 1)) {

    return Error::handle(name(), L"get", Error::ARG, __FILE__, __LINE__);

  }



  // if there is not 'j' in the string, convert directly

  //

  if (imag_pos < 0) {

    TIntegral val;

    str.get(val);

    arg_a = SysComplex<TIntegral>(val, 0);



    int32 pos = 0;

    int32 len = str.length();

    SysString num;



    // search letters '+' or '-' in the string

    //

    str.tokenize(num, pos, L'+');

    if (pos >= len - 1) {

      pos = 0;

      str.tokenize(num, pos, L'-');

    }



    // if '+' or '-' exists in the string

    //

    if (pos < len - 1 ) {

      str.debug(L"value");

      return Error::handle(name(), L"invalid format - complex numbers should be in the format: a+bj", Error::ARG, __FILE__, __LINE__);

    }

  }



  else {



    // declare local variable

    //

    int32 pos = 0;

    int32 len = str.length();

    bool8 isPositive = true;

    TIntegral val0, val1;

    SysString num;



    // search letters '+' or '-' in the string

    //

    str.tokenize(num, pos, L'+');

    if (pos >= len - 1) {

      pos = 0;

      str.tokenize(num, pos, L'-');

      isPositive = false;

    }



    // if '+' or '-' exists in the string

    //

    if (pos < len - 1 ) {



      // get the real part of the complex number

      //

      num.trim();

      num.get(val0);



      // set the appropriate sign for real part if necessary

      //

      if ((str(0) != num(0)) && (str(0) == '-')) {

        val0 = -val0;

      }



      // get the image part of the complex number

      //

      pos++;

      str.tokenize(num, pos, L'j');

      num.trim();

      if (num.length() == 0) {

        val1 = 1;

      }

      else {

        if (!num.get(val1)) {

          str.debug(L"value");

          return Error::handle(name(), L"invalid format - complex numbers should be in the format: a+bj", Error::ARG, __FILE__, __LINE__);

        }

      }



      // set the corresponding sign for the image part

      //

      if (!isPositive) {

        val1 = -val1;

      }



      // copy temporary complex number to output argument

      //

      arg_a = SysComplex<TIntegral>(val0, val1);

    }



    // only real part or image part exists in the string

    //

    else {

      

      // delete the letter 'j'

      //

      str.deleteRange(imag_pos, 1);



      // get the image part of the complex number

      //

      TIntegral val;

      str.trim();

      if (str.length() == 0 || str.eq(L"+")) {

        val = 1;

      }

      else if (str.eq(L"-")) {

        val = -1;

      }

      else {

        if (!str.get(val)) {

          str.debug(L"value");

          return Error::handle(name(), L"invalid format - complex numbers should be in the format: a+bj", Error::ARG, __FILE__, __LINE__);

        }

      }

      

      // copy temporary complex number to output argument

      //

      arg_a = SysComplex<TIntegral>(0, val);

    }

  }



  // exit gracefully

  //

  return true;

}

  

// explicit instantiations for complex types

//

template

bool8 SysString::get<float32>(SysComplex<float32>&) const;



template

bool8 SysString::get<float64>(SysComplex<float64>&) const;



template

bool8 SysString::get<int32>(SysComplex<int32>&) const;



// method: get

//

// arguments:

//  SysChar& val: (output) SysChar value

//

// return: a bool8 value indicating status

//

// this method converts the object into a SysChar

//

bool8 SysString::get(SysChar& val_a) const {



  // if length is 1 then assign the element to output SysChar

  //

  if (length() == 1) {

    return val_a.assign(value_d[0]);

  }



  // exit ungracefully

  //

  return false;

}

johnsfine

07-26-2010 03:39 PM

I'm pretty sure this is the bug:

Code:

  uint32 val = 0;



  // use the 8-bit character conversion

  //

  if (sscanf((char*)(byte8*)(*this),

            (char*)DEF_FMT_LONG_8BIT, &val) != 1) {

Does DEF_FMT_LONG_8BIT make its destination long, meaning 64 bits in a 64 bit build?

val is only 32 bits.

If you don't understand, post the definition of DEF_FMT_LONG_8BIT and I can explain more specifically.

In architectures where uint32 is the same size as long, this code works. In architectures where long is bigger, this code clobbers the low half of a register saved by the entry into this function.

It then returns to the caller with the wrong value in a register. Whether/how that matters depends on details of the optimization in that calling function. As I started to explain in post #24, I had deduced that the optimizer placed an extra copy of the address of num in a callee saved register that happens to be the register clobbered by the bug in get.

amir1981

07-26-2010 03:48 PM

// constants: 8-bit version of the default format strings (for efficiency)
//
const char SysString::DEF_FMT_VOIDP_8BIT[] = "%p";
const char SysString::DEF_FMT_ULONG_8BIT[] = "%lu";

could it be the problem? Actually your comment make sense to me but I wonder if this could make segmentation fault?

amir1981

07-26-2010 03:51 PM

I think I can correct this part and then I will report the result here, thanks your comment is really make sense now after some thinking ;)

johnsfine

07-26-2010 03:51 PM

Quote:

Originally Posted by amir1981 (Post 4046160)

In the case in question it seems to be using
DEF_FMT_LONG_8BIT, but you posted two other format strings.

Anyway, this is the problem. There is a similar problem in more than one of your overloads of get. So fix it in each place, not just in the one that is causing this specific seg fault.

johnsfine

07-26-2010 04:03 PM

Quote:

Originally Posted by amir1981 (Post 4046164)

I think I can correct this part

If you just need portability across modern 32 bit and 64 bit architectures, you can just drop the l from your 32 bit format strings: Use %u instead of %lu and use %d for signed 32 bit.

amir1981

07-26-2010 04:28 PM

Quote:

Originally Posted by johnsfine (Post 4046175)

If you just need portability across modern 32 bit and 64 bit architectures, you can just drop the l from your 32 bit format strings: Use %u instead of %lu and use %d for signed 32 bit.

Thanks a lot.It was the problem, as a work around I have used long variable to read from sscanf now, but I'm looking for a more decent solution that works in for example in 128 bit systems too.
Do you think if I remove all 'l''s from formatting strings can I be hopeful that this code would not break again for this reason?

Sergei Steshenko

07-26-2010 04:51 PM

Quote:

Originally Posted by amir1981 (Post 4046196)

Thanks a lot.It was the problem, as a work around I have used long variable to read from sscanf now, but I'm looking for a more decent solution that works in for example in 128 bit systems too.
Do you if I remove all 'l''s from formatting strings can I be hopeful that this code would not break again for this reason?

I would go the other way round in order to make my code portable. I.e. I am using the following idiom:

Code:

printf("some_int_var=%ll\n", (long long)some_int_var);

- this works regardless of 32/64 bits and regardless of actual integer type of some_int_var - by construction.

amir1981

07-26-2010 04:59 PM

Quote:

Originally Posted by Sergei Steshenko (Post 4046210)

I would go the other way round in order to make my code portable. I.e. I am using the following idiom:

Code:

printf("some_int_var=%ll\n", (long long)some_int_var);

- this works regardless of 32/64 bits and regardless of actual integer type of some_int_var - by construction.

thanks but I think here the situation is a bit different for example look at this:
bool8 SysString::get(int16& val_a) const {

// declare local variable
//
int32 tmp_val = 0;

// use the 8-bit character conversion
//
if (sscanf((char*)(byte8*)(*this),
(char*)DEF_FMT_LONG_8BIT, &tmp_val) != 1) {
return false;
}

// set the output
//
val_a = tmp_val;

// exit gracefully
//
return true;
}

Generally I want to specify the type of my variables to use certain number of bits so these variables should be exactly the same on all Machines. For formats we have:
const char SysString::DEF_FMT_LONG_8BIT[] = "%ld";

Now if change formats to for example this:
const char SysString::DEF_FMT_LONG_8BIT[] = "%d";

Can I be hopeful to get the same result for example on 128-bit machines too? I mean is it likely that "%d" definition change again?

johnsfine

07-26-2010 05:08 PM

Quote:

Originally Posted by amir1981 (Post 4046214)

thanks but I think here the situation is a bit different

Only because you are enforcing your own rules beyond where they should apply.

In your design, you want val_a to be an explicit size independent of architecture. That makes sense.

But you are being too strict in deciding tmp_val is also an explicit size independent of architecture. tmp_val exists only to interface to sscanf. sscanf does not work with explicit sizes independent of architecture.

So you should change your objective to just make sure tmp_val is at least as big as val_a. There are lots of ways of doing that while declaring tmp_val as some architecture specific size that is compatible with sscanf.

Mainly your problem is wrapped up in your choice of using sscanf at all. This is C++ code. sscanf is a lame holdover from C. If you were using some kind of stringstream as the text side source and using operator>> instead of sscanf then the operator overloading of streams would fit the format to the destination automatically rather than requiring all this work on your part to do so.

I have no idea what a 128 bit architecture would look like. Lots of different things are different sizes in each architecture. But the virtual address size has been the primary driver of the size naming of architectures.

You might think that the exponential growth from 16 bit virtual addresses to 32 bit virtual addresses to 64 bit virtual addresses would logically continue to 128 bit. But it won't:

16 bit virtual addresses were already too small when 16 bit x86 was introduced and were horribly too small by the time 32 bit x86 was introduced.

32 bit virtual addresses were plenty large enough when introduced and were still mostly large enough when 64 bit was introduced.

32 bits were closer to adequate when 64 bits were introduced than 16 bit was when 16 bit x86 itself was introduced. In that sense the available addressing doubled twice while the required addressing only really doubled once.

Then the exponential growth in problem size just needs an exponential growth in memory, which is only a linear growth address size. So the jump from 32 to 64 was is another way twice the jump from 16 to 32. So 64 bit addressing should be plenty for at least four times longer than 32 bit addressing was plenty.

So I think you're trying too hard to guess distant future portability issues.

amir1981

07-26-2010 05:21 PM

Quote:

Originally Posted by johnsfine (Post 4046223)

You're right but this code is not written by me, and I am just trying to update it. Maybe I can change sscanf too, I think it makes sense.
Anyway, thanks a lot. All of you ,specially john, were very helpful.

Sergei Steshenko

07-26-2010 05:31 PM

Quote:

Originally Posted by amir1981 (Post 4046214)

First of all, I agree with http://www.linuxquestions.org/questi...ml#post4046223 .

Secondly, you need fixed sizes only if/when you deal with HW generated data, e.g. if/when you deal with, say, Ethernet packet. I don't see a case when one needs "bool8" (taken from http://www.linuxquestions.org/questi...ml#post4046144 ).

I.e. I would give the compiler to choose width of 'bool' type.

Anyway, if you want to complicate things, you still can use constructs like

Code:

if(sizeof(my_int_var) == sizeof(int))

  {

  printf("my_int_var=%d\n", my_int_var);

  }



if(sizeof(my_int_var) == sizeof(long))

  {

  printf("my_int_var=%l\n", my_int_var);

  }

- in such cases 'sizeof' is known at compile time, so all the unnecessary 'if's will be optimized out at compile time.

This can be scripted (i.e. C++ code can be generated by a script) and can probably be implemented through templates.

Still, rethink the whole issue of imposed size variables.

acvoight

07-26-2010 06:07 PM

Edit - Misread (can't delete?)

All times are GMT -5. The time now is 08:24 PM.