ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm having yet another big problem in my interpreter project that I'm unable to figure out. In this function, the call to map.find() causes a segfault:
Code:
Table* Table::findParentWithVar(char *name)
{
if (map.find(name) <-- segfault here != map.end())
return this;
std::vector<Table*>::iterator i;
for (i=parents.begin(); i!=parents.end(); i++) {
Table *temp = (*i)->findParentWithVar(name);
if (temp)
return temp;
}
return NULL;
}
I don't have the slightest clue why. Items were successfully added to the same map before, and it couldn't be a NULL pointer problem becasue the map is stored directly in the class, not as a pointer.
I'm having yet another big problem in my interpreter project that I'm unable to figure out. In this function, the call to map.find() causes a segfault:
Code:
Table* Table::findParentWithVar(char *name)
{
if (map.find(name) <-- segfault here != map.end())
return this;
std::vector<Table*>::iterator i;
for (i=parents.begin(); i!=parents.end(); i++) {
Table *temp = (*i)->findParentWithVar(name);
if (temp)
return temp;
}
return NULL;
}
I don't have the slightest clue why. Items were successfully added to the same map before, and it couldn't be a NULL pointer problem becasue the map is stored directly in the class, not as a pointer.
You have to check your assumptions. For example, 'map.find(name)' looks like a function call. If so, then check the function address at a place where you are sure it is correct (e.g. just after creating the corresponding class instance) and just before the segfault. The addresses should be the same.
With GDB you should be able to see exactly where the seg fault occurs.
Are you using an optimized build or an ordinary debug build?
You told us where the seg fault occurs, but I don't know how exactly you checked that. Is it on the actual code you indicated or is it in some function called by that code?
In an optimized build, some functions called by that code might be inlined, so bugs such as a bad name pointer would seg fault apparently in that code rather than in called code.
If it is on that actual code, that should indicate that the this pointer is bad. If find were a virtual function (which I expect it isn't) then a vtable pointer might be bad or the find pointer in a vtable might be bad. But even the symptom of a bad vtable pointer is more likely caused by a bad this pointer.
I always look at the asm code around the point of the seg fault. You should learn how to display that asm code in GDB. If you don't know enough asm to learn anything from that asm code, you can post it here for help.
In x86_64, to understand this kind of seg fault, you also need to look at the general registers: rdi, rsi, rax, etc.
In x86 (32 bit) you need to look at the top several values on the stack.
As Sergei mentioned, re-verify your assumptions; for example, you should check the iterator value to assure that it is not pointing to NULL or perhaps to an address of a Table object that may have been previously deleted.
With GDB you should be able to see exactly where the seg fault occurs.
Are you using an optimized build or an ordinary debug build?
You told us where the seg fault occurs, but I don't know how exactly you checked that. Is it on the actual code you indicated or is it in some function called by that code?
I found the segfault using GDB and it was compiled with optimization turned off.
Here's something I just found out:
It was crashing when interpreting this AST:
Code:
new CallNode(new MemberNode(new IntegerNode(5) , "print"), new NodeCallParamList())
(MemberNode calls Table::get, which calls Table::getParentWithVar, which calls map::find(), and that's where the segfault happens)
But it doesn't crash with this one, which contains the part that crashes in the previous AST:
Code:
new MemberNode(new IntegerNode(5) , "print")
The problem is that I don't see anything wrong with CallNode that would cause this. All it does is call node->eval(scope) just like main() calls it in the second example.
Quote:
Originally Posted by johnsfine
In an optimized build, some functions called by that code might be inlined, so bugs such as a bad name pointer would seg fault apparently in that code rather than in called code.
If it is on that actual code, that should indicate that the this pointer is bad. If find were a virtual function (which I expect it isn't) then a vtable pointer might be bad or the find pointer in a vtable might be bad. But even the symptom of a bad vtable pointer is more likely caused by a bad this pointer.
I always look at the asm code around the point of the seg fault. You should learn how to display that asm code in GDB. If you don't know enough asm to learn anything from that asm code, you can post it here for help.
In x86_64, to understand this kind of seg fault, you also need to look at the general registers: rdi, rsi, rax, etc.
In x86 (32 bit) you need to look at the top several values on the stack.
I once played around with x86 (not x86_64, but it should be similar enough) assembler, but not very much and I forgot a lot of it by now. And i don't know how to make GDB print it out.
The problem is that I don't see anything wrong with CallNode that would cause this. All it does is call node->eval(scope) just like main() calls it in the second example.
Same here... I don't see anything wrong with CallNode, although I must admit my judgement is biased because I have no idea what you have implemented in that class' constructor.
It seems that you are using a variant of smart-pointers, however they don't seem so "smart" if you must call getref() and putref() when you want to increase and decrease the number of references. Is this a home-made smart-pointer class that you are using?
In your function eval(), you never check to see if the pointer to 'l' is valid. It seems like you are biased against developing "safe" code. This lax attitude may very well have placed you into your current predicament of checking for a NULL pointer or a memory corruption error.
Anyhow, consider using Boost's shared pointer; it is a lot easier to use than what you have.
which calls Table::getParentWithVar, which calls map::find(), and that's where the segfault happens)
Is the contradiction to your earlier post that I marked in red a typo, or a change or what?
The part in purple points marks where the inherent ambiguity of English (plus my inherent belief that PEBKAC is most likely) leaves me guessing at what you really saw.
When you want us to know what a gdb backtrace showed you, it is usually best to just issue the bt command in gdb and then copy/paste the result into a CODE block in your post.
Quote:
But it doesn't crash with this one, which contains the part that crashes in the previous AST:
That tells us a whole lot less than you might think it should.
Quote:
I once played around with x86 (not x86_64, but it should be similar enough)
Doesn't quite answer which architecture your current C++ code was compiled to. But if I see any disassembly, I'll know anyway.
Quote:
i don't know how to make GDB print it out.
Various forms of the disas command. With no parameters, that shows you disassembly (reconstructed, not original assembly) code for whatever gdb thinks is the current function. From that you should find a block from several instructions before the failure point through a few instructions after it.
The command inf r dumps the basic registers. For x86_64, that will include rax through rip that are interesting for C++ debugging, followed by a bunch of obscure registers only interesting for kernel debugging. For x86, the interesting ones are eax through eip, but most of the interesting stuff is usually on the stack rather than in registers.
The exact point of the failure is in the rip or eip register. It should be possible to match that against the addresses in disas output, but sometimes some further effort is required. I don't use gdb enough myself to know when to expect raw hex addresses (such as in rip or eip) vs. various symbol and offset forms in the bt output or the disas output. Usually I'd like to deal with all three together but one or more are in a different format requiring conversion.
It seems that you are using a variant of smart-pointers, however they don't seem so "smart" if you must call getref() and putref() when you want to increase and decrease the number of references. Is this a home-made smart-pointer class that you are using?
In your function eval(), you never check to see if the pointer to 'l' is valid. It seems like you are biased against developing "safe" code. This lax attitude may very well have placed you into your current predicament of checking for a NULL pointer or a memory corruption error.
Anyhow, consider using Boost's shared pointer; it is a lot easier to use than what you have.
I have a RefcountObject class, and all objects that will be stored as variables in the interpreted language (and all members of those classes, unless the member's value won't be shared with other objects) are subclasses of it.
(I never learned it, but it might be a fun thing to try sometime)
Quote:
Originally Posted by johnsfine
When you want us to know what a gdb backtrace showed you, it is usually best to just issue the bt command in gdb and then copy/paste the result into a CODE block in your post.
Code:
(gdb) run
Starting program: /home/michael/Projects/lang/build/src/lang
Creating integer 5
setting table entry "__internal_data__" to 0x602250
setting table entry "__op_plus__" to 0x602358
setting table entry "__op_minus__" to 0x602448
setting table entry "__op_times__" to 0x602538
setting table entry "__op_div__" to 0x602628
setting table entry "__comp_ne__" to 0x602718
setting table entry "print" to 0x602808
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bce7d6 in std::_Rb_tree<char*, std::pair<char* const, lang::LangObject*>, std::_Select1st<std::pair<char* const, lang::LangObject*> >, lang::CStringComparisonClass, std::allocator<std::pair<char* const, lang::LangObject*> > >::_M_lower_bound (
this=0x7ffff7bd2908, __x=0x8b48008b4820408b, __y=0x7ffff7bd2910, __k=@0x7fffffffd7c0)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1004
1004 if (!_M_impl._M_key_compare(_S_key(__x), __k))
(gdb) bt
#0 0x00007ffff7bce7d6 in std::_Rb_tree<char*, std::pair<char* const, lang::LangObject*>, std::_Select1st<std::pair<char* const, lang::LangObject*> >, lang::CStringComparisonClass, std::allocator<std::pair<char* const, lang::LangObject*> > >::_M_lower_bound (
this=0x7ffff7bd2908, __x=0x8b48008b4820408b, __y=0x7ffff7bd2910, __k=@0x7fffffffd7c0)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1004
#1 0x00007ffff7bd06f6 in std::_Rb_tree<char*, std::pair<char* const, lang::LangObject*>, std::_Select1st<std::pair<char* const, lang::LangObject*> >, lang::CStringComparisonClass, std::allocator<std::pair<char* const, lang::LangObject*> > >::find (
this=0x7ffff7bd2908, __k=@0x7fffffffd7c0)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1519
#2 0x00007ffff7bd0209 in std::map<char*, lang::LangObject*, lang::CStringComparisonClass, std::allocator<std::pair<char* const, lang::LangObject*> > >::find (this=0x7ffff7bd2908, __x=@0x7fffffffd7c0)
at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_map.h:697
#3 0x00007ffff7bd3036 in lang::Table::findParentWithVar (this=0x7ffff7bd28e8, name=0x602070 "print")
at /home/michael/Projects/lang/lib/table.cc:31
#4 0x00007ffff7bd3117 in lang::Table::get (this=0x7ffff7bd28e8, name=0x602070 "print") at /home/michael/Projects/lang/lib/table.cc:46
#5 0x00007ffff7bd2512 in lang::MemberNode::eval (this=0x602040, scope=0x602110) at /home/michael/Projects/lang/lib/nodes.cc:119
#6 0x00007ffff7bd2935 in lang::CallNode::eval (this=0x6020e0, scope=0x602110) at /home/michael/Projects/lang/lib/nodes.cc:186
#7 0x0000000000400bcc in main () at /home/michael/Projects/lang/src/main.cc:21
(gdb)
Quote:
Originally Posted by johnsfine
Various forms of the disas command. With no parameters, that shows you disassembly (reconstructed, not original assembly) code for whatever gdb thinks is the current function. From that you should find a block from several instructions before the failure point indicated by bt through a few instructions after it.
The command inf r dumps the basic registers. For x86_64, that will include rax through rip that are interesting for C++ debugging, followed by a bunch of obscure registers only interesting for kernel debugging. For x86, the interesting ones are eax through eip, but most of the interesting stuff is usually on the stack rather than in registers.
in std::_Rb_tree<...>::_M_lower_bound(
this=0x7ffff7bd2908, __x=0x8b48008b4820408b, __y=0x7ffff7bd2910, __k=@0x7fffffffd7c0)
I don't have the internals of GNU Rb_tree either memorized or handy. If I debug one of these myself, I just look as disassembly and register values to be sure of what I can guess at from just the above.
I assume all three of the above items are pointers. this, __y and __k are valid pointers. __x is very much not a valid pointer.
I expect __x and __y are nodes withing the existing map. this is the map itself and __k is the new name.
So __x being bad implies the map was corrupt before you got into the code with the actual crash.
It's always harder to debug something where the crash occurs as an after effect of a previous silent bug.
If it is a memory clobber bug (rather than something that specifically hits the map) that is harder still.
Assuming a map clobber bug (rather than a memory clobber bug), I would suggest writing or finding some map testing function and insert it in a bunch of asserts scattered through the code to help find the point at which the map is corrupted.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.