LinuxQuestions.org - [SOLVED] C++ map segfaulting

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - C++ map segfaulting (https://www.linuxquestions.org/questions/programming-9/c-map-segfaulting-874191/)

C++ map segfaulting

I'm having yet another big problem in my interpreter project that I'm unable to figure out. In this function, the call to map.find() causes a segfault:

Code:

Table* Table::findParentWithVar(char *name)

{

        if (map.find(name) <-- segfault here != map.end())

                return this;

        

        std::vector<Table*>::iterator i;

        for (i=parents.begin(); i!=parents.end(); i++) {

                Table *temp = (*i)->findParentWithVar(name);

                if (temp)

                        return temp;

        }

        

        return NULL;

}

I don't have the slightest clue why. Items were successfully added to the same map before, and it couldn't be a NULL pointer problem becasue the map is stored directly in the class, not as a pointer.

Quote:

Originally Posted by MTK358 (Post 4320451)

I'm having yet another big problem in my interpreter project that I'm unable to figure out. In this function, the call to map.find() causes a segfault:

Code:

Table* Table::findParentWithVar(char *name)

{

        if (map.find(name) <-- segfault here != map.end())

                return this;

        

        std::vector<Table*>::iterator i;

        for (i=parents.begin(); i!=parents.end(); i++) {

                Table *temp = (*i)->findParentWithVar(name);

                if (temp)

                        return temp;

        }

        

        return NULL;

}

I don't have the slightest clue why. Items were successfully added to the same map before, and it couldn't be a NULL pointer problem becasue the map is stored directly in the class, not as a pointer.

You have to check your assumptions. For example, 'map.find(name)' looks like a function call. If so, then check the function address at a place where you are sure it is correct (e.g. just after creating the corresponding class instance) and just before the segfault. The addresses should be the same.

Quote:

Originally Posted by MTK358 (Post 4320451)

it couldn't be a NULL pointer problem becasue the map is stored directly in the class, not as a pointer.

You should learn to use gdb to look at the details of a seg fault.

From what you showed it could easily be a NULL pointer problem: name could be null or this could be null.

It is also possible the map is corrupted by some previous bug.

Quote:

Originally Posted by johnsfine (Post 4320915)

From what you showed it could easily be a NULL pointer problem: name could be null or this could be null.

I did use GDB, and neither this nor name are NULL.

Quote:

Originally Posted by johnsfine (Post 4320915)

It is also possible the map is corrupted by some previous bug.

How would I find that out?

With GDB you should be able to see exactly where the seg fault occurs.

Are you using an optimized build or an ordinary debug build?

You told us where the seg fault occurs, but I don't know how exactly you checked that. Is it on the actual code you indicated or is it in some function called by that code?

In an optimized build, some functions called by that code might be inlined, so bugs such as a bad name pointer would seg fault apparently in that code rather than in called code.

If it is on that actual code, that should indicate that the this pointer is bad. If find were a virtual function (which I expect it isn't) then a vtable pointer might be bad or the find pointer in a vtable might be bad. But even the symptom of a bad vtable pointer is more likely caused by a bad this pointer.

I always look at the asm code around the point of the seg fault. You should learn how to display that asm code in GDB. If you don't know enough asm to learn anything from that asm code, you can post it here for help.

In x86_64, to understand this kind of seg fault, you also need to look at the general registers: rdi, rsi, rax, etc.

In x86 (32 bit) you need to look at the top several values on the stack.

As Sergei mentioned, re-verify your assumptions; for example, you should check the iterator value to assure that it is not pointing to NULL or perhaps to an address of a Table object that may have been previously deleted.

Code:

Table *temp = (*i)->findParentWithVar(name);

Quote:

Originally Posted by johnsfine (Post 4320962)

I found the segfault using GDB and it was compiled with optimization turned off.

Here's something I just found out:

It was crashing when interpreting this AST:

Code:

new CallNode(new MemberNode(new IntegerNode(5) , "print"), new NodeCallParamList())

(MemberNode calls Table::get, which calls Table::getParentWithVar, which calls map::find(), and that's where the segfault happens)

But it doesn't crash with this one, which contains the part that crashes in the previous AST:

Code:

new MemberNode(new IntegerNode(5) , "print")

The problem is that I don't see anything wrong with CallNode that would cause this. All it does is call node->eval(scope) just like main() calls it in the second example.

Quote:

Originally Posted by johnsfine (Post 4320962)

In an optimized build, some functions called by that code might be inlined, so bugs such as a bad name pointer would seg fault apparently in that code rather than in called code.

If it is on that actual code, that should indicate that the this pointer is bad. If find were a virtual function (which I expect it isn't) then a vtable pointer might be bad or the find pointer in a vtable might be bad. But even the symptom of a bad vtable pointer is more likely caused by a bad this pointer.

I always look at the asm code around the point of the seg fault. You should learn how to display that asm code in GDB. If you don't know enough asm to learn anything from that asm code, you can post it here for help.

In x86_64, to understand this kind of seg fault, you also need to look at the general registers: rdi, rsi, rax, etc.

In x86 (32 bit) you need to look at the top several values on the stack.

I once played around with x86 (not x86_64, but it should be similar enough) assembler, but not very much and I forgot a lot of it by now. And i don't know how to make GDB print it out.

Quote:

Originally Posted by MTK358 (Post 4320979)

The problem is that I don't see anything wrong with CallNode that would cause this. All it does is call node->eval(scope) just like main() calls it in the second example.

Same here... I don't see anything wrong with CallNode, although I must admit my judgement is biased because I have no idea what you have implemented in that class' constructor.

Code:

class CallNode : public Node

        {

        public:

                CallNode(Node *funcNode, NodeCallParamList *param);

                ~CallNode();

                LangObject* eval(Table *scope);

                

        private:

                NodeCallParamList *param;

                Node *funcNode;

        };

Code:

CallNode::CallNode(Node *funcNode, NodeCallParamList *param) {

        this->funcNode = (Node*) funcNode->getref();

        this->param = param;

}



CallNode::~CallNode() {

        funcNode->putref();

        delete param;

}



LangObject* CallNode::eval(Table *scope) {

        CallParamList *l = param->evaluateParameters(scope);

        Function *f = (Function*) funcNode->eval(scope); //LangObject::discardIfWrongType(funcNode->eval(scope), LangObject::FunctionType);

        if (f) {

                LangObject *result = f->call(l);

                l->putref();

                f->putref();

                return result;

        }

        //TODO throw error

        l->putref();

        f->putref();

        return NULL;

}

It seems that you are using a variant of smart-pointers, however they don't seem so "smart" if you must call getref() and putref() when you want to increase and decrease the number of references. Is this a home-made smart-pointer class that you are using?

In your function eval(), you never check to see if the pointer to 'l' is valid. It seems like you are biased against developing "safe" code. This lax attitude may very well have placed you into your current predicament of checking for a NULL pointer or a memory corruption error.

Anyhow, consider using Boost's shared pointer; it is a lot easier to use than what you have.

Quote:

Originally Posted by MTK358 (Post 4320979)

It was crashing when interpreting this AST:

What does "AST" mean?

Quote:

which calls Table::getParentWithVar, which calls map::find(), and that's where the segfault happens)

Is the contradiction to your earlier post that I marked in red a typo, or a change or what?

The part in purple points marks where the inherent ambiguity of English (plus my inherent belief that PEBKAC is most likely) leaves me guessing at what you really saw.

When you want us to know what a gdb backtrace showed you, it is usually best to just issue the bt command in gdb and then copy/paste the result into a CODE block in your post.

Quote:

But it doesn't crash with this one, which contains the part that crashes in the previous AST:

That tells us a whole lot less than you might think it should.

Quote:

I once played around with x86 (not x86_64, but it should be similar enough)

Doesn't quite answer which architecture your current C++ code was compiled to. But if I see any disassembly, I'll know anyway.

Quote:

i don't know how to make GDB print it out.

Various forms of the disas command. With no parameters, that shows you disassembly (reconstructed, not original assembly) code for whatever gdb thinks is the current function. From that you should find a block from several instructions before the failure point through a few instructions after it.

The command inf r dumps the basic registers. For x86_64, that will include rax through rip that are interesting for C++ debugging, followed by a bunch of obscure registers only interesting for kernel debugging. For x86, the interesting ones are eax through eip, but most of the interesting stuff is usually on the stack rather than in registers.

The exact point of the failure is in the rip or eip register. It should be possible to match that against the addresses in disas output, but sometimes some further effort is required. I don't use gdb enough myself to know when to expect raw hex addresses (such as in rip or eip) vs. various symbol and offset forms in the bt output or the disas output. Usually I'd like to deal with all three together but one or more are in a different format requiring conversion.

Quote:

Originally Posted by dwhitney67 (Post 4320998)

I have a RefcountObject class, and all objects that will be stored as variables in the interpreted language (and all members of those classes, unless the member's value won't be shared with other objects) are subclasses of it.

Quote: