Cons of putting huge chunks of code in the constructor

Aquarius_Girl · 12-24-2010, 06:26 AM

I had to parse an XML file and put the data in the concerned structures.

In order to make it convenient for the user to use it like:

Code:

XmlFileReader objXmlFileReader ("abc.xml");

I made the constructor take a file name as an argument and wrote the parsing code in the constructor body.

Is that considered a bad idea?

eSelix · 12-24-2010, 06:55 AM

Generally not, if your class is inteded to do only that work. But in the constructor should be only code needed to construct your object, and any action on this object should be implemented in methods. For convience you can allow passing arguments to constructor to call this methods. So in my opinion your constructor should have optional argument as file. That way these who want only create object and work with it later, can do that.

Think what will be made during assigning it to the variable or during other operation if you implement it, everything will be parsed once more if you do not create special copying constructor.

dugan · 12-24-2010, 08:39 AM

The pattern in C++ is for the constructor to acquire any resources necessary (including file handles), and for the destructor to release them. It follows that the actual parsing should be done by a third method.

More information is here:

http://en.wikipedia.org/wiki/RAII

BTW, if your XmlFileReader class is also writing data, then you should either rename it, or move the parts that write to another class.

Sergei Steshenko · 12-24-2010, 07:25 PM

Quote:

Originally Posted by Anisha Kaul

I had to parse an XML file and put the data in the concerned structures.

In order to make it convenient for the user to use it like:

Code:

XmlFileReader objXmlFileReader ("abc.xml");

I made the constructor take a file name as an argument and wrote the parsing code in the constructor body.

Is that considered a bad idea?

Why do you need a constructor ? I.e. why do you need an instance ? I.e. why do you need state ?

Aquarius_Girl · 12-30-2010, 01:25 AM

Quote:

Originally Posted by eSelix

Think what will be made during assigning it to the variable or during other operation if you implement it, everything will be parsed once more if you do not create special copying constructor.

Thanks for your input, I have read (not in extreme detail though) previously about copy constructors but haven't yet been able to figure out a situation for their use, I'll read again now.

Quote:

Originally Posted by Sergei Steshenko

Why do you need a constructor ? I.e. why do you need an instance ? I.e. why do you need state ?

Well, actually, the class titled XmlFileReader, also contains some other functions to read and traverse through the data from the filled up structures, that's why I needed to create an object of the class by which I could call those functions.

But it seems there is a point behind your question which I've missed, if so, then be kind enough to explain.

So now I have concluded that if the constructor is supposed to be called multiple number of times then it makes no sense to put code other than initializers in it, but if the constructor is supposed to be called only fixed number of times, then there is no harm to put the code in it?? Does it make sense?

paulsm4 · 12-30-2010, 01:46 AM

Hi -

Bjarne Stroustrup, the author and inventor of C++, strongly advocates the idiom of "RAII":

Quote:

The pattern in C++ is for the constructor to acquire any resources necessary (including file handles), and for the destructor to release them

Personally, I've never been completely comfortable with RAII.

More generally, a constructor should contain "everything needed to bring an object into a 'known good initial state'". No more, no less.

I like this definition from Wikipedia:

Quote:

http://en.wikipedia.org/wiki/Constru...programming%29

[Constructors] have the task of initializing the object's data members and of establishing the invariant of the class, failing if the invariant isn't valid. A properly written constructor will leave the object in a valid state. Immutable objects must be initialized in a constructor.

Back to your original question: Yes, I believe it might be COMPLETELY appropropriate to open, read and parse the entire XML file into memory inside your constructor. Or not

There's not necessarily any "right" or "wrong" answer. The best answer is usually "it depends"

'Hope that helps ... at least a little bit

Sergei Steshenko · 12-30-2010, 10:25 AM

Quote:

Originally Posted by Anisha Kaul

...
Well, actually, the class titled XmlFileReader, also contains some other functions to read and traverse through the data from the filled up structures, that's why I needed to create an object of the class by which I could call those functions.

But it seems there is a point behind your question which I've missed, if so, then be kind enough to explain.
...

There is, say, standard "C" library with hundreds of functions which are called in every program written in "C", yet no constructors are necessary.

A class is also a namespace; you may have as many static methods in it as you like, yet, you do no need a constructor just because you have many functions. You do need a constructor if you want to package state into what they call instance. If you write in C++

Code:

some_instance.some_method(<some_args>);

this means you are eventually (in "C" terms) are calling:

Code:

some_method(&instance_data, <some_args>);

, so you can have a number of instances with different 'instance_data', and the language/compiler relieves of you duty to every time specify '&instance_data' in the method call, they also hide data in case it is not public, i.e. you can't change data not using public interface implemented through methods. But if your class does not have data members, there is no 'instance_data', so all the 'some_method' calls will be the same (except for <some_args> which have nothing to do with instances and instance data).

In you case it appears to be that all the data is hidden inside one function, so the function is stateless, so you do not need state, i.e. you do not need (per) instance data.

Sergei Steshenko · 12-30-2010, 10:35 AM

Quote:

Originally Posted by paulsm4

Hi -

Bjarne Stroustrup, the author and inventor of C++, strongly advocates the idiom of "RIAA":

Quote:

The pattern in C++ is for the constructor to acquire any resources necessary (including file handles), and for the destructor to release them

Personally, I've never been completely comfortable with RIAA.

More generally, a constructor should contain "everything needed to bring an object into a 'known good initial state'". No more, no less.
...

From my personal experience not at all related to C++ I would also prefer to open files at object construction stage, i.e. if I'm writing something that parses a file, I'd rather have file open and file close calls separated from the parsing routine, hence I'll need to have a file handle, hence in terms of OO I'll need a data member representing the file handle, hence in terms of OO I need a constructor.

But having file handle as data member is not the OP's case/choice

.

paulsm4 · 12-30-2010, 02:15 PM

Hi, Sergie -

Quote:

From my personal experience not at all related to C++ I would also prefer to open files at object construction stage, i.e. if I'm writing something that parses a file, I'd rather have file open and file close calls separated from the parsing routine, hence I'll need to have a file handle, hence in terms of OO I'll need a data member representing the file handle, hence in terms of OO I need a constructor.

Actually, I often prefer the OPPOSITE of this approach (and precisely why I don't think RAII is always a good idea).

Specifically:
1. Yes, I have a file handle as a member.
A *private* - or at least protected - member.
2. In the constructor (or an initialization clause PRECEDING the constructor) I initialize the handle to null.
3. I have a separate public "open()" member.
4. I also have a public "close()" member (which, internally, sets the handle to NULL).
5. The destructor also calls "close()" - but if (and only if) the handle is null.

Object creation ("the document") and resource allocation ("open the file for reading" - something that might finish instantly, might take a very long time, or might not ever finish) remain two separate and distinct things.

The client of the class might well want control over when the file is opened, independent of when the resource is allocated.

One might argue the better solution might be to create two classes. Personally, I'd rather just expose "open()"

IMHO

dugan · 12-30-2010, 02:40 PM

Paul, there's a C++-specific reason for preferring RAII to your approach. It's explained in the Wikipedia article I linked to, but in my own words:

With RAII:

Code:

XmlFileReader objXmlFileReader("abc.xml");
throw runtime_error("error");
// Resources are automatically cleaned up

Without RAII:

Code:

XmlFileReader objXmlFileReader;
objXmlFileReader.open("abc.xml");
throw runtime_error("error");
// resource leak!
objXmlFileReader.close();

Your approach would be good for most other languages, but not C++.

Sergei Steshenko · 12-30-2010, 02:49 PM

Quote:

Originally Posted by paulsm4

Hi, Sergie -

Actually, I often prefer the OPPOSITE of this approach (and precisely why I don't think RAII is always a good idea). Specifically:
1. Yes, I have a file handle as a member.
A *private* - or at least protected - member.
2. In the constructor (or an initialization clause PRECEDING the constructor) I initialize the handle to null.
3. I have a separate public "open()" member.
4. I also have a public "close()" member (which, internally, sets the handle to NULL).
5. The constructor also calls "close()" - but if (and only if) the handle is null.

The reason is that the client of the class might well want control over when the file is opened, independent of when the resource is allocated. One might argue the better solution might be to create two classes. Personally, I'd rather just expose "open()"

IMHO

I don't see what is opposite. At the moment you have a data member, you need a constructor. How in our example file handle is initialized (first to null and then set, or set immediately by constructor) is secondary. Furthermore, if we are to talk about architecture, file handle should better be abstracted out by some kind of 'get_next_character' function/class and friends, so one would be able to parse files, memory regions, socket buffers, whatever.

paulsm4 · 12-30-2010, 11:03 PM

Hi, Anisha -

Quote:

Q: I made the constructor take a file name as an argument and wrote the parsing code in the constructor body. Is that considered a bad idea?

A: That's a perfectly acceptable thing to do

Quote:

The pattern in C++ is for the constructor to acquire any resources necessary (including file handles), and for the destructor to release them.

The general term for this idiom is "Resource Acquisition Is Initialization" ("RAII"). It's merely an "idiom" - NOT a "rule". It often results in good, clean code. Used naively, or used slavishly, it can also result in nightmares where simply declaring a variable can hang (or crash) the whole program.

Quote:

It follows that the actual parsing should be done by a third method.

Absolute nonsense. If your class represents information in your XML file, it makes MUCH more sense to a) open the file, b) parse it, and c) close the file (retaining the parsed information) ALL INSIDE YOUR CONSTRUCTOR.

Again, I think the best criteria for "what goes into a constructor" is to ask yourself "what is the invariant of your class"? What is the BARE MINIMUM needed to make your class useful? If your class contains "information" (information that happened to come from an XML file), then your constructor should arguably "get the information". It should NOT leave the file open if it doesn't need to. Nor should any USER of your class need to know - or care - if the information happened to come from XML, from a database or from a web service. Your class just gives the user "information". And should be ready to give that information the moment after it's constructed

Quote:

Q: I have read (not in extreme detail though) previously about copy constructors but haven't yet been able to figure out a situation for their use

A: If you pass a class as a parameter (by value), or if you assign one class instance to another then, by default, it will invoke the constructor. If the constructor does a lot of work ... that can be a HUGE waste.

Copy constructors provide you a way to define exactly what happens when you "copy" a class instance. Among other things, it can help you avoid a lot of unnecessary overhead. But there's more to it than that. Here's a good, brief explanation:

Quote:

http://www.dreamincode.net/forums/to...ctor-tutorial/

The C++ compiler provides a copy constructor for each class that you create and, unless you define your own overloaded copy constructor, is called each time a copy of the object is made.

Why would you want to create your own overloaded copy constructor if there is already one present? I'm glad you asked! Default copy constructors only copy each member variable from the passed object to the member variables of the new object. This is called a Shallow Copy, while this may work fine for many member variables, it could potentially break with pointers, as pointers in both objects end up pointing to the same memory. So, if either object's destructor is called, the memory will be freed, and the other object's pointer will point still be pointing to the old area of memory! One of two things could happen: Your program crashes, or, worse yet, your program goes on happily running, as the compiler placed something else in that memory, giving you unexpected and hard to trace problems.

'Hope that helps!

dugan · 12-31-2010, 01:20 AM

Here's one design that would work, given the information posted. I'm not presenting this as the only way, or even one that's necessarily better than the ones that other people (including me) have suggested, but only as something that would work well.

First, we'll create a class to store data:

Code:

class Data
{
public:
    // setters and getters
};

As paulsm4 suggested, you might very well change the source of data from an XML file to an SQL database in the future. To anticipate this, we declare an interface that classes that fetch that data will implement:

Code:

class DataReader
{
public:
    virtual Data getData() const = 0;
};

We create a concrete class that implements that interface:

Code:

class XmlDataReader: public DataReader
{
public:
    XmlDataReader(string filename); // only stores the filename
    virtual Data getData() const;  // opens the file, parses it, closes it, and returns the data
private:
    string filename;
};

Note that this has the benefit of facilitating code reuse. For example, if you switch to an SQL database, just create an SQLDataReader class that derives from DataReader. You can also create a MockDataReader class that returns fixed values, which is a standard practise in unit testing. As you can see, this design is a lot more maintainable and testable than a single class with the parsing (and writing) code in the constructor body. The unit tests for these two classes are obvious. The unit tests for XmlFileReader in the original post, by contrast, are impossible to design with only the information provided.

And now you can just use the two classes:

Code:

void initialize(const DataReader &reader)
{
    Data data = reader.getData();
    // Do things with data
}

int main()
{
    XmlDataReader reader("abc.xml");
    initialize(reader);

    // Or:
    SqlDataReader sqlReader(connectionString);
    initialize(reader);

}

Aquarius_Girl · 01-04-2011, 12:40 AM

Sergei,

Your post number 7 was enlightening "as always", thanks for the effort.

I read up on "static functions" and understood that when a piece of code doesn't have to give different outputs for different inputs and those different outputs need to be preserved, it can be kept in a static function.

In my case I didn't mention

that I have modified that single piece of XML parsing code in a way that it can handle two "similar structured" (but not same) XML files.

The outputs of the two XML files need to be stored in separate structures for further usage.

I don't think static functions can be used in this case, correct me if I am wrong!

Sergei Steshenko · 01-04-2011, 12:59 AM

Quote:

Originally Posted by Anisha Kaul

Sergei,

Your post number 7 was enlightening "as always", thanks for the effort.

I read up on "static functions" and understood that when a piece of code doesn't have to give different outputs for different inputs and those different outputs need to be preserved, it can be kept in a static function.

In my case I didn't mention

that I have modified that single piece of XML parsing code in a way that it can handle two "similar structured" (but not same) XML files.

The outputs of the two XML files need to be stored in separate structures for further usage.

I don't think static functions can be used in this case, correct me if I am wrong!

I meant static methods: http://en.wikipedia.org/wiki/Method_(computer_science) -> http://en.wikipedia.org/wiki/Method_...Static_methods .

A related to the discussion issue is stateless functions: http://stackoverflow.com/questions/8...ss-programming and elsewhere.

Interestingly enough, in VLSI design from the getgo designers are taught about combinational logic (stateless functions) opposed to sequential logic (latches, flip-flops, memory cells). There is a lot of difference in implementation and in what tools can do - stateless world is much better optimizable/transformable/verifiable.

Anyway, the main IMO lesson of this thread - one shouldn't just blindly learn tricks/concepts of a given paradigm, but has to truly understand when/why/what is happening.

I read an opinion of a guy (just a mere mortal guy like us; the opinion was written in Russian) who wrote code in various languages/paradigms. He most liked his own OO/"OO" code written with a lot of static methods - for him it was the easiest one to maintain and according to his own words the code was quite stable and robust in production.

Sergei Steshenko · 01-04-2011, 12:59 AM

Quote:

Originally Posted by Anisha Kaul

Sergei,

Your post number 7 was enlightening "as always", thanks for the effort.

I read up on "static functions" and understood that when a piece of code doesn't have to give different outputs for different inputs and those different outputs need to be preserved, it can be kept in a static function.

In my case I didn't mention

that I have modified that single piece of XML parsing code in a way that it can handle two "similar structured" (but not same) XML files.

The outputs of the two XML files need to be stored in separate structures for further usage.

I don't think static functions can be used in this case, correct me if I am wrong!

I meant static methods: http://en.wikipedia.org/wiki/Method_(computer_science) -> http://en.wikipedia.org/wiki/Method_...Static_methods .

A related to the discussion issue is stateless functions: http://stackoverflow.com/questions/8...ss-programming and elsewhere.

Interestingly enough, in VLSI design from the getgo designers are taught about combinational logic (stateless functions) opposed to sequential logic (latches, flip-flops, memory cells). There is a lot of difference in implementation and in what tools can do - stateless world is much better optimizable/transformable/verifiable.

Anyway, the main IMO lesson of this thread - one shouldn't just blindly learn tricks/concepts of a given paradigm, but has to truly understand when/why/what is happening.

I read an opinion of a guy (just a mere mortal guy like us; the opinion was written in Russian) who wrote code in various languages/paradigms. He most liked his own OO/"OO" code written with a lot of static methods - for him it was the easiest one to maintain and according to his own words the code was quite stable and robust in production.