[SOLVED] Creating a resume builder - pros and cons of XML vs SQLite storage?

SigTerm · 12-02-2011, 07:50 AM

Quote:

Originally Posted by vharishankar

Thing about MVC is that the view is mostly read-only, right?

Umm, no.
The view can change the data.
Changes in data should be reflected in view (signal/slots in Qt allow that).

If there's modal dialog (ok/cancel buttons), then modal dialog can take copy of the data, modify it, and if OK was pressed, you can grab modified copy of data from dialog and assign it wherever it was taken for.

If a "view" is not a modal dialog, then it can manipulate the single instance of "data" whenever it is stored within your program.

Quote:

Originally Posted by vharishankar

Updating the GUI directly is prohibited, so I have to implement an entire framework to update the data in the view?

I'm not sure what is the problem. Separating data from GUI isn't that difficult and it most likely will save your code from turning into unmaintainable spaghetti later.

Qt 4 has QTreeView, QAbstractProxyModel and associated tutorials, I believe you should take a look at them (not sure if they're available for python though) - they demonstrate how Qt implemented model/view for their case.

In any case, you should be passing data and pieces of data around as classes (IMO). Still, I'm a C++ programmer, and cannot gurantee that this is suitable mode of thinking for python (i.e. breaking everything down into concepts, classes and structures).

Quote:

Originally Posted by vharishankar

What I planned to do is represent the actual document data in an intermediate format (a Python object in this case which is created with a single function call whenever I actually need the data) and use that as the feeder for the non-GUI parts of the code, i.e. Save document, Load document, Export to HTML etc. and other data manipulations. The intermediate format is a Python dictionary object that represents the entire document's contents.

You're talking about the same thing.

Quote:

Originally Posted by vharishankar

If I understood the MVC framework right, I think the GUI part of the application can never be updated directly by the user and it can only be updated through the controller. Doesn't this work better in situations where the user never actually edits any of the data directly in the GUI itself?

This isn't correct. Take a look at Qt examples dealing with item views and models. They can be found in "qt demo" application under "Item Views" section. MVC means "view" "controller" and "model". "controller" can be merged with "view", and relationship between view and model doesn't have to be one way.

vharishankar · 12-02-2011, 08:16 AM

OK, I think I understand. I was probably using the same metaphor without understanding it entirely. So the MVC simply implies that I separate the data from the code which defines the view and updates to either data or view only happen through that mechanism and not arbitrarily.

From a purely programmatic approach would this mean simply separating the view class from the class that represents the data? And prevent any GUI part of the code from arbitrarily updating the data without going through the controller (which can either be a function or a class)?

Simply put, the update "view-to-data" and "data-to-view" mechanism can be triggered from the view, so long as no other code modifies the data and non-GUI code has no relation to the GUI parts of the code?

sundialsvcs · 12-02-2011, 09:40 PM

"MVC" is merely a "maybe-good idea," not a religion.

Like every other good-idea in the software-design world, it is: "an idea that has proved to be useful enough, to enough people, enough times, to ... attract the publishing attention of Mr. Tim O'Reilly."

It is one of many so-called design patterns, which IMHO is a fancy-pants way of taking the idea of "this worked for me," and trying to make a college-degree program out of it.

(And, to some extent at least, actually succeeding in doing so.)

Take such ideas for what they are, without worshiping any of them into what they really are not. Study the ideas very closely, put them into your toolbox along with everything else you might put there, and ... see if some day they actually do "work for you." They might. They might not.

vharishankar · 12-02-2011, 10:46 PM

Quote:

Originally Posted by sundialsvcs

"MVC" is merely a "maybe-good idea," not a religion.

Like every other good-idea in the software-design world, it is: "an idea that has proved to be useful enough, to enough people, enough times, to ... attract the publishing attention of Mr. Tim O'Reilly."

It is one of many so-called design patterns, which IMHO is a fancy-pants way of taking the idea of "this worked for me," and trying to make a college-degree program out of it.

(And, to some extent at least, actually succeeding in doing so.)

Take such ideas for what they are, without worshiping any of them into what they really are not. Study the ideas very closely, put them into your toolbox along with everything else you might put there, and ... see if some day they actually do "work for you." They might. They might not.

If I remember right, I first came across this paradigm several years ago when messing about with Microsoft Visual C++ and the MFC Appwizard which generated code for GUI document-based applications. Maybe it has existed before that also. Not sure.

Yes, I agree with you. That's also been my programming philosophy, especially since I'm just a hobbyist and I needn't be tied down to specific tools or methodologies.

Sergei Steshenko · 12-03-2011, 10:14 AM

Quote:

Originally Posted by vharishankar

Sorry, could you give me examples? What is "Self-sufficient to store its own data" and how does that apply to my current problem?

I genuinely have no idea of what you might be suggesting if it is not object serialization???

Here is an old example of mine in Perl:

Code:

sergei@amdam2:~/junk> cat -n hash_to_be_imported.prl
     1  {
     2  foo => 1,
     3  bar => 2
     4  }
sergei@amdam2:~/junk> cat -n hash_importer.pl
     1  #!/usr/bin/perl -w
     2
     3  use Data::Dumper;
     4
     5  $Data::Dumper::Deepcopy = $Data::Dumper::Deepcopy = 1;
     6  $Data::Dumper::Indent = $Data::Dumper::Indent = 1;
     7  $Data::Dumper::Terse = $Data::Dumper::Terse = 1;
     8  $Data::Dumper::Sortkeys = $Data::Dumper::Sortkeys = 1;
     9
    10  use strict;
    11
    12  my $x = 10;
    13  my $hash_ref = require './hash_to_be_imported.prl';
    14
    15  warn "\$x=$x";
    16
    17  foreach my $key(keys %{$hash_ref})
    18    {
    19    warn "key/value: $key/$hash_ref->{$key}"
    20    }
    21
    22
    23  warn "\$hash_ref=", Dumper($hash_ref), ' ';
sergei@amdam2:~/junk> ./hash_importer.pl
$x=10 at ./hash_importer.pl line 15.
key/value: bar/2 at ./hash_importer.pl line 19.
key/value: foo/1 at ./hash_importer.pl line 19.
$hash_ref={
  'bar' => 2,
  'foo' => 1
}
  at ./hash_importer.pl line 23.
sergei@amdam2:~/junk>

.

The point is that Data::\Dumper module: ( http://search.cpan.org/~smueller/Dat....131/Dumper.pm ) can export data in Perl format, i.e. exactly in the same format as 'hash_to_be_imported.prl' file is in.

vharishankar · 12-03-2011, 10:26 PM

Sergei, thanks for the example. I suppose the pickle module in Python does something similar to this, except that it uses a binary format for the objects stored in the pickle.

I suppose what you say is that I should actually store the data in the form of a hash/dictionary in the syntax of Python and read it as Python code directly into my program.

I think this is a neat technique, but it makes the document difficult to edit outside my program for somebody not familiar with the programming language syntax. I feel a more generic format would be preferable. It also will keep the document format separate and allow me or others to build tools in different languages to access the data. For example, I might create a PHP front end to access the resume data and display it as a web page and so on.

Anyway your example was very helpful to know the different methods available to me for storing data within the programming language syntax.

Sergei Steshenko · 12-04-2011, 06:58 AM

Quote:

Originally Posted by vharishankar

...
I think this is a neat technique, but it makes the document difficult to edit outside my program for somebody not familiar with the programming language syntax. ...

XML is no human-friendlier than Perl/Python. Furthermore, storing in Python format will most likely be more concise. I really hate XML tags, for example, AFAIK XML is order-independent, so one needs to use special means in it to express order of elements (e.g. 'element_number' field). In Perl/Python there are arrays/lists, so the order is preserved in a natural way. E.g. in Perl it would be something like this:

Code:

[ # array reference
  { # hash reference
  element_type => 'header',
  text => 'RESUME',
  font => 
    {
    type => 'verdana',
    color => 'black'
    }
  },

  {
  element_type => 'applicant_data',
  data =>
    {
    first_name => 'John',
    last_name => 'Smith',
    phone_numbers =>
      {
      mobile => '98-123-4567',
      wired => '89-321-7658'
      }
    }
  },
...
]

- at the top level it's an array (reference) (Perl arrays are elastic), so order of elements is naturally expressed. Try to write this in XML - there will be more characters because of ever present tags which are longer than square or curly braces in this case.

Regarding binary/text - Perl also has a facility to store in binary format (e.g. http://search.cpan.org/~ams/Storable-2.30/Storable.pm ), but I intentionally chose text format - for user-friendliness and human readability.

Again, to achieve the goal (storing the document) you do not need XML/YAML/JSON/whatever - just the native programming language (Python in your case) is sufficient.

vharishankar · 12-04-2011, 07:36 AM

Sergei, I am beginning to agree with you about XML format. Manipulating XML even with an easy to use library like ElementTree It's just... like pulling teeth out. I think I will go with your advice in this case. I will simply use python pickle, which is close to what you're suggesting (http://docs.python.org/library/pickle.html) in Perl. I think pickle is the recommended "python" object format so it makes sense to use it.

So I think the Python pickle will do nicely for me at the moment. I can later add XML as an export option if I feel the need. However since there is no "generic" XML format, it makes sense either to use XML as a primary datastore or simply avoid it altogether.

Thanks for the help.

I could also use "pretty printer" http://docs.python.org/library/pprint.html which appears to be similar to what Perl's Data:

umper also.

Sergei Steshenko · 12-04-2011, 07:56 AM

Quote:

Originally Posted by vharishankar

Sergei, I am beginning to agree with you about XML format. Manipulating XML even with an easy to use library like ElementTree It's just... like pulling teeth out. I think I will go with your advice in this case. I will simply use python pickle, which is close to what you're suggesting (http://docs.python.org/library/pickle.html) in Perl. I think pickle is the recommended "python" object format so it makes sense to use it.

So I think the Python pickle will do nicely for me at the moment. I can later add XML as an export option if I feel the need. However since there is no "generic" XML format, it makes sense either to use XML as a primary datastore or simply avoid it altogether.

Thanks for the help.

I could also use "pretty printer" http://docs.python.org/library/pprint.html which appears to be similar to what Perl's Data:

umper also.

Quick web search yields: http://docs.python.org/library/pprint.html :

Quote:

The pprint module provides a capability to “pretty-print” arbitrary Python data structures in a form which can be used as input to the interpreter. If the formatted structures include objects which are not fundamental Python types, the representation may not be loadable. This may be the case if objects such as files, sockets, classes, or instances are included, as well as many other built-in objects which are not representable as Python constants.

vharishankar · 12-04-2011, 08:38 AM

OK, I've decided to go with JSON.

JSON support is pretty much built into (recent versions of) Python standard library and is very similar to the above data::dumper approaches except that JSON is a more common format than Pickle or the Python object format dumped by pprint.

Also the usage of JSON is pretty much the same as pickle or pprint - simply dump() and load() any built-in python object and it works. No unnecessary XML-like parsing necessary.

Once again, I must thank Sergei for showing patience in helping me understand the issue of data dumping in the language's own format. I must say JSON comes close to the same approach. (well after all JSON is JavaScript's Object Notation - so the concept is the same, I think)

Sergei Steshenko · 12-04-2011, 09:37 AM

Quote:

Originally Posted by vharishankar

OK, I've decided to go with JSON.

JSON support is pretty much built into (recent versions of) Python standard library and is very similar to the above data::dumper approaches except that JSON is a more common format than Pickle or the Python object format dumped by pprint.

Also the usage of JSON is pretty much the same as pickle or pprint - simply dump() and load() any built-in python object and it works. No unnecessary XML-like parsing necessary.

Once again, I must thank Sergei for showing patience in helping me understand the issue of data dumping in the language's own format. I must say JSON comes close to the same approach. (well after all JSON is JavaScript's Object Notation - so the concept is the same, I think)

Well, though JSON is not native, i.e. it is not Python, it is, at least, concise.

vharishankar · 12-04-2011, 09:55 AM

Quote:

Originally Posted by Sergei Steshenko

Well, though JSON is not native, i.e. it is not Python, it is, at least, concise.

I noticed the JSON syntax is pretty close to Python's syntax for dictionaries and lists, but it is obviously not the same.

JSON might have other drawbacks I am not aware of, but for now, provisionally I choose it.

The advantage of the "dump" and "load" approach is that I can replace JSON with pickle or pprint quite easy by replacing a few lines of code.