Split header from data in file using python
I'm trying to get to grips with python, and thought a genuine application would be more likely to get me going that just fiddling my way through tutorials. With that in mind, I have the following problem.
I need to extract certain parameters from the header (purple) in a number of files. At the moment, the header is not a fixed number of lines, but its format is as follows: Code:
DATA SOURCE CODE=RIKZ So I'd like to be able to create variables of PERIOD BEGIN, PERIOD END etc. for use later on in the code, and then work on the data values separately. I think this'll most easily be achieved if I can create two arrays, one which is the header information in two columns (separated by =) and one which is the values (which lie between the words VALUES and END. I've tried to read the file in line by line, and then assign the file to a variable split by =, but it fails when the line contains only a single column with no = in it. This is what I have so far, and it doesn't work: Code:
#!/usr/bin/env python I've tried googling, but there's an enormous amount of information, and identifying what's relevant and what's outdated is difficult unless you know what you're looking for! |
The for line is missing the readlines object
for line in openFile.readlines(): |
Quote:
Quote:
|
I think this should do pretty much of what you need. It could still be improved by adding more error-checking and raising WaterFileSyntaxError exceptions when such error are found.
Hope this helps Code:
#!/usr/bin/env python |
A typical task for Perl - built-in regular expressions come handy.
And I do not see a need for OOP in this case - pure procedural code would suffice because of simplicity of the problem. |
Possibly, but that shouldn't stop the anyone in his/her quest to learn a new language.
I only started on python because I had a specific project in mind - must get back to it sometime ... |
Quote:
... Regarding the specific project - if you have to contribute to an existing Python project, of course you need Python. If you're starting something from scratch - Python (AFAIK) is not a more capable language than Perl; I have already published a number of small pieces of code which cannot be implemented in Python due to its limitations. |
Quote:
Code:
d={} |
Quote:
Quote:
Quote:
|
Quote:
Code:
#!/usr/bin/env python For reading the values you can either loop over and log when you're in range (between 'values' and 'end') or if you have the whole file read into a list of lines do something like this: Code:
lines[lines.index('VALUES')+1:lines.index('END')] |
Thanks everyone for all the examples.
Since my secondary aim in doing this was to start learning python, the full-blown program method does appeal, although it seems this is a pretty trivial problem to solve! I've copied the code I've ended up using below. As you can see, I've ended up using most of Hko's code, adding a small section at the end to actually calculate the new date and times for each data point. Code:
#!/usr/bin/env python Incidentally, the original reason I decided to try this in python was that my bash attempt has been running for a few days (on around 200 files), and the python implementation above takes about 20 seconds for the same number of files. Needless to say, that is something of an improvement, and probably says more about my bash implementation than anything else! As for perl vs. python, the main reason I wanted to use python was it's so easy to read. Although perl's regular expressions are extremely powerful, they're just so hard to read unless you use them every day (I don't). So, python seemed the more obvious choice. That, and a piece of software I do use every day (ArcGIS) has the ability to incorporate custom functions written in python, so it's likely to be more useful to me in the future. Thanks everyone for the input - hopefully this is the start of a prosperous use of python for me. It's been on my todo list for so long now. |
Quote:
The idea of anonymous functions/objects is that not the anonymous code author, but the anonymous code user decides on names, so the user and the author have no need in prior negotiations regarding names. In a similar manner, inheritance is easily done "inline" through scoping rules - the lack of decent scoping rules is a big drawback of Python, probably the biggest repellant for me. |
May I suggest you open (yet another) dedicated thread to do perl vs python flamewars?
|
Quote:
|
All times are GMT -5. The time now is 10:29 AM. |