[SOLVED] Methods for extracting data strings from output files
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
It's you who started using "holy grail" - I was talking about "overall optimization" WRT languages one invests his/her time in.
If your memory is failing you, may i please redirect you to post #7. You are the one who started it all by saying its a waste of time learning an "underlanguage" (which strangely, is still undefined till now.) Then you mentioned one must go for Perl/Python/Ruby because its "one language fits all". So isn't that your "holy grail" mentality taking effect? In my posts, i have never once mentioned OP definitely have to use awk. I just said OP can use awk as well to solve his problem, which i did show him how.
And then there's my question which you have consistently avoided. If Perl/Python/Ruby one day is going to be called "underlanguages", are you going to advice people not to learn them? Still no answer from you?
This answer will decide whether you are spouting crap or not.
In your last few posts, you mentioned about embedded systems and that "underlanguages" are only used in those systems. So now i ask you, is learning "underlanguages" that worthless now?
Last edited by ghostdog74; 08-24-2010 at 10:18 AM.
... Then you mentioned one must go for Perl/Python/Ruby because its "one language fits all". So isn't that your "holy grail" mentality taking effect?
...
No, it isn't one language fits all. One may still need C/C++/OCaml/AnotherFastLanguage.
In the category of tightly coupled text parsing and related data processing Perl/Python/Ruby are clear winners over 'awk'.
In the category of tightly coupled text parsing and related data processing Perl/Python/Ruby are clear winners over 'awk'.
Again, another baseless assumption. Show some proof of those "winners" regarding text parsing and i will believe you. Otherwise, stop spouting your nonsense. Note, I am not an awk advocate. I like Perl as much as you do, and I use Python whenever i need to. I am only refuting you baseless comment that one should not waste time and learn awk (or other underlanguages as you defined it ) because I do believe they are still needed in various other environments, like the embedded systems you mentioned.
Hmm.. Why is this thread marked [SOLVED] - I didn't note any particular solution, and the OP is still providing information and sample files recently. Are the arguing parties still trying to help the OP here? Perhaps the debate should be pruned off to another thread, and assisting the OP can resume (assuming the thread is not actually SOLVED - is it?)
I looked at you data and it looks way too disorganized to me. I.e. my sensation is that the data is generated by quite a number of ad-hoc solutions with no clear architecture.
Funny you found that disorganized. This is a pretty highly regarded software and every quantum chemistry package I have used outputs data in this type of way.
The .dat file (also produced from a calculation) is more condensed, but it is essentially a chunk of the log file. I figured I would sift through the log file by default just in case I want to find something that is not in the dat file.
Anyway, the key here is that there are landmarks in that gibberish. For instance, if I want the total energy of the molecule, I want the number imediatly following the phrase "FINAL RHF ENERGY IS". And you will notice that the file is broken up into these chunks of data. Each has its own grammar/syntax (am am mostly self taught so forgive me if I use some terms incorrectly) and some unique landmark denoting where it start, and if you look carefully, there is a "-------" that comes before and after each chunk of data. Looking for things like this would be a typical task in sifting through the results of a quantum chemistry package.
I wanted to keep the scripts general so they would work not just for GAMESS, but most standard quantum chemistry packages. They all do this chunking thing. At this point, it seems if I have scripts that can perform those five tasks (actually, only four of them are needed--the last one is just a generalization of the third one), I should be able to extract just about any portion of any output generated by these packages. There are probably exceptions I have not thought of, but this would be an excellent start.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.