[SOLVED] Methods for extracting data strings from output files
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
first you tell me why do you need to do that? And why is it related to text parsing at all?
Because this is how normal programming is done. I.e. input files are parsed, and the result of parsing is a data structure.
Then processing is performed on the data structure.
For modularity/extensibility data structures are exported and imported by next consumers in the data processing chain.
I've dealt with huge amounts of data - be it VLSI design, static timing analysis, VLSI verification, ASIC standard library cells characterization, acoustic modeling, whatever - the approach with data structures always works and is the book approach.
Funny you found that disorganized. This is a pretty highly regarded software ...
Windows95/98 was also once considered highly regarded SW.
For SW to be good one needs competition - as everywhere else. I do not think quantum chemistry SW is widely used, so I do not expect competition in the field.
There are well known and highly regarded data formats/approaches used in scientific calculations, for example, HDF: http://www.hdfgroup.org/ .
Ok, I will rephrase that "easiest to learn" comment
Which language has commands/functions that are most naturally implemented to perform these tasks. For example:
If awk has a find_the_first_word_after_this_string("Insert string here") command, or
If perl has a grab_text_between_these_two_strings("string1", "string2") command,
then it is quite easy to decent which language is best suited for which task. I am ignoring performance because it seems that no consensus is coming any time soon regarding that. In any case, the fact that two senior members cannot reach a consensus about it means to me that awk and perl have only marginal differences in performance. Hence I place my main priority on implementation.
I am having trouble keeping up. Give me a moment to review all the posts. I missed one directly referring to GAMESS with a link to some kind of cookbook.
Ok, I will rephrase that:
Which language has commands/functions that are most naturally implemented to perform these tasks. For example:
If awk has a find_the_first_word_after_this_string("Insert string here") command, or
If perl has a grab_text_between_these_two_strings("string1", "string2") command,
both have these functions, I have shown you how its done with awk.
Quote:
then it is quite easy to decent which language is best suited for which task. I am ignoring performance because it seems that no consensus is comming any time soon regarding that. In any case, the fact that two senior members cannot reach a consensus about it means to me that awk and perl have only marginal differences in performance. Hence I place my main priority on implementation.
Not true, awk parsing can be fast, if not, faster than Perl/Python/Ruby. And no, I am not disputing the fact that one can use Perl/Python for the job, what i don't agree is the "underlanguage" should not be learned comment.
Ok, I will rephrase that:
Which language has commands/functions that are most naturally implemented to perform these tasks. For example:
If awk has a find_the_first_word_after_this_string("Insert string here") command, or
If perl has a grab_text_between_these_two_strings("string1", "string2") command,
then it is quite easy to decent which language is best suited for which task. I am ignoring performance because it seems that no consensus is comming any time soon regarding that. In any case, the fact that two senior members cannot reach a consensus about it means to me that awk and perl have only marginal differences in performance. Hence I place my main priority on implementation.
I am reiterating what I've said - you seem to be asking a wrong question.
Though Perl can do anything 'awk' can do.
The correct questions emanate from the understanding of the whole data parsing and processing mission. My whole experience tells me that 'awk' is insufficient for this. Or, in other words, relying on tools of limited capability (like 'awk') perpetuates data mess.
Ok, those perl scripts are indeed the type of thing I am looking for. That is not to say that the previously mentioned awk scripts would not work either. I was going to put these scripts in separate files anyway so the user would be able to invoke them at his/her convenience. Therefore, there is nothing from stopping me from writing one command perl_getafterstring, and another awk_getafterstring. I can test both--although I suspect the performance will vary and average performance of each will be very close. I am guessing this will come down to personal preference and case by case problems.
My whole experience tells me that 'awk' is insufficient for this. Or, in other words, relying on tools of limited capability (like 'awk') perpetuates data mess.
you obviously do not have enough experience with awk.
please don't cloud the newbie mind with blatant lies. Awk is perfectly sufficient for what he is doing.
Last edited by ghostdog74; 08-24-2010 at 11:19 AM.
If you show me how to export data structures using 'awk' and then to import them back, then 'awk' might be sufficient. Otherwise 'awk' is DOA.
why don't you show us how you solve his problem in Perl, and i will show you mine with awk. Then let him decide which one is simpler, more readable and works.
Last edited by ghostdog74; 08-24-2010 at 11:28 AM.
Hi Feynman - try to ignore any bickering. May I ask if you are happy to progress on your own now or do you still require help?
I had a look at the file you attached. I am assuming this is only the input data? (I didn't read all of it just skimmed)
If you are still working on a solution that requires help, maybe using the data from this file you could give an example output that satisfies
what you are looking for?
why don't you show us how you solve his problem in Perl ...
I have no interest in reconsidering all the circumstances which led to creation of Perl as a replacement for sh/sed/awkk and later transition from Perl 4 to Perl 5 with introduction of references and hierarchical data structures.
Because for me the considerations are obvious.
I do not care that 'awk' can in some case be faster than Perl because in the grand scheme of things (WRT data parsing and consequent data processing) it's not an issue.
Thank you grail. Well, with the given awk commands, I have 4/5 task covered. My description of the unsolved task was admittedly vague so I rewrote it in an earlier post. I will work on/copy-past from that cookbook site/get some help with perl programs that do the equivalent. I will try to have both available as separate commands for my program. I assume if I am using more than one cpu and if I have the right software installed that these text searches will be automatically redistributed across my other cpus. I really do not have any experience in parallelization, but I do have access to more than one cpu.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.