LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Methods for extracting data strings from output files (https://www.linuxquestions.org/questions/programming-9/methods-for-extracting-data-strings-from-output-files-828088/)

Sergei Steshenko 08-23-2010 11:15 PM

Quote:

Originally Posted by ghostdog74 (Post 4075695)
So do you think awk has no capabilities to handle scientific data? If you can show us (back up) this comment with some facts/examples, i will believe what you say. Other than that, you are just putting too much assumptions into the original problem(question) and saying something that has no concrete proof, (that awk is not suitable for his tasks. Yes, read the key words, his tasks)

I am saying what I've said: 'awk' is an "underlanguage". I haven't said it's not suitable for text parsing. I am saying it is not worth learning in the grand scheme of things.

Feynman 08-23-2010 11:15 PM

WOW! Thank you so much! I cannot blame you for not fully understanding task d. I can give a simple but general example for the task a) implementation:
text file reads:

blah blah
blah add this word to the list: 1234.56 blah blah
blah blah
blah now don't forget to add this word to the list: PINAPPLE blah blah
And for bonus points,
it would be nice to know that the script
would be able to add this word to the list: 1!@#$%^&*()[]{};:'",<.>/?asdf blah blah
blah blah

As the file implies, save words that come after "add this word to the list:" to a list.

Thank's again.

ghostdog74 08-24-2010 12:00 AM

Quote:

Originally Posted by Sergei Steshenko (Post 4075715)
I am saying what I've said: 'awk' is an "underlanguage". I haven't said it's not suitable for text parsing. I am saying it is not worth learning in the grand scheme of things.

In the first reply to my post, you did say that "awk is an underlanguage" but you did not say that its not suitable for parsing. It gives the impression that you meant awk is not suitable for the task.

So what's the definition of "grand scheme of things" ? Are you saying that whenever one has a task to solve, he has to turn to Python/Perl/Ruby for the solution? Or what? I am not disapproving your notion of turning to these 3 languages (well known) for solving problems, but often than not, the "grand scheme of things" is dependent on the environment, and what tools one has to his disposal. Note: I did say that you can do it with awk as well (the key word is "as well"), but i did not say awk is the only thing that can solve the problem.

Sergei Steshenko 08-24-2010 12:15 AM

Quote:

Originally Posted by ghostdog74 (Post 4075742)
...
So what's the definition of "grand scheme of things" ? Are you saying that whenever one has a task to solve, he has to turn to Python/Perl/Ruby for the solution?
...

Essentially yes. The grand scheme of things is that one shouldn't spend his/her time learning underlanguages.

ghostdog74 08-24-2010 12:29 AM

Quote:

Originally Posted by Sergei Steshenko (Post 4075755)
Essentially yes. The grand scheme of things is that one shouldn't spend his/her time learning underlanguages.

by that argument, you are suggesting that people should not learn DOS batch, vbscript etc right?. So if you one day a better language comes about and takes over Python/Perl/Ruby (remember, languages do evolve), and Perl/Python/Ruby now becomes the "underlanguage", so now, are you going to change your point of view, in this case, we should not spend time learning Perl/Python/Ruby because they are now "underlanguages" ?

you should stop imprinting that kind of "holy grail" thinking onto other people unaware of what's going on.

By the way, i am curious. How do you actually measure and categorize "underlanguages"? It appears to me like its a scientific and proven technique.

Sergei Steshenko 08-24-2010 07:44 AM

Quote:

Originally Posted by ghostdog74 (Post 4075766)
by that argument, you are suggesting that people should not learn DOS batch, vbscript etc right?. So if you one day a better language comes about and takes over Python/Perl/Ruby (remember, languages do evolve), and Perl/Python/Ruby now becomes the "underlanguage", so now, are you going to change your point of view, in this case, we should not spend time learning Perl/Python/Ruby because they are now "underlanguages" ?

you should stop imprinting that kind of "holy grail" thinking onto other people unaware of what's going on.

By the way, i am curious. How do you actually measure and categorize "underlanguages"? It appears to me like its a scientific and proven technique.

My technique is purely subjective - if after making an overview of easily available languages on a platform I come to the conclusion that there are both underlanguages and normal languages, I choose the latter ones.

And yes, Perl/Python/Ruby can become underlanguages.

DOS batch language is definitely an underlanguage, and even though many many years ago I knew it somewhat, now I wouldn't consider learning it. For example, for Windows there is portable "Strawberry Perl", so if I need to do massive scripting under Windows, I'll use that Perl instead of DOS batch language.

ghostdog74 08-24-2010 08:33 AM

Quote:

Originally Posted by Sergei Steshenko (Post 4076079)
My technique is purely subjective - if after making an overview of easily available languages on a platform I come to the conclusion that there are both underlanguages and normal languages, I choose the latter ones.

since its subjective, it will apply to anyone else as well. Some times, one doesn't need "normal languages". You have also not defined how you "measure" "underlanguages", whatever that means.

Quote:

And yes, Perl/Python/Ruby can become underlanguages.
so, what's your conclusion? Will you advice people during that time Perl/Python/Ruby has become "underlanguages" that they should not waste their time learning them?


Quote:

DOS batch language is definitely an underlanguage, and even though many many years ago I knew it somewhat, now I wouldn't consider learning it.
Yes,but it doesn't mean it don't have any uses, right? Situations where you can't install anything on a Win32 machine, then one has to use what's available.

Coming back to the main point of argument. You mentioned awk is an "underlanguage" and that later you mentioned you did not say its not suitable for parsing. I take it that you agree awk can do the job for this task (even though its under YOUR definition of "underlanguage"). So we can stop this useless argument already. right ?

Sergei Steshenko 08-24-2010 08:54 AM

Quote:

Originally Posted by ghostdog74 (Post 4076115)
since its subjective, it will apply to anyone else as well. Some times, one doesn't need "normal languages". You have also not defined how you "measure" "underlanguages", whatever that means.


so, what's your conclusion? Will you advice people during that time Perl/Python/Ruby has become "underlanguages" that they should not waste their time learning them?



Yes,but it doesn't mean it don't have any uses, right? Situations where you can't install anything on a Win32 machine, then one has to use what's available.

Coming back to the main point of argument. You mentioned awk is an "underlanguage" and that later you mentioned you did not say its not suitable for parsing. I take it that you agree awk can do the job for this task (even though its under YOUR definition of "underlanguage"). So we can stop this useless argument already. right ?

The argument is that it is senseless to learn 'awk' in case of massive scientific data on the horizon. And in general it is senseless to learn a sea of underlanguages.

The only place for underlanguages is systems with limited resources, like tiny embedded ones - not the case here.

ghostdog74 08-24-2010 09:17 AM

Quote:

Originally Posted by Sergei Steshenko (Post 4076139)
The argument is that it is senseless to learn 'awk' in case of massive scientific data on the horizon.

baseless assumptions on one isolated case. You have not provided data to back up your claims on awk not able to do massive scientific tasks. Again, you have based on assumption that OP has massive scientific data to process. Also you keep avoiding my question on Perl/Python/Ruby being called "underlanguages" in the future. Is it senseless to learn them if they ever become "underlanguages"? All your comments up till now are all crap if you can't answer them truthfully.

Quote:

And in general it senseless to learn a sea of underlanguages.
So you agree and would advice people not to use Perl/Python/ruby if they ever become underlanguages? Correct? What's your definition of an "underlanguage", you have not told us also.

Quote:

The only place for underlanguages is systems with limited resources, like tiny embedded ones - not the case here.
It is also not the case here that 'underlanguages' like awk AS YOU DEFINED IT, cannot do the job OP asked. So for this example, do you now think that its useless to learn "underlanguages"?

grail 08-24-2010 09:26 AM

@ Sergei & ghostdog - guys I realise that you both believe passionately in what you have to say but it seems that although loosely based on
this question you are more arguing with each other than helping the OP. Far be it for me to complain against either of you as I respect both of
you in your given strengths and always read solutions that both of you post.

Please let us just present the solutions we feel will work and then as with all things on LQ let the OP decide which option they prefer to follow :)
If they are clever, they will give both the due merit as I know this is how I have been learning whilst participating in the forum.

Cheers
Grail

grail 08-24-2010 09:30 AM

Feynman - I know you provided in the first post the things you would like to achieve and in post #17 you provided some data. Perhaps you could show what, using the data provided, your output for each and or all steps would be?

ghostdog74 08-24-2010 09:35 AM

Quote:

Originally Posted by grail (Post 4076175)
Please let us just present the solutions we feel will work and then as with all things on LQ let the OP decide which option they prefer to follow :)
Cheers
Grail

please note that i have already presented my solutions. The tasks can be solved using awk as well. his solution is just look for the "holy grail", which is non-existent.

Sergei Steshenko 08-24-2010 10:02 AM

Quote:

Originally Posted by ghostdog74 (Post 4076185)
please note that i have already presented my solutions. The tasks can be solved using awk as well. his solution is just look for the "holy grail", which is non-existent.

It's you who started using "holy grail" - I was talking about "overall optimization" WRT languages one invests his/her time in.

Feynman 08-24-2010 10:12 AM

Well, I do not know how to attach files. Please tell me how. In any case, I do not have very large files at this point. Actually, I was hoping in part I could use these scripts to feed the output of smaller files into the input of other programs--so and each output would contain more information to sift through.

[Bit of background here]
For my purposes, I might calculate the properties of a few small molecules in parallel, have the scripts grab some portion of the data (which would be easily identifiable based on the structure of the output file the chemistry program generates) and concatenate it into a new input files that asks for information about how they would interact. Automating this process would be wonderfully useful. I suspect "professionals" already have these scripts and a strong knowledge of whatever language they were written in at hand, but I am still an undergrad and have much to learn about my computational recourses. I was hoping to put the final product on a website for free download and GNU usage. I suspect others like me will find it quite useful.

Anyway, I can certainly copy and past some example input files I was starting out with (these came as tests for one of the chemistry packages I am working with). Give me a second to boot up my virtual Debian. I will post it in the next reply.

Sergei Steshenko 08-24-2010 10:14 AM

Quote:

Originally Posted by Feynman (Post 4076210)
Well, I do not know how to attach files. Please tell me how. ...

When you press the "Quote" button answering a post, in the lower left part of your browser screen "Manage Attachments" button should appear.


All times are GMT -5. The time now is 06:39 PM.