sed challenge..datamining
Hello All,
I am working on a project and i came up with this problem.I need to extract a certain information from a text file... for example. YES_CHICK agagagagadagatdgatagatfgatagatagag agagagagatagatagatagtagatatagtatagta fgafgatatatatatattgtgatgatgatgatgat YES_HUMAN sgsgsgsgasgafafafatsfgsgsfsgsfgsfsg fgsfstfstsgtsgstsgstsgtsgstsgstsgts gstsstsgstsgtsgstsgstsgtsgsststgstgs YES_DEMON fgsdgddghudghgdgshghdghsghgdhgdsgdhghd fgdshdghdshgdshgdsgdhghdsghdshgdsgsh gshgdhsgdhgsdgshghdsghdsghgdhgshgdhgsh and i want to extract the info from it.for example if the user query is YES_HUMAN..then i get all the lines after YES_HUMAN uptil...YES_DEMON(not included.) I have worked with sed before many times but i am having trouble doing this..i am sure it is possible. If u think that it is not possible..what other options do i have..like any C++ code would also be of great help.. thanks and Regards to all FAHAD SAEED |
Hi.
Have a read of this: http://enterprise.linux.com/article....33253&from=rss The 'Searching, browsing, and exporting records' bit should be particularly interesting. Dave |
it still wont work... :(...any other ideas
|
just one way
Code:
sed -n "/YES_CHICK/,/YES_HUMAN/{/YES_*/!p}" yourfile |
The code by Ghostdog doesn't work with me in bash. I admit, I don't understand the code either otherwise I would have tried to fix it.
In these cases, I think awk is your friend. I really pays off to grab the concept of awk. Once you do it only takes a few minutes to create a script for this kind of processing. Awk was written for this purpose. :) Writing this post took me longer than writing the script. This is the script: Code:
BEGIN { Code:
YES_CHICKEN it gives this output: donald_pc:/tmp$ cat yesfile | awk -v flavour=YES_ -f yes.awk donald_pc:/tmp$ cat yesfile | awk -v flavour=YES_m -f yes.awk donald_pc:/tmp$ cat yesfile | awk -v flavour=YES_BIRD -f yes.awk 1. bird bird 2. bird bird 3. bird bird 4. bird bird donald_pc:/tmp$ cat yesfile | awk -v flavour=YES_CHICKEN -f yes.awk 1. chicken chicken 2. chicken chicken 3. chicken chicken 4. chicken chicken donald_pc:/tmp$ If you want the query string to show up before the data lines, change if (pflag == 1 && $0 !~ flavour ){ in if (pflag == 1){ On the command line, "-v flavour" passes a command line parameter to the awk script. I know that there are awk gurus who can do this much more elegantly, and put it all on one line. This script is readable though :D Let me know if this works for you jlinkels |
Thankyou all :)...that was very nice of u..
The code did work for the input file that i gave... BUT there is one more hurdle... the original data file is Code:
>YES_CHICKEN and the code that jlinkels gave did not work with this data file... so i tried to modify the code and did this Code:
BEGIN { Code:
cat yesfile | awk -v flavour=>YES_BIRD -f yes.awk |
If you are going to put > before the field, then use quotes in the awk statement.
Code:
cat file.txt Code:
awk -v flavour=">YES_BIRD" -f yes.awk file.txt Code:
sed -n '/>YES_HUMAN/,/>YES_BIRD/{/>YES_BIRD/!p}' file.txt |
Thankyou so much for allof the help :)
the code did work for the modified data file. However this did nt work on my RedHat Linux 3.3.. Code:
sed -n "/YES_CHICK/,/YES_HUMAN/{/YES_*/!p}" yourfile |
All times are GMT -5. The time now is 05:48 AM. |