LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   translating text with awk or sed (https://www.linuxquestions.org/questions/programming-9/translating-text-with-awk-or-sed-768700/)

mek 11-12-2009 11:52 AM

translating text with awk or sed
 
Hi all,

I would like to know how can I parse a line like this:

IT©Author«what is CP/M?*an OS

into something like:

Category: IT
Question: what is CP/M?
Answer: an OS
Author: Author

I have to parse a file with 60000 lines like that.
Do you know how can I make an awk syntax to accomplish this task?

Thanks in advance!

H_TeXMeX_H 11-12-2009 12:52 PM

Quote:

Originally Posted by mek (Post 3754439)
Do you know how can I make an awk syntax to accomplish this task?

Yes, I do know how to accomplish this task. But, why not take a stab at it yourself first. At least try.

Try:
http://www.grymoire.com/Unix/Awk.html

H_TeXMeX_H 11-12-2009 01:29 PM

Ok, here I'll give you a hint as to how I might do it:

Code:

awk '{ printf("Category: %s", substr($1,1,index($1,"©")-2)); }' test
Note that some trouble may come from the strange characters that are used.

pixellany 11-12-2009 01:40 PM

Why not first parse the file to get rid of all the strange characters?---then awk can just work on fields normally.

ghostdog74 11-12-2009 06:32 PM

Code:

# awk  -vFS='\302\251|\302\253|*' '{print $1,$2,$3,$4}' file
IT Author what is CP/M? an OS

do the rest yourself.

mek 11-13-2009 07:57 AM

Thanks guys, I sorted out the issue working with sed to replace all those weird character with commas and then I used cut to extract the text to different files.

Thanks for your tips in awk anyway, I'll play arround that


All times are GMT -5. The time now is 11:02 AM.