LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Need 'cut' with mulit char delimiter (https://www.linuxquestions.org/questions/programming-9/need-cut-with-mulit-char-delimiter-684047/)

endfx 11-17-2008 08:27 AM

Need 'cut' with mulit char delimiter
 
I need to tokenize a very long line of characters. Cut would work perfectly, however I need to use more than one character for the delimiter.

Does anybody have any ideas?
I don't have perl, but I do have most of the standard linux command line tools available: cut, sed, awk ...

Thanks

H_TeXMeX_H 11-17-2008 09:18 AM

Well, I'm probably not familiar with many programming term/jargon, but could you provide a more concrete example of what needs to be done ? I personally don't get it.

colucix 11-17-2008 09:20 AM

You can try to use awk, since the Field Separator can be a single character or a regular expression. For example if I have
Code:

$ cat testfile
Hellodelimworlddelim!!!

and I want to split this line using "delim" as delimiter:
Code:

$ awk -F"delim" '{print $1; print $2; print $3}' testfile
Hello
world
!!!

How to split fields and how to store the result, depends from your specific needs.

zer0x333 11-17-2008 09:30 AM

awk
 
Hi,

I just had a quick play with awk, looks the the field separator option can take multiple arguments!?

e.g. ',' and '=' using OR ..

Code:

cat inputfile | awk -F \(,\|=\) '{print $2}'
Please excuse the bad example! xD

HTH, zer0x

jan61 11-17-2008 01:42 PM

Moin,

you mean, the delimiter is not a multi character string, but is some times one character, another time a different one? In ths case the simpliest way would be to redefine IFS (Input Field Separator) and then use a simple loop:
Code:

jan@jack:~/tmp> IFS=',=
> '
jan@jack:~/tmp> echo "a,b=c" | while read f1 f2 f3; do
> echo "f1=$f1 f2=$fs f3=$f3"
> done
f1=a f2= f3=c
jan@jack:~/tmp> echo "a,b=c" | while read f1 f2 f3; do echo "f1=$f1 f2=$f2 f3=$f3"; done
f1=a f2=b f3=c

Jan

virteman 03-04-2009 01:37 AM

cut's delimiter must be a single character. You can replace your long character list to a special character first, then use cut.


All times are GMT -5. The time now is 04:33 AM.