how to split string with awk or sed

xiutuo · 02-23-2018, 08:15 PM

there is a string format like this,split with comma：
111,2222,33333,4444,555,6666666,77,

I want to split with 9 character each line.if each line not eq 9 character and end with comma,display the all the character as one line before the last comma,the other characters after comma put it into the next line

the result will be like :
111,2222,
33333,[the second line should be "33333,444" but is not end with comma,so display all the context before the last comma,and the other after comma "444" to next line ]
4444,555,[this line should be "4444,555,",it end with comma and eq 9 character ]
6666666,[this line should be "6666666,7",same situation with line two]

so my question is how to do this with awk or sed.
thank you for your help

syg00 · 02-23-2018, 08:23 PM

You've been registered almost 10 years - you should know LQ is not a free coding centre.
Make an effort, explain where you are having problems, we'll try to point you in the right direction.

With the logic required, sed would be a waste of time. awk, perl, python, ... pick a language with logic you are comfortable with and use that

ondoho · 02-24-2018, 06:51 AM

i question what the point would be.
apparently the input follows a certain logic (comma-separated values), and it seems you are trying to break that logic. why?
wouldn't it be easier & better to fill an array with all the values, then work with that?

syg00 · 02-24-2018, 07:58 PM

Define "better" - the task can be simply done in awk using substr and index and loop over the input record(s).

Always multiple ways to skin a cat.

ondoho · 02-25-2018, 12:12 AM

maybe i didn't understand the question.
did op mean to respect the commas as separators, but ensure that a line never gets longer than 9 characters?
that would imply that any one value would never be longer than 8 characters, i think.

syg00 · 02-25-2018, 02:02 AM

Quote:

Originally Posted by ondoho

maybe i didn't understand the question.

You and me both maybe.
I think I got misled by the ccomments - maybe I'll re-look at it sometime.

pan64 · 02-25-2018, 03:13 AM

looks like split at comma, but only if the text will not fit into 9 chars (including commas). But who knows?
Yes, it can easily implemented with awk, a bit hard using sed (looks like a challenge).
But first just please show us what did you try, what's happened, what is your real problem (where did you stuck) - and we will gladly help you to step forward.

BudiKusasi · 02-25-2018, 07:46 AM

Code:

echo 111,2222,33333,4444,555,6666666,77,|sed -r 's/(.{0,8},)/\1\n/g'

add the end I option for case-insensitive alphabet

pan64 · 02-25-2018, 07:49 AM

yes, very nice

syg00 · 02-25-2018, 11:37 PM

Indeed - I was way too overthinking the issue.

Suggested addition removed as it was an artifact of the way I was testing.

MadeInGermany · 02-26-2018, 03:25 AM

The {0,8} tries to match the maximum of characters (greedyness) - exactly what is required here.
In sed there is normally no need to put the entire search expression in brackets, because it can be referenced as & in the substitution string:

Code:

echo 111,2222,33333,4444,555,6666666,77,| sed -r 's/.{0,8},/&\n/g'

A Unix sed needs a BRE, and a newline is represented by a \ followed by a newline:

Code:

echo 111,2222,33333,4444,555,6666666,77,| sed 's/.\{0,8\},/&\
/g'

xiutuo · 09-08-2018, 09:30 AM

Quote:

Originally Posted by BudiKusasi

Code:

echo 111,2222,33333,4444,555,6666666,77,|sed -r 's/(.{0,8},)/\1\n/g'

add the end I option for case-insensitive alphabet

workouts fine.thank you so much

MadeInGermany · 09-10-2018, 09:27 AM

If the separator were space (not comma) then you could use fmt

Code:

echo 111 2222 33333 4444 555 6666666 77 | fmt -9

fmt can also join lines:

Code:

echo 111 2222 33333 4444 555 6666666 77 | fmt -9 | fmt -20

That was for demonstration. Of course you can simply split

Code:

echo 111 2222 33333 4444 555 6666666 77 | fmt -20