[SOLVED] converting multiple lines of text to single line of text with comma separting them

cogiz · 10-06-2020, 09:57 AM

I have a file called abc.txt

00:00:00:00
00:02:59:90
00:09:08:50

I would like it to read as follows:

00:00:00:00,00:02:59:90,00:09:08:50

I have tried

Code:

tr "n" " " < abc.txt > xyz.txt

but to no avail.
Any help or assistance is greatly appreciated.
Thank you.

Cogiz

Sefyir · 10-06-2020, 10:23 AM

Code:

tr "n" " " < abc.txt > xyz.txt

Code:

tr "\n" "," < abc.txt > xyz.txt

This will leave a trailing ,

danielbmartin · 10-06-2020, 10:38 AM

With this InFile ...

Code:

00:00:00:00
00:02:59:90
00:09:08:50

... this paste ...

Code:

paste -sd, <$InFile >$OutFile

... produced this OutFile ...

Code:

00:00:00:00,00:02:59:90,00:09:08:50

No trailing comma.

Daniel B. Martin

.

cogiz · 10-06-2020, 12:09 PM

perhaps I should give you more information regarding my problem.

I have a file named chapters.txt (see below):

Chapter 1 = 00:00:00:00
Chapter 2 = 00:02:59:90
Chapter 3 = 00:09:08:50
Chapter 4 = 00:16:40:23
Chapter 5 = 00:24:22:66
Chapter 6 = 00:26:53:56
Chapter 7 = 00:34:57:30
Chapter 8 = 00:39:11:40
Chapter 9 = 00:43:44:86
Chapter 10 = 00:50:24:33
Chapter 11 = 00:53:23:66
Chapter 12 = 00:58:28:23
Chapter 13 = 01:04:48:50
Chapter 14 = 01:15:07:30
Chapter 15 = 01:24:48:23
Chapter 16 = 01:26:59:73
Chapter 17 = 01:28:53:73
Chapter 18 = 01:32:56:40
Chapter 19 = 01:37:13:90
Chapter 20 = 01:43:03:36

so far I have done this

Code:

cat chapters.txt | cut -c13- | sed -e 's/^[ \t]*//' > newchapters.txt

which produces this:

00:00:00:00
00:02:59:90
00:09:08:50
00:16:40:23
00:24:22:66
00:26:53:56
00:34:57:30
00:39:11:40
00:50:24:33
00:53:23:66
00:58:28:23
01:04:48:50
01:15:07:30
01:24:48:23
01:26:59:73
01:28:53:73
01:32:56:40
01:37:13:90
01:43:03:36

what I need to do now is instead of printing each one on a new line is to have it print as a single line with comma separating between chapters as follows:

00:00:00:00,00:02:59:90,00:09:08:50,00:16:40:23,00:24:22:66,00:26:53:56,00:34:57:30,00:39:11:40,00:3 9:11:40,00:50:24:33,00:53:23:66,
00:58:28:23,01:04:48:50,01:15:07:30,01:24:48:23,01:26:59:73,01:28:53:73,01:32:56:40,01:37:13:90,01:4 3:03:36

I have tried

Code:

paste -sd, <$infile >$outfile

also

Code:

tr "\n" "," < infile > outfile

to no avail.

any further assistance will be greatly appreciated.

thank you very much for all your replies.

Cogiz

danielbmartin · 10-06-2020, 12:39 PM

With this InFile ...

Code:

Chapter 1 = 00:00:00:00
Chapter 2 = 00:02:59:90
Chapter 3 = 00:09:08:50
Chapter 4 = 00:16:40:23
Chapter 5 = 00:24:22:66
Chapter 6 = 00:26:53:56
Chapter 7 = 00:34:57:30
Chapter 8 = 00:39:11:40
Chapter 9 = 00:43:44:86
Chapter 10 = 00:50:24:33
Chapter 11 = 00:53:23:66
Chapter 12 = 00:58:28:23
Chapter 13 = 01:04:48:50
Chapter 14 = 01:15:07:30
Chapter 15 = 01:24:48:23
Chapter 16 = 01:26:59:73
Chapter 17 = 01:28:53:73
Chapter 18 = 01:32:56:40
Chapter 19 = 01:37:13:90
Chapter 20 = 01:43:03:36

... this code ...

Code:

 cut -c13- <$InFile    \
|sed -e 's/^[ \t]*//'  \
|paste -sd,            \
>$OutFile

... produced this OutFile ...

Code:

00:00:00:00,00:02:59:90,00:09:08:50,00:16:40:23,00:24:22:66,00:26:53:56,00:34:57:30,00:39:11:40,00:43:44:86,00:50:24:33,00:53:23:66,00:58:28:23,01:04:48:50,01:15:07:30,01:24:48:23,01:26:59:73,01:28:53:73,01:32:56:40,01:37:13:90,01:43:03:36

Daniel B. Martin

.

pan64 · 10-06-2020, 12:56 PM

Code:

awk '{ ORS=","; printf $NF } '

there are still ways to improve

boughtonp · 10-06-2020, 01:06 PM

This produces a trailing comma, but is only a single command:

Code:

awk '{printf $4 ","}' chapters.txt

If the comma is a problem, there's a couple of ways to deal with it:

Code:

awk '{printf $4 ","}' chapters.txt | head -c-1
awk '{printf "," $4}' chapters.txt | tail -c+2

(As noted later in the thread, using printf like this will cause errors/incorrect results if the input contains percent signs. That's not the case for the given input, but if it were, using printf "%s," , $4 would solve it.)

Although if we can't avoid multiple commands, we can use awk to extract the last field in the row ($NF) then paste to replace newlines with comma:

Code:

awk '{print $NF}' chapters.txt  | paste -sd,

danielbmartin · 10-06-2020, 07:10 PM

Perhaps I will be accused of torturing sed.

With this InFile ...

Code:

Chapter 1 = 00:00:00:00
Chapter 2 = 00:02:59:90
Chapter 3 = 00:09:08:50
Chapter 4 = 00:16:40:23
Chapter 5 = 00:24:22:66
Chapter 6 = 00:26:53:56
Chapter 7 = 00:34:57:30
Chapter 8 = 00:39:11:40
Chapter 9 = 00:43:44:86
Chapter 10 = 00:50:24:33
Chapter 11 = 00:53:23:66
Chapter 12 = 00:58:28:23
Chapter 13 = 01:04:48:50
Chapter 14 = 01:15:07:30
Chapter 15 = 01:24:48:23
Chapter 16 = 01:26:59:73
Chapter 17 = 01:28:53:73
Chapter 18 = 01:32:56:40
Chapter 19 = 01:37:13:90
Chapter 20 = 01:43:03:36

... this sed ...

Code:

sed -e :a -e '$!N;s/\n/,/;ta s/\(Chapter [0-9]* = \)//'g $InFile >$OutFile

... produced this OutFile ...

Code:

00:00:00:00,00:02:59:90,00:09:08:50,00:16:40:23,00:24:22:66,00:26:53:56,00:34:57:30,00:39:11:40,00:43:44:86,00:50:24:33,00:53:23:66,00:58:28:23,01:04:48:50,01:15:07:30,01:24:48:23,01:26:59:73,01:28:53:73,01:32:56:40,01:37:13:90,01:43:03:36

Daniel B. Martin

.

astrogeek · 10-06-2020, 11:06 PM

Here is a simple awk which does not add the trailing comma and also ignores empty lines or anything that does not begin with "Chap":

Code:

awk '/^Chap/{if(n++){printf ","} printf "%s", $4}' chapters.txt

And another sed|tr|sed pipeline to do the same thing:

Code:

sed -r 's/^ch.*= ([0-9:]+)/\1,/i' chapters.txt | tr -d '\n' | sed -r 's/,$//'

UPDATE: Added correct printf specifier per post #10

MadeInGermany · 10-07-2020, 01:22 AM

Always printf unkown contents with a format string like "%s,"

With a "delayed" separator:

Code:

awk 'NF>=4 {printf (sep "%s"), $4; sep=","}' chapters.txt

The separator sep is known.

pan64 · 10-07-2020, 02:19 AM

I think post #6 solves the separator issue "automatically".

danielbmartin · 10-07-2020, 06:29 AM

Quote:

Originally Posted by astrogeek

Here is a simple awk which does not add the trailing comma and also ignores empty lines or anything that does not begin with "Chap":

Code:

awk '/^Chap/{if(n++){printf ","} printf $4}' chapters.txt

And another sed|tr|sed pipeline to do the same thing:

Code:

sed -r 's/^ch.*= ([0-9:]+)/\1,/i' chapters.txt | tr -d '\n' | sed -r 's/,$//'

Is there something missing from these solutions?

This one ...

Code:

sed -e :a -e '$!N;s/\n/,/;ta s/\(Chapter [0-9]* = \)//g' $InFile >$OutFile
cat $OutFile; echo "EOF"

... produced this ...

Code:

00:00:00:00,00:02:59:90,00:09:08:50,00:16:40:23,00:24:22:66,00:26:53:56,00:34:57:30,00:39:11:40,00:43:44:86,00:50:24:33,00:53:23:66,00:58:28:23,01:04:48:50,01:15:07:30,01:24:48:23,01:26:59:73,01:28:53:73,01:32:56:40,01:37:13:90,01:43:03:36
EOF

... but this one ...

Code:

awk '/^Chap/{if(n++){printf ","} printf $4}' $InFile >$OutFile
cat $OutFile; echo "EOF"

... produced this ...

Code:

00:00:00:00,00:02:59:90,00:09:08:50,00:16:40:23,00:24:22:66,00:26:53:56,00:34:57:30,00:39:11:40,00:43:44:86,00:50:24:33,00:53:23:66,00:58:28:23,01:04:48:50,01:15:07:30,01:24:48:23,01:26:59:73,01:28:53:73,01:32:56:40,01:37:13:90,01:43:03:36EOF

... and this one ...

Code:

sed -r 's/^ch.*= ([0-9:]+)/\1,/i' $InFile | tr -d '\n' | sed -r 's/,$//' >$OutFile
cat $OutFile; echo "EOF"

... did the same thing.

Daniel B. Martin

.

boughtonp · 10-07-2020, 08:01 AM

Quote:

Originally Posted by pan64

I think post #6 solves the separator issue "automatically".

What version did you test it with? Here's the output from GNU Awk versions 4.1.3 and 4.2.1:

Code:

$ awk '{ ORS=","; printf $NF } ' chapters.txt
00:00:00:0000:02:59:9000:09:08:5000:16:40:2300:24:22:6600:26:53:5600:34:57:3000:39:11:4000:43:44:8600:50:24:3300:53:23:6600:58:28:2301:04:48:5001:15:07:3001:24:48:2301:26:59:7301:28:53:7301:32:56:4001:37:13:9001:43:03:36

You did avoid the trailing comma, but that's because printf doesn't output field or record separators, and thus there are no commas at all.

boughtonp · 10-07-2020, 08:09 AM

Quote:

Originally Posted by MadeInGermany

Always printf unkown contents with a format string like "%s,"

That's good advice for arbitrary input, but not necessary if it's known not to contain percent signs?

pan64 · 10-07-2020, 08:14 AM

Quote:

Originally Posted by boughtonp

What version did you test it with? Here's the output from GNU Awk versions 4.1.3 and 4.2.1:

Code:

$ awk '{ ORS=","; printf $NF } ' chapters.txt
00:00:00:0000:02:59:9000:09:08:5000:16:40:2300:24:22:6600:26:53:5600:34:57:3000:39:11:4000:43:44:8600:50:24:3300:53:23:6600:58:28:2301:04:48:5001:15:07:3001:24:48:2301:26:59:7301:28:53:7301:32:56:4001:37:13:9001:43:03:36

You did avoid the trailing comma, but that's because printf doesn't output field or record separators, and thus there are no commas at all.

obviously that was wrong (solution was incomplete), works without f:

Code:

$ awk 'BEGIN { ORS="," } { print $NF } ' chapters.txt