Any Linux command output to delimited format. Script should work for any command.

torrelm@hotmail.com · 09-03-2014, 01:25 PM

I would like to develop a script capable to split my command results into a delimited text. See the examples below and see the desired output. Would be wonderful if the script could work for most of the Linux commands.

command used: $ finger

EXAMPLE 1: The column "Name" has 2 names separated by a blank space
---------
Login Name Tty Idle Login Time Office Office Phone
abcdefg Firstname Secondname *:0 Sep 2 09:58 ABCDE/32th (111)111-1111
abcdefg Firstname Secondname pts/1 1d Sep 2 09:58 (:0)
abcdefg Firstname Secondname *pts/2 Sep 3 15:08 (:0)
abcdefg Firstname Secondname pts/3 6:04 Sep 3 09:03 (:0.0)
abcdefg Firstname Secondname *pts/7 5:08 Sep 3 09:59 (:0)

EXAMPLE 2: The column "Name" now has 3 names splited by 2 blank spaces
---------
Login Name Tty Idle Login Time Office Office Phone
abcdefg Firstname Secondname Thirdname *:0 Sep 2 09:58 ABCDE/32th (111)111-1111
abcdefg Firstname Secondname Thirdname pts/1 1d Sep 2 09:58 (:0)
abcdefg Firstname Secondname Thirdname *pts/2 Sep 3 15:08 (:0)
abcdefg Firstname Secondname Thirdname pts/3 6:04 Sep 3 09:03 (:0.0)
abcdefg Firstname Secondname Thirdname *pts/7 5:08 Sep 3 09:59 (:0)

DESIRED OUTPUT:
--------------
Login|Name|Tty|Idle|Login Time|Office|Office Phone
abcdefg|Firstname Secondname|*:0|Sep 2 09:58|ABCDE/32th|(111)111-1111
abcdefg|Firstname Secondname|pts/1|1d Sep 2 09:58|(:0)|
abcdefg|Firstname Secondname|*pts/2|Sep 3 15:08|(:0)|
abcdefg|Firstname Secondname|pts/3|6:04 Sep 3 09:03|(:0.0)|
abcdefg|Firstname Secondname|*pts/7|5:08 Sep 3 09:59|(:0)|

abcdefg|Firstname Secondname Thirdname|*:0|Sep 2 09:58|ABCDE/32th|(111)111-1111
abcdefg|Firstname Secondname Thirdname|pts/1|1d Sep 2 09:58|(:0)|
abcdefg|Firstname Secondname Thirdname|*pts/2|Sep 3 15:08|(:0)|
abcdefg|Firstname Secondname Thirdname|pts/3|6:04 Sep 3 09:03|(:0.0)|
abcdefg|Firstname Secondname Thirdname|*pts/7|5:08 Sep 3 09:59|(:0)|

No matter how many words I have for each different column I would like to split the results by individual column.

I am looking for a general solution and not specific for this command (finger), but it could work everywhere.

MensaWater · 09-03-2014, 01:46 PM

Perfect candidate for sed:

finger | sed -e 's/\s\+/|/g'

ls -l | sed -e 's/\s\+/|/g'

ps -ef | sed -e 's/\s\+/|/g'

The s means "substitute". The item between the first forward slash and the second one is the pattern to search for to do substitution on and the item between the second and third is what to substitute with. The final g means do it globally on the line rather than just on the first one it finds.

For the first item the special characters shown:
The \s means "any whitespace" and the \+ means treat successive whitespace all as one (e.g 3 spaces, 2 tabs etc...).

Of course this may have limited use for some commands given that delimiting some things (e.g. a file name that contains a space) will make it misreport what you really want.

Didier Spaier · 09-03-2014, 01:58 PM

Hi and welcome to LQ.

Unfortunately there's no general solution, because you have to take into account the semantic (meaning of the fields) and their formatting, furthermore some fields may be empty for some records but not for others and for instance the number of white spaces in a given field can vary.

Often the command has formatting options that can help knowing what character string represents what field, for instance "finger -l".

When (but only when) you know what is the content of each field you can include separators in each record with the usual text processing commands like sed, awk, grep, or tr to name a few, and parameter expansion.

keefaz · 09-03-2014, 02:10 PM

finger output fields are separated by 2 spaces or more, so better do

Code:

finger | sed -e 's/\s\s\+/|/g'

but I agree with Didier Spaier, if some fields are empty, output is meaningless

[edit]
Just tested, the date was split (day number is 2 spaces after month) doh

torrelm@hotmail.com · 09-03-2014, 02:21 PM

finger output fields are not always separated by 2 spaces or more. Sometime it uses 1 only and does not work correctly.

First sed example did not work at all. Check the results of the second.

finger | sed -e 's/\s\s\+/|/g'

Login|Name|Tty|Idle|Login Time|Office|Office Phone
abcdefg|Firstname Secondname|*:0|Sep|2 09:58 ABCDE/32th|(111)111-1111
abcdefg|Firstname Secondname|pts/1|1d|Sep|2 09:58 (:0)
abcdefg|Firstname Secondname|*pts/2|1:06|Sep|3 15:08 (:0)
abcdefg|Firstname Secondname|pts/3|7:11|Sep|3 09:03 (:0.0)
abcdefg|Firstname Secondname|*pts/4|Sep|3 15:24 (:0)
abcdefg|Firstname Secondname|*pts/7|6:15|Sep|3 09:59 (:0)

MensaWater · 09-03-2014, 03:40 PM

Quote:

Originally Posted by torrelm@hotmail.com

f

First sed example did not work at all. Check the results of the second.

Which "first sed example"? The one I listed certainly works as I tested it before I sent it to you.

As I explained you do NOT need \s\s for multiples because the \s\+ deals with multiples.

suicidaleggroll · 09-03-2014, 03:48 PM

Quote:

Originally Posted by MensaWater

Which "first sed example"? The one I listed certainly works as I tested it before I sent it to you.

Look at his desired output again. sed is replacing ALL spaces with |, which is not what he wants. He wants each FIELD to be separated by a pipe, not each WORD to be separated by a pipe. Since there's no way to know which spaces are part of a field or between fields, there is no easy solution.

Your first sed example turns this:

Code:

abcdefg Firstname Secondname *:0 Sep 2 09:58 ABCDE/32th (111)111-1111

into this:

Code:

abcdefg|Firstname|Secondname|*:0|Sep|2|09:58|ABCDE/32th|(111)111-1111

when he wants this:

Code:

abcdefg|Firstname Secondname|*:0|Sep 2 09:58|ABCDE/32th|(111)111-1111

Note the space between "Firstname Secondname" and in the date.

Didier Spaier · 09-03-2014, 04:00 PM

Again, you need to have a reliable way to find fields boundaries to do what you want, and the way output is formatted can vary upon options used, so there can't be a general answer.

Usually "man <command>" gives information about output formatting and its options.

torrelm@hotmail.com · 09-04-2014, 07:12 AM

Thank you all for the quick reply. I will take your feedback and will try to build a solution.

grail · 09-04-2014, 08:54 AM

I do not see a one size fits all solution based on the changing format of your data, however, if we compare the first and second sets of data provided and say that the only diff
is the number of names in the second field (Name), you could try looking at something like Perl or Ruby and use the unpack command to delimit fixed width data and then you would have 2 options for the
two examples presented.