Any Linux command output to delimited format. Script should work for any command.
I would like to develop a script capable to split my command results into a delimited text. See the examples below and see the desired output. Would be wonderful if the script could work for most of the Linux commands.
command used: $ finger EXAMPLE 1: The column "Name" has 2 names separated by a blank space --------- Login Name Tty Idle Login Time Office Office Phone abcdefg Firstname Secondname *:0 Sep 2 09:58 ABCDE/32th (111)111-1111 abcdefg Firstname Secondname pts/1 1d Sep 2 09:58 (:0) abcdefg Firstname Secondname *pts/2 Sep 3 15:08 (:0) abcdefg Firstname Secondname pts/3 6:04 Sep 3 09:03 (:0.0) abcdefg Firstname Secondname *pts/7 5:08 Sep 3 09:59 (:0) EXAMPLE 2: The column "Name" now has 3 names splited by 2 blank spaces --------- Login Name Tty Idle Login Time Office Office Phone abcdefg Firstname Secondname Thirdname *:0 Sep 2 09:58 ABCDE/32th (111)111-1111 abcdefg Firstname Secondname Thirdname pts/1 1d Sep 2 09:58 (:0) abcdefg Firstname Secondname Thirdname *pts/2 Sep 3 15:08 (:0) abcdefg Firstname Secondname Thirdname pts/3 6:04 Sep 3 09:03 (:0.0) abcdefg Firstname Secondname Thirdname *pts/7 5:08 Sep 3 09:59 (:0) DESIRED OUTPUT: -------------- Login|Name|Tty|Idle|Login Time|Office|Office Phone abcdefg|Firstname Secondname|*:0|Sep 2 09:58|ABCDE/32th|(111)111-1111 abcdefg|Firstname Secondname|pts/1|1d Sep 2 09:58|(:0)| abcdefg|Firstname Secondname|*pts/2|Sep 3 15:08|(:0)| abcdefg|Firstname Secondname|pts/3|6:04 Sep 3 09:03|(:0.0)| abcdefg|Firstname Secondname|*pts/7|5:08 Sep 3 09:59|(:0)| abcdefg|Firstname Secondname Thirdname|*:0|Sep 2 09:58|ABCDE/32th|(111)111-1111 abcdefg|Firstname Secondname Thirdname|pts/1|1d Sep 2 09:58|(:0)| abcdefg|Firstname Secondname Thirdname|*pts/2|Sep 3 15:08|(:0)| abcdefg|Firstname Secondname Thirdname|pts/3|6:04 Sep 3 09:03|(:0.0)| abcdefg|Firstname Secondname Thirdname|*pts/7|5:08 Sep 3 09:59|(:0)| No matter how many words I have for each different column I would like to split the results by individual column. I am looking for a general solution and not specific for this command (finger), but it could work everywhere. |
Perfect candidate for sed:
finger | sed -e 's/\s\+/|/g' ls -l | sed -e 's/\s\+/|/g' ps -ef | sed -e 's/\s\+/|/g' The s means "substitute". The item between the first forward slash and the second one is the pattern to search for to do substitution on and the item between the second and third is what to substitute with. The final g means do it globally on the line rather than just on the first one it finds. For the first item the special characters shown: The \s means "any whitespace" and the \+ means treat successive whitespace all as one (e.g 3 spaces, 2 tabs etc...). Of course this may have limited use for some commands given that delimiting some things (e.g. a file name that contains a space) will make it misreport what you really want. |
Hi and welcome to LQ.
Unfortunately there's no general solution, because you have to take into account the semantic (meaning of the fields) and their formatting, furthermore some fields may be empty for some records but not for others and for instance the number of white spaces in a given field can vary. Often the command has formatting options that can help knowing what character string represents what field, for instance "finger -l". When (but only when) you know what is the content of each field you can include separators in each record with the usual text processing commands like sed, awk, grep, or tr to name a few, and parameter expansion. |
finger output fields are separated by 2 spaces or more, so better do
Code:
finger | sed -e 's/\s\s\+/|/g' [edit] Just tested, the date was split (day number is 2 spaces after month) doh |
finger output fields are not always separated by 2 spaces or more. Sometime it uses 1 only and does not work correctly.
First sed example did not work at all. Check the results of the second. finger | sed -e 's/\s\s\+/|/g' Login|Name|Tty|Idle|Login Time|Office|Office Phone abcdefg|Firstname Secondname|*:0|Sep|2 09:58 ABCDE/32th|(111)111-1111 abcdefg|Firstname Secondname|pts/1|1d|Sep|2 09:58 (:0) abcdefg|Firstname Secondname|*pts/2|1:06|Sep|3 15:08 (:0) abcdefg|Firstname Secondname|pts/3|7:11|Sep|3 09:03 (:0.0) abcdefg|Firstname Secondname|*pts/4|Sep|3 15:24 (:0) abcdefg|Firstname Secondname|*pts/7|6:15|Sep|3 09:59 (:0) |
Quote:
As I explained you do NOT need \s\s for multiples because the \s\+ deals with multiples. |
Quote:
Your first sed example turns this: Code:
abcdefg Firstname Secondname *:0 Sep 2 09:58 ABCDE/32th (111)111-1111 Code:
abcdefg|Firstname|Secondname|*:0|Sep|2|09:58|ABCDE/32th|(111)111-1111 Code:
abcdefg|Firstname Secondname|*:0|Sep 2 09:58|ABCDE/32th|(111)111-1111 |
Again, you need to have a reliable way to find fields boundaries to do what you want, and the way output is formatted can vary upon options used, so there can't be a general answer.
Usually "man <command>" gives information about output formatting and its options. |
Thank you all for the quick reply. I will take your feedback and will try to build a solution.
|
I do not see a one size fits all solution based on the changing format of your data, however, if we compare the first and second sets of data provided and say that the only diff
is the number of names in the second field (Name), you could try looking at something like Perl or Ruby and use the unpack command to delimit fixed width data and then you would have 2 options for the two examples presented. |
All times are GMT -5. The time now is 12:56 PM. |