LinuxQuestions.org - Best way to parse this line in shell

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - Best way to parse this line in shell (https://www.linuxquestions.org/questions/linux-newbie-8/best-way-to-parse-this-line-in-shell-4175608402/)

luftwaffe

06-22-2017 11:26 AM

Best way to parse this line in shell

Code:

line='Owner: C=US, O="Some Content, Inc.", OU=ABCDE12345, CN="Some Distribution: Some Content, Inc. (ABCDE12345)", UID=ABCDE12345'

Single quotes are not part of the string. The closest I could get is:

Code:

IFS="," read -ra splitted string <<< "$line"

The problem is that IFS does not tolerate double quotes and splits like:
${splitted[2]} == "O=Some Content"
${splitted[3]} == "Inc."

Ideally I would like to get a result like:
[0] Owner
[1] C=US
[2] O=Some Content, Inc.
[3] OU=ABCDE12345
[4] CN=Some Distribution: Some Content, Inc. (ABCDE12345)
[5] UID=ABCDE12345

I looked at awk and sed, and both cuts through the quoted line. Is there any idea / help on how to do it in just with standard shell tools? I have perl and python on this box too, just very unsure about syntax.

Thanks a lot :)

Turbocapitalist

06-22-2017 01:09 PM

Will you have a lot of such lines to parse from a file?

What can you say about the format or structure? There is probably already a perl module at CPAN to parse it, if you can name it. It looks similar to LDAP.

syg00

06-22-2017 06:55 PM

awk will do it with patsplit() - there is a good explanation with code under "Defining Fields by Content". The regex needs adjusting for what you want, but is achievable.
Your situation is a little more complex than a csv with double quotes as you appear to also want the colon as a field separator. As I said, a little regex fu and you are done.

Or go play in CPAN as Turbocapitalist suggested.

syg00

06-22-2017 07:40 PM

Some simple updates to the supplied code generated this - I note (now) that you don't want the double quotes. Easiest to probably post-process them.

Code:

[me@laptop awktst]$ echo $line | awk -f simpl.csv.awk 

NF =  6

$1 = <Owner>

$2 = <C=US>

$3 = <O="Some Content, Inc.">

$4 = <OU=ABCDE12345>

$5 = <CN="Some Distribution: Some Content, Inc. (ABCDE12345)">

$6 = <UID=ABCDE12345>

sweepnine

06-25-2017 05:16 PM

Unless your are an experienced sed/awk-script-guy you will be better off with an existing parser.

Code:

apt-get install python-openssl

Code:

#!/usr/bin/python

import OpenSSL

path        = 'your_cert.crt'

file_content = open(path).read()

cert        = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, file_content)

subject_dict = dict(cert.get_subject().get_components())

All times are GMT -5. The time now is 06:17 PM.