LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Clever person to write clever parsing bash script (https://www.linuxquestions.org/questions/programming-9/clever-person-to-write-clever-parsing-bash-script-4175438782/)

nishilmistry 11-26-2012 11:31 AM

Clever person to write clever parsing bash script
 
Hi clever people! I need a bash script that will parse the following line:

{"status":"OKAY","results":[{"name":"solr_ping","status":"OKAY"},{"name":"google_ping","status":"OKAY"},{"name":"hotel_search"," status":"OKAY"},{"name":"landmark_search","status":"OKAY"},{"name":"neighborhood_search","status":"O KAY"},{"name":"city_search","status":"OKAY"}]}

into something that looks like:

solr_ping OKAY
google_ping OKAY
hotel_search OKAY
landmark_search OKAY
neighborhood_search OKAY
city_search OKAY

Any help, much appreciated. Thank you.

TobiSGD 11-26-2012 11:40 AM

If your thread title suggests that you search for someone else that will write that script for you then you did not understand how LQ works.
Post what you have done already and where you have problems with your script and we will try to help you, but we will not do your work for you.

markush 11-26-2012 11:47 AM

Hello nishilmistry, welcome to LQ,

what have you achieved so far? Note, that we per our forumrules don't do your homework. Please post some lines of code which show what you have tried out to solve your problem. Then we can help if it doesn't run as you expected.

You should take a look at sed (the stream editor) or maybe the cut command. Since all the strings which you want to be extracted are between doublequotes in the original text, you should consider if you can extract text between doublequotes

Here is a very valuable manual about programming bash http://tldp.org/LDP/abs/html/

Markus

dugan 11-26-2012 12:09 PM

The line is JSON, so have you googled for "parsing JSON with BASH" (no quotes) or something similar?

Sergei Steshenko 11-27-2012 06:09 PM

Quote:

Originally Posted by nishilmistry (Post 4837351)
Hi clever people! I need a bash script that will parse the following line:

{"status":"OKAY","results":[{"name":"solr_ping","status":"OKAY"},{"name":"google_ping","status":"OKAY"},{"name":"hotel_search"," status":"OKAY"},{"name":"landmark_search","status":"OKAY"},{"name":"neighborhood_search","status":"O KAY"},{"name":"city_search","status":"OKAY"}]}

into something that looks like:

solr_ping OKAY
google_ping OKAY
hotel_search OKAY
landmark_search OKAY
neighborhood_search OKAY
city_search OKAY

Any help, much appreciated. Thank you.

If this is for work, and if you share part of your salary, and if you do not insist on 'bash', I can do this part of your job for you.

Still, undermining my monopolistic powers, I suggest looking for JSON parsers in other languages, e.g. in Perl: http://search.cpan.org/search?query=JSON&mode=all -> http://search.cpan.org/~makamaka/JSON-2.53/lib/JSON.pm , http://search.cpan.org/~mlehmann/JSON-XS-2.33/XS.pm , etc.

Since I like Perl and know it pretty well, this is what I would use myself.

If it's a college assignment, and thus needs to be done 'bash' and nothing else, then you need to know that JSON is not a line-oriented format, so you at least need regular expressions supporting multiline strings (I think 'bash' has such an engine, but I am not sure).

Also, since JSON nestedness is unlimited, pure regular expressions are not sufficient - you need recursion. I am not a 'bash' programmer; it looks like 'bash' supports recursion: http://guidespratiques.traduc.org/gu.../localvar.html .

David the H. 11-28-2012 10:26 AM

Quote:

Originally Posted by Sergei Steshenko (Post 4838343)
...so you at least need regular expressions supporting multiline strings (I think 'bash' has such an engine, but I am not sure).

bash regex isn't explicitly multi-line, but it can handle ascii NL characters just like any other if they are explicitly defined in the regex. This can generally be most easily done with the $'..' quoting pattern.

Code:

$ text=$'foo\nbar\nbaz\nbum'
$ echo "$text"
foo
bar
baz
bum

$ re=$'bar\nbaz'
$ [[ $text =~ $re ]] && echo "${BASH_REMATCH[0]}"
bar
baz

Quote:

..it looks like 'bash' supports recursion...
Functions at least are recursive, yes. Incidentally, the maximum level of recursion can be controlled (since v.4 I believe) with the FUNCNEST shell variable.

Of course this still doesn't mean that parsing formats like this would be easy to do safely. It's much better to use a dedicated parser of some kind, as stated.

Finally, to nishilmistry:

If the input formatting is trustably regular, you can probably process it the way you want with awk.

Here are a few useful awk references:
http://www.grymoire.com/Unix/Awk.html
http://www.gnu.org/software/gawk/man...ode/index.html
http://www.pement.org/awk/awk1line.txt
http://www.catonmat.net/blog/awk-one...ined-part-one/


BTW, please use ***[code][/code]*** tags around your code and data, to preserve the original formatting and to improve readability. Do not use quote tags, bolding, colors, "start/end" lines, or other creative techniques.

sundialsvcs 11-28-2012 10:54 AM

I suggest that you're approaching this problem wrongly ...

Who needs "bash" for this? If you want to drive a nail, don't use a wrench to do it.

What you've got here is a JSON string. And on your system I'm sure that you have at least two of three languages ... Perl, Python, and PHP ... already installed and read-to-go, any one of which has or can have thorough support for JSON. And, they are full-featured programming languages with all of the other support that you require, designed expressly for the purpose.

Any language can be used to implement a script, thanks to the #!shebang feature: the first line of the script is checked for #! and if found, that's the language-processor that is invoked to handle the script.

Therefore, you should (even if you engage a "clever person" to do it) use "the right tool for the job." You have your choice of tools, but in my humble, bash-scripting is not (nor was it ever intended to be) one of them.

bigearsbilly 12-02-2012 07:02 AM

A clever person would not use shell scripting for this.
(I won't even use bash for shell scripting).

A clever person would use the correct tool. Here is a silly example,
not production ready of course.
(this is okay for trivial stuff not a proper JSON parser of course)

Code:

#!/usr/bin/env perl

use Data::Dumper;

my $line = q#{"status":"OKAY","results":[{"name":"solr_ping","status":"OKAY"},
{"name":"google_ping","status":"OKAY"},
{"name":"hotel_search"," status":"OKAY"},
{"name":"landmark_search","status":"OKAY"}, 
{"name":"neighborhood_search","status":"O KAY"},
{"name":"city_search","status":"OKAY"}]}#;


print "$line\n";

              # change the ':' to '=>' and it acts as a perl hash assignment which we
              # can simply evaluate:

$line =~ s/:/=>/g;
my $ref = eval $line;
print Dumper(\$ref);

Code:

$ ./1.pl
{"status":"OKAY","results":[{"name":"solr_ping","status":"OKAY"},{"name":"google_ping","status":"OKAY"},{"name":"hotel_search"," status":"OKAY"},{"name":"landmark_search","status":"OKAY"},{"name":"neighborhood_search","status":"O KAY"},{"name":"city_search","status":"OKAY"}]}
$VAR1 = {
          'status' => 'OKAY',
          'results' => [
                        {
                          'status' => 'OKAY',
                          'name' => 'solr_ping'
                        },
                        {
                          'status' => 'OKAY',
                          'name' => 'google_ping'
                        },
                        {
                          'name' => 'hotel_search',
                          ' status' => 'OKAY'
                        },
                        {
                          'status' => 'OKAY',
                          'name' => 'landmark_search'
                        },
                        {
                          'status' => 'O KAY',
                          'name' => 'neighborhood_search'
                        },
                        {
                          'status' => 'OKAY',
                          'name' => 'city_search'
                        }
                      ]
        };
$


Sergei Steshenko 12-02-2012 07:43 AM

Quote:

Originally Posted by bigearsbilly (Post 4841164)
A clever person would not use shell scripting for this.
(I won't even use bash for shell scripting).

A clever person would use the correct tool. Here is a silly example,
not production ready of course.
(this is okay for trivial stuff not a proper JSON parser of course)

Code:

...
$line =~ s/:/=>/g;
...


Nah !

Suppose you have ':' as part of key or value.

There are full-fledged JASON parsers. Period.

ntubski 12-02-2012 10:42 AM

Quote:

Originally Posted by bigearsbilly (Post 4841164)
A clever person would not use shell scripting for this.
(I won't even use bash for shell scripting).

A clever person would use the correct tool.

You could choose a correct tool for shell scripting; here's a solution with jq:
Code:

~/tmp$ ./jq -r '.results | .[] | .name, .status' < results.json
solr_ping
OKAY
google_ping
OKAY
hotel_search
null
landmark_search
OKAY
neighborhood_search
O KAY
city_search
OKAY

The null is caused by an extra space in the data which is probably a copy paste error. Of course, you'll want to get rid of the extra newlines by piping to awk or similar.


All times are GMT -5. The time now is 05:35 AM.