LinuxQuestions.org - simple parsing question

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - simple parsing question (https://www.linuxquestions.org/questions/programming-9/simple-parsing-question-409782/)

simple parsing question

I have a string as follows:

Errors xx, Warnings yy (zz)

where xx, yy, and zz are numbers; everything else are literals.

How can I parse this string to extract the xx, yy, and zz numbers? awk, sed, perl, python, etc. are fair game as long as I can easily use it in a shell script.

Thanks,

can you post a sample of the lines that will need parsed..

Hmm, perhaps I wasn't being perfectly clear, but here it goes. Suppose I have a bunch of the following lines:

Errors 1, Warnings 2 (3)
Errors 4, Warnings 5 (6)
Errors 7, Warnings 8 (9)
Errors 10, Warnings 11 (12)
Etc...

I want to extract the individual numbers in the above lines...so that I can sum them, for example:

Total errors: 22
Total warnings: 26 (30)

I hope that is clear enough. What I wanted is a fairly efficient (but not overly complicated, understandability/maintainability is important too) way to parse the above example lines to put in my shell script. Awk, sed, perl, python, etc. invocations are acceptable as long as these tools are fairly well known and come standard (i.e. preinstalled) on most Linux distros.

try this,

Code:

>echo "Errors 10, Warnings 11 (12)" | sed 's/\(Errors \)\(.*\), \(Warnings\) \(.*\) (\(.*\))/\2 \4 \5/'

>10 11 12

will have the numbers seperated then read them into a variable, process the file in a loop and sum all the numbers.

Hope this helps.

Quote:

Originally Posted by thanhvn

Hmm, perhaps you were. I didnt read the first example to be literally the string..
here is somthing that should work in perl

Code:

$line = "Errors 5, Warnings 4 (3)";

$line =~ /^.*?(\d+).*?(\d+).*?(\d+)/;

print "$1 $2 $3";

or simply

Code:

$line = "Errors 5, Warnings 4 (3)";

$line =~ /^Errors (\d+), Warnings (\d+) \((\d+)/;

print "$1 $2 $3";