LinuxQuestions.org - [SOLVED] decimal pattern matching

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - decimal pattern matching (https://www.linuxquestions.org/questions/programming-9/decimal-pattern-matching-900716/)

decimal pattern matching

I have a big file and the lines pattern is given below:

MDQ[11:15],IO,MDQ[10:14],,,,MDQ[12:16],TPR_AAWD[11:15]

I want to modify this file like given below:

MDQ[11],IO,MDQ[10],,,,MDQ[12],TPR_AAWD[11]
MDQ[12],IO,MDQ[11],,,,MDQ[13],TPR_AAWD[12]
MDQ[13],IO,MDQ[12],,,,MDQ[14],TPR_AAWD[13]
MDQ[14],IO,MDQ[13],,,,MDQ[15],TPR_AAWD[14]

How i can implement this in sed/awk/perl/csh/vim?
Please help

I see integers not decimals.
To clarify, you want to strip the bits between : and ], aka take turn [nn:nn] into [nn]?

It looks more to me like he wants to take the bracketed parts as a sequence of integers, and expand them into individual entries based on that sequence. [11:15] becomes five single lines with [11], [12], [13], [14], and [15]. But the whole problem is still not entirely clear.

You say it's a "big file", but you only showed us one line. Does every line have sequences like that, or are there other lines interspersed? Are the sequences all single digit increments? Are the patterns regular or irregular in any way? Do you want the output to be a single file, with only the expanded lines, or with the expanded lines following the unexpanded lines, or what?

Please clarify.

And please use [code][/code] tags around your code (including input and output text), to preserve formatting and to improve readability.

Quote:

Originally Posted by Proud (Post 4459634)

I see integers not decimals.
To clarify, you want to strip the bits between : and ], aka take turn [nn:nn] into [nn]?

yaa.. its integers.

first we need to calculate the range between MDQ[11:15] , its 5.
and need to split the line into 5 lines as given below:
MDQ[11]
MDQ[12]
MDQ[13]
MDQ[14]
MDQ[15]

Quote:

Originally Posted by David the H. (Post 4459647)

Yes. Its a big file, i showed here only one line and every other lines has same sequence.

input file is given below:

Code:

 

MDQ[10:15],IO,MDQ[10:15],,,,MDQ[10:15],TPR_AAWD[10:15],,,DATA[11:16],DATA[11:16],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[16],IO,MDQ[16],,,,MDQ[16],TPR_CLK_SYNC,,,DATA[16],DATA[16],IO,,1,,GVDD (1.5V/1.35V) SSTL15

output file become :

Code:

MDQ[10],IO,MDQ[10],,,,MDQ[10],TPR_AAWD[10],,,DATA[11],DATA[11],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[11],IO,MDQ[11],,,,MDQ[11],TPR_AAWD[11],,,DATA[12],DATA[12],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[12],IO,MDQ[12],,,,MDQ[12],TPR_AAWD[12],,,DATA[13],DATA[13],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[13],IO,MDQ[13],,,,MDQ[13],TPR_AAWD[13],,,DATA[14],DATA[14],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[14],IO,MDQ[14],,,,MDQ[14],TPR_AAWD[14],,,DATA[15],DATA[15],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[15],IO,MDQ[15],,,,MDQ[15],TPR_AAWD[15],,,DATA[16],DATA[16],IO,,16,,GVDD (1.5V/1.35V) SSTL15

MDQ[16],IO,MDQ[16],,,,MDQ[16],TPR_CLK_SYNC,,,DATA[16],DATA[16],IO,,1,,GVDD (1.5V/1.35V) SSTL15

So each line the difference between the ranges for each item on a line will always be the same?
ie. each MDQ in this line have a range size of 5

Yes, should the first range dictate the number of rows, are the other ranges redundant (only their first number need be read?) or do they dictate just the pattern/stepping for values in those rows&columns, or can they cause more rows to be added, etc?

Quote:

Originally Posted by Proud (Post 4459634)

I see integers not decimals.
To clarify, you want to strip the bits between : and ], aka take turn [nn:nn] into [nn]?

Maybe "decimal" (vs. "hexadecimal") integers? Technically not incorrect, but definitely misleading.
Kevin Barry

Well going on the assumption that Proud and I have made:

Code:

#!/usr/bin/awk -f



BEGIN{ FS = "[][]" }



/:/{

    n = 1

    j = 1

    diff = 0

    for(i = 1; i <= NF; i++){

        if($i ~ /:/){

            split($i, a, ":")

            start[j++] = a[1]

            if( ! diff )

                diff = a[2] - a[1] + 1

        }

        else

            pieces[n++] = $i

    }

    for(x = 1; x <= diff; x++){

        line = ""

        for(y = 1; y < (n-1);y++)

            line = sprintf("%s%s[%d]", line, pieces[y], start[y]++)



        line = sprintf("%s%s", line, pieces[n-1])



        print line

    }

}



!/:/

I am sure a perl guru out there will have something to offer :)

Ruby(1.9+)

Code:

#!/usr/bin/env ruby

range=[]

File.open("file").each do |line|

    if line[/^MDQ\[(\d+):(\d+)\]/]

        num=line.scan(/DATA\[(\d+):/)[0][0]

        line.scan(/^MDQ\[(\d+):(\d+)\]/){|x,y| range=(x..y).to_a }

        range.each do |i|

            line.gsub!(/(MDQ|TPR_AAWD)\[.[^\]]*?\]/,"\\1[#{i}]")

            puts line.gsub!(/DATA\[.[^\]]*?\]/,"DATA[#{num}]")

            num.succ!

        end

    else

        puts line

    end

end

That is sweet kurumi ... haven't worked it all out but very nice :)

Hi Grail and Kurumi,

Your solutions are working fine for me.... Thanks a lot for your help.....
Grail, Can you please explain the FS = "[][]" in the first line of the awk script.

Regards,
Kadhan.

FS = "[][]"
I belive this is setting the field separator to the regular expression [][] which is the character set containing ] or [. The definition of a character set is the outer [] and I think the double quotes might be superfluous.
FS = [\]\[] might be clearer, you can test if it works.

The quotes are required as it is a computed regex.

I got your point... Thanks

Regards,
Kadhan