LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 09-02-2011, 07:34 AM   #1
kadhan
Member
 
Registered: Dec 2004
Posts: 40

Rep: Reputation: 15
Question decimal pattern matching


I have a big file and the lines pattern is given below:

MDQ[11:15],IO,MDQ[10:14],,,,MDQ[12:16],TPR_AAWD[11:15]

I want to modify this file like given below:

MDQ[11],IO,MDQ[10],,,,MDQ[12],TPR_AAWD[11]
MDQ[12],IO,MDQ[11],,,,MDQ[13],TPR_AAWD[12]
MDQ[13],IO,MDQ[12],,,,MDQ[14],TPR_AAWD[13]
MDQ[14],IO,MDQ[13],,,,MDQ[15],TPR_AAWD[14]


How i can implement this in sed/awk/perl/csh/vim?
Please help
 
Old 09-02-2011, 08:22 AM   #2
Proud
Senior Member
 
Registered: Dec 2002
Location: England
Distribution: Used to use Mandrake/Mandriva
Posts: 2,794

Rep: Reputation: 116Reputation: 116
I see integers not decimals.
To clarify, you want to strip the bits between : and ], aka take turn [nn:nn] into [nn]?
 
Old 09-02-2011, 08:33 AM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946
It looks more to me like he wants to take the bracketed parts as a sequence of integers, and expand them into individual entries based on that sequence. [11:15] becomes five single lines with [11], [12], [13], [14], and [15]. But the whole problem is still not entirely clear.

You say it's a "big file", but you only showed us one line. Does every line have sequences like that, or are there other lines interspersed? Are the sequences all single digit increments? Are the patterns regular or irregular in any way? Do you want the output to be a single file, with only the expanded lines, or with the expanded lines following the unexpanded lines, or what?

Please clarify.

And please use [code][/code] tags around your code (including input and output text), to preserve formatting and to improve readability.
 
Old 09-02-2011, 08:35 AM   #4
kadhan
Member
 
Registered: Dec 2004
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by Proud View Post
I see integers not decimals.
To clarify, you want to strip the bits between : and ], aka take turn [nn:nn] into [nn]?
yaa.. its integers.

first we need to calculate the range between MDQ[11:15] , its 5.
and need to split the line into 5 lines as given below:
MDQ[11]
MDQ[12]
MDQ[13]
MDQ[14]
MDQ[15]
 
Old 09-02-2011, 08:45 AM   #5
kadhan
Member
 
Registered: Dec 2004
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by David the H. View Post
It looks more to me like he wants to take the bracketed parts as a sequence of integers, and expand them into individual entries based on that sequence. [11:15] becomes five single lines with [11], [12], [13], [14], and [15]. But the whole problem is still not entirely clear.

You say it's a "big file", but you only showed us one line. Does every line have sequences like that, or are there other lines interspersed? Are the sequences all single digit increments? Are the patterns regular or irregular in any way? Do you want the output to be a single file, with only the expanded lines, or with the expanded lines following the unexpanded lines, or what?

Please clarify.

And please use [code][/code] tags around your code (including input and output text), to preserve formatting and to improve readability.
Yes. Its a big file, i showed here only one line and every other lines has same sequence.

input file is given below:

Code:
 
MDQ[10:15],IO,MDQ[10:15],,,,MDQ[10:15],TPR_AAWD[10:15],,,DATA[11:16],DATA[11:16],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[16],IO,MDQ[16],,,,MDQ[16],TPR_CLK_SYNC,,,DATA[16],DATA[16],IO,,1,,GVDD (1.5V/1.35V) SSTL15
output file become :

Code:
MDQ[10],IO,MDQ[10],,,,MDQ[10],TPR_AAWD[10],,,DATA[11],DATA[11],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[11],IO,MDQ[11],,,,MDQ[11],TPR_AAWD[11],,,DATA[12],DATA[12],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[12],IO,MDQ[12],,,,MDQ[12],TPR_AAWD[12],,,DATA[13],DATA[13],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[13],IO,MDQ[13],,,,MDQ[13],TPR_AAWD[13],,,DATA[14],DATA[14],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[14],IO,MDQ[14],,,,MDQ[14],TPR_AAWD[14],,,DATA[15],DATA[15],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[15],IO,MDQ[15],,,,MDQ[15],TPR_AAWD[15],,,DATA[16],DATA[16],IO,,16,,GVDD (1.5V/1.35V) SSTL15
MDQ[16],IO,MDQ[16],,,,MDQ[16],TPR_CLK_SYNC,,,DATA[16],DATA[16],IO,,1,,GVDD (1.5V/1.35V) SSTL15
 
Old 09-02-2011, 08:46 AM   #6
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,185

Rep: Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782
So each line the difference between the ranges for each item on a line will always be the same?
ie. each MDQ in this line have a range size of 5
 
Old 09-02-2011, 08:50 AM   #7
Proud
Senior Member
 
Registered: Dec 2002
Location: England
Distribution: Used to use Mandrake/Mandriva
Posts: 2,794

Rep: Reputation: 116Reputation: 116
Yes, should the first range dictate the number of rows, are the other ranges redundant (only their first number need be read?) or do they dictate just the pattern/stepping for values in those rows&columns, or can they cause more rows to be added, etc?

Last edited by Proud; 09-02-2011 at 08:52 AM.
 
Old 09-02-2011, 08:51 AM   #8
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
Quote:
Originally Posted by Proud View Post
I see integers not decimals.
To clarify, you want to strip the bits between : and ], aka take turn [nn:nn] into [nn]?
Maybe "decimal" (vs. "hexadecimal") integers? Technically not incorrect, but definitely misleading.
Kevin Barry
 
Old 09-02-2011, 10:11 AM   #9
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,185

Rep: Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782
Well going on the assumption that Proud and I have made:
Code:
#!/usr/bin/awk -f

BEGIN{ FS = "[][]" }

/:/{
    n = 1
    j = 1
    diff = 0
    for(i = 1; i <= NF; i++){
        if($i ~ /:/){
            split($i, a, ":")
            start[j++] = a[1]
            if( ! diff )
                diff = a[2] - a[1] + 1
        }
        else
            pieces[n++] = $i
    }
    for(x = 1; x <= diff; x++){
        line = ""
        for(y = 1; y < (n-1);y++)
            line = sprintf("%s%s[%d]", line, pieces[y], start[y]++)

        line = sprintf("%s%s", line, pieces[n-1])

        print line
    }
}

!/:/
I am sure a perl guru out there will have something to offer
 
Old 09-02-2011, 10:21 PM   #10
kurumi
Member
 
Registered: Apr 2010
Posts: 223

Rep: Reputation: 45
Ruby(1.9+)

Code:
#!/usr/bin/env ruby
range=[]
File.open("file").each do |line|
    if line[/^MDQ\[(\d+):(\d+)\]/]
        num=line.scan(/DATA\[(\d+):/)[0][0]
        line.scan(/^MDQ\[(\d+):(\d+)\]/){|x,y| range=(x..y).to_a }
        range.each do |i|
            line.gsub!(/(MDQ|TPR_AAWD)\[.[^\]]*?\]/,"\\1[#{i}]")
            puts line.gsub!(/DATA\[.[^\]]*?\]/,"DATA[#{num}]")
            num.succ!
        end
    else
        puts line
    end
end
 
Old 09-03-2011, 06:46 AM   #11
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,185

Rep: Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782
That is sweet kurumi ... haven't worked it all out but very nice
 
Old 09-05-2011, 01:51 AM   #12
kadhan
Member
 
Registered: Dec 2004
Posts: 40

Original Poster
Rep: Reputation: 15
Hi Grail and Kurumi,

Your solutions are working fine for me.... Thanks a lot for your help.....
Grail, Can you please explain the FS = "[][]" in the first line of the awk script.

Regards,
Kadhan.
 
Old 09-05-2011, 02:46 AM   #13
Proud
Senior Member
 
Registered: Dec 2002
Location: England
Distribution: Used to use Mandrake/Mandriva
Posts: 2,794

Rep: Reputation: 116Reputation: 116
FS = "[][]"
I belive this is setting the field separator to the regular expression [][] which is the character set containing ] or [. The definition of a character set is the outer [] and I think the double quotes might be superfluous.
FS = [\]\[] might be clearer, you can test if it works.

Last edited by Proud; 09-05-2011 at 02:50 AM.
 
Old 09-05-2011, 02:52 AM   #14
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,185

Rep: Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782Reputation: 1782
The quotes are required as it is a computed regex.
 
Old 09-05-2011, 03:12 AM   #15
kadhan
Member
 
Registered: Dec 2004
Posts: 40

Original Poster
Rep: Reputation: 15
I got your point... Thanks

Regards,
Kadhan
 
  


Reply

Tags
awk, csh, perl, sed, vim


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk with pipe delimited file (specific column matching and multiple pattern matching) lolmon Programming 4 08-31-2011 12:17 PM
[SOLVED] Pattern Matching kdate Linux - Newbie 4 05-23-2011 05:27 PM
pattern matching vinaytp Linux - Newbie 2 10-10-2009 06:06 AM
Pattern Matching Aveltium Linux - Newbie 5 04-12-2009 11:14 PM
pattern matching nadeemr Linux - Newbie 8 06-13-2007 11:05 AM


All times are GMT -5. The time now is 04:24 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration