LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Shell scripting (https://www.linuxquestions.org/questions/programming-9/shell-scripting-887470/)

kswapnadevi 06-21-2011 01:52 AM

Shell scripting
 
I have the data file in the given format. In the given file, 3 structures are there starting with’Contig’and each structure having substructures starting with’-‘ like -150.90, -150.70, -149.70. One substructure contains range of values and one did not have any value. In -150.70, three ranges are there in that 672:693 and 679:700 are continous. I need a shell program to combine such ranges like 672:700 for this and remove the substructures which do not have range of values. The output needed shown below.

INPUT

Contig1430
-150.90:
-150.70:
672:693
711:732
679:700
-149.70:

Contig1439
-134.80
20:41
42:63
55:76

Contig1454
-178.40:
536:557
648:669
546:567
551:572
554:575
561:582
567:588
572:593
579:600
583:604
591:612
601:622
607:628
614:635
617:638


OUTPUT
Contig1430
-150.70:
672:700
711:732

Contig1439
-134.80
20:76

Contig1454
-178.40:
536:638

grail 06-21-2011 02:20 AM

Are you saying that it is continuous as the last field of one substructure is greater than or equal to the first field of another substructure?
Code:

-150.70:
672:693
711:732
679:700


kswapnadevi 06-22-2011 06:45 AM

Quote:

Originally Posted by grail (Post 4391457)
Are you saying that it is continuous as the last field of one substructure is greater than or equal to the first field of another substructure?
Code:

-150.70:
672:693
711:732
679:700


Yes sir.

TB0ne 06-22-2011 07:59 AM

Quote:

Originally Posted by kswapnadevi (Post 4392644)
Yes sir.

Ok...as with most ALL of your other threads, you are asking us to write shell scripts for you. Again, can you POST WHAT YOU'VE WRITTEN???? What have you done/tried to make this work? Post what you've done and where you're stuck, and we will be glad to assist.

Comparing two numerical values is trivial...since you've (as you've said) been working on shell scripts "round the clock" (http://www.linuxquestions.org/questi...esting-848051/), since last year, this should be childs play to you.

grail 06-22-2011 08:21 AM

Well looking at your data I believe your suggested output is actually incorrect. The following seems to work (and can probably be shortened if you look through it):
Code:

#!/usr/bin/awk -f

BEGIN{ FS = ":" }

/^-/{
        i = 0
        line[i++] = $0
        next
}

/^[0-9]/{
        line[i] = $0

        while(!/^-/ && !/^$/ ){
                true = 1
                for(j = 1;j <= i;j++){
                        split(line[j],f)
                        if( $1 >= f[1] && $1 <= f[2] ){
                                line[j] = f[1]":"$2
                                true = 0
                        }
                }

                if( true )
                        line[++i] = $0

                if( getline <= 0)
                        break
        }

        for(x = 0; x <= i; x++)
                print line[x]
}

/^C/ || /^$/{
        print
        next
}

So the output I get is:
Code:

Contig1430
-150.70:
672:700
711:732

Contig1439
-134.80
20:41
42:76

Contig1454
-178.40:
536:638
648:669


sundialsvcs 06-22-2011 08:26 AM

I'll also say this: don't send a shell script (for any shell) to do a Camel's job.

Shell scripts (with the slight exception of the Korn shell, which for some reason does embed a full programming language) were not really designed for "heavy duty programming." But all of them do support the #! ("shebang...") construct, which allows the script to be written in any language. And there are many of them at your fingertips: Perl, PHP, Ruby, Python, and many more.

The task that you describe, while not entirely trivial, is the work of a simple Perl program that would be no more than a hundred lines or so, if that. (No, I'm not going to write it for you.) :tisk:


All times are GMT -5. The time now is 01:41 PM.