LinuxQuestions.org - Adding a line

Hi

I have a file whose contents are as follows:

Code:

sorce1      LEN  assumption  695    3570    0.770047        -      .      ID=f000001.1;source_id=A.off_LEN_10008424;

sorce1      LEN  descriptive    3334    3570    .      -      0      Parent=f000001.1;



sorce1      LEN  assumption    8859    11328  0.628724        +      .      ID=f000002.1;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive    8859    9032    .      +      0      Parent=f000002.1;



sorce1      LEN  assumption    354569    361011  0.628724        +      .      ID=f000012.1;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive        354600    360111    .      +      0      Parent=f000012.1;



sorce1      LEN  assumption    350567    354686    0.628724        +      .      ID=f000012.2;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive    350567    353321    .      +      0                      Parent=f000012.2;

I wanted it to look like this

Code:



sorce1      LEN  predictive    695    3570    0.770047        -      .      ID=f000001;source_id=A.off_LEN_10008424;

sorce1      LEN  assumption  695    3570    0.770047        -      .      ID=f000001.1;source_id=A.off_LEN_10008424;

sorce1      LEN  descriptive    3334    3570    .      -      0      Parent=f000001.1;



sorce1      LEN  predictive    8859    11328  0.628724        +      .      ID=f000002;source_id=A.off_LEN_10008425;

sorce1      LEN  assumption    8859    11328  0.628724        +      .      ID=f000002.1;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive    8859    9032    .      +      0      Parent=f000002.1;



sorce1      LEN  predictive    350567    361011    0.628724        +      .      ID=f000012;source_id=A.off_LEN_10008425;

sorce1      LEN  assumption    354569    361011  0.628724        +      .      ID=f000012.1;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive        354600    360111    .      +      0      Parent=f000012.1;



sorce1      LEN  assumption    350567    354686    0.628724        +      .      ID=f000012.2;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive    350567    353321    .      +      0                      Parent=f000012.2;

Basically I wanted to add a statement with the third column entry as predictive and the ID having only the id name without anything after the dot.
So for every statement for assumption,I need to add a statement with predictive.

So i used this code
sed 's/\(.*\)assumption\(.*\)\(ID=[^.]*\)[^;]*\(;.*\)/\1predictive\2\3\4\n&/' file

However in my file, I have some instance where there are variants for the id name :For example One variant of id is f000012.1 and the other is f000012.2
this above code worked perfectly well for instance having no variants of IDS. But in case of variants,I am getting a multiple entry of predictive statement for the same ids.

result of the code

Code:

sorce1      LEN  predictive    695    3570    0.770047        -      .      ID=f000001;source_id=A.off_LEN_10008424;

sorce1      LEN  assumption  695    3570    0.770047        -      .      ID=f000001.1;source_id=A.off_LEN_10008424;

sorce1      LEN  descriptive    3334    3570    .      -      0      Parent=f000001.1;



sorce1      LEN  predictive    8859    11328  0.628724        +      .      ID=f000002;source_id=A.off_LEN_10008425;

sorce1      LEN  assumption    8859    11328  0.628724        +      .      ID=f000002.1;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive    8859    9032    .      +      0      Parent=f000002.1;



sorce1      LEN  predictive  354569    361011  0.628724        +      .      ID=f000012.1;source_id=A.off_LEN_10008425;

sorce1      LEN  assumption    354569    361011  0.628724        +      .      ID=f000012.1;source_id=A.off_LEN_10008425;

sorce1      LEN  descriptive        354600    360111    .      +      0      Parent=f000012.1;



sorce1      LEN  predictive    350567    354686    0.628724        +      .      ID=f000012.2;source_id=A.off_LEN_10008425;

sorce1      LEN  assumption    350567    354686    0.628724        +      .      ID=f000012.2;source_id=A.off_LEN_10008425;

sorce1      LEN  descrptive    350567    353321    .      +      0                      Parent=f000012.2;

whereas what i needed should look like this
sorce1 LEN predictive 350567 361011 0.628724 + . ID=f000012;source_id=A.off_LEN_10008425;

Is there a way I could only add a single line with predictive statement with using the earliest start point i e : and farthest away end point to represent the predictive statement?The ID name shouldnt have variants .

thanks in advance