LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-30-2010, 08:34 PM   #1
kswapnadevi
LQ Newbie
 
Registered: Oct 2010
Posts: 16

Rep: Reputation: 0
shell scripting


In the given output, -50.00, -49.20, -49.20 and -47.60 are four structures; three in Chr8 and -47.6 in Chr9. I have a big file with various chromosomes under various structures. Each structure contains various values. In the given output, -50.00 contain no values. -49.20 contain 40:61 and 70:91 many times. In that duplication has to be removed. Similarly in -47.60, 65:86 repeated many times. The output required shown below. The shell script for this is highly appreciated. Thanks in advance.



Quote:
output generated:

Chr8:86884850-86884997

-50.00

-49.20

40:61

70:91

40:61

70:91

40:61

70:91

40:61

70:91

40:61

70:91

40:61

70:91

40:61

70:91

-49.20

Chr9:86884850-86884997

-47.60

65:86

65:86

65:86

65:86

65:86

65:86

65:86

Quote:
output required

Chr8:86884850-86884997

-50.00

-49.20

40:61

70:91

-49.20

Chr9:86884850-86884997

-47.60

65:86
 
Old 11-30-2010, 09:20 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
I think this awk snippet is close to what you need :}
Code:
{
  if( $1 ~ /^Chr/ ){
    print $1
    delete a
  } else {
    if( a[$1]++){
      next
    } else {
      print $1"\n"
    }
  }
}
 
1 members found this post helpful.
Old 11-30-2010, 09:30 PM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,565

Rep: Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901
I might have missed something, but wouldn't the normal exclude duplicates work?
Code:
awk '!a[$0]++' file
 
Old 11-30-2010, 09:45 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 15,999

Rep: Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219Reputation: 2219
I suspect the OP needs to define the requirements better - the required output shows -49.20 twice (begin and end maybe ?).
Perhaps all the negative values are to be kept ?.

Guessing ...
 
Old 12-01-2010, 11:42 AM   #5
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 341

Rep: Reputation: 143Reputation: 143
It looks as though a negative number marks the start of a record and duplicates are to be removed from each record.
Code:
awk '/^-/{delete a}; $0{if (!a[$0]++) print $0"\n"}'

Last edited by Kenhelm; 12-01-2010 at 11:46 AM.
 
Old 12-01-2010, 01:24 PM   #6
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by grail View Post
I might have missed something, but wouldn't the normal exclude duplicates work?
Code:
awk '!a[$0]++' file
My understanding was that the "ChrX:number-mumbles"
designate sections, and the "uniqueness" of the purely
numeric values was per section; hence my convoluted
script ... maybe the OP will come back to elaborate ;}



Cheers,
Tink
 
Old 12-01-2010, 01:51 PM   #7
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
OK, I'll toss in a few cents; I understand it as follows (and at least two contributors to the thread appear to be extremely close if not right on with their code):

The input contains records, and sub-records (the sub-records may contain fields). A record begins with "Chr..." (print those), and sub-records within that record begin with "-nn.nn" (print those too). The goal is to remove duplicate "nn:nn" fields from within the sub-records (so print the "nn:nn" items without duplication.)
 
Old 12-01-2010, 06:30 PM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,565

Rep: Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901
So maybe something more like:
Code:
awk '/^-/ || !a[$0]++' file
Edit: hhmmm ... just realised I have not allowed for the extra empty lines in the formatting
Code:
awk '/^-/ || !a[$0]++ && !/^$/{print $0"\n"}' file

Last edited by grail; 12-01-2010 at 06:36 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Terminal functions for shell scripting with Shell Curses LXer Syndicated Linux News 0 03-27-2008 12:50 AM
SHELL scripting/ shell functions mayaabboud Linux - Newbie 6 12-26-2007 09:18 AM
Shell Scripting: Getting a pid and killing it via a shell script topcat Programming 15 10-28-2007 03:14 AM
teaching shell scripting: cool scripting examples? fax8 Linux - General 1 04-20-2006 05:29 AM
shell interface vs shell scripting? I'm confused jcchenz Linux - Software 1 10-26-2005 04:32 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 08:16 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration