LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices



Reply
 
Search this Thread
Old 12-18-2011, 11:34 PM   #1
verse123
LQ Newbie
 
Registered: Oct 2011
Posts: 19

Rep: Reputation: Disabled
printing rows with repeated strings


Hi guys,

I am trying to print the rows with repeated strings (in this example it is the word DOG with some numbers) in a file. so for example:

col1 col2 col3
DOG1 233 1
DOG1 231 1
DOG4 230 5


I can do something like
Code:
awk '{if($3>=1){print}}' | sort -k1,1
but I do not know what to use to print the rows with every single repeat found. Any ideas?
 
Old 12-19-2011, 04:32 PM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
Could you explain your goal in more detail? For example:

Do you need to print out lines that match a specific string, or any repeating string in the file?
Are the matching lines sequential, or can they be scattered through the file?
Are the matched words all in the same column, or can they be in different ones?
Does the order of the output matter, and if so, how should it appear?

You might want to give use a larger example of the input, and perhaps tell us the context of what you're trying to do.
 
Old 12-19-2011, 08:40 PM   #3
verse123
LQ Newbie
 
Registered: Oct 2011
Posts: 19

Original Poster
Rep: Reputation: Disabled
Hi,

I need it to print out lines that match a repeating string and they are scattered throughout the file. The matched words are also in the same column and the order of the output does not matter as long as all of the lines with the repeated strings are printed in the output file. So in the example below, the word "DOG1" should appear in the output file 4 times with the corresponding information from the rest of the columns. The word "DOG3" should appear twice in the output file with the corresponding information from the rest of the columns (for example, it should look like this: DOG3 0.04 2)


col1 col2 col3
DOG1 233 1
DOG1 231 1
DOG4 230 5
DOG1 0.5 3
DOG3 0.04 2
DOG0 4 23
DOG1 5 0.1
DOG3 63 5
 
Old 12-19-2011, 10:47 PM   #4
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Hi, verse123. Here's how I read your program specification.

Quote:
Originally Posted by verse123 View Post
. . . print out lines that match . . . The matched words are also in the same column . . . the order of the output does not matter . . . "DOG1" should appear in the output file 4 times . . . "DOG3" should appear twice in the output
Hope I understand you. Here's the input file I copied from your post.

Code:
$ cat dogs.txt
DOG1 233 1
DOG1 231 1
DOG4 230 5
DOG1 0.5 3
DOG3 0.04 2
DOG0 4 23
DOG1 5 0.1
DOG3 63 5
Here's the program I use to parse the file.

Code:
$ cat dog-finder.awk
#! /usr/bin/awk -f

BEGIN {
    # FS=" "
    # OFS=" "
}

{
    i=$1
    if (i in array) {
        print array[i]
        delete array[i]
        print $0
    } else {
        array[i]=$0
    }
}
Here's the output my program produced when I fed it the input file. I believe this program meets your specifications as I understood them.

Code:
$ ./dog-finder.awk dogs.txt
DOG1 233 1
DOG1 231 1
DOG1 0.5 3
DOG1 5 0.1
DOG3 0.04 2
DOG3 63 5
HTH
 
Old 12-20-2011, 05:56 AM   #5
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
A Perl version:

Code:
perl -ane 'push @{$s{$F[0]}},$_;END{for(keys %s){print @{$s{$_}} if @{$s{$_}}>1;}}' file.txt
Edit: actually I prefer:
Code:
perl -ane '$s{$F[0]}.=$_;END{print for(grep {/\n./}values %s)}' file.txt
Edit Edit, works as well without for() and values
Code:
perl -ane '$s{$F[0]}.=$_;END{print grep{/\n./}%s}' file.txt

Last edited by Cedrik; 12-20-2011 at 12:50 PM.
 
  


Reply

Tags
awk, perl


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Searching and replacing strings in a file with strings in other files xndd Linux - Newbie 16 07-29-2010 03:40 PM
Searching .txt file for (specific) strings and printing them to new file Hb_Kai Linux - General 7 02-18-2010 10:09 AM
Compare two fields on consecutive rows and print the two rows aditi_borkar Linux - Newbie 3 04-09-2009 06:49 AM
how to find duplicate strings in vertical column of strings markhod Programming 7 11-02-2005 05:04 AM
Java printing - problem with large strings Andy@DP Programming 2 08-03-2004 03:23 PM


All times are GMT -5. The time now is 10:45 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration