LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-16-2006, 05:00 PM   #1
eluzi
LQ Newbie
 
Registered: Apr 2005
Location: BRAZIL !!!
Distribution: Fedora4 :D
Posts: 17

Rep: Reputation: 0
Shell Script - filter list


Lads, i'd like to develop a shell script that would take a list of words in a file and told me what words are included there and how many times they're used... Gotta think...

Example:

world
window
linux
keyboard
linux
world
linux

Output:

world = 2
window = 1
keyboard = 1
linux = 3
 
Old 03-16-2006, 06:04 PM   #2
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
Something like this should work:
Code:
#!/bin/bash

# Usage: script <FILENAME>

WORDLIST=`sort $1 | uniq`
for WORD in $WORDLIST ; do echo -n $WORD = ; grep -cE "^$WORD\$" $1 ; done
It's not terribly efficient, but it should work.
 
Old 03-17-2006, 02:26 PM   #3
eluzi
LQ Newbie
 
Registered: Apr 2005
Location: BRAZIL !!!
Distribution: Fedora4 :D
Posts: 17

Original Poster
Rep: Reputation: 0
Great ! Just to know the UNIQ command you helped me a lot, the only thing left now to do is to take something in reverse order, let me explain with an example to simplify:

www.visavale.com.br
www.vivario.org.br
www.vivifernandes.theblog.com.br

I want to take the top domain of these, so to the first is .COM and for the second it's .ORG, to the third is .COM. I'm trying to use the CUT command, using the dots as separator, but as you can see I'd have to take only the second field after the last dot...(coming from the end to the beggining) and then taking the first. So i'd have: .COM for the first and so on...
If I use CUT from the beggining and chose the 3 field i'd have .COM, .ORG, but in the third it would be 'theblog' and it's not what i want...
Is it possible?
 
Old 03-17-2006, 07:06 PM   #4
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
Code:
sed 's/.*\(\.[a-z]{3}\)\.br$/\1/'
Should do it. Regular expressions are not my strength, so it may take some tweaking.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Script filter a file onradius Programming 4 02-21-2006 08:38 PM
program to filter valid emails out of a list?? ALInux General 1 12-17-2005 11:52 AM
shell script problem, want to use shell script auto update IP~! singying304 Programming 4 11-29-2005 06:32 PM
Postfix->filter Script->sudo adduser lawtoncooper Linux - General 0 07-23-2004 02:12 AM
Shell script for insert ip address into an ordered list of IPs inTrouble? Linux - Newbie 2 10-27-2003 03:21 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 09:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration