LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-04-2008, 11:11 AM   #1
Acidg3rm5
LQ Newbie
 
Registered: Nov 2008
Posts: 5

Rep: Reputation: 0
Taking in a text file and analzying it


Hi. Would like to write a shell program using bash, to take a given text file and analyze it to produce the frequency of characters in the file... I also want to report frequency of 1-letter word, 2 letter words till 4 letter words.
I'm very new to linux, and this is my first program. Can anyone please help me with it? I would love to write out something here, so that someone may correct me. But i just have no idea how to start.
Example of an output would be something like this...
Character used number of occurrence
a 2
b 4
------------------------------------------

Length of words used no. of occurence
1 letter word 1
2 letter word 3
3 letter word 2
4 letter word 8
Many thanks.

Last edited by Acidg3rm5; 11-04-2008 at 11:12 AM.
 
Old 11-04-2008, 11:34 AM   #2
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 19,007

Rep: Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341
Quote:
Originally Posted by Acidg3rm5 View Post
Hi. Would like to write a shell program using bash, to take a given text file and analyze it to produce the frequency of characters in the file... I also want to report frequency of 1-letter word, 2 letter words till 4 letter words.
I'm very new to linux, and this is my first program. Can anyone please help me with it? I would love to write out something here, so that someone may correct me. But i just have no idea how to start.
Example of an output would be something like this...
Character used number of occurrence
a 2
b 4
------------------------------------------

Length of words used no. of occurence
1 letter word 1
2 letter word 3
3 letter word 2
4 letter word 8
Many thanks.
This sounds very much like homework......

These bash scripting guides should get you started.
http://tldp.org/LDP/abs/html/
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
 
Old 11-04-2008, 02:03 PM   #3
nishamathew1980
Member
 
Registered: Oct 2008
Posts: 37

Rep: Reputation: 16
A simple Google search will even get you the actual script you need. Have fun "googling"


Linux Archive

Last edited by nishamathew1980; 11-09-2008 at 04:57 AM.
 
Old 11-05-2008, 12:10 PM   #4
Acidg3rm5
LQ Newbie
 
Registered: Nov 2008
Posts: 5

Original Poster
Rep: Reputation: 0
i am able to come out with something after some research. However, i don't know how i could report the frequency of 1-letter word up till 4-letter word in the codes. Since i have put the Field separator as "".

Another problem i'm hitting is, the program also counts and out put the whitespace that it encounters. Is there a way to make the program ignore the whitespace, or at least not print it? here's an example of my output now.

echo i am testing | bash words.sh
Character used Number of Occurrence
2 <<<can i get rid of this? tell the program not to print the count for whitespace?
a 1
e 1
g 1
i 2
m 1
n 1
s 1
t 2



Quote:
nawk '
BEGIN {FS=""
print "Character used\t Number of Occurrence"
}

{
for (i=1;i<=NF;i++)
count[$i]++
}
END {
for (i in count)
print i"\t\t",count[i]
fi

}'
 
Old 11-05-2008, 04:22 PM   #5
openSauce
Member
 
Registered: Oct 2007
Distribution: Fedora, openSUSE
Posts: 252

Rep: Reputation: 39
Quote:
Originally Posted by Acidg3rm5 View Post
i am able to come out with something after some research. However, i don't know how i could report the frequency of 1-letter word up till 4-letter word in the codes. Since i have put the Field separator as "".
If you have a look at the man page for awk, you should be able to find a way to determine the length of a word. Once you've got that you can adapt the script you've got so far.
 
Old 11-05-2008, 05:00 PM   #6
nukoso
LQ Newbie
 
Registered: Jan 2008
Posts: 19

Rep: Reputation: 0
you should do your homework!!

And take a closer look to
$ man awk
and
$ man cut
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
text match pipe to file then delete from original text file create new dir automatic tr1px Linux - Newbie 6 09-10-2008 09:40 PM
How to parse text file to a set text column width and output to new text file? jsstevenson Programming 12 04-23-2008 02:36 PM
Steps needed to convert multiple text files into one master text file jamtech Programming 5 10-07-2007 11:24 PM
in Pascal: how to exec a program, discard text output or send to text file Valkyrie_of_valhalla Programming 6 05-02-2007 09:50 AM
Large tar file taking huge disk space in ext3 file system pcwulf Linux - General 2 10-20-2003 07:45 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:59 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration