LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-10-2010, 05:40 AM   #1
SeraphimNL
LQ Newbie
 
Registered: Dec 2009
Location: Leeuwarden
Distribution: Fedora 11
Posts: 5

Rep: Reputation: 0
Lightbulb If-statement in Shell Script


Hello LQ'er,

For a school assignment I have to write some sort of e-mail classification program. It's kinda fake since there are no mailservers involved, but only a collection of old e-mails. The commands I mostly needed to work with are: grep, sed, paste and other basic commands like that. Now I have to write a shell script that does the following: extract a e-mail subject, make a wordlist of that subject and check it it contains words that are classified as possible spam. This is what i've written so far:

Code:
      #!/bin/bash

      # spam words used for matching
      spamwords=output/only-spam.subject.all.lemm_stop.txt

      # files to examine
      FILES="data/lemm_stop/part*/*"

      #name of output file
      OUT="output/subspam-list.txt"
      rm -f $OUT  # remove output file (to prevent appending to an existing file) 
                  # -f: script will not produce an error message if file to remove does not exist

      for i in $FILES do
            egrep -wh '^Subject' | sed '/Subject: /d' | sed -r 's/[[:space:]]/\n/g' | sed -r 's/^[^[:alpha:]]*|[^[:alpha:]]*$//g' | egrep -wi '^[[:alpha:]]{3,}' | sed '/^$/d' > $i.temp
            grep -wf only-spam.subject.all.lemm_stop.txt $i.temp
            if filename == spam, echo $filename.subspam >> subspam.list.txt
      else echo $filename.subham >> subspam.list.txt
      rm $filename.temp
I am totally stuck in the part where i have to use an if-statement.

Code:
            grep -wf only-spam.subject.all.lemm_stop.txt $i.temp
            if filename == spam, echo $filename.subspam >> subspam.list.txt
      else echo $filename.subham >> subspam.list.txt
      rm $filename.temp
First I have to compare the created wordlist of the subject of the mail to the wordlist with classified spam words: only-spam.subject.all.lemm_stop.txt. How can I use the if-statement to make this work? How can I check for a match? I hope you can give me some advice!

Thanks in advance,

Seraphim
 
Old 01-10-2010, 07:35 AM   #2
kofucii
Member
 
Registered: May 2007
Location: Bulgaria
Distribution: Slackware, SCO Unix
Posts: 62

Rep: Reputation: 20
Code:
#!/bin/bash

# spam words used for matching
      SPAMWORDS=`cat output/only-spam.subject.all.lemm_stop.txt`

      # files to examine
      FILES=`ls data/lemm_stop/part*/*`

      #name of output file
      OUT=output/subspam-list.txt
      rm -f $OUT  # remove output file (to prevent appending to an existing file) 
                  # -f: script will not produce an error message if file to remove does not exist

      for i in $FILES
do
            egrep -wh '^Subject' | sed '/Subject: /d' | sed -r 's/[[:space:]]/\n/g' | sed -r 's/^[^[:alpha:]]*|[^[:alpha:]]*$//g' | egrep -wi '^[[:alpha:]]{3,}' | sed '/^$/d' > $i.temp

for s in $SPAMWORDS
do            
  SPAM=`grep -wf $s $i.temp`
  if [ ! -z $SPAM ];then
    echo $i >> $OUT
    rm $i.temp
  fi
done

done
Try this out. Please consider, that there are two nested "for" loops, so the Big-o is O(N^2)

Last edited by kofucii; 01-10-2010 at 07:36 AM.
 
Old 01-10-2010, 07:49 AM   #3
ozanbaba
Member
 
Registered: May 2003
Location: Tengiz
Distribution: Slackware64 14.1
Posts: 672

Rep: Reputation: 94
for loop is wrong (actually it will run, but for will pass FILE to commands and difened variables used with $)

for $i in `ls $FILES` do # this will pass ls output one by one to the command

and using ` you can assign output of programs to variables easily.

example:

ls -d /home | grep -i ozan will output /home/ozan/ on my system
if i write this OZANHOME=`ls -d /home | grep -i ozan` OZAHOME will be /home/ozan/ (that slash would create some problems).


if statements are like this:

if [statement]; then
do something
else
do something else otherwise
fi

for statement part where you do tests: look at this, IBM wrote such nice document
 
Old 01-11-2010, 01:24 AM   #4
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
There's good tutorials here
http://rute.2038bug.com/index.html.gz
http://tldp.org/LDP/Bash-Beginners-G...tml/index.html
 
Old 01-11-2010, 01:30 AM   #5
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551Reputation: 551
Plus, one is not limited to simply IF-THEN-ELSE:

Code:
if <something>; then
  <do stuff A>
elif <something else>; then
 <do different stuff B>
elif <try a third test>; then
 <do plan C>
else
 # none of the above tests were acted upon
 <do pretty much nothing, or plan D>
fi
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
if-then statement and wildcard: shell script Cyberman Programming 7 12-27-2009 08:49 AM
shell scripting question - IF statement copperfox Linux - Software 1 09-28-2007 10:22 AM
for statement in bourne shell script bujecas Linux - General 3 07-17-2006 08:32 AM
How do I use an If statement within a case in a Shell script?? crazygyhrous Programming 7 01-03-2006 07:41 AM
Unix Shell Programming If Statement ']['HeBroken Programming 2 12-06-2004 04:21 PM


All times are GMT -5. The time now is 12:54 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration