LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-06-2016, 10:23 AM   #1
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Rep: Reputation: Disabled
Seeking advice with shell script (homework)


My situation is the same as stated in the forum link below:

http://www.linuxquestions.org/questi...rk-4175485733/

Below is what I have compiled thus far in my research. I need to make this all one line and have it work but so far I can not get any of it to work... What am I doing wrong?

Also I ask for kindness, yes this is homework and I am willing to do the work I just need guidance on where to go with it so please do not be rude.


Code:
#!/bin/bash
# Bash TestScript

#a. Remove punctuation
sed -e 's|^[[:punct:]]*||; s|[[:punct:]]*$||;' -i gasoline

#b. Make all characters lowercase
tr '[:upper:]' '[:lower:]' < gasoline > ScriptResults

#c. Put each word on a line by itself
tr ' ' '\n' < gasoline

#d. Remove blank lines
$ sed '/^$/d' gasoline > ScriptResults

#e. Sort the text to pull all lines containing the same word on adjacent lines
sort ScriptTest | uniq -c | sort -rn | head -n 12 | sed -E 's/^ *[0-9]+ //g'

#f. Remove duplicate words from the text
sed -ri s/(.* )1/1/g ScriptResults

#g. List most used words in the file first
cat ScriptResults | tr ' '  '\012' | tr '[:upper:]' '[:lower:]' | tr -d '[:punct:]' | grep -v '[^a-z]' | sort | uniq -c | sort -rn | head

exit 0
 
Old 04-06-2016, 10:39 AM   #2
BW-userx
Senior Member
 
Registered: Sep 2013
Location: MID-SOUTH USA
Distribution: Void Linux / Slackware 14.2
Posts: 2,191

Rep: Reputation: Disabled
you have just about the same thing but you keep NOT adding your input data only a blank file that goes by the name of 'gasoline' . what is it you're actually working with here?
 
Old 04-06-2016, 10:40 AM   #3
Habitual
LQ Addict
 
Registered: Jan 2011
Location: Youngstown, Ohio
Distribution: LM17.1/Xfce4.11.8
Posts: 7,195
Blog Entries: 10

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
Quote:
Originally Posted by sobey View Post
I can not get any of it to work... What am I doing wrong?
First, describe better what "I can not get any of it to work" means exactly?
In detail, step-by-painful-step.
Any output? Error code? Permission messages? Does not exist messages?
I/O errors? Something besides "I can not get any of it to work"

Welcome to LQ!

Last edited by Habitual; 04-06-2016 at 10:41 AM.
 
Old 04-06-2016, 10:41 AM   #4
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
2. Create and save a new text file called Gasoline that consists of the following content:

Gas prices rose only half a penny a gallon in the past two weeks, continuing an unusual 20-week trend of mostly steady prices.

3. Create a script file called TestScript that completes the following tasks for the Gasoline file. Hint: Add one command at a time, save the TestScript file, run it, and debug it, before adding the next command.

a. Remove punctuation
b. Make all characters lowercase
c. Put each word on a line by itself
d. Remove blank lines
e. Sort the text to pull all lines containing the same word on adjacent lines
f. Remove duplicate words from the text
g. List most used words in the file first h. Send the output of this script to a file named ScriptResults
4. Give the TestScript file execute permission and run it. Important: When you are done, leave the ScriptResult file in your home directory for grading.

Hints:
 This script can be written as one continuous line of several commands, where the output of one command is piped into the next command.
 Gasoline should only appear once, as input to the first command.
 ScriptResults should only appear once, as output of the last command.
 
Old 04-06-2016, 10:45 AM   #5
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
I get an invalid command message... my last script I did I got this and it had to do with the placement of the switches and words. thus why I am asking for guidance... Am I on the right track? to indepth/ complicated?

Last edited by sobey; 04-06-2016 at 10:47 AM. Reason: added more detail
 
Old 04-06-2016, 11:32 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,256

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Based on the information you have provided, I would think the 'Hints' are actually requirements and therefore no hints at all.

a. Remove punctuation - You need to re-look at this one. Your current sed is trying to remove punctuation from either the start or end of the line (denoted by the ^ and $), so the comma in your example will not be removed

b, c and d look ok

e. Sort the text to pull all lines containing the same word on adjacent lines - I think you incorporated part of step 'f' into this one. On reading the statement a few times, it reads that you are only sorting the data here to be prepared to remove adjacent words, not actually do the removing here

f. Remove duplicate words from the text - the action here seems quite clear, but your sed does not appear to make any sense (could be some backslashes are missing to use capturing??

g. List most used words in the file first h - firstly, not sure if the 'h' at the end is a typo? (if not then we are missing part of this statement) I think I would need more information on this one.
If you are to in fact use a single piped together set of commands, the only way to know the most occurring words, which you removed in the previous step, is to count them, but any count used in a prior command to the pipe could be potentially lost. You could add a count to your uniq, but this then requires additional removal of the numbers at the end. Unless maybe the teacher wants to see the count (might be implied but it has not been said)

Quote:
I get an invalid command message
Please show the full error
 
1 members found this post helpful.
Old 04-06-2016, 12:06 PM   #7
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
This information has been extremely helpful, give me some time to work on this and I will post my findings. Again thank you.
 
Old 04-06-2016, 10:10 PM   #8
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
I worked on this and got responses from all except the very first command sequence, results below:


Code:
  -bash-3.2$ sed -e 's|^[[:punct:]]*||; s|[[:punct:]]*$||;' -i Gasoline
-bash-3.2$ tr '[:upper:]' '[:lower:]' < Gasoline
gas prices rose only half a penny a gallon in the past two weeks, continuing an                                                                                                                                                              unusual 20-week trend of mostly steady prices
-bash-3.2$ tr ' ' '\n' < Gasoline
Gas
prices
rose
only
half
a
penny
a
gallon
in
the
past
two
weeks,
continuing
an
unusual
20-week
trend
of
mostly
steady
prices
-bash-3.2$ sed '/^$/d' Gasoline
Gas prices rose only half a penny a gallon in the past two weeks, continuing an                                                                                                                                                              unusual 20-week trend of mostly steady prices
-bash-3.2$ sort Gasoline | uniq -c | sort -rn | head -n 12 | sed -e 's/^ *[0-9]+                                                                                                                                                              //g'
      1 Gas prices rose only half a penny a gallon in the past two weeks, contin                                                                                                                                                             uing an unusual 20-week trend of mostly steady prices
-bash-3.2$ sed -ri .s/(.* )1/1/g. Gasoline
-bash: syntax error near unexpected token `('
-bash-3.2$ cat Gasoline | tr ' '  '\012' | tr '[:upper:]' '[:lower:]' | tr -d '[                                                                                                                                                             :punct:]' | grep -v '[^a-z]' | sort | uniq -c | sort -rn | head
      2 prices
      2 a
      1 weeks
      1 unusual
      1 two
      1 trend
      1 the
      1 steady
      1 rose
      1 penny
-bash-3.2$

Last edited by sobey; 04-06-2016 at 10:12 PM. Reason: Added code
 
Old 04-06-2016, 10:22 PM   #9
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
The first command did not give me any results but the below command sequence works.

tr -d '[unct:]' < Gasoline

combined with the other results I now need to combine all commands in one line.

Question, can I use the tr command to remove blank lines and remove duplicate words from text?
 
Old 04-07-2016, 01:00 AM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,256

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
I would probably use sed to remove the blank lines and uniq for the duplicate words. You could use sort for the duplicate words but part of the requirement seemed to be an indicator to show which
words had been repeated the most, ie. a count, so uniq can do this
 
Old 04-07-2016, 08:03 AM   #11
BW-userx
Senior Member
 
Registered: Sep 2013
Location: MID-SOUTH USA
Distribution: Void Linux / Slackware 14.2
Posts: 2,191

Rep: Reputation: Disabled
Quote:
Originally Posted by sobey View Post
The first command did not give me any results but the below command sequence works.

tr -d '[unct:]' < Gasoline

combined with the other results I now need to combine all commands in one line.

Question, can I use the tr command to remove blank lines and remove duplicate words from text?
here is a little how to on using tr

The tr Command
 
Old 04-07-2016, 10:23 AM   #12
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by grail View Post
I would probably use sed to remove the blank lines and uniq for the duplicate words. You could use sort for the duplicate words but part of the requirement seemed to be an indicator to show which
words had been repeated the most, ie. a count, so uniq can do this
I like this thought, especially since the sed command sequence for the duplicate words errors, How would I combine them? something like the below command string perhaps?

sed '/^$/d' | uniq -u

I know the -u switch removes duplicate lines... is lines and words the same thing?
 
Old 04-07-2016, 10:28 AM   #13
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by BW-userx View Post
here is a little how to on using tr

The tr Command
This link will be helpful as I research this, I was trying to use the -s switch and a few others without much success. I think that maybe I am getting twisted on the terminology, example, I am tasked to remove duplicate words but a command to remove duplicate characters can be viewed as doing the same thing but I am not piecing that together... If I am over thinking and/ or over complicating this then help me simplify it.
 
Old 04-07-2016, 10:51 AM   #14
BW-userx
Senior Member
 
Registered: Sep 2013
Location: MID-SOUTH USA
Distribution: Void Linux / Slackware 14.2
Posts: 2,191

Rep: Reputation: Disabled
Quote:
Originally Posted by sobey View Post
This link will be helpful as I research this, I was trying to use the -s switch and a few others without much success. I think that maybe I am getting twisted on the terminology, example, I am tasked to remove duplicate words but a command to remove duplicate characters can be viewed as doing the same thing but I am not piecing that together... If I am over thinking and/ or over complicating this then help me simplify it.
I know string compare works but it seems your teacher has limited you in what you can use being just sed and tr. Is that correct?

Code:
 The -d option is used to delete every instance of the string (i.e.,
 sequence of characters) specified in set1. Thus, for example, the following would
 remove every instance of the word soft from a copy of the text in a file named 
file11 and write the modified text to a file named file12:


    cat file11 | tr -d 'soft' > file12


The quotation marks are necessary for tr to treat the argument as a string. If they
 are not used, everything in the argument is instead treated as individual
 characters. Thus, if the above example were rewritten without the quotation marks,
 it would remove every instance of the letters s, o, f and t. Interestingly, the
 quotation marks cannot be used to treat arguments as strings when not using the -d
 option.


Among the few remaining options is -c, which causes tr to work on the complement of
 the specified characters, that is, on the characters that are not in the given set.


tr contains much of most basic functionality of the command line program sed, which
 is used to perform basic editing on streams of text supplied by a pipe. However, 
it often advantageous to use tr instead of sed because the former is simpler and 
requires less typing and because it is easier to incorporate into scripts.
sed : is what you should use instead.

Last edited by BW-userx; 04-07-2016 at 10:58 AM.
 
1 members found this post helpful.
Old 04-07-2016, 11:18 AM   #15
sobey
LQ Newbie
 
Registered: Apr 2016
Posts: 16

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by BW-userx View Post
I know string compare works but it seems your teacher has limited you in what you can use being just sed and tr. Is that correct?

Code:
 The -d option is used to delete every instance of the string (i.e.,
 sequence of characters) specified in set1. Thus, for example, the following would
 remove every instance of the word soft from a copy of the text in a file named 
file11 and write the modified text to a file named file12:


    cat file11 | tr -d 'soft' > file12


The quotation marks are necessary for tr to treat the argument as a string. If they
 are not used, everything in the argument is instead treated as individual
 characters. Thus, if the above example were rewritten without the quotation marks,
 it would remove every instance of the letters s, o, f and t. Interestingly, the
 quotation marks cannot be used to treat arguments as strings when not using the -d
 option.


Among the few remaining options is -c, which causes tr to work on the complement of
 the specified characters, that is, on the characters that are not in the given set.


tr contains much of most basic functionality of the command line program sed, which
 is used to perform basic editing on streams of text supplied by a pipe. However, 
it often advantageous to use tr instead of sed because the former is simpler and 
requires less typing and because it is easier to incorporate into scripts.
sed : is what you should use instead.

The teacher has not given much help in any of this, her response so far is google it... This is the second time during our class term that we have been tasked to script... She has given us access to a command line driven Linux server called pyrite which we will be doing the work but the entire class has been watch the videos and google it thus my dilemma... I am willing to do the work (produced all this thus far) I just need more guidance and understanding... The results from the sed command sequence that is suppose to remove duplicate words gave an error, how am I to fix that?

Code:
 -bash-3.2$ sed -ri .s/(.* )1/1/g. Gasoline
-bash: syntax error near unexpected token `('
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Seeking advice with shell script (homework) mmmm13 Linux - Newbie 10 11-25-2013 12:15 PM
I need advice to create a simple script (homework) anas_lko Linux - Newbie 12 02-01-2012 11:59 AM
Shell Script (Yes it is a homework question) smturner1 Linux - Newbie 11 11-02-2009 11:11 PM
Seeking advice on bash script satimis Programming 6 10-11-2004 12:01 PM
Seeking advice on script satimis Programming 1 10-05-2004 04:02 PM


All times are GMT -5. The time now is 08:49 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration