LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Bash Shell Script Help! (https://www.linuxquestions.org/questions/linux-newbie-8/bash-shell-script-help-4175593025/)

ChrisJames82 11-06-2016 07:27 AM

Bash Shell Script Help!
 
Hello everyone,

This is my first time joining a forum and asking for help so forgive me if I did something incorrectly. I'm in school for networking and I am taking my first ubuntu class, I am stuck on this creating our first bash shell script assignment. In the assignment we downloaded a pdf of a bunch of team names and then the instructions say

Create a bash shell script, named uniq-teams.sh that uses pdftotext, sed, and perl -p -i -e (note: you may not need the -i) to create a sorted list of non-duplicated team names from the selected pages of the PDF file.

It must take the following parameters:

input pdf filename
start page
end page
If any of the parameters are missing, your program should print usage instructions.

Your script must remove the following:

Must remove blank/empty lines
Page breaks
Must remove duplicates
Must remove non-team lines, e.g. Area and Ballroom

could anyone guide me, i'm a complete noob, I have ubuntu running in puTTy. So far I entered vim uniq-teams.sh to bring up the editor then type #!/bin/bash and thats where im at. Please guide me in the right direction. Thank you so much

unSpawn 11-06-2016 07:35 AM

Welcome to LQ, hope you like it here. Odd your assignment wasn't accompanied by a lecture, introduction or assignment notes BTW?.. What you want is to first read some stuff.

BASH intros:
http://www.gnu.org/software/bash/man...ode/index.html
http://www.tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
http://www.tldp.org/LDP/Bash-Beginne...tml/index.html

BASH scripting guides:
http://mywiki.wooledge.org/BashGuide
http://mywiki.wooledge.org/BashFAQ
# these are must reads so do read them! (I'll leave out the rest of mywiki.wooledge.org and the ABS 'cuz it's too early for that IMHO.)

Also see:
http://www.linuxquestions.org/questi...l-links-35334/
http://www.linuxquestions.org/questi...nd-line-32893/
http://www.linuxquestions.org/questi...ks-4175433508/

keefaz 11-06-2016 07:36 AM

A script is a way to automate actions, eg instead of typing each command by hand you just run the script.

Here, the pdftotext command is the main action (other commands will filter its output)
Try to familiar yourself with pdftotext, eg do some test on pdf files and see how it works, then go ahead to filter the output it produces

Turbocapitalist 11-06-2016 08:15 AM

So one way to work out the script, is to figure out each step (line) manually one step at a time.

When you've figured out the options and got that step working, enter it into your script's file. If you set some debugging options in the second line of your script, it can help:

Code:

set -x -v
For some background information on each program, look at the manual pages for each one. They can be overwhelming so focus on the options that the instructor has provided, or look for options that provide the function asked for.

Code:

man pdftotext
man sed
man perlrun
man perlre

I wrote a short post on "sed" which has a few useful links though is aimed more at people that have already been using "sed" a bit.

"perl" is a more powerful scripting language than "bash" but it looks like the assignment is to use it for one-liners if the -p and -e options are being recommended. That's ok too. The -p wraps a loop around what you have in -e. The biggest advantage of perl is its pattern matching. You can get the full reference with man perlre. However, a lot of guides and books are available. Some of those books may be in your school library.

pan64 11-06-2016 08:24 AM

also if you have already created a script just post it and we will discuss it with you - and also will help you to improve it, but we won't write it for you.
Additionally you may try this site: www.shellcheck.net to check the script you wrote.

Tonus 11-06-2016 01:47 PM

Bash Shell Script Help!
 
The result of each command could be assigned to a variable then processed with next command as a pipe could do.

Pipe, variable, stdin could be searched for better comprehension...

BW-userx 11-06-2016 03:58 PM

http://www.tutorialspoint.com/unix_c.../pdftotext.htm pdftotext

what day our you on in this class?

Advanced Bash-Scripting Guide

ChrisJames82 11-06-2016 06:08 PM

Still confused
 
This class is all online, with a online textbook, im considering dropping it and taking it in person, I am very confused and am having no luck at all in which seems to be a easy assignment. I dont know, could anyone drop there email and I email them asking for further help as my teacher takes a while to answer back? Appreciate the help everyone

BW-userx 11-07-2016 10:17 AM

when your programing or scripting, beginning or not, take it and do it in steps.

pdftotext, sed, and perl -p -i -e
frist get your pdf file, and only use pdftotext on it then using a copy of your pdf file, experiment on it via your command line to get it to do what you need it to do. then take them commands and add it to your script file.

next learn sed
figure out how to get sed to extract what you need via the command line from the text file you created using pdftotext. when you get that output you need, take them commands and put them into your script file after the pdftotext commands, now on to your perl in whatever it does, I have no idea on perl and what it does. but I am thinking it is for formatting your output to the final result your teacher is wanting.

just take the output of sed and | pipe it into perl then let perl do what it is suppose to in order to give you the end results.
hint to what to use with perl were already given, -p -i -e

then with what you've completed their I am user if post your work on here their are others that can help you complete your task in figuring out how to put it al together within the script.

but if you can figure it all out by using the command line, on into the next then that is all you need to have and put that into your script file.

piping it a useful tool in command line arguments sending output into the input of another app/program so it can manipulate it further.

done.

it is a process one step at a time.

you're still going to have to do all of the steps regardless if it is on line or not.

Code:

userx@voided1.what~/Documents/Linux-how-tos/The Hacker's Manual 2015>> pdftotext -f 70 -l 71 'The Hacker'\''s Manual 2015.pdf'

userx@voided1.what~/Documents/Linux-how-tos/The Hacker's Manual 2015>> ls
'The Hacker'\''s Manual 2015.pdf'  'The Hacker'\''s Manual 2015.txt'


TB0ne 11-07-2016 10:47 AM

Quote:

Originally Posted by ChrisJames82 (Post 5627776)
This class is all online, with a online textbook, im considering dropping it and taking it in person, I am very confused and am having no luck at all in which seems to be a easy assignment. I dont know, could anyone drop there email and I email them asking for further help as my teacher takes a while to answer back? Appreciate the help everyone

Sorry, but do you realize how incredibly rude this is???

We are happy to help you, happy to explain things, or assist if you're stuck...but you've flat-out posted a homework question, showed us NO effort of your own (not even the beginnings of a script), and have been given links to many bash scripting tutorials, man pages, and other things to help get you going. And you then want us to give you our personal email addresses, so you can mail us questions DIRECTLY because you don't want to WAIT for your teacher to answer, is a bit beyond the scope here. We will help you in the forums...but asking us to be your personal, FREE, one-on-one tutors is a bit much.

Please...can you show us what YOU have written/done/tried on your own so far, and tell us where you're stuck? BW-userx gave some solid advice...so start to THINK about things one step at a time:
  • Your assignment tells you that you need three things to function: an input file name, start page, and end page. That tells you that your script needs to take three command line arguments. Look up how to read command line arguments from the many resources you've been given.
  • It tells you that if they do NOT provide three, to give instructions to the user. Look up how to check if a variable is empty or not.
  • Read the instructions on pdftotext that tell you how to remove page breaks
  • Do a search on how to remove duplicates...you can do it with sed or perl. A simple perl example:
    Code:

    perl -ne '{$H{$_}++ or print $_'
  • Look up how to handle arrays in the bash script...you know you have a list of team names; that is one array. So read everything else in, and if its NOT in that array, don't print it.
We can help...but we will not do this for you. It is time for you to show us some effort of your own.

Turbocapitalist 11-07-2016 11:31 AM

Yes, to summarize the comments: take it one step at a time and show us what you have as a start.

And to add to the cacophony, I'd suggest starting with using the "pdftotext" utility manually to figure out how to use it to extract a range of pages.

szboardstretcher 11-07-2016 12:51 PM

Well.. I don't use perl, but here is my quick untested idea:

Code:

#!/bin/bash

# print instructions function
function usage(){
        echo "USAGE: uniq-teams.sh [options]"
        echo "  -f <int>        : first page to convert"
        echo "  -l <int>        : last page to convert"
        echo "  -p <filename>  : pdf filename to convert"
        echo "  -t <filename>  : text filename to output"
}

# if number of arguments less than 8 print instructions
if [[ $# -lt 8 ]]; then usage; exit; fi

# grab option arguments into variables
while getopts ":f:l:p:t:" opt; do
case $opt in
        f) first="$OPTARG"
        ;;
        l) last="$OPTARG"
        ;;
        p) pdf="$OPTARG"
        ;;
        t) text="$OPTARG"
        ;;
        \?) usage; exit;
        ;;
esac
done

# convert pdf to text
pdftotext -f $first -l $last $pdf $text

# remove blank lines
sed -i '/^&/d' $text

# remove page breaks ^L
sed -i 's/\f//' $text

# remove Area and Ballroom lines
sed -i '/Area/d' $text
sed -i '/Ballroom/d' $text

# remove duplicates
sort -u $text -o $text.sorted

I keep a 'bash scripting skeleton' for writing bash scripts which has some error handling, functions, temp files and that kind of thing. Has some decent ideas in it, if you'd like to take a look for reference.

https://github.com/boardstretcher/bash-script-skeleton


All times are GMT -5. The time now is 03:39 AM.