LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-23-2022, 06:50 AM   #1
ychaouche
Member
 
Registered: Mar 2017
Distribution: Mint, Debian, Q4OS, Mageia, KDE Neon
Posts: 261
Blog Entries: 1

Rep: Reputation: 22
Exclamation [bash] code review request


Hello all,

I have just written my first big function in bash and I'm not happy with it b/c of the use of getopt.
I chose GNU getopt overs bash getopts builtin because it can understand and parse --long-options,
however, it dosn't know how to parse arguments that include spaces, even when you quote them.

The second thing I'm not happy with is the function definition of _usage inside the main function.
I learned that function definitions are global,
so it's no use to define a function inside a function with the intention of hiding it to others.
I don't know how to do that in bash,
or if I should even bother with it.

Here's the code, if anyone wants to discuss it.

Code:
function notes.search {
    # examples
    # notes.search --log --newer-than 30 -E "(streaming.*TDA|TDA.*streaming)"
    # notes.search --txt -F --headings "nextcloud"
    # notes.search --all -F "streaming"
    # notes.search --all --newer-than 30 -F "streaming"
    # notes.search --all --recent "streaming"

    function _usage {
	echo "usage : notes.search.new <options> <pattern>"
	echo "options :"
	echo "                  -E : regex search"
	echo "                  -F : fast search"
	echo "                  -r : reverse order (newer first)"
	echo "               --all : search all notes"
	echo "               --txt : search text notes only (default)"
	echo "               --log : search logs only "
	echo " --newer-than <days> : useful with --log; search only in files that were modified in the last <days> days"
	echo "            --recent : useful with --log; search only in files that were modified in the last 30 days"
	echo "          --headings : search only in headings"
    }

    local search_dirs="$NOTES_TXT"
    local search_type
    local find_recent
    local headings headings_prefix
    local pattern
    local grep_opts grep_opts_debug
    local ls_opts="-t"

    # echo "function args $@"
    TEMP=$(getopt -uo EFr -l all,log,txt,LOG,TXT,newer-than:,headings,recent,all -- $@)
    # echo "TEMP $TEMP"
    set -- $TEMP
    while (($# > 0))
    do
	case "$1" in
	    --all)
		search_dirs="$NOTES_TXT $NOTES_LOG"
		;;

	    --log|--LOG)
		search_dirs="$NOTES_LOG"
		;;

	    --txt|--TXT)
	        search_dirs="$NOTES_TXT"
		;;

	    --newer-than)
		shift
		find_recent="-mtime -$1"
		;;
	    
	    --heading?)
		headings_prefix="^\*.* "
		;;
	    
	    --recent)
		find_recent="-mtime 30"
		;;

	    -E)
		grep_opts="-E"
		;;

	    -F)
		grep_opts="-F"
		;;
	    -r)
		ls_opts="-rt"
		;;
	    --)
		shift
		pattern="$*"
		;;
	esac
	shift 
    done
    
    
    [[ -z $pattern ]] && (echo "no pattern specified"; _usage; return 1)
    [[ -n $grep_opts ]] && grep_opts_debug="with the $grep_opts option"
    > /tmp/notes.search.results.colors
    find $search_dirs -not -name "*~" -type f $find_recent | xargs ls "$ls_opts" | while read file
    do
	grep --color=always -H -n -i $grep_opts "$headings_prefix$pattern" "$file"  | tee -a /tmp/notes.search.results.colors
    done    

    pretty.colors.remove /tmp/notes.search.results.colors > /tmp/notes.search.results
}

Last edited by ychaouche; 10-23-2022 at 06:54 AM.
 
Old 10-23-2022, 09:46 AM   #2
GazL
LQ Veteran
 
Registered: May 2008
Distribution: CRUX 3.7
Posts: 6,568

Rep: Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668
Unless you want to support POSIX short option combining then I find it just as easy to parse options manually rather than use getopt or getopts.

getopt is typically used with an 'eval' on the set -- "$VAR" line, which solves some of the whitespace issues. Use of the 'eval' is one of the reasons I don't like to use getopt, along with the fact it's not standard and there are incompatible versions out there.

In bash scripts I prefer to stick with getopts and POSIX short-options where possible, or just handle options manually: if that is not possible,or where I don't care about option combining.


For your usage function scoping issue... the best you can likely do is use a good naming convention to keep things tidy: notes.search() and notes.search.usage() perhaps?

Last edited by GazL; 10-23-2022 at 09:55 AM.
 
1 members found this post helpful.
Old 10-23-2022, 09:56 AM   #3
ychaouche
Member
 
Registered: Mar 2017
Distribution: Mint, Debian, Q4OS, Mageia, KDE Neon
Posts: 261

Original Poster
Blog Entries: 1

Rep: Reputation: 22
Thanks for the nice feedback GazL!
I like your naming suggestion very much!
Thanks also for explaining about the eval.
I saw that in some bash scripts and didn't understand why it was used,
or why would anyone use eval at all.
I learned in my early days of computing that eval was another name for evil.
I will consider rewriting the options parsing without getopt or getopts.
It's interesting that bash doesn't seem to have a universal solution for that.
Maybe writing it by hand isn't that hard after all.
Appended to my todo-list.
 
1 members found this post helpful.
Old 10-23-2022, 10:33 AM   #4
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 2,983

Rep: Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120

Put the code into ShellCheck and it highlights several issues to be resolved. (Also, there is a command-line version of ShellCheck which can be used offline.)

Actually, I'm surprising that ShellCheck didn't warn about not to use getopt. As per that link: "Every Unix system which still ships getopt includes warnings in its man page telling you not to use it."

Here is how to correctly parse command line options with Bash.

 
Old 10-23-2022, 10:39 AM   #5
ychaouche
Member
 
Registered: Mar 2017
Distribution: Mint, Debian, Q4OS, Mageia, KDE Neon
Posts: 261

Original Poster
Blog Entries: 1

Rep: Reputation: 22
Quote:
Actually, I'm surprising that ShellCheck didn't warn about not to use getopt.
That's because I use GNU getopt, which solves some of the problems of the traditional unix getopt.
I should consider rewriting the options parsing loop by hand.
Thanks for the link!
 
Old 10-23-2022, 11:22 AM   #6
GazL
LQ Veteran
 
Registered: May 2008
Distribution: CRUX 3.7
Posts: 6,568

Rep: Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668
Quote:
Originally Posted by boughtonp View Post
I don't know about "correctly". The case statement in that first example starts checking options before it checks for '--' which is a mistake. '--' or the first non-option argument should stop option processing according to POSIX rules.

I typically do it like this:
Code:
#!/bin/sh

# Example of parsing command-line options/arguments.
#
# This supports all the common option style variants,
# but not POSIX short-option combining.
#  e.g.
#    ./scriptname -llogfile.txt
#    ./scriptname -l logfile.txt
#    ./scriptname --logfile=logfile.txt
#    ./scriptname --logfile logfile.txt

########################################################################
# Parse Option Args:

unset opt_logfile opt_verbose

while [ $# -gt 0 ]
do
  case "$1" in
	--)            shift ; break ;;
	-|""|[!-]*)    break ;;
	--logfile)     opt_logfile=${2:?logfile: missing option} ; shift ;;
	--logfile=?*)  opt_logfile=${1#*=} ;;
	--verbose)     opt_verbose=1 ;;
	--?*)          echo "Unknown long-option:$1" >&2 ; exit 1 ;;
	-l?*)          opt_logfile=${1#-l} ;;
	-l)            opt_logfile=${2:?logfile: missing option} ; shift ;;
	-v)            opt_verbose=1 ;;
	-?*)           echo "Unknown option:$1" >&2 ; exit 1 ;;
  esac
  shift
done

[ -n "$opt_logfile" ]  && echo "logfile is: \"$opt_logfile\""

if [ $# -gt 0 ] && [ ${opt_verbose:-0} -ne 0 ]; then
  echo remaining args are:
  for arg in "$@"
  do
	echo $arg
  done
fi

exit 0

################################################################# End. #
Though it's allowed by the POSIX standard, I don't like splitting option arguments from the option with whitespace as it can lead to ambiguity, for example: "--logfile --verbose"; did the user intend to use a filename '--verbose' or just forgot to supply the argument?
One can add additional checks for this sort of stuff, but, personally I favour the use of -l<filename> or --logfile=<filename> formats to avoid the situation.
 
Old 10-23-2022, 12:21 PM   #7
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 2,983

Rep: Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120
Quote:
Originally Posted by GazL View Post
I don't know about "correctly". The case statement in that first example starts checking options before it checks for '--' which is a mistake. '--' or the first non-option argument should stop option processing according to POSIX rules.
The "--" case is correct when it comes prior to any ambiguous cases (i.e. "-?*" and "*"), which it does. (If you disagree, prove it with a specific example of where it fails.)

I do tend to put "--" as the first case too, but Wooledge's version still works.

 
1 members found this post helpful.
Old 10-23-2022, 01:10 PM   #8
GazL
LQ Veteran
 
Registered: May 2008
Distribution: CRUX 3.7
Posts: 6,568

Rep: Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668
The problem will come should you start using wildcards in any of the case matches that preceed '--' For example, you could run into trouble if you had a nested case, like this
Code:
-?)  # process short option:
     case ${1#-} in ...
So, while what is there is technically correct given the limited options specified, as example boilerplate I'd argue that these two should always come first.
Code:
--)            shift ; break ;;
-|""|[!-]*)    break ;;
followed by long-option matches, followed by short option matches.
 
Old 10-23-2022, 05:31 PM   #9
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 2,983

Rep: Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120Reputation: 2120

Given the example already includes short options, I'm unconvinced that's a likely scenario.

It is an easy change though, so I guess you could hop on IRC and see if GreyCat agrees.

No reason to split short/long options though, that's just unnecessary duplication.

 
Old 10-25-2022, 11:11 AM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,958

Rep: Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173
So if you are going to go the Wooledge Wiki, there is plenty to read

So here are my personal thoughts (so please take what you like and abaandon the rest )

1. I prefer to use cat and a heredoc when creating a usage message:
Code:
_usage()
{
  cat<<-USAGE
    usage : notes.search.new <options> <pattern>
    options :
                      -E : regex search
                      -F : fast search
                      -r : reverse order (newer first)
                   --all : search all notes
                   --txt : search text notes only (default)
                   --log : search logs only
     --newer-than <days> : useful with --log; search only in files that were modified in the last <days> days
                --recent : useful with --log; search only in files that were modified in the last 30 days
              --headings : search only in headings
  USAGE
}
2. As you have found the nesting of functions does not obscure the function to be local only, but an alternative would be to create it as an array, which you can declare as local:
Code:
declare -a usage

usage=('usage : notes.search.new <options> <pattern>'
		'options :'
		'                  -E : regex search'
		'                  -F : fast search'
		'                  -r : reverse order (newer first)'
		'               --all : search all notes'
		'               --txt : search text notes only (default)'
		'               --log : search logs only '
		' --newer-than <days> : useful with --log; search only in files that were modified in the last <days> days'
		'            --recent : useful with --log; search only in files that were modified in the last 30 days'
		'          --headings : search only in headings'
	)

IFS=$'\n'
echo "${usage[*]}"
unset IFS
It is a little more cumbersome and has the nasty IFS setting/unsetting, but it does solve it being only local

3. I agree with the others about just write your own parameter parsing

4. I don't see the need to assign to a variable and then call set to get you back where you started, or use a while loop:
Code:
for param in $(getopt -uo EFr -l all,log,txt,LOG,TXT,newer-than:,headings,recent,all -- $@)
do
  case ...
done
5. Don't clutter a system with tmp files (> /tmp/notes.search.results.colors) look up mktemp command, saves on name clashes

6. Check out Wooledge page on why not to process ls output

7. Also on Wooledge is issues (probably not affecting you here) around "command | while" and subshells and variable issues

8. Lastly, you invite discussion which some may look at requiring a test run with known factors, however there are several variables and commands which only exist on your system. So please remember
to try and include a working set of options



Also, welcome on your bash journey
 
3 members found this post helpful.
Old 10-25-2022, 01:24 PM   #11
GazL
LQ Veteran
 
Registered: May 2008
Distribution: CRUX 3.7
Posts: 6,568

Rep: Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668Reputation: 4668
Interesting idea. You can avoid the ugly IFS stuff if you do it something like this:
Code:
func1()
{
  local -a usage=(
    'func1:'
    '  usage line one'
    '  usage line two'
  )

  case "$1" in
    -h)  printf '%s\n' "${usage[@]}" ;;
    *)   printf '%s\n' "func1() called with args: " "$@" ;;
  esac
}
Using an array makes it incompatible with the basic POSIX shell of course, but if that is not a concern then it looks like a tidy way to do it without having a heap of usage text cluttering up your case statement logic.
 
Old 10-26-2022, 05:36 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,958

Rep: Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173Reputation: 3173
Thumbs up

Quote:
Originally Posted by GazL View Post
Interesting idea. You can avoid the ugly IFS stuff if you do it something like this:
Code:
func1()
{
  local -a usage=(
    'func1:'
    '  usage line one'
    '  usage line two'
  )

  case "$1" in
    -h)  printf '%s\n' "${usage[@]}" ;;
    *)   printf '%s\n' "func1() called with args: " "$@" ;;
  esac
}
Using an array makes it incompatible with the basic POSIX shell of course, but if that is not a concern then it looks like a tidy way to do it without having a heap of usage text cluttering up your case statement logic.
I so often forget that one GazL, much nicer option
 
Old 10-27-2022, 03:28 PM   #13
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,139

Rep: Reputation: 953Reputation: 953Reputation: 953Reputation: 953Reputation: 953Reputation: 953Reputation: 953Reputation: 953
Quote:
Code:
[[ -z $pattern ]] && (echo "no pattern specified"; _usage; return 1)
Please avoid the fork/exec of a ( subshell ).
Use a command group
Code:
[[ -z $pattern ]] && {
  echo "no pattern specified"; _usage; return 1
}
Or the "speaking"
Code:
if [[ -z $pattern ]]; then
  echo "no pattern specified"; _usage; return 1
fi
A here document needs a temporary file.
Less overhead: a multi-line echo
Code:
_usage()
{
  echo "\
    usage : notes.search.new <options> <pattern>
    options :
                      -E : regex search
                      -F : fast search
                      -r : reverse order (newer first)
                   --all : search all notes
                   --txt : search text notes only (default)
                   --log : search logs only
     --newer-than <days> : useful with --log; search only in files that were modified in the last <days> days
                --recent : useful with --log; search only in files that were modified in the last 30 days
              --headings : search only in headings"
}
The only problem is that it is in quotes - embedded quotes must be escaped.
 
  


Reply

Tags
bash, code review


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sudo hangs up if mount code contains a request for a return value (Bash script) pizzipie Linux - Desktop 9 03-27-2021 06:05 PM
[SOLVED] [C] function to replace part of a string - request for code review Andy Alt Programming 11 07-31-2019 03:35 PM
[SOLVED] request C source code review: rmw Andy Alt Programming 9 05-09-2013 07:16 AM
Bad Request Your browser sent a request that this server could not understand. vishnukumar Linux - Server 2 08-13-2009 01:56 AM
webserver doesn't reply to external request but it reply's to local request ziba Linux - Server 4 05-11-2009 06:27 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:59 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration