ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have just written my first big function in bash and I'm not happy with it b/c of the use of getopt.
I chose GNU getopt overs bash getopts builtin because it can understand and parse --long-options,
however, it dosn't know how to parse arguments that include spaces, even when you quote them.
The second thing I'm not happy with is the function definition of _usage inside the main function.
I learned that function definitions are global,
so it's no use to define a function inside a function with the intention of hiding it to others.
I don't know how to do that in bash,
or if I should even bother with it.
Here's the code, if anyone wants to discuss it.
Code:
function notes.search {
# examples
# notes.search --log --newer-than 30 -E "(streaming.*TDA|TDA.*streaming)"
# notes.search --txt -F --headings "nextcloud"
# notes.search --all -F "streaming"
# notes.search --all --newer-than 30 -F "streaming"
# notes.search --all --recent "streaming"
function _usage {
echo "usage : notes.search.new <options> <pattern>"
echo "options :"
echo " -E : regex search"
echo " -F : fast search"
echo " -r : reverse order (newer first)"
echo " --all : search all notes"
echo " --txt : search text notes only (default)"
echo " --log : search logs only "
echo " --newer-than <days> : useful with --log; search only in files that were modified in the last <days> days"
echo " --recent : useful with --log; search only in files that were modified in the last 30 days"
echo " --headings : search only in headings"
}
local search_dirs="$NOTES_TXT"
local search_type
local find_recent
local headings headings_prefix
local pattern
local grep_opts grep_opts_debug
local ls_opts="-t"
# echo "function args $@"
TEMP=$(getopt -uo EFr -l all,log,txt,LOG,TXT,newer-than:,headings,recent,all -- $@)
# echo "TEMP $TEMP"
set -- $TEMP
while (($# > 0))
do
case "$1" in
--all)
search_dirs="$NOTES_TXT $NOTES_LOG"
;;
--log|--LOG)
search_dirs="$NOTES_LOG"
;;
--txt|--TXT)
search_dirs="$NOTES_TXT"
;;
--newer-than)
shift
find_recent="-mtime -$1"
;;
--heading?)
headings_prefix="^\*.* "
;;
--recent)
find_recent="-mtime 30"
;;
-E)
grep_opts="-E"
;;
-F)
grep_opts="-F"
;;
-r)
ls_opts="-rt"
;;
--)
shift
pattern="$*"
;;
esac
shift
done
[[ -z $pattern ]] && (echo "no pattern specified"; _usage; return 1)
[[ -n $grep_opts ]] && grep_opts_debug="with the $grep_opts option"
> /tmp/notes.search.results.colors
find $search_dirs -not -name "*~" -type f $find_recent | xargs ls "$ls_opts" | while read file
do
grep --color=always -H -n -i $grep_opts "$headings_prefix$pattern" "$file" | tee -a /tmp/notes.search.results.colors
done
pretty.colors.remove /tmp/notes.search.results.colors > /tmp/notes.search.results
}
Unless you want to support POSIX short option combining then I find it just as easy to parse options manually rather than use getopt or getopts.
getopt is typically used with an 'eval' on the set -- "$VAR" line, which solves some of the whitespace issues. Use of the 'eval' is one of the reasons I don't like to use getopt, along with the fact it's not standard and there are incompatible versions out there.
In bash scripts I prefer to stick with getopts and POSIX short-options where possible, or just handle options manually: if that is not possible,or where I don't care about option combining.
For your usage function scoping issue... the best you can likely do is use a good naming convention to keep things tidy: notes.search() and notes.search.usage() perhaps?
Thanks for the nice feedback GazL!
I like your naming suggestion very much!
Thanks also for explaining about the eval.
I saw that in some bash scripts and didn't understand why it was used,
or why would anyone use eval at all.
I learned in my early days of computing that eval was another name for evil.
I will consider rewriting the options parsing without getopt or getopts.
It's interesting that bash doesn't seem to have a universal solution for that.
Maybe writing it by hand isn't that hard after all.
Appended to my todo-list.
Put the code into ShellCheck and it highlights several issues to be resolved. (Also, there is a command-line version of ShellCheck which can be used offline.)
Actually, I'm surprising that ShellCheck didn't warn about not to use getopt. As per that link: "Every Unix system which still ships getopt includes warnings in its man page telling you not to use it."
Actually, I'm surprising that ShellCheck didn't warn about not to use getopt.
That's because I use GNU getopt, which solves some of the problems of the traditional unix getopt.
I should consider rewriting the options parsing loop by hand.
Thanks for the link!
I don't know about "correctly". The case statement in that first example starts checking options before it checks for '--' which is a mistake. '--' or the first non-option argument should stop option processing according to POSIX rules.
I typically do it like this:
Code:
#!/bin/sh
# Example of parsing command-line options/arguments.
#
# This supports all the common option style variants,
# but not POSIX short-option combining.
# e.g.
# ./scriptname -llogfile.txt
# ./scriptname -l logfile.txt
# ./scriptname --logfile=logfile.txt
# ./scriptname --logfile logfile.txt
########################################################################
# Parse Option Args:
unset opt_logfile opt_verbose
while [ $# -gt 0 ]
do
case "$1" in
--) shift ; break ;;
-|""|[!-]*) break ;;
--logfile) opt_logfile=${2:?logfile: missing option} ; shift ;;
--logfile=?*) opt_logfile=${1#*=} ;;
--verbose) opt_verbose=1 ;;
--?*) echo "Unknown long-option:$1" >&2 ; exit 1 ;;
-l?*) opt_logfile=${1#-l} ;;
-l) opt_logfile=${2:?logfile: missing option} ; shift ;;
-v) opt_verbose=1 ;;
-?*) echo "Unknown option:$1" >&2 ; exit 1 ;;
esac
shift
done
[ -n "$opt_logfile" ] && echo "logfile is: \"$opt_logfile\""
if [ $# -gt 0 ] && [ ${opt_verbose:-0} -ne 0 ]; then
echo remaining args are:
for arg in "$@"
do
echo $arg
done
fi
exit 0
################################################################# End. #
Though it's allowed by the POSIX standard, I don't like splitting option arguments from the option with whitespace as it can lead to ambiguity, for example: "--logfile --verbose"; did the user intend to use a filename '--verbose' or just forgot to supply the argument?
One can add additional checks for this sort of stuff, but, personally I favour the use of -l<filename> or --logfile=<filename> formats to avoid the situation.
I don't know about "correctly". The case statement in that first example starts checking options before it checks for '--' which is a mistake. '--' or the first non-option argument should stop option processing according to POSIX rules.
The "--" case is correct when it comes prior to any ambiguous cases (i.e. "-?*" and "*"), which it does. (If you disagree, prove it with a specific example of where it fails.)
I do tend to put "--" as the first case too, but Wooledge's version still works.
The problem will come should you start using wildcards in any of the case matches that preceed '--' For example, you could run into trouble if you had a nested case, like this
Code:
-?) # process short option:
case ${1#-} in ...
So, while what is there is technically correct given the limited options specified, as example boilerplate I'd argue that these two should always come first.
Code:
--) shift ; break ;;
-|""|[!-]*) break ;;
followed by long-option matches, followed by short option matches.
So if you are going to go the Wooledge Wiki, there is plenty to read
So here are my personal thoughts (so please take what you like and abaandon the rest )
1. I prefer to use cat and a heredoc when creating a usage message:
Code:
_usage()
{
cat<<-USAGE
usage : notes.search.new <options> <pattern>
options :
-E : regex search
-F : fast search
-r : reverse order (newer first)
--all : search all notes
--txt : search text notes only (default)
--log : search logs only
--newer-than <days> : useful with --log; search only in files that were modified in the last <days> days
--recent : useful with --log; search only in files that were modified in the last 30 days
--headings : search only in headings
USAGE
}
2. As you have found the nesting of functions does not obscure the function to be local only, but an alternative would be to create it as an array, which you can declare as local:
Code:
declare -a usage
usage=('usage : notes.search.new <options> <pattern>'
'options :'
' -E : regex search'
' -F : fast search'
' -r : reverse order (newer first)'
' --all : search all notes'
' --txt : search text notes only (default)'
' --log : search logs only '
' --newer-than <days> : useful with --log; search only in files that were modified in the last <days> days'
' --recent : useful with --log; search only in files that were modified in the last 30 days'
' --headings : search only in headings'
)
IFS=$'\n'
echo "${usage[*]}"
unset IFS
It is a little more cumbersome and has the nasty IFS setting/unsetting, but it does solve it being only local
3. I agree with the others about just write your own parameter parsing
4. I don't see the need to assign to a variable and then call set to get you back where you started, or use a while loop:
Code:
for param in $(getopt -uo EFr -l all,log,txt,LOG,TXT,newer-than:,headings,recent,all -- $@)
do
case ...
done
5. Don't clutter a system with tmp files (> /tmp/notes.search.results.colors) look up mktemp command, saves on name clashes
6. Check out Wooledge page on why not to process ls output
7. Also on Wooledge is issues (probably not affecting you here) around "command | while" and subshells and variable issues
8. Lastly, you invite discussion which some may look at requiring a test run with known factors, however there are several variables and commands which only exist on your system. So please remember
to try and include a working set of options
Interesting idea. You can avoid the ugly IFS stuff if you do it something like this:
Code:
func1()
{
local -a usage=(
'func1:'
' usage line one'
' usage line two'
)
case "$1" in
-h) printf '%s\n' "${usage[@]}" ;;
*) printf '%s\n' "func1() called with args: " "$@" ;;
esac
}
Using an array makes it incompatible with the basic POSIX shell of course, but if that is not a concern then it looks like a tidy way to do it without having a heap of usage text cluttering up your case statement logic.
Interesting idea. You can avoid the ugly IFS stuff if you do it something like this:
Code:
func1()
{
local -a usage=(
'func1:'
' usage line one'
' usage line two'
)
case "$1" in
-h) printf '%s\n' "${usage[@]}" ;;
*) printf '%s\n' "func1() called with args: " "$@" ;;
esac
}
Using an array makes it incompatible with the basic POSIX shell of course, but if that is not a concern then it looks like a tidy way to do it without having a heap of usage text cluttering up your case statement logic.
I so often forget that one GazL, much nicer option
if [[ -z $pattern ]]; then
echo "no pattern specified"; _usage; return 1
fi
A here document needs a temporary file.
Less overhead: a multi-line echo
Code:
_usage()
{
echo "\
usage : notes.search.new <options> <pattern>
options :
-E : regex search
-F : fast search
-r : reverse order (newer first)
--all : search all notes
--txt : search text notes only (default)
--log : search logs only
--newer-than <days> : useful with --log; search only in files that were modified in the last <days> days
--recent : useful with --log; search only in files that were modified in the last 30 days
--headings : search only in headings"
}
The only problem is that it is in quotes - embedded quotes must be escaped.
Here's my take on args, using keyword=value
keywords become variables, quoted whitespace works and "$@" is used normally
Code:
# designed with korn, works with bash and most shells
# automagic shell keyword arguments, like:
#
# name='mr smith' address='the matrix' job=agent
#
for V in "$@"
do
case "$V" in
*=*) eval ${V%=*}='${V#*=}';;
*) args[i++]="$V";;
esac
done
[ "$i" ] && set "${args[@]}"|| shift $#
unset i args V
# EXAMPLE usage
#
# # optional and default arguments
# address='no fixed abode'
# job=
#
# source keywords
#
# # mandatory arguments
# set -o nounset
# : $name
# : $address
#
# echo "hello [$name] of [$address]"
# [ "$job" ] && echo how is the $job business\?|| echo get a job\!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.