Applying a script to all the files of the find command
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Extract.sh - would extract the text between the two tags START and END.
awk '{sub(/^[ /*]+/, ""); print}' - would remove the leading spaces, slash, astericks
Format.sh would the format the output in to required format.
if [ ! "$1" ];
then usage;
fi
until [ -z "$1" ];
do
case "$1" in
"-s") shift; SOURCEDIR="$1";;
"-e") shift; EXTENTIONS=( `echo "$1" |sed 's/,/ /g'` );;
*) usage;;
esac
shift
done
if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
if [ "${#EXTENTIONS[@]}" == "0" ]; then abort "extention(s)"; fi
echo -e "Input Directory: $SOURCEDIR\n"
for ITEM in "${EXTENTIONS[@]}"; do
while read file; do
echo "File name: " $file>>$SAMPLE_FILE
awk -f Extract.sh $SOURCEDIR/$file|awk '{sub(/^[ /*]+/, ""); print}'>>$SAMPLE_FILE
done < <(find "$SOURCEDIR" -type f -name "*$ITEM" -printf "%P\n")
done
cat $SAMPLE_FILE|awk -f Format1.sh>$OUTPUT_FILE
if [ $? -eq 0 ];then
echo "Ending script successfully............"
fi
exit $?
The script needs the input directory and the file extension.
To me this implies that both source and extension are optional.
Code:
if [ ! "$1" ];
then usage;
fi
Clearly not the case. Also on this test, I am guessing from the following line:
Code:
if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
if [ "${#EXTENTIONS[@]}" == "0" ]; then abort "extention(s)"; fi
That both items are compulsory, this would mean that every time you run the script it should have a minimum (because you can have multiple extensions) of 4.
So investigate '$#'
Once you have used the above you could go simpler than the until loop and in the else use something like:
Code:
SOURCEDIR=$2
shift 3
EXTENSIONS=($@)
I would like to assume you are using bash, no #! at top of script, so if you are I would suggest the following;
1. Change [] for [[]]. The second is safer in that it doesn't have as many little gotchas as []
2. When doing math calculations, change [] for (()). Apart from being more obvious what you are testing, it also provides normal math looking tests.
eg.
Code:
if [ "${#EXTENTIONS[@]}" == "0" ]; then abort "extention(s)"; fi
# becomes
if (( ${#EXTENTIONS[@]} == 0 )); then abort "extention(s)"; fi
# even cleaner could be
(( ${#EXTENTIONS[@]} == 0 )) && abort "extention(s)"
Although if you make the change for testing arguments, this test becomes mute
Code:
if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
Again first test not required if you make changes above. As far as second test goes, check what '-e' means.
Code:
for ITEM in "${EXTENTIONS[@]}"; do
while read file; do
echo "File name: " $file>>$SAMPLE_FILE
awk -f Extract.sh $SOURCEDIR/$file|awk '{sub(/^[ /*]+/, ""); print}'>>$SAMPLE_FILE
done < <(find "$SOURCEDIR" -type f -name "*$ITEM" -printf "%P\n")
done
Impact of loops in loops can be done away with using a small change to assigning extensions:
Code:
EXTENTIONS=$(echo "$@" | tr ' ' '|')
while read -r found_file
do
<your stuff here>
done< <(find "$SOURCEDIR" -type f -regextype posix-extended -regex ".*\.($EXTENTIONS)")
Lastly, the following if is always true:
Code:
cat $SAMPLE_FILE|awk -f Format1.sh>$OUTPUT_FILE
if [ $? -eq 0 ];then
echo "Ending script successfully............"
fi
First thing here is that cat is not required, just place $SAMPLE_FILE prior to redirection.
The reason it will always work is unless the Format1.sh has errors in it the awk is the last command run and even if nothing is output it will end successfully
Grail - FYI you're a genius, this isn't my script but that's bloody clever (both the code changes, the original code and your improved code. Grail has obviously done a lot of work here to get his, I mean your code working here flamingo.
Grail - FYI you're a genius, this isn't my script but that's bloody clever (both the code changes, the original code and your improved code. Grail has obviously done a lot of work here to get his, I mean your code working here flamingo.
To me this implies that both source and extension are optional.
Code:
if [ ! "$1" ];
then usage;
fi
No, it is checking whether the argument given to the script is not null.
Quote:
Clearly not the case. Also on this test, I am guessing from the following line:
Code:
if [ ! "$SOURCEDIR" ] || [ ! -e "$SOURCEDIR" ]; then abort "source dir"; fi
if [ "${#EXTENTIONS[@]}" == "0" ]; then abort "extention(s)"; fi
That both items are compulsory, this would mean that every time you run the script it should have a minimum (because you can have multiple extensions) of 4.
So investigate '$#'
Yes, both the source directory and extensions are compulsory and -e is checking for the existence of the directory.
Quote:
I would like to assume you are using bash, no #! at top of script, so if you are I would suggest the following;
Am not using bash shell, since am using Cygwin by default shell would be Bourne shell (sh).
Quote:
This also covers the Extract.sh script, so is the input above valid? If so I can look at putting code to do:
Yes, the given input is valid and it would be a part of a java or cs file as comments....
Also, there can be multiple blocks like that in a single file...
In some cases, instead of // it may contain /* or * or " *"... So only am removing the leading spaces, astericks, slash.
The script given by you looks simpler but it is not working for me if it contains slash, space, asterick....
Also, if any of the line like Function Name or changes is missing , then also it is working by printing nothing in the specified column name...
Which i was trying these days ...............
Last edited by flamingo_l; 10-22-2010 at 04:36 AM.
To me this implies that both source and extension are optional.
This line was referring to the code box above it meaning that your usage() function uses square brackets for -s and -e switches which under a lot of linux apps would mean they are optional.
Quote:
No, it is checking whether the argument given to the script is not null.
Yes your first if is checking that you have past at least one parameter to the function, but as I said previously, from further code it is an incorrect test as the minimum number of
arguments is 4.
Quote:
-e is checking for the existence of the directory.
Actually, no it is not. If you look under man test (equivalent to [ or [[) you will see:
Code:
-e FILE
FILE exists
-d FILE
FILE exists and is a directory
So -e is not the correct test here as the user could enter 'file.txt' for the source directory and your test will pass.
Quote:
Am not using bash shell, since am using Cygwin by default shell would be Bourne shell (sh).
My bad ... means some of my nice tricks might not work
Quote:
Yes, the given input is valid and it would be a part of a java or cs file as comments....
Also, there can be multiple blocks like that in a single file...
In some cases, instead of // it may contain /* or * or " *"... So only am removing the leading spaces, astericks, slash.
So am I understanding correctly then that we are stripping information from source code and the START,END range is inside each of the comments along with the required fields?
Something like:
Now i have got it.. I have solved by renaming flamingo_l.tar.bz2.txt to flamingo_l.tar.bz2.
Used winzip to extarct the contents of the folder..
Will check and let you know.
Meanwhile, attaching sample file.........
Last edited by flamingo_l; 10-25-2010 at 04:20 AM.
I got an error when executed the script in Cygwin on my system...
Code:
$ ./flamingo_l1.sh -s . -e java
Starting script................
Input Directory: .
find: warning: you have specified the -regextype option after a non-option argum
ent -type, but options are not positional (-regextype affects tests specified be
fore it as well as those specified after it). Please specify options before oth
er arguments.
./flamingo_l1.sh: line 48: column: command not found
No data in output file
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.