Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
|
11-25-2016, 03:20 AM
|
#1
|
LQ Newbie
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29
Rep:
|
printf in nested for/while loop not working
For a given two sets of keywords ( say set1 is {keyA1, keyA2} and set2 is {keyB1, keyB2} ) my objective is to create list of file pairs fileA (files which have all the keyA words in its path) and fileB (files which have all the keyA words in its path) by searching recursively.
I wrote following code but it is not printing out the filename as expected. If condition is there to avoid repetitions and pair with itself, however, I have commented out.
Can someone point out the mistake/ improvement ?
Code:
#!/bin/bash
keyA1="AC"
keyA2="LM"
keyB1="SR"
keyB2="LT"
find . -name "*.txt" -path "*$keyA*" -path "*$keyA2*" | while read -d $'\0' fileA
do
# printf '%s %s \n' "fileA: " "$fileA"
find . -name "*.txt" -path "*$keyB1*" -path "*$keyB2*" | while read -d $'\0' fileB
do
# if [ $fileA \< $fileB ] ; then
printf '%s %s \n' "$fileA" "$fileB"
# fi
done
done
Earlier I observed that if I placed the printf statement right after the first while loop (commented out in the above code) it makes the innermost printf functional on some machines (Cygwin). This make me wonder if this is similar and related to fflush() in C.
|
|
|
11-25-2016, 03:52 AM
|
#2
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933
|
[ ] needs variable references in quotes, for example "$fileA"
Better use < in [[ ]] (and you do not need to quote)
Code:
if [[ $fileA < $fileB ]]; then
Further, you certainly want
Code:
find . -name "*.txt" -path "*$keyA1*" -path "*$keyA2*"
Still this is quite fuzzy because $keyA1 or $keyA2 can be everywhere in a ./dir/dir/filename
Perhaps you can restrict this to dir
Code:
find . -name "*.txt" -path "*$keyA1*/*" -path "*$keyA2*/*"
or filename
Code:
find . -name "*.txt" -name "*$keyA1*" -name "*$keyA2*"
Then, what is the -d $'\0' for?
The usual "raw read" is
|
|
1 members found this post helpful.
|
11-25-2016, 04:14 AM
|
#3
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,800
|
and how are the two finds related to each other? I could not understand it. You repeat the same command again and again (the second find).
I would recommend you another solution.
I think read -d $'\0' would require -print0 option to find.
|
|
|
11-25-2016, 05:15 AM
|
#4
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,021
|
Quote:
I wrote following code but it is not printing out the filename as expected.
|
So what, if anything, is being printed out?
|
|
|
11-25-2016, 09:15 AM
|
#5
|
LQ Newbie
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29
Original Poster
Rep:
|
Thanks very much MadeinGermany. All the points you mentioned were helpful. *$keyA* was in fact a typo. I edited my code with the points you mentioned and it is working fine now.
I guess what made the difference is the substitution by "while read -r fileA" in the code.
Here is the working code for the reference of others:
Code:
#!/bin/bash
keyA1="AC"
keyA2="LM"
keyB1="SR"
keyB2="LT"
find . -name "*.txt" -path "*$keyA1*/*" -path "*$keyA2*/*" | while read -r fileA
do
# printf '%s %s \n' "fileA: " "$fileA"
find . -name "*.txt" -path "*$keyB1*/*" -path "*$keyB2*/*" | while read -r fileB
do
if [[ "$fileA" < "$fileB" ]] ; then
printf '%s %s \n' "$fileA" "$fileB"
fi
done
done
@grail and @pan, I appreciate your attention to my query. I did blindly copy the "while part of the command" from some online source & so I did not know the exact difference of the $'\0' syntax. May be still I don't what is raw read!
Quote:
and how are the two finds related to each other? I could not understand it. You repeat the same command again and again (the second find).
|
Those two are same command however with different search terms to pick respective two different file sets to be used for creating pairs.
Quote:
So what, if anything, is being printed out?
|
no it did not print anything at all.
Thank you very much everyone!
|
|
|
11-25-2016, 11:24 AM
|
#6
|
LQ Newbie
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29
Original Poster
Rep:
|
addendum here ...
I noticed that it works only for the Ubuntu machine where I tested it.
However, in the Cygwin environment it did not work.
Individually the following command works when launched from the command prompt e.g.
Code:
find . -path "*AC/*" -path "*RM/*" -name "*.bmp" | while read -r fileA; do echo $fileA; done
However, the script, when launched, appears to run in the background for a while (may be there are many files in the file structure) and then returns to the next line on the command prompt without printing anything on the command prompt.
|
|
|
11-25-2016, 12:36 PM
|
#7
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933
|
The biggest delay in the output can be caused by an oversized pipe buffer.
What means "did not work"? Error message? No output?
find is resource intensive.
It is more efficient to redirect each find output to a temp file:
Code:
find . -name "*.txt" -path "*$keyA1*/*" -path "*$keyA2*/*" > tmpfileA
find . -name "*.txt" -path "*$keyB1*/*" -path "*$keyB2*/*" > tmpfileB
Reading a file multiple times is less overhead.
Also it will avoid pipe buffers:
Code:
while read -r fileA
do
# printf '%s %s \n' "fileA: " "$fileA"
while read -r fileB
do
if [[ "$fileA" < "$fileB" ]] ; then
printf '%s %s \n' "$fileA" "$fileB"
fi
done < tmpfileB
done < tmpfileA
|
|
1 members found this post helpful.
|
11-25-2016, 01:58 PM
|
#8
|
LQ Newbie
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29
Original Poster
Rep:
|
Quote:
What means "did not work"? Error message? No output?
|
I did not see any output on the command prompt screen. However, I am not able to type anything. It appears that the process is running. After few seconds prompt becomes receptive and I can type there as usual.
I was curious to know what does the following redirection in your code mean ? It is that for the next loop, system is told to fetch the next line ( while read -r) from this i.e. tmpfileB as the source ?
|
|
|
11-25-2016, 02:56 PM
|
#9
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933
|
The for-do-done is a code block. The shell reads the block into memory, at the end sees the redirection and opens (associates a file descriptor) the file for reading, and sets the code block's stdin to the file descriptor.
Then it runs the code block. Every "read" by default reads from stdin that is now the file.
Also braces can form a code block, for example
Code:
{
read line1
read line2
} < /etc/group
{
echo "$line1"
echo "$line2"
} > output
cat output
Of course the following is more compact. You can redirect both stdin and stdout, just like you can do it with a simple command.
Code:
{
read line1
read line2
echo "$line1"
echo "$line2"
} < /etc/group > output
cat output
|
|
1 members found this post helpful.
|
11-25-2016, 03:04 PM
|
#10
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,021
|
The redirect is feeding the loop.
I was curious though, your 'if' test, how is it relevant to test if a path to a file is less than another path to a file? Ultimately this will check the strings against each other to find
the first value which is lower based on ascii values of each character in the respective paths. I fail to see how this is a valuable test?? (or of course I am missing the point altogether )
|
|
|
11-25-2016, 03:49 PM
|
#11
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933
|
I think the idea is to exclude
A A
(equality)
and allow
A B
but not vice versa (commutation)
Last edited by MadeInGermany; 11-25-2016 at 03:55 PM.
|
|
|
11-26-2016, 05:00 AM
|
#12
|
LQ Newbie
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29
Original Poster
Rep:
|
Thanks MadeInGermany.
@grail,
Quote:
I was curious though, your 'if' test, how is it relevant to test if a path to a file is less than another path to a file?
|
As MadeInGermany said above, this is to avoid the self-pair and the commutation.
Consider a matrix of size mxn where m is number of keyA files and n is the number of keyB files. One can imagine all the files path strings for keyword set keyA arranged vertically to the leftmost as row title. And those belonging to the keyword set keyB to be arranged in a 1 row vector at the top as column title. Each file pair corresponds to some location in this matrix. We want only the upper triangular matrix (avoid repetitions) and without diagonal entries (self-pair).
Last edited by samasat; 11-26-2016 at 05:02 AM.
|
|
|
11-26-2016, 05:08 AM
|
#13
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,800
|
I would create a list of files/paths and then write a sort-like function in perl/python where I can easily reorder/count/select/whatever I want. Also using data structures may help to implement it better.
Does keyA1 occur before keyA2 (and does keyB1 occur before keyB2) in filenames, or only one of them can be found in the path (of files)?
|
|
1 members found this post helpful.
|
11-26-2016, 05:14 AM
|
#14
|
LQ Newbie
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29
Original Poster
Rep:
|
Quote:
Does keyA1 occur before keyA2 (and does keyB1 occur before keyB2) in filenames, or only one of them can be found in the path (of files)?
|
Yes. We have *1 coming before *2. Thats how we have organized the file structure. However, generalization for any other possibility in future is good.
Quote:
I would create a list of files/paths and then write a sort-like function in perl/python where I can easily reorder/count/select/whatever I want.
|
So far I am Python/Perl illiterate! Hence, wanted to do within bash. However, I guess I will be learning (will have to!) Python soon.
|
|
|
11-26-2016, 05:34 AM
|
#15
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,800
|
Another question:
find <dir> -name "*.txt" | grep keyA1|keyA2|keyB1|keyB2 | sort > result
will give you an ordered list, something like this:
Code:
fileA.1
fileA.2
fileB.1
fileA.3
fileB.2
fileB.3
fileA.4
fileA.5
...
you only need to remove the files from the list where both keyA and keyB can be found.
Is this what you are looking for?
|
|
|
All times are GMT -5. The time now is 11:54 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|