LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-25-2016, 03:20 AM   #1
samasat
LQ Newbie
 
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29

Rep: Reputation: Disabled
printf in nested for/while loop not working


For a given two sets of keywords ( say set1 is {keyA1, keyA2} and set2 is {keyB1, keyB2} ) my objective is to create list of file pairs fileA (files which have all the keyA words in its path) and fileB (files which have all the keyA words in its path) by searching recursively.


I wrote following code but it is not printing out the filename as expected. If condition is there to avoid repetitions and pair with itself, however, I have commented out.
Can someone point out the mistake/ improvement ?

Code:
#!/bin/bash

keyA1="AC"
keyA2="LM"
keyB1="SR"
keyB2="LT"

find . -name "*.txt" -path "*$keyA*" -path "*$keyA2*" | while read -d  $'\0' fileA
do
#    printf '%s %s \n' "fileA: " "$fileA"
    find . -name "*.txt" -path "*$keyB1*" -path "*$keyB2*" | while read -d  $'\0' fileB
do     
   #   if [ $fileA \< $fileB ] ; then
        printf '%s %s \n' "$fileA" "$fileB"
   #   fi
done
done
Earlier I observed that if I placed the printf statement right after the first while loop (commented out in the above code) it makes the innermost printf functional on some machines (Cygwin). This make me wonder if this is similar and related to fflush() in C.
 
Old 11-25-2016, 03:52 AM   #2
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933

Rep: Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251
[ ] needs variable references in quotes, for example "$fileA"
Better use < in [[ ]] (and you do not need to quote)
Code:
if [[ $fileA < $fileB ]]; then
Further, you certainly want
Code:
find . -name "*.txt" -path "*$keyA1*" -path "*$keyA2*"
Still this is quite fuzzy because $keyA1 or $keyA2 can be everywhere in a ./dir/dir/filename
Perhaps you can restrict this to dir
Code:
find . -name "*.txt" -path "*$keyA1*/*" -path "*$keyA2*/*"
or filename
Code:
find . -name "*.txt" -name "*$keyA1*" -name "*$keyA2*"
Then, what is the -d $'\0' for?
The usual "raw read" is
Code:
read -r variable
 
1 members found this post helpful.
Old 11-25-2016, 04:14 AM   #3
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,800

Rep: Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568
and how are the two finds related to each other? I could not understand it. You repeat the same command again and again (the second find).
I would recommend you another solution.
I think read -d $'\0' would require -print0 option to find.
 
Old 11-25-2016, 05:15 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,021

Rep: Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199
Quote:
I wrote following code but it is not printing out the filename as expected.
So what, if anything, is being printed out?
 
Old 11-25-2016, 09:15 AM   #5
samasat
LQ Newbie
 
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29

Original Poster
Rep: Reputation: Disabled
Smile

Thanks very much MadeinGermany. All the points you mentioned were helpful. *$keyA* was in fact a typo. I edited my code with the points you mentioned and it is working fine now.
I guess what made the difference is the substitution by "while read -r fileA" in the code.

Here is the working code for the reference of others:

Code:
#!/bin/bash

keyA1="AC"
keyA2="LM"
keyB1="SR"
keyB2="LT"

find . -name "*.txt" -path "*$keyA1*/*" -path "*$keyA2*/*" | while read -r fileA
do
#    printf '%s %s \n' "fileA: " "$fileA"
     find . -name "*.txt" -path "*$keyB1*/*" -path "*$keyB2*/*" | while read -r fileB
     do     
         if [[ "$fileA" < "$fileB" ]] ; then
            printf '%s %s \n' "$fileA" "$fileB"
         fi
     done
done
@grail and @pan, I appreciate your attention to my query. I did blindly copy the "while part of the command" from some online source & so I did not know the exact difference of the $'\0' syntax. May be still I don't what is raw read!


Quote:
and how are the two finds related to each other? I could not understand it. You repeat the same command again and again (the second find).
Those two are same command however with different search terms to pick respective two different file sets to be used for creating pairs.

Quote:
So what, if anything, is being printed out?
no it did not print anything at all.


Thank you very much everyone!
 
Old 11-25-2016, 11:24 AM   #6
samasat
LQ Newbie
 
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29

Original Poster
Rep: Reputation: Disabled
Unhappy

addendum here ...
I noticed that it works only for the Ubuntu machine where I tested it.
However, in the Cygwin environment it did not work.

Individually the following command works when launched from the command prompt e.g.
Code:
find . -path "*AC/*" -path "*RM/*" -name "*.bmp" | while read -r fileA; do echo $fileA; done
However, the script, when launched, appears to run in the background for a while (may be there are many files in the file structure) and then returns to the next line on the command prompt without printing anything on the command prompt.
 
Old 11-25-2016, 12:36 PM   #7
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933

Rep: Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251
The biggest delay in the output can be caused by an oversized pipe buffer.

What means "did not work"? Error message? No output?

find is resource intensive.
It is more efficient to redirect each find output to a temp file:
Code:
find . -name "*.txt" -path "*$keyA1*/*" -path "*$keyA2*/*" > tmpfileA
find . -name "*.txt" -path "*$keyB1*/*" -path "*$keyB2*/*" > tmpfileB
Reading a file multiple times is less overhead.
Also it will avoid pipe buffers:
Code:
while read -r fileA
do
#    printf '%s %s \n' "fileA: " "$fileA"
     while read -r fileB
     do     
         if [[ "$fileA" < "$fileB" ]] ; then
            printf '%s %s \n' "$fileA" "$fileB"
         fi
     done < tmpfileB
done < tmpfileA
 
1 members found this post helpful.
Old 11-25-2016, 01:58 PM   #8
samasat
LQ Newbie
 
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29

Original Poster
Rep: Reputation: Disabled
Question

Quote:
What means "did not work"? Error message? No output?
I did not see any output on the command prompt screen. However, I am not able to type anything. It appears that the process is running. After few seconds prompt becomes receptive and I can type there as usual.

I was curious to know what does the following redirection in your code mean ? It is that for the next loop, system is told to fetch the next line ( while read -r) from this i.e. tmpfileB as the source ?
Code:
done < tmpfileB
 
Old 11-25-2016, 02:56 PM   #9
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933

Rep: Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251
The for-do-done is a code block. The shell reads the block into memory, at the end sees the redirection and opens (associates a file descriptor) the file for reading, and sets the code block's stdin to the file descriptor.
Then it runs the code block. Every "read" by default reads from stdin that is now the file.

Also braces can form a code block, for example
Code:
{
read line1
read line2
} < /etc/group
{
echo "$line1"
echo "$line2"
} > output
cat output
Of course the following is more compact. You can redirect both stdin and stdout, just like you can do it with a simple command.
Code:
{
read line1
read line2
echo "$line1"
echo "$line2"
} < /etc/group > output
cat output
 
1 members found this post helpful.
Old 11-25-2016, 03:04 PM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,021

Rep: Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199
The redirect is feeding the loop.

I was curious though, your 'if' test, how is it relevant to test if a path to a file is less than another path to a file? Ultimately this will check the strings against each other to find
the first value which is lower based on ascii values of each character in the respective paths. I fail to see how this is a valuable test?? (or of course I am missing the point altogether )
 
Old 11-25-2016, 03:49 PM   #11
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,933

Rep: Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251
I think the idea is to exclude
A A
(equality)
and allow
A B
but not vice versa (commutation)

Last edited by MadeInGermany; 11-25-2016 at 03:55 PM.
 
Old 11-26-2016, 05:00 AM   #12
samasat
LQ Newbie
 
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29

Original Poster
Rep: Reputation: Disabled
Thanks MadeInGermany.

@grail,
Quote:
I was curious though, your 'if' test, how is it relevant to test if a path to a file is less than another path to a file?
As MadeInGermany said above, this is to avoid the self-pair and the commutation.
Consider a matrix of size mxn where m is number of keyA files and n is the number of keyB files. One can imagine all the files path strings for keyword set keyA arranged vertically to the leftmost as row title. And those belonging to the keyword set keyB to be arranged in a 1 row vector at the top as column title. Each file pair corresponds to some location in this matrix. We want only the upper triangular matrix (avoid repetitions) and without diagonal entries (self-pair).

Last edited by samasat; 11-26-2016 at 05:02 AM.
 
Old 11-26-2016, 05:08 AM   #13
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,800

Rep: Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568
I would create a list of files/paths and then write a sort-like function in perl/python where I can easily reorder/count/select/whatever I want. Also using data structures may help to implement it better.
Does keyA1 occur before keyA2 (and does keyB1 occur before keyB2) in filenames, or only one of them can be found in the path (of files)?
 
1 members found this post helpful.
Old 11-26-2016, 05:14 AM   #14
samasat
LQ Newbie
 
Registered: Nov 2011
Location: Bangalore
Distribution: Ubuntu/Cygwin
Posts: 29

Original Poster
Rep: Reputation: Disabled
Quote:
Does keyA1 occur before keyA2 (and does keyB1 occur before keyB2) in filenames, or only one of them can be found in the path (of files)?
Yes. We have *1 coming before *2. Thats how we have organized the file structure. However, generalization for any other possibility in future is good.

Quote:
I would create a list of files/paths and then write a sort-like function in perl/python where I can easily reorder/count/select/whatever I want.
So far I am Python/Perl illiterate! Hence, wanted to do within bash. However, I guess I will be learning (will have to!) Python soon.
 
Old 11-26-2016, 05:34 AM   #15
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,800

Rep: Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568Reputation: 7568
Another question:
find <dir> -name "*.txt" | grep keyA1|keyA2|keyB1|keyB2 | sort > result
will give you an ordered list, something like this:
Code:
fileA.1
fileA.2
fileB.1
fileA.3
fileB.2
fileB.3
fileA.4
fileA.5
...
you only need to remove the files from the list where both keyA and keyB can be found.

Is this what you are looking for?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Nested while loop - similar strings tlegend33 Linux - Newbie 4 09-17-2014 06:04 PM
[SOLVED] infinite nested while loop wolverene13 Programming 3 11-14-2012 08:32 PM
[SOLVED] Bash - While Loop reading from two lists simultaneously - nested while loop wolverene13 Programming 11 10-01-2011 05:00 PM
[SOLVED] for loop and nested find kez1985 Linux - Newbie 1 10-01-2010 10:46 AM
Nested-double loop error Harry Seldon Programming 3 05-06-2006 05:15 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:54 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration