LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash, LS, For loops, and whitespaces in directories (https://www.linuxquestions.org/questions/programming-9/bash-ls-for-loops-and-whitespaces-in-directories-126388/)

jhrbek 12-16-2003 02:45 PM

Bash, LS, For loops, and whitespaces in directories
 
Hi everyone. I have a problem that I have been unable to solve. I need to write a bash script that will recurse through a directory structure and write whatever directories names it finds to a file. Below is an example of how I am doing it. I have an embedded for loop structure that recurses to a depth that I need. This example is just a simple version of what I wrote.

What I know:

I know that "for" uses whitespaces as a delimeter and automatically splits up directories that have spaces in them. This is bad. I also know that spaces in directories is bad, but I can't change that. They are apart of an imap mail spool and spaces in the directory names was deemed "necessary" by my boss. :( I have been told that I should be able to accomplish what I need with a well constructed use of the "find" command and a while statement, but I have been unable to find/create a working model.

The code below works for directories that do not have a whitespace, but will not include those that do.

If I had a directory like:

trash/
sent items/
save/

it would print the results as such:

I found trash and it is a directory
I found save and it is a directory

If I "echo" all of the results it will print:

trash/
sent
items/
save/


Any help would be appreciated, thanks!


Code:

#!/bin/bash

# Build our directory list

for i in `ls`
 do
 if [ -d $i ]
  then
  echo "I found $i and it is a directory"
 fi
done

please help :confused:

Bebo 12-16-2003 02:59 PM

Hi,

find <dir> -type d

will give you a list of the directories in <dir>, recursively. Then you can just redirect the output to a file.

HTH

jhrbek 12-16-2003 03:10 PM

Thanks for the reply. Here is what I have so far, the complicated one. If I change directory to one of the sub directories, and issue:

find . -type d -name '* *' -o -type d -name '*'

It works great, returning both normal directories and directories with spaces. However, because the for loop chops things up based on white space, I'm getting hosed there. I don't know how else to do this without a for loop and still preserve the usernames that are associated with the directory structure (I obtain the usernames from the directory names).

Here is an example of the output from my script:

message folder found, saving: ./j/user/jjacobsen^foo^com/Web
message folder found, saving: Site
message folder found, saving: Stuff
message folder found, saving: ./j/user/jkelch^foo^com/Mail
message folder found, saving: ./j/user/jkeller^foo^com/News
message folder found, saving: ./j/user/jlancaster^foo^com/News

Note that the "jjacobsen^foo^com/Web" should really be "jjacobsen^foo^com/Web Site Stuff"

Code:

for i in `ls`
 do
 if [ -d $i/user ]
  then
  for j in `ls $i/user`
    do
    echo -e "user.$j\tdefault\t${j//^/.}\tlrswipcda\t$acluser\tlrswipcda" >>/tmp/newmboxlist.txt
      : $[users++]
      for k in `find . -type d -name '* *' -o -type d -name '*'`
      do
          echo "message folder found, saving: $k"
          echo -e "user.$j.$k\tdefault\t${j//^/.}\tlrswipcda\t$acluser\tlrswipcda" >>/tmp/newmboxlist.txt
      done
    done
 fi
done


Bebo 12-16-2003 03:20 PM

Aha... If you could somehow exchange the spaces in the dirnames to the regexp of space (backslash-space), so that the parsing gets right? Use something like tr ' ' '\ ' or maybe even tr '\ ' '\\\ '?

Bebo 12-16-2003 03:22 PM

Or... Since you know that the proper dirname should start with ./ then you can just append the strings that don't start with that to the former dirname?

Edit: a bit ugly, but... ;)


jhrbek 12-16-2003 03:35 PM

Well, getting a list and writing it to a file isn't a problem. But I need to be able to iterate through each result and insert that result into a larger string:

Here is what the final output must be:

user.bhartline^foo^com default bhartline.foo.com lrswipcda cyrus lrswipcda
user.bhartline^foo^com.Drafts default bhartline.foo.com lrswipcda cyrus lrswipcda
user.bhartline^foo^com.Sent Items default bhartline.foo.com lrswipcda cyrus lrswipcda

I would just normally rename the directories and be done with it but because it is for email, I can't do that w/o making people mad at me. :D

The echo string that creates the above output is:

echo -e "user.$j.$k\tdefault\t${j//^/.}\tlrswipcda\t$acluser\tlrswipcda" >>/tmp/newmboxlist.txt

where:

$j is an iteration of a letter of the alphabet, a, b, c, etc. It gets it's info from the output of an LS command.

\t is a special character I need to delinate the data.

$j is the name of the drectory holding the user's mail data

${j//^/.} is a regexp rewrite of the user^foo^com to user.foo.com, which is also an iteration of the output from another LS command

$k is any subdirectories found in $j

Hope that makes sense. :D

Will your regexp pattern prevent the for loop from cutting up the spaces in the directory names?

Would it be easier to write this info to a file, then open it again and use a while loop, eg while not EOF? I think this would be complicated because I would have to do this for every letter of the alphabet and every username under that letter. Gerr.

By the way, the ./ you mentioned is a product of my find command (i think). I'll try to implement your suggestion and see if it helps. :)

jhrbek 12-16-2003 03:39 PM

Quote:

Or... Since you know that the proper dirname should start with ./ then you can just append the strings that don't start with that to the former dirname?
This actually might work. I would have to be able to access the for loop's previous iteration though and I don't know how to do that. Hmm. I'll have to see if I can store them in an array or something. I'm new to BASH so i'll have to check into the array structure.

Bebo 12-16-2003 03:44 PM

Yeah, that's some horrible echo! :D

The ./ is, as you say, the result of the find command. BTW, you don't have to use that very long find: find . -type d -name '* *' -o -type d -name '*'. It should be enough with find . -type d, but then again, it might just be a result of the contents in the dir you're looking in :)

I was just trying to get some use of my tr commands, but to no avail. sed should be better: find . -type d | sed s/' '/'\\ '/g should be better.

jhrbek 12-16-2003 03:51 PM

Yeah, the echo statement is evil, but necessary.

Even if I changed my find command, how would I iterate through it though? The iteration is splitting the directories, not the find. I've only used sed once, and it was an example to learn bash, so I didn't really understand what I was doing. :) I'll keep hacking at it though.

Bebo 12-16-2003 04:18 PM

Yeah, you're right about that the iteration splits the directories. I see that now - sorry :) And my sed pipe didn't work either. Nothing I do today works... bwaaah! :cry: ;)

One last try to help you before I go and stand in the corner... To implement the ./ check, you can append the dirnames to one another until the next dirname starts with ./. Well, that's not helping very much, I guess...

(Well, what do you know, this is my 200th post! :))

jhrbek 12-16-2003 04:22 PM

Bebo, I appreciate your help!

Do you know anything about arrays with bash? As I read the bash documentation I find it lacking, especially in the area of examples.

I tried:

Code:

declare -a folders
read -a folders | find . -type d -name '* *' -o -type d -name '*'
#find . -type d -name '* *' -o -type d -name '*' | read -a folders # doesn't work either
echo ${folders[@]} #prints array

and that didn't work. It only seems to work if i manually type something in. The docs say read separates array values by whitespace, so if I could get this to work, I would be able to do the append operation with relative ease. Regardless, you would think that I could get at least 1 array element with all of the returned data.

this works, but does not address my need:

Code:

declare -a folders
read -a folders
echo ${folders[@]} #prints array

Ideas?
:scratch:

jhrbek 12-16-2003 05:41 PM

Almost there! All I need to do now is append the offending values and i've got it. Thanks for your help!

-j

Code:

declare -a folders

folders=(`find . -type d -name '* *' -o -type d -name '*'`)

arrSize=${#folders[@]} #array size of "folders"

for i in `seq 0 $arrSize`
 do
  if [ ! -d ${folders[$i]} ]
  then
    echo ${folders[$i]}
  fi
done


Bebo 12-16-2003 05:43 PM

I don't know anything about arrays, I'm afraid.

The last hour I've tried to solve the splitting problem with the continue statement, but I'm apparantly too stupid today, 'cause I can't even get the logic tests to work.

Bebo 12-16-2003 05:52 PM

Aha, great! I didn't see your last post before I wrote my previous one :) Now I can use your solution too :D

Bebo 12-16-2003 06:19 PM

OK, now I have a solution that can be used if one uses ls instead of find:

Code:

#!/bin/bash

unset name

for i in `ls -1F | grep \/$` ; do
    if test ! $name ; then
        name=$i
    else
        name=`echo $name $i`
    fi

    test `echo $name | rev | cut -c1` != \/ && continue

    echo "Here is your directory: $name"

    unset name
done


Bebo 12-17-2003 07:28 AM

Hello again,

I couldn't let it go (I like scripting :D) so here is something that might work for you:

Code:

#!/bin/bash

declare -a names

names=(`find . -type d -name '*'`)
namesize=${#names[@]}

unset completename
unset previousname

for i in `seq 0 $namesize` ; do
    currentname=${names[$i]}

    if test $previousname ; then
        completename=`echo $completename $previousname`
       
        if test ! $currentname || test `echo $currentname | cut -c1-2` = './' ; then
            echo "Here is the correct dirname for use: $completename"

            # DO SOMETHING HERE

            unset completename
        fi
    fi

    previousname=$currentname
done


Cheers!

dolmen 12-17-2003 09:08 AM

Re: Bash, LS, For loops, and whitespaces in directories
 
Quote:

Originally posted by jhrbek

The code below works for directories that do not have a whitespace, but will not include those that do.

Any help would be appreciated, thanks!


Code:

#!/bin/bash

# Build our directory list

for i in `ls`
 do
 if [ -d $i ]
  then
  echo "I found $i and it is a directory"
 fi
done


Of course, the simplest solution would be to use find as suggested by the first user who replied.

But I won't miss this occasion to teach you a few things.
Your original code has three flaws:[list=1][*]you are using ls instead of "ls".[*]you do not put quotes around variables.[*]you are using the ls result on the for line[/list=1]

Let's explain in depth:[list=1][*]ls may be an alias which may not be the same as expected when you wrote the shell script. In your case, ls seems to be "ls -CF" (as I see the '/' after the directories). Putting quotes (") around ls calls "bare" ls, bypassing any defined alias.
before: for i in `ls`
after: for i in `"ls"`[*]it is very important to put quotes around variable expansion because if the variable value is two or more words separated by spaces, they may be expanded as two separate words
before: if [ -d $i ]
after: if [ -d "$i" ][*]the numbers of rguments of some commands (may be the for statement) may be limited, so you can expect problems if your directory contains many files. You also have problems with filenames containing spaces because each part of the filename is used in a different iteration of the loop. The solution is to read line by line the result of "ls" in a variable.
before: for i in `ls`
after: "ls" | while read i[/list=1]

So the new code is :

Code:

#!/bin/bash

# Build our directory list

"ls" | while read i
do
  [ -d "$i" ] && echo "I found $i and it is a directory"
done


dolmen 12-17-2003 09:17 AM

Bebo: don't try to transpose algorithms from other programming languages in shell programming because it is very apart. To be good in shell, you have to think in shell.
Using arrays in shell programming is usually not necessary.

Bebo 12-17-2003 09:41 AM

Hello dolmen,

I'm not sure if I understand what you mean. Where did I transpose algorithms from other programming languages? You mean the arrays? Well, with the inspiration of jhrbek, that was the only way I could think of to solve the problem with the iteration splitting the directory names. But, now with your post, I would have solved it differently using the double quotes. Thanks!

Edit: But, I don't get the double quotes to work around the find command. This works just as fine as my previous 25 line script:

Code:

find -type d -name '*' | while read name ; do
  echo $name
done

Wohoo! :D


unSpawn 12-17-2003 10:29 AM

Using arrays in shell programming is usually not necessary.
FWIW, I think arrays in shell programming are a deity gift. They can for instance help you cut down using external binaries. Here's a lame example. Say you got files all named "DDMM YYYY some_remark.ext" you have to rename to YYYYMMDD_some_remark.ext:
find . -type f -iname \*.ext | while read fn; do fn=( ${fn} )
D=${fn[0]:0:2} ); M=${fn[0]:2:2} ); Y=${fn[1]} );
mv "${fn[*]}" "${Y}${M}${D}_${fn[2]}"; done
In combination with IFS usage you could split on about any char ![0-9A-Za-z].

dolmen 12-17-2003 06:15 PM

Quote:

Originally posted by Bebo
Hello dolmen,
I'm not sure if I understand what you mean. Where did I transpose algorithms from other programming languages? You mean the arrays?

Yes I mean the array. But may be I'm too used to program in shell without, because they are limited in ksh to 512 entries.

Quote:


Code:

find -type d -name '*' | while read name ; do
  echo $name
done


This code doesn't work as expected if a filename contains two (or more) consecutive spaces. Try running it after mkdir "a b" (with two spaces between 'a' and 'b').
So you have to put double quotes around the use of the variable.
And "-name '*'" doesn't add anything because "find" will already match all files (or did I miss something?).
This code is better:
Code:

find . -type d | while read name ; do
  echo "$name"
done

which in this particular case of the loop content can be reduced to "find . -type d -print"

dolmen 12-17-2003 06:42 PM

Quote:

Originally posted by unSpawn
Here's a lame example. Say you got files all named "DDMM YYYY some_remark.ext" you have to rename to YYYYMMDD_some_remark.ext:
Code:

find . -type f -iname \*.ext | while read fn; do fn=( ${fn} )
D=${fn[0]:0:2} ); M=${fn[0]:2:2} ); Y=${fn[1]} );
mv "${fn[*]}" "${Y}${M}${D}_${fn[2]}"; done

In combination with IFS usage you could split on about any char ![0-9A-Za-z].

Here is my version (not tested, no Unix or GNU system around) :
Code:

find . -type f -iname \*.ext | sed s/\'/\\\\\''/g;/s/^\(..\)\(..\) \(....\) \(.*\).ext$/mv '\''\&'\'' '\''\3\2\1_\4.ext'\'/ | $SHELL
(this forum doesn't render double slashes as I want (slash before a quote disappear), and I won't fight against it. If you want my code, select the "quote" button).

Bebo 12-17-2003 07:07 PM

In reply to dolmen's post 21...

Well, yeah, I overlooked the double quotes again :)

About the -name '*'. Actually, I didn't think very much it in the beginning, as you can see in post 8. I realized the difference when I was writing my too-long script. The -name '*' will filter away the single period (.) which is the first element in find's output. I found that useful, somehow :)

It seems jhrbek has gotten his problem solved real good now, don't you think? ;)

dolmen 12-17-2003 08:44 PM

Quote:

Originally posted by Bebo
It seems jhrbek has gotten his problem solved real good now, don't you think?
I hope so. ;)

jhrbek 12-18-2003 03:04 PM

Wow, well thanks for all the help everyone! I haven't been able to work on this problem since tuesday because of final exams at university. I'm done now (just finished) so I'll be sure to implement the suggestions.

Dolmen,

I've tried quoting the LS command in the way you suggested but the for loop still split the directory names. Maybe it's something to do with redhat 8, I don't know. I don't know a whole lot about linux, but i'm trying to learn. :) Consequently, because of the mysterious for loop behavior, I decided to use find as it seems to be a bit more powerful than LS, at least for what I need to do.

Regardless, I'll definitely find a solution from all of these great suggestions. I'll be sure to post my final code at the end of the day today.

Thanks!

-j

jhrbek 12-18-2003 05:12 PM

Finally! :)

Here is my final product, it does everything I need it to do. Thanks for all of your help! :)

Code:

for i in `ls`
 do
 if [ -d $i/user ]
  then
    find $spool/$i/user/ -type d | while read name ; do
      # We need to remove the spool path and get just the username
      name=${name//$escSpool\/$i\/user\//}
      if [ ! -z "$name" ]
      then
        echo "$name"
        echo "user/${name//^/.}" >>/tmp/userlist.txt
      echo -e "user.$name\tdefault\t${name//^/.}\tlrswipcda\t$acluser\tlrswipcda" >>/tmp/newmboxlist.txt
      fi
    done
 fi
done


Bebo 12-19-2003 05:56 AM

Great! I was a lot of fun! :)

Bax1989 09-22-2010 06:17 AM

Thanks
 
I also had this question .
Bobe, your script works fine also without the "grep \/$" filter after the ls statement .


All times are GMT -5. The time now is 03:44 AM.