LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   make a folder for each file in a directory then move the file into it (https://www.linuxquestions.org/questions/programming-9/make-a-folder-for-each-file-in-a-directory-then-move-the-file-into-it-847274/)

genderbender 11-29-2010 12:39 PM

make a folder for each file in a directory then move the file into it
 
Initially I thought - piece of piss, use a for loop with ls in it:

Code:

for file in `ls /shares/ | egrep -v '(.srt|.ass|.sub)'; do mkdir $files && mv $files /shares/$files/; done;
However this causes lots of problems (folders have extensions, I have duplicate folders, the names with spaces create a folder for each element of the name).

The contents of the folder is basically movies (some with subtitles). Some of the names have things like (original) or CD1 CD2 in them.

Any ideas?

ntubski 11-29-2010 01:04 PM

Quote:

use a for loop with ls in it:
Never do that, use globbing, or pipe find into while read.

Code:

shopt -s extglob
for file in /shares/!(*.srt|*.ass|*.sub) ; do
    dir=$(mktemp -d "$file.XXXXXXXXX")
    mv "$file" "$dir"
    mv "$dir" "$file"
done


genderbender 11-29-2010 05:39 PM

Quote:

Originally Posted by ntubski (Post 4174990)
Never do that, use globbing, or pipe find into while read.

Code:

shopt -s extglob
for file in /shares/!(*.srt|*.ass|*.sub) ; do
    dir=$(mktemp -d "$file.XXXXXXXXX")
    mv "$file" "$dir"
    mv "$dir" "$file"
done


~ # shopt
-sh: shopt: not found
~ # which shopt
~ #

I'm using a very minimal kernel... Find might do it though :)

grail 11-30-2010 01:49 AM

The part about duplicate folders makes me think that there are already folders in the folder / directory you are looking into?

Hence the for solution is incomplete unless you check each one that is found as to whether or not it is a file.
The other issue you have as well is that in linux all items are considered files at their base and just have an identifier to say they are something else like a directory (simplistic but you get the
point).
So, if for example you have a file called test.txt, you can no longer make in the same directory a file, of any type, with the same name.
Try this simple test:
Code:

$ touch test.txt
$ mkdir test.txt

You should receive the following error:
Quote:

mkdir: cannot create directory `test.txt': File exists
So along with the first point, this is why you would need to do something like ntubski and create temp storage first.

Hope that gives you some ideas.

genderbender 11-30-2010 04:19 AM

I didn't actually mean 'duplicate' folders. I end up with folders such as "/shares/allaboutlinuxCD1.avi/" "/shares/allaboutlinuxCD2.avi/".

The idea is that for every file with a similar name it needs to reside in a folder named after the bits of their name that are similar (minus the extension). What I REALLY want is something with the following structure:

Code:

/shares/allaboutlinux
/shares/allaboutlinux/allaboutlinuxCD1.avi
/shares/allaboutlinux/allaboutlinuxCD1.avi

But I'm happy having a base folder named allaboutlinuxCD (I can rename that by hand as there are only a few occurances. Now in reality what I get is:

Code:

ls -l
/shares/allaboutlinux/all about linux CD1.avi
/shares/allaboutlinux/all about linux CD1.avi
for i in `ls - l`; do mkdir $i; done;
Something something; attempt at making all dir failed, attempt at making about failed, attempt at making linux failed...

Psuedo code above :P

grail 11-30-2010 04:36 AM

So I am not sure I have said this to you before, although I have said it many times on this site, ls is most definitely not to be used as a general rule of thumb (see here for why not)
One of the main reasons which would definitely affect you in this scenario is that the for loop will perform word splitting on the output (which you have already run into with titles with
spaces in the names)

ntubski's use of globbing does overcome this but you still face the issue of testing for directories and so on.

You have now added a further layer of complexity which I believe you would need to handle first which is the option of having multiple files needing to go into a new single directory.
To this end I would probably generate a temp file with the output of all files and then use your favourite tool on the file to create the list of names for the new directories.

Another possible solution which just popped into my head. Use your initial idea to create all the directories and instead of using the full name, create a directory with the extension removed.
This solves the issue of creating temp directories or files and then you only need to create a small number of moves and copies to tidy up the extras you want moved.

What do you think?

I think you could easily manage the second one :)

genderbender 11-30-2010 08:36 AM

I'll have a toy with it, pity there's not an alternative to using ls for a task like this (using tr to replace spaces outside the loop and then using tr to replace *with* spaces inside the loop). Thanks for all your ideas :)

grail 11-30-2010 09:25 AM

Quote:

pity there's not an alternative to using ls for a task like this
I am not sure you understood me correctly. There are dozens of ways (better ways) to do this than using ls ... that was my point.
If you follow the link you will see that in the general scheme of things, unless you are simply listing items to be read by a human then ls is generally
the one tool not to use.

genderbender 11-30-2010 09:27 AM

Sorry. Didn't read the post properly.

ntubski 11-30-2010 09:47 AM

Quote:

Originally Posted by genderbender (Post 4175590)
I didn't actually mean 'duplicate' folders. I end up with folders such as "/shares/allaboutlinuxCD1.avi/" "/shares/allaboutlinuxCD2.avi/".

Solving that is a bit tricker, but doable:
Code:


#!/bin/sh

find /shares -maxdepth 1 ! -regex '.*\.\(srt\|ass\|sub\)$' -type f | \
    while read file ; do

    dir=$(echo "$file" | sed 's/[A-Z0-9]*\..*$//;s/ //g')

    if ! mkdir "$dir" 2>/dev/null && ! [ -d "$dir" ] ; then
        dirtemp=$(mktemp -d "$dir.XXXXX")
        mv "$dir" "$dirtemp" # $dir is really a file
        mv "$dirtemp" "$dir"
    fi

    mv "$file" "$dir"
done


grail 11-30-2010 10:35 AM

hmmm ... not so sure about this one ntubski. Normally I follow but you lost me on:
Code:

dir=$(echo "$file" | sed 's/[A-Z0-9]*\..*$//;s/ //g')
If we assume that all the OP files are at the single level under '/shares', then will not dir be equal to - /shares/ - everytime?

I would also like to point out that the regex in the find did not work for me. According to the man page, regex for find requires the entire path to be matched,
so you would need a further .* at the beginning.
Code:

find /shares -maxdepth 1 ! -regex '.*\.\(srt\|ass\|sub\)$' -type f

ntubski 11-30-2010 10:56 AM

Quote:

Originally Posted by grail (Post 4175931)
hmmm ... not so sure about this one ntubski. Normally I follow but you lost me on:
Code:

dir=$(echo "$file" | sed 's/[A-Z0-9]*\..*$//;s/ //g')
If we assume that all the OP files are at the single level under '/shares', then will not dir be equal to - /shares/ - everytime?

Nope, it only chops uppercase and digits off the end:
Code:

$ echo "/shares/all about linux CD1.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/allaboutlinux

Quote:

I would also like to point out that the regex in the find did not work for me. According to the man page, regex for find requires the entire path to be matched,
so you would need a further .* at the beginning.
Code:

find /shares -maxdepth 1 ! -regex '.*\.\(srt\|ass\|sub\)$' -type f

Oops, I forgot to put a .srt file in my test dir. :o

grail 11-30-2010 06:36 PM

Quote:

Nope, it only chops uppercase and digits off the end:
See I knew you wouldn't have made a mistake :) It was actually that I did not provide the correct input.
I would however caution OP on the fact that if you have dot separated names that this will currently remove everything after the first dot.

Also, you may have issues if your filenames are capitalising first letter of each word:
Code:

/share/All About Linux CD1.avi
Obviously there are plenty of gotcha scenarios and I am not trying to say any one solution is better or worse :)

genderbender 12-02-2010 04:32 AM

Thanks for all your help, I'll run some tests and see what is best suited to my media (my media does have a capital for each first letter most of the time and special characters...). The kernel is incredibly basic, so I'm not entirely sure all of these commands will work, but I'll give it a shot. It's basically one of those NAS boxes; I've already noticed find to be lacking most of the common arguements.

If none of these solutions work I may mount this share point on a different linux box and run the above code. Failing that I'll probably try using some free Windows tool (my /shares folder is exported as a CIFS share anyway).

ntubski 12-02-2010 10:59 AM

Quote:

Originally Posted by genderbender (Post 4177994)
Thanks for all your help, I'll run some tests and see what is best suited to my media (my media does have a capital for each first letter most of the time and special characters...)

The sed command I posted can handle capital first letters, you'll have problems if you have capital or upper case letters you want to keep in the name at the end that are not separated from the part you want to chop off.


Code:

~$ echo "/shares/All About Linux CD1.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/AllAboutLinux
~$ echo "/shares/All About XML CD.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/AllAboutXML
~$ echo "/shares/All About XMLDVD.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/AllAbout # oops

Quote:

I've already noticed find to be lacking most of the common arguements.
GNU find has lots of extras, eg this reference, is much shorter, particularly it doesn't have -regex or -maxdepth. It can probably be faked with a grep.


All times are GMT -5. The time now is 01:49 AM.