[SOLVED] make a folder for each file in a directory then move the file into it

genderbender · 11-29-2010, 12:39 PM

Initially I thought - piece of piss, use a for loop with ls in it:

Code:

for file in `ls /shares/ | egrep -v '(.srt|.ass|.sub)'; do mkdir $files && mv $files /shares/$files/; done;

However this causes lots of problems (folders have extensions, I have duplicate folders, the names with spaces create a folder for each element of the name).

The contents of the folder is basically movies (some with subtitles). Some of the names have things like (original) or CD1 CD2 in them.

Any ideas?

ntubski · 11-29-2010, 01:04 PM

Quote:

use a for loop with ls in it:

Never do that, use globbing, or pipe find into while read.

Code:

shopt -s extglob
for file in /shares/!(*.srt|*.ass|*.sub) ; do 
    dir=$(mktemp -d "$file.XXXXXXXXX")
    mv "$file" "$dir"
    mv "$dir" "$file"
done

genderbender · 11-29-2010, 05:39 PM

Quote:

Originally Posted by ntubski

Never do that, use globbing, or pipe find into while read.

Code:

shopt -s extglob
for file in /shares/!(*.srt|*.ass|*.sub) ; do 
    dir=$(mktemp -d "$file.XXXXXXXXX")
    mv "$file" "$dir"
    mv "$dir" "$file"
done

~ # shopt
-sh: shopt: not found
~ # which shopt
~ #

I'm using a very minimal kernel... Find might do it though

grail · 11-30-2010, 01:49 AM

The part about duplicate folders makes me think that there are already folders in the folder / directory you are looking into?

Hence the for solution is incomplete unless you check each one that is found as to whether or not it is a file.
The other issue you have as well is that in linux all items are considered files at their base and just have an identifier to say they are something else like a directory (simplistic but you get the
point).
So, if for example you have a file called test.txt, you can no longer make in the same directory a file, of any type, with the same name.
Try this simple test:

Code:

$ touch test.txt
$ mkdir test.txt

You should receive the following error:

Quote:

mkdir: cannot create directory `test.txt': File exists

So along with the first point, this is why you would need to do something like ntubski and create temp storage first.

Hope that gives you some ideas.

genderbender · 11-30-2010, 04:19 AM

I didn't actually mean 'duplicate' folders. I end up with folders such as "/shares/allaboutlinuxCD1.avi/" "/shares/allaboutlinuxCD2.avi/".

The idea is that for every file with a similar name it needs to reside in a folder named after the bits of their name that are similar (minus the extension). What I REALLY want is something with the following structure:

Code:

/shares/allaboutlinux
/shares/allaboutlinux/allaboutlinuxCD1.avi
/shares/allaboutlinux/allaboutlinuxCD1.avi

But I'm happy having a base folder named allaboutlinuxCD (I can rename that by hand as there are only a few occurances. Now in reality what I get is:

Code:

ls -l
/shares/allaboutlinux/all about linux CD1.avi
/shares/allaboutlinux/all about linux CD1.avi
for i in `ls - l`; do mkdir $i; done;
Something something; attempt at making all dir failed, attempt at making about failed, attempt at making linux failed...

Psuedo code above :P

grail · 11-30-2010, 04:36 AM

So I am not sure I have said this to you before, although I have said it many times on this site, ls is most definitely not to be used as a general rule of thumb (see here for why not)
One of the main reasons which would definitely affect you in this scenario is that the for loop will perform word splitting on the output (which you have already run into with titles with
spaces in the names)

ntubski's use of globbing does overcome this but you still face the issue of testing for directories and so on.

You have now added a further layer of complexity which I believe you would need to handle first which is the option of having multiple files needing to go into a new single directory.
To this end I would probably generate a temp file with the output of all files and then use your favourite tool on the file to create the list of names for the new directories.

Another possible solution which just popped into my head. Use your initial idea to create all the directories and instead of using the full name, create a directory with the extension removed.
This solves the issue of creating temp directories or files and then you only need to create a small number of moves and copies to tidy up the extras you want moved.

What do you think?

I think you could easily manage the second one

genderbender · 11-30-2010, 08:36 AM

I'll have a toy with it, pity there's not an alternative to using ls for a task like this (using tr to replace spaces outside the loop and then using tr to replace *with* spaces inside the loop). Thanks for all your ideas

grail · 11-30-2010, 09:25 AM

Quote:

pity there's not an alternative to using ls for a task like this

I am not sure you understood me correctly. There are dozens of ways (better ways) to do this than using ls ... that was my point.
If you follow the link you will see that in the general scheme of things, unless you are simply listing items to be read by a human then ls is generally
the one tool not to use.

genderbender · 11-30-2010, 09:27 AM

Sorry. Didn't read the post properly.

ntubski · 11-30-2010, 09:47 AM

Quote:

Originally Posted by genderbender

I didn't actually mean 'duplicate' folders. I end up with folders such as "/shares/allaboutlinuxCD1.avi/" "/shares/allaboutlinuxCD2.avi/".

Solving that is a bit tricker, but doable:

Code:

#!/bin/sh

find /shares -maxdepth 1 ! -regex '.*\.\(srt\|ass\|sub\)$' -type f | \
    while read file ; do

    dir=$(echo "$file" | sed 's/[A-Z0-9]*\..*$//;s/ //g')

    if ! mkdir "$dir" 2>/dev/null && ! [ -d "$dir" ] ; then
        dirtemp=$(mktemp -d "$dir.XXXXX")
        mv "$dir" "$dirtemp" # $dir is really a file
        mv "$dirtemp" "$dir"
    fi

    mv "$file" "$dir"
done

grail · 11-30-2010, 10:35 AM

hmmm ... not so sure about this one ntubski. Normally I follow but you lost me on:

Code:

dir=$(echo "$file" | sed 's/[A-Z0-9]*\..*$//;s/ //g')

If we assume that all the OP files are at the single level under '/shares', then will not dir be equal to - /shares/ - everytime?

I would also like to point out that the regex in the find did not work for me. According to the man page, regex for find requires the entire path to be matched,
so you would need a further .* at the beginning.

Code:

find /shares -maxdepth 1 ! -regex '.*\.\(srt\|ass\|sub\)$' -type f

ntubski · 11-30-2010, 10:56 AM

Quote:

Originally Posted by grail

hmmm ... not so sure about this one ntubski. Normally I follow but you lost me on:

Code:

dir=$(echo "$file" | sed 's/[A-Z0-9]*\..*$//;s/ //g')

If we assume that all the OP files are at the single level under '/shares', then will not dir be equal to - /shares/ - everytime?

Nope, it only chops uppercase and digits off the end:

Code:

$ echo "/shares/all about linux CD1.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/allaboutlinux

Quote:

I would also like to point out that the regex in the find did not work for me. According to the man page, regex for find requires the entire path to be matched,
so you would need a further .* at the beginning.

Code:

find /shares -maxdepth 1 ! -regex '.*\.\(srt\|ass\|sub\)$' -type f

Oops, I forgot to put a .srt file in my test dir.

grail · 11-30-2010, 06:36 PM

Quote:

Nope, it only chops uppercase and digits off the end:

See I knew you wouldn't have made a mistake

It was actually that I did not provide the correct input.
I would however caution OP on the fact that if you have dot separated names that this will currently remove everything after the first dot.

Also, you may have issues if your filenames are capitalising first letter of each word:

Code:

/share/All About Linux CD1.avi

Obviously there are plenty of gotcha scenarios and I am not trying to say any one solution is better or worse

genderbender · 12-02-2010, 04:32 AM

Thanks for all your help, I'll run some tests and see what is best suited to my media (my media does have a capital for each first letter most of the time and special characters...). The kernel is incredibly basic, so I'm not entirely sure all of these commands will work, but I'll give it a shot. It's basically one of those NAS boxes; I've already noticed find to be lacking most of the common arguements.

If none of these solutions work I may mount this share point on a different linux box and run the above code. Failing that I'll probably try using some free Windows tool (my /shares folder is exported as a CIFS share anyway).

ntubski · 12-02-2010, 10:59 AM

Quote:

Originally Posted by genderbender

Thanks for all your help, I'll run some tests and see what is best suited to my media (my media does have a capital for each first letter most of the time and special characters...)

The sed command I posted can handle capital first letters, you'll have problems if you have capital or upper case letters you want to keep in the name at the end that are not separated from the part you want to chop off.

Code:

~$ echo "/shares/All About Linux CD1.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/AllAboutLinux
~$ echo "/shares/All About XML CD.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/AllAboutXML
~$ echo "/shares/All About XMLDVD.avi" | sed 's/[A-Z0-9]*\..*$//;s/ //g'
/shares/AllAbout # oops

Quote:

I've already noticed find to be lacking most of the common arguements.

GNU find has lots of extras, eg this reference, is much shorter, particularly it doesn't have -regex or -maxdepth. It can probably be faked with a grep.