LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Traverse the file system and Rename (xargs or sed?) (https://www.linuxquestions.org/questions/linux-general-1/traverse-the-file-system-and-rename-xargs-or-sed-880784/)

sahil.jammu 05-15-2011 12:16 AM

Traverse the file system and Rename (xargs or sed?)
 
Hello Everyone,

I need your inputs on performing some operations:-

a. Traverse from top Level directory, find all the directories
b. Rename all these directories to <original name>.dir
c. Once the renaming is done - search from top level and retain only those directories which has .txt content in them.
d. Delete rest all.....

Can i use xargs here to perform operation a and b , or will sed will be useful.

Kindly provide your inputs....

EricTRA 05-15-2011 12:46 AM

Hi,

You could use find with -exec or pipe the output from find to xargs for example. You can even loop through the directory and use a loop to process the filenames with sed. Try them all out and see if they work for you. If you tried and encountered errors, please post the command you used and the errors you get.

Kind regards,

Eric

MTK358 05-15-2011 07:56 AM

This will add the ".dir" to directories. I'm not sure how to tell if a directory has a file with a certain name, though.

Code:

find . -type d -exec mv '{}' '{}'.dir ';'

sahil.jammu 05-15-2011 10:23 AM

Thanks Eric and find "MTK358" . Now with the execution of :- find . -type d -exec mv '{}' '{}'.dir ';'

All my directories are renamed to <original name>.dir

Now with this new naming structure, i am interested in doing find from top level (recursive), look into all the directories/sub-directories (check for the extention .txt) retain the directories which have .txt in them and delete rest all.

How shall i go about it??

MTK358 05-15-2011 10:36 AM

This might work (UNTESTED)

Code:

find . -type d -exec mv '{}' '{}'.dir ';'

function contains_txt_files
{
    for file in "$1"/*
    do
        echo "$file" | grep '\.txt$' &> /dev/null
        if [ $? '=' 0 ]
        then
            return 0
        fi
    done
    return 1
}

find . -type d | while read d
do
    contains_txt_files "$d"
    if [ $? '!=' 0 ]
    then
        rm -r "$d"
    fi
done


EricTRA 05-15-2011 10:46 AM

Hi,

If you only want to retain the files with .txt and delete all the others, you might do it with this:
Code:

find /yourdir -type f -not -name '*.txt' -exec rm {} \;
That will find you all the files that don't have .txt and delete them, leaving you with only files with .txt. Test before executing the actual rm with for example ls. If you also want to delete the directories not containing .txt files then you'll have to use a solution such as pointed out by MTK358.

Kind regards,

Eric

sahil.jammu 05-15-2011 11:01 AM

Thanks to both of you. It worked.

I was on different track, once the directories were renamed to <orginal_name>.dir , i was trying to use xargs:-

Something like this:-
find . -iname '*dir' | xargs find . -iname '*.txt' | <3rd action Point>

But it wasn't working for me..

Plz correct me on use of xargs ( i can use find twice in same go right? )
find . -iname '*dir' | xargs find . -iname '*.txt' //this command wasn't working for me
Error:-
find: paths must precede expression
Usage: find [path...] [expression]


Solution provided by you solved the problem, but if you can provide some info related to my error, it will be a good learning.

Thanks

MTK358 05-15-2011 11:08 AM

Quote:

Originally Posted by sahil.jammu (Post 4356948)
*dir

Big mistake.

It should be '*.dir', not '*dir'. Just '*dir' (without the dot) with match things like "gsfgjdfkgldir", which obviously isn't what you want.

MTK358 05-15-2011 11:12 AM

Quote:

Originally Posted by sahil.jammu (Post 4356948)
find . -iname '*dir' | xargs find . -iname '*.txt' | <3rd action Point>

See the "." after "find"? That specifies the directory to search. "." means the current directory. Also, "xargs" will add all the directories as arguments, but "find" can only search one directory (it's wrong to specify many).

sahil.jammu 05-15-2011 12:25 PM

Hello MTK358,

Regarding this Script:-

---
find . -type d -exec mv '{}' '{}'.dir ';'

function contains_txt_files
{
for file in "$1"/*
do
echo "$file" | grep '\.txt$' &> /dev/null
if [ $? '=' 0 ]
then
return 0
fi
done
return 1
}

find . -type d | while read d
do
contains_txt_files "$d"
if [ $? '!=' 0 ]
then
rm -r "$d"
fi
done
---

There is one thing, every-time i run this, it adds .dir as extension to the directory name, if we execute twice name becomes:-
dir_name.dir.dir . How shall we go about it - so that name remains only .dir

MTK358 05-15-2011 12:32 PM

I can't find a find option that matches only if a regex or wildcard does not match. I'm not sure how to do this.

EricTRA 05-15-2011 12:35 PM

Quote:

Originally Posted by MTK358 (Post 4357049)
I can't find a find option that matches only if a regex or wildcard does not match. I'm not sure how to do this.

Hi,

Have a look at post #6 where I use the -not to find all files that don't match the -name. If I'm not mistaken you can also use it in combination with -regex since it 'reverses' what you're looking for.

Kind regards,

Eric

MTK358 05-15-2011 12:50 PM

I didn't know about "-not", it was very far down in the man page in a place I didn't look. I thought that if there was such a thing, it would be in the "TESTS" section. So:

Replace this line:

Code:

find . -type d -exec mv '{}' '{}'.dir ';'
with this:

Code:

find . -type d -not -name '*.dir' -exec mv '{}' '{}'.dir ';'

sahil.jammu 05-15-2011 12:55 PM

Thanks Eric and MTK358.

Last few posts were quite informative and useful.


Cheers
Sahil

EricTRA 05-15-2011 12:56 PM

Hi,

The man page for find is indeed pretty big. Your command would only leave OP with one problem, the one he stated in #10
Quote:

There is one thing, every-time i run this, it adds .dir as extension to the directory name, if we execute twice name becomes:-
dir_name.dir.dir . How shall we go about it - so that name remains only .dir
meaning that if you run the command a second time a second .dir gets added which is not what he wants. I'm trying to figure out how to overcome that but haven't found a solution yet. Any ideas?

Kind regards,

Eric


All times are GMT -5. The time now is 03:26 AM.