Mass file manipulation

Drack · 02-20-2006, 10:52 AM

I have a few directories full of thousands of similar files, most of them redundant, taking up a huge amount of space. There are a few procedures that I'd like to run these files through to reclaim much of this space both by file compression and trimming the redundant files. I think that shell scripts would be the most effective tool for this but I need to know which programs I should be reading about to accomplish it.

Anyway, here are some of the batch procedures I'd like to do to organize this mess:

-Zip all files into individual zips. filename.ext becomes filename.zip with best compression (I hear 7zip makes the most compressed .zips).

-Delete all files whose filenames contain a substring
This has already been answered to my satisfaction - It was handed to me! This alone has reclaimed about a dozen gigs of space! I'd link to the post but I can't include links until I have at least 5 posts so just search under my username if you want the solution.

-Delete all files with filenames that contain substring1 but do not contain substring2

Here's where it starts getting tricky. The filenames are often in the format "Title (additional info)"
-Delete all files with one "additional info" string where another file with the same title with preferable additional info exists. For instance, delete any "Same Title (Revision A).zip" files if "Same Title (Revision B).zip" is found in the directory.

What programs can do this under linux? I think I have some manpages to read.

pljvaldez · 02-20-2006, 11:18 AM

Here's a mass file zipping script (run from the directory with the files):

for i in *; do zip `basename $i .wmv`.zip $i; done

change the ".wmv" to whatever file extension you're zipping. If you leave out the basename text, you'll get filenames like "file.wmv.zip" instead of "file.zip". also, note those are back quotes (from the ~ key).

For the deleting, can't you just rm *$Revision\ A$*.zip. The "\" allows you to use special characters like parenthesis, spaces, etc.

Drack · 02-21-2006, 09:41 AM

Thanks.

nx5000 · 02-21-2006, 09:45 AM

Its always good to understand find, for,.. but for the task of mass renaming, mass moving, mass linking you can try:
http://linux.maruhn.com/sec/mmv.html

There should be a downloadable package on every distribution, at least on Debian there is

Drack · 02-25-2006, 03:37 PM

Tried the for command, but it didn't work on filenames with spaces in them

The spaces divided what was stored in i.

nx5000 · 02-27-2006, 06:40 AM

Code:

for i in *.wmv; do zip "`basename "$i" .wmv`".zip "$i"; done

Prevent bash interpretation of spacing by quoting $i

Here's one more general, use the replace of bash:

Code:

for i in *.wmv; do zip "${i/wmv/zip}" "$i"; done