LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 01-09-2013, 05:07 PM   #1
Dweeb2010
Member
 
Registered: May 2010
Posts: 33

Rep: Reputation: 1
looking for a method of stripping special characters from filenames


Hi. So I've run into the ages old issue of special characters in filenames. Specifically, in my music collection, there are several files I have which I have ripped on older Windows computers, etc, which have things like quotes and question marks in the filenames (files auto-named by whatever ripping software I was using at the time, based on the cddb lookups). This problem wreaks havoc when I attempt to copy albums or songs over to a portable player, or to other disks, or even just opening the file on gnu/linux players, because the files or directories containing the special characters are just skipped with an error. It can be extremely annoying when I try to load an entire album and then find out one song in the middle wasn't added to the playlist due to this. As stated above, files and directories containing special characters are also omitted whilst copying.

Since I have a fairly large music collection, I'd like to find a semi-automated way to just navigate through an entire directory and remove all special characters. I would write a script to do this, but last time I tried, I wound up making a mistake and renamed several files into nothing (based on a one-liner a friend gave me), losing the files. Is there a fairly uncomplicated way to do this? I tried searching around a bit, but didn't find anything specifically describing this. Thanks.
 
Old 01-09-2013, 05:31 PM   #2
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,150

Rep: Reputation: 330Reputation: 330Reputation: 330Reputation: 330
Look at the tr command. (man tr for details.)

Something like this find ./ -type f -exec mv '{}' tr <args> '{}' ';' (where "<args>" are the tr specification you want) might work.

Note: the find command I suggest is from memory, and not checked. I'm unsure if the '{}' argument-substitution can be used twice.
 
Old 01-09-2013, 05:55 PM   #3
suicidaleggroll
Senior Member
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 2,849

Rep: Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007
There's probably a better way to do it, but I would just go through each special character one by one and write a short script to find any files/dirs with that character and remove it. Something like:

Code:
find dir -iname "* *" -print0 | while read -d $'\0' file; do echo mv "$file" "${file// /_}"; done
find dir -iname "*'*" -print0 | while read -d $'\0' file; do echo mv "$file" "${file//\'/}"; done
find dir -iname '*"*' -print0 | while read -d $'\0' file; do echo mv "$file" "${file//\"/}"; done
find dir -iname "*\?*" -print0 | while read -d $'\0' file; do echo mv "$file" "${file//\?/}"; done
Note that I left "echo"s in front of the mv commands so that you can verify the mv is doing what you think it's doing before actually running it. If you verify that everything looks like it will be renamed properly, then remove the echo in front of the mv and run it again to actually perform the renaming. An example on my system:

Code:
$ ls -l dir/
total 0
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evil"file1a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evil"file1b
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evil'file2a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evil'file2b
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evilfile3a?
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evilfile3b?
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evil file 4a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evil file 4b
$
$ cat fix
#!/bin/bash

find dir -iname "* *" -print0 | while read -d $'\0' file; do echo mv "$file" "${file// /_}"; done
find dir -iname "*'*" -print0 | while read -d $'\0' file; do echo mv "$file" "${file//\'/}"; done
find dir -iname '*"*' -print0 | while read -d $'\0' file; do echo mv "$file" "${file//\"/}"; done
find dir -iname "*\?*" -print0 | while read -d $'\0' file; do echo mv "$file" "${file//\?/}"; done
$
$  ./fix 
mv dir/evil file 4a dir/evil_file_4a
mv dir/evil file 4b dir/evil_file_4b
mv dir/evil'file2b dir/evilfile2b
mv dir/evil'file2a dir/evilfile2a
mv dir/evil"file1a dir/evilfile1a
mv dir/evil"file1b dir/evilfile1b
mv dir/evilfile3b? dir/evilfile3b
mv dir/evilfile3a? dir/evilfile3a
Then after removing the echos and running it again:
Code:
$ ls -l dir/
total 0
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evilfile1a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evilfile1b
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evilfile2a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evilfile2b
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evilfile3a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evilfile3b
-rw-rw-r-- 1 user user 0 Jan  9 16:14 evil_file_4a
-rw-rw-r-- 1 user user 0 Jan  9 16:17 evil_file_4b

Last edited by suicidaleggroll; 01-09-2013 at 06:02 PM.
 
Old 01-10-2013, 05:23 PM   #4
codergeek
Member
 
Registered: Dec 2012
Posts: 52

Rep: Reputation: 7
You can do a for loop script using sed. I created two filenames with special characters for this example
Code:
ls -1
a_ weird ^? filename.mp3
This is #$ a _ test file
Code:
for i in *; do  echo mv "$i" "$(echo $i | sed 's/[!@#\$%^&*()?_]//g' | tr -s " ")"; done
mv a_ weird ^? filename.mp3 a weird filename.mp3
mv This is #$ a _ test file This is a test file
The original filename is in black and the new filename is in red

Whatever character(s) you want omitted, just insert it between the brackets i.e in bold

It is good to use echo to preview the results before committing the actual conversion. If satisfied with the preview, then remove the echo in blue.

This is optional. If you want to capitalize the first character in each word, then add an extra sed statement.

Code:
for i in *; do  echo mv "$i" "$(echo $i | sed 's/[!@#\$%^&*()_?]//g;s/^.\| [Aa-Zz]/\U&/g' | tr -s " ")"; done
mv a_ weird ^? filename.mp3 A Weird Filename.mp3
mv This is #$ a _ test file This Is A Test File
The new filenames have each word capitalized and weird characters removed.
The new sed statement is in bold

Hope this helps. Remember keep the echo part to preview the results. Then remove it to make the real changes.

Last edited by codergeek; 01-10-2013 at 05:35 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Special characters in PAR2 filenames ricardisimo Linux - Newbie 3 07-16-2009 08:18 PM
Removing special characters from filenames PlymWS Linux - Software 1 08-09-2007 05:11 AM
Strip special characters from filenames General Linux - Software 1 05-14-2006 03:49 AM
Special characters in filenames gmartin Linux - General 2 01-05-2006 08:22 PM
bash and filenames with special characters CoolAJ86 Programming 2 03-09-2005 02:50 PM


All times are GMT -5. The time now is 02:07 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration