LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 04-14-2009, 07:34 PM   #1
aminalshmu
LQ Newbie
 
Registered: Aug 2004
Location: tallahassee, florida, usa
Distribution: arch linux 0.7
Posts: 2

Rep: Reputation: 0
Script to find ntfs-incompatible characters?


I'm trying to move a lot of data from an ext filesystem to ntfs in windows (7), using the ext ifs driver. Some of the files windows refuses to copy because of characters in the filenames, such as question marks (?)...

The reason I'm not using ntfs-3g to move these files in gentoo from one drive to the next is because I want to make sure windows will be able to read and manipulate all of the files. It's a total pain to go through my whole music folder - 1,600+ subfolders, 6000 total folders, 115gb of 25,000+ files (and this is after I've been semi-successfully moving chunks of folders to the new drive) - looking for the ones windows can't read, then booting into linux to rename them.

I'm wondering if anyone knows, first of all, what specific characters NTFS does not support that ext2/3 does, and secondly if there is a script/set of commands/program that could go through this massive collection of files and find/replace them with characters that windows can deal with... eventually I plan on using the musicbrainz picard tagger to try to sort it out once I can move all of the files to my new WD green terabyte drive =)
 
Old 05-17-2009, 03:53 PM   #2
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
I usually run tests on one of my own machines, but in this case, most what follows is guesswork . . .

No, I have no idea characters are allowable in either filesystem; and you are right that it is an essential element of your project. I think you need to do 4 things:
  1. Determine your source character set.
  2. Determine your destination character set.
  3. Decide a mapping of items in 1) that are not allowed in 2).
  4. Change the affected file names accordingly.

Determine your source character set
It should be possible to script a loop through whatever characters you want to test & see if/how they fail. Here is some concept code for the entire 7-bit ASCII set:
Code:
mkdir ASCII
cd ASCII
for X in {0,1}{0..7}{0..7} 
do 
  y=$(echo -e "\\${X}")
  echo -en "$X  $y "
  touch $y; ls $y  
done  | less
Warning: It's messy.

Another approach to finding your source character set is to look in the directories you will be working w/ -- here I assume it's /home/<you>"
Code:
ls -R ~ \
| grep -v "^/.*:$" \
| grep -o . \
| LC_ALL=C sort -u \
| less
Not so messy, but not as complete.

Determine your destination character set
Test 1) on the dest. file sys.; perhaps w/ a .bat or .cmd file that is generated by a script.

Decide a mapping of items in 1) that are not allowed in 2)
This one is up to you -- you might map lower case to upper, spaces to underscores, all punctuation to periods, etc., etc., etc.

Change the affected file names accordingly
Use tr, RTM, here's a lower to upper sample:
Code:
for X in *; do mv $X $(echo $X |tr a-z A-Z); done
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
converting linux (reiserfs) data to NTFS (file names contain special characters) jefffq Linux - General 2 03-08-2009 03:07 PM
Unable to create files named with latin characters in *some* directories in NTFS glalejos Linux - General 2 10-14-2008 04:01 PM
bash script to find out more than 1 continuous special characters in a file. kkpal Linux - Newbie 1 06-02-2008 04:56 AM
ntfs-3g cannot access files with foreign characters dissociative Slackware 3 05-11-2008 10:26 PM
NTFS - files with CE font characters not displayed? zocy Mandriva 1 02-08-2004 10:56 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:04 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration