LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 02-29-2008, 08:30 AM   #1
Clemente
Member
 
Registered: Aug 2003
Distribution: Debian, Ubuntu
Posts: 188

Rep: Reputation: 30
How to find files with bad filenames (in terms of charset encodeing)


Hi all,

I am running a linux server that is used within my intranet (samba and nfs) as well as from the internet (sftp/ssh).
Some users use their sftp clients with wrong filename encoding what leads to ugly filenames (and problems) for the intranet users.

I don't see any chance to avoid this problem, so I try to fix it automatically. And ran into the next problem: I don't figure out, how to detect files with bad filenames.

I recognized, that
Code:
find .
finds all filenames, including those with bad filenames, while
Code:
find . iname "*"
only finds files with good filenames. So I could compare these two filelists and hope to catch all bad filenames.
No solution I like. First: I don't really know, why find behaves this way, and in consequence, I don't know, if this procedure is very reliable, and second: It seems to generate much overhead, I think.

Does anyone know, how to get "bad" filenames?

Thanks a lot,
Clemente
 
Old 02-29-2008, 07:21 PM   #2
Poetics
Senior Member
 
Registered: Jun 2003
Location: California
Distribution: Slackware
Posts: 1,181

Rep: Reputation: 49
You could use a bit of grep's regular expression handling for this in combination with your 'find .', or actually find's regexp itself, if you're comfortable.

If I'm not mistaken, \w will only match valid word characters. Thus, if you search for files that do not match [\w\d\-\_\.], you'll have a fairly complete list of files that don't work. If you have a lot of files with other characters in the name (#, @, $, et cetera), you can adjust your regular expression accordingly.

Hope that is a step in the right direction for you!
 
Old 03-01-2008, 07:34 AM   #3
Clemente
Member
 
Registered: Aug 2003
Distribution: Debian, Ubuntu
Posts: 188

Original Poster
Rep: Reputation: 30
Thank you very much, I will try this!
 
Old 03-01-2008, 11:27 PM   #4
JWPurple
Member
 
Registered: Feb 2008
Posts: 67

Rep: Reputation: 17
Have you tried ls -b? This will show "strange" chars in octal notation.
 
Old 04-04-2011, 12:56 AM   #5
trendle
LQ Newbie
 
Registered: Apr 2011
Posts: 1

Rep: Reputation: 0
Wink For those still interested...

You almost answered this yourself.
If...
find .
gives everything and
find . -iname '*'
gives good names .... then by extension trying
find . ! -iname '*'
does as you wish... it return files with bad names.
But thanks, you showed me something more about the quirks of find that I didn't know.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Question about find command + recognizing filenames with spaces 200mg Linux - General 3 02-22-2008 02:37 PM
Bad filenames kratib Linux - Software 4 06-01-2005 04:32 AM
Grip writes filenames with wrong charset bruno buys Linux - Software 6 12-25-2004 04:01 AM
Shell scripting to find length of filenames ridertech Linux - Newbie 2 08-25-2004 12:07 PM
using FIND on upper & lowercase filenames linux-singapore Linux - General 2 12-29-2003 11:30 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 05:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration