LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 01-27-2014, 12:56 AM   #1
jojanmpaul
Member
 
Registered: Sep 2012
Location: Bangalore
Posts: 56

Rep: Reputation: Disabled
find and delete .xxx files.


I need to find and delete *.doc, *.docx, *.xls, *.xlsx, *.fla extension files in which having special characters in its file name.
 
Old 01-27-2014, 01:16 AM   #2
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Distribution: UBUNTU 5.10 since Jul-18,2006 on Intel 820 DC
Posts: 607

Rep: Reputation: 127Reputation: 127
What to do. Everyone has a need.

What did you try?

In GUI (linux or windoze)
Find the files, select and delete.

If in linux CLI.
The solution lies in words found in your question by finding the files and piping the results to a command that deletes the file.

OK
 
Old 01-27-2014, 02:46 AM   #3
SAbhi
Member
 
Registered: Aug 2009
Location: Bangaluru, India
Distribution: CentOS 6.5, SuSE SLED/ SLES 10.2 SP2 /11.2, Fedora 11/16
Posts: 516

Rep: Reputation: 58
something like :
Code:
find /dir/path/ -type f -iname "*.xxx" --exec command {} \;
would work for you.
I would suggest running rm with -i in interactive mode to let you decide the action everytime. # since i suspect you never came across find command earlier.

NOTE: please dont try any command that deletes something untill and unless you are quite sure about it.

Last edited by SAbhi; 01-27-2014 at 02:48 AM.
 
Old 01-27-2014, 05:43 AM   #4
jojanmpaul
Member
 
Registered: Sep 2012
Location: Bangalore
Posts: 56

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by AnanthaP View Post
What to do. Everyone has a need.

What did you try?

In GUI (linux or windoze)
Find the files, select and delete.

If in linux CLI.
The solution lies in words found in your question by finding the files and piping the results to a command that deletes the file.

OK
I am using the following command now,

find / -name "*.xls" -exec rm -rf '{}' \;

but I want to script it, and files in which having special characters should be in safer side.
 
Old 01-27-2014, 06:34 AM   #5
GNU/Linux
Member
 
Registered: Sep 2012
Distribution: Slackware-14
Posts: 78

Rep: Reputation: Disabled
There might be a better way. Just to be on safe side either don't add the remove command into the mix and just see which files 'find' spits out OR instead of removing the files in one go just 'mv' the matched files into /tmp/toremove directory. Next just kill the /tmp/toremove.
 
Old 01-27-2014, 08:57 AM   #6
mina86
Member
 
Registered: Aug 2008
Distribution: Slackware
Posts: 354

Rep: Reputation: 148Reputation: 148
Code:
find -name \*.xls -delete
What do you mean by “having special characters in its file name”?
 
Old 01-27-2014, 12:36 PM   #7
rtmistler
Senior Member
 
Registered: Mar 2011
Location: Milford, MA. USA
Distribution: MontaVista, Ubuntu, MINT
Posts: 1,017
Blog Entries: 7

Rep: Reputation: 447Reputation: 447Reputation: 447Reputation: 447Reputation: 447
Quote:
Originally Posted by jojanmpaul View Post
I am using the following command now,

find / -name "*.xls" -exec rm -rf '{}' \;

but I want to script it, and files in which having special characters should be in safer side.
GNU/Linux had a good suggestion that you do not remove files right now, you move them to a reserved directory so you can test your script over time.

Several Words of Caution!

I'm concerned that you're doing this based off of the root path, "/". My concern is that everything of concern to logged in users, should be within either their /home/<username> directory structure, or a common directory structure if you happen to have several users and they access somewhere common where they collaborate their data. But that would not be any of: /proc, /sys, /usr, /lib, /bin, /etc, /media -- an important note here is with most distributions which auto-mount media plugged into them, they would create sub-directories under /media. So you could have say a multi-terrabyte backup USB flash disk plugged into there and your find-exec command would find and delete XLS files off of that extra drive; be that a huge backup drive, or a thumbstick. Basically any media which is RW.

The amount of errors you'd encounter if you weren't root and running this command, would be very lengthy, therefore you'd ignore them by way of filtering, never the best choice. Plus there are directories such as /tmp and the /proc tree which are not real files but they exist to serve the functions for the kernel. Not to say that the system directories would have XLS files, but why transit down those paths; taking a ton of time, raising the risk of a potential user based oops, or using CPU cycles to search a lot in paths where it shouldn't be searching?

Years ago, a common nasty advice intended to ruin Unix systems was to tell the person using it to log in as root, cd / and then type in rm -rf * (Don't EVER do that or even try it!) which would erase your entire directory structure. And of course that would still work today, thus ruining your Linux system.

Further, what's to say that there isn't some form of spreadsheet or documentation in an XLS format which other software put there for when you look at their documentation?

What's to stop you if you decide for yourself, "Well ... this works great in cleaning up my XLS files, I'll now do that with DOC files, TXT files ..."?

What if you make a script mistake and write it where it will remove all files accidentally, like my nasty advice example above?

Another final thought is: XLS files aren't huge, not like images or movies. Consider leaving them be.

Last edited by rtmistler; 01-27-2014 at 12:40 PM.
 
Old 01-27-2014, 11:11 PM   #8
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,147

Rep: Reputation: 330Reputation: 330Reputation: 330Reputation: 330
And, in any case, you do NOT want to use the recursive option (-r) in your rm command. If you find only a file, that might not be a problem, but find will happily return anything that matches your -name specification.

As to your "special character" need, try find ./ -type f -name '*c*.xxx' -print where "c" is the character.

An example:
Code:
$ ls
666.nii      Calibre Library      Desktop    libpeerconnection.log  Public     Testing
AndroidSDK   Code                 Documents  Me.png                 R          tmp
awkprof.out  conky-dlab.tar.gz    Downloads  Music                  rpmbuild   Videos
bin          conkyrc-brenden.txt  dustbin    orphan.lst             Scilab     x.txt
Books        conkyrc-vert.txt     fp.txt     Pictures               Scripts    yum.error.msg
Calibre      conkyrc-wminfo       FreeMat    PTrenholme.revoke      Templates  yum.log
$ find ./ -maxdepth 1 -type f '(' -name '*-*.txt' -o -name '*.*.msg' ')' -print
./yum.error.msg
./conkyrc-brenden.txt
./conkyrc-vert.txt
Note that I used the -print option instead of the -delete one for testing. The -o between the two -name directives, inside the parentheses, is an "or" directive. Oh, the parentheses need the blank separation from the contents.

Last edited by PTrenholme; 01-27-2014 at 11:13 PM.
 
Old 01-28-2014, 10:38 PM   #9
jojanmpaul
Member
 
Registered: Sep 2012
Location: Bangalore
Posts: 56

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by mina86 View Post
Code:
find -name \*.xls -delete
What do you mean by having special characters in its file name?
As I explained, planing to find all files which is having special characters or non printable characters included in the file name. Those files are uploaded from the user end and are need to be removed to attain the integrity of my data.
 
Old 01-29-2014, 01:24 AM   #10
jojanmpaul
Member
 
Registered: Sep 2012
Location: Bangalore
Posts: 56

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by rtmistler View Post
GNU/Linux had a good suggestion that you do not remove files right now, you move them to a reserved directory so you can test your script over time.

Several Words of Caution!

I'm concerned that you're doing this based off of the root path, "/". My concern is that everything of concern to logged in users, should be within either their /home/<username> directory structure, or a common directory structure if you happen to have several users and they access somewhere common where they collaborate their data. But that would not be any of: /proc, /sys, /usr, /lib, /bin, /etc, /media -- an important note here is with most distributions which auto-mount media plugged into them, they would create sub-directories under /media. So you could have say a multi-terrabyte backup USB flash disk plugged into there and your find-exec command would find and delete XLS files off of that extra drive; be that a huge backup drive, or a thumbstick. Basically any media which is RW.

The amount of errors you'd encounter if you weren't root and running this command, would be very lengthy, therefore you'd ignore them by way of filtering, never the best choice. Plus there are directories such as /tmp and the /proc tree which are not real files but they exist to serve the functions for the kernel. Not to say that the system directories would have XLS files, but why transit down those paths; taking a ton of time, raising the risk of a potential user based oops, or using CPU cycles to search a lot in paths where it shouldn't be searching?

Years ago, a common nasty advice intended to ruin Unix systems was to tell the person using it to log in as root, cd / and then type in rm -rf * (Don't EVER do that or even try it!) which would erase your entire directory structure. And of course that would still work today, thus ruining your Linux system.

Further, what's to say that there isn't some form of spreadsheet or documentation in an XLS format which other software put there for when you look at their documentation?

What's to stop you if you decide for yourself, "Well ... this works great in cleaning up my XLS files, I'll now do that with DOC files, TXT files ..."?

What if you make a script mistake and write it where it will remove all files accidentally, like my nasty advice example above?

Another final thought is: XLS files aren't huge, not like images or movies. Consider leaving them be.
I appreciate your kind approach to guiding me in right direction, Thank you very much.

I need a help from you, How to find both the special and non printable characters those are present in file names,

example,

file name having special characters,
/var/lib/html/qwerty/no-l~!atin^1.doc

file name having non printable characters(non ascii or not able feed though keyboard such as regional language letters),
/var/lib/html/bd/keymaps/dbあ.fla

file name having both non printable characters and pecial characters.
/var/lib/html/bd/$sdf.xls


Now using the below commands for find special characters and ascii,
find /var/www/html/content/ -name '*~*' -o -name '*^*'

If there is any alternative way please let me know for both the cases.
 
Old 01-29-2014, 02:26 AM   #11
pan64
Senior Member
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 4,500

Rep: Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221
I would try to use a language, for example perl. run a recursive read (like find) and you can specify any kind of filter to keep or remove them. Finally you can collect those files into a dir, check them twice again and finally remove them. But be care about moving important files that can make your system unresponsive or unusable.
 
Old 01-29-2014, 07:14 AM   #12
mina86
Member
 
Registered: Aug 2008
Distribution: Slackware
Posts: 354

Rep: Reputation: 148Reputation: 148
Code:
find -type f \( -name '*[^-_.a-zA-Z0-9]*' -o -name '.*' -o -name '*.' \) -delete
Will delete all files that contain any character outside of “-_.a-zA-Z0-9”, or begin or end with a dot.
 
Old 01-29-2014, 07:44 AM   #13
rtmistler
Senior Member
 
Registered: Mar 2011
Location: Milford, MA. USA
Distribution: MontaVista, Ubuntu, MINT
Posts: 1,017
Blog Entries: 7

Rep: Reputation: 447Reputation: 447Reputation: 447Reputation: 447Reputation: 447
Quote:
Originally Posted by jojanmpaul View Post
file name having special characters,
/var/lib/html/qwerty/no-l~!atin^1.doc

file name having non printable characters(non ascii or not able feed though keyboard such as regional language letters),
/var/lib/html/bd/keymaps/dbあ.fla

file name having both non printable characters and pecial characters.
/var/lib/html/bd/$sdf.xls


Now using the below commands for find special characters and ascii,
find /var/www/html/content/ -name '*~*' -o -name '*^*'
I'm very amateurish with regular expressions, which is what the find command uses. I do know that by experience one can grab regional language characters as well as the substitutes for non-printable characters, encase those in double quotes and place wildcard designators "*" around them and be successful in finding what you wish. The problem is one of repeatability, but further if you don't have a set of files which can be classified under the same search spec, then you'll always have that problem.
 
Old 01-30-2014, 11:17 AM   #14
yooden
Member
 
Registered: Dec 2013
Distribution: Debian Wheezy/Jessie # XFCE
Posts: 52

Rep: Reputation: Disabled
Quote:
Originally Posted by jojanmpaul View Post
I need a help from you, How to find both the special and non printable characters those are present in file names
Depending on what you want to do with them, I'd just grep them. If you need to restrict names to a certain set, reverse-grep.
 
Old 02-06-2014, 09:15 AM   #15
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Distribution: UBUNTU 5.10 since Jul-18,2006 on Intel 820 DC
Posts: 607

Rep: Reputation: 127Reputation: 127
Wassa problem with *.xls ? The shell will operate on/with all files having last 4 bytes as ".xls" (being linux, there is no extension) but be sure to handle files ending both with ".xls" and ".XLS" by finding with -iname instead of -name.

OK
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error, some other host already uses address xxx.xxx,xxx,xxx ryan462 Linux - Networking 20 01-24-2010 11:14 PM
http://www.spamhaus.org/query/bl?xxx.xxx.xxx.xxx (Server cannot send email now!) RMLinux Linux - Server 3 05-08-2009 02:06 AM
smbclient -M xxx.xxx.xxx.xxx Doesnt Work DiscreetControl Linux - Networking 7 12-28-2007 10:50 AM
rarpd: cannot find xxx on net xxx Hieronimus *BSD 3 02-13-2006 07:21 AM
Host XXX.XXX.XXX.XXX is not allowed to connect to this MySQL server ocavid Linux - Newbie 2 03-16-2005 09:40 AM


All times are GMT -5. The time now is 08:29 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration