Renaming ~750,000 files. Thunar is unbearably slow.
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Renaming ~750,000 files. Thunar is unbearably slow.
I have a folder from my site that I need to pull information from, but I need to transfer it to Windows for editing. The problem is that the folder contains about 750,000 files and I can't transfer it to Windows because of illegal characters ("?", etc) in the file names. I tried Thunar and it's been trying to load the files into the bulk rename for about 10 hours now. Is there a better way that I can rename those files? All I would need to do is rename them sequentially to 1.txt, 2.txt, etc.
Thanks
EDIT: Alright, I wrote a little script that I thought would solve my problem. I'm using the mv command and a loop in order to rename them but something's wrong. Here's the script if anyone can help me out.
Quote:
for i in *.HTML; do let j+=1 ; mv $i file$j.txt ; done
I tried it on a test folder with 10 files and it sees them, sees the correct number of them, and outputs the new name's accordingly - but outputs "target "file#.txt" is not a directory". What am I doing wrong? I'm running the command within the folder containing the files in the terminal.
Last edited by justiceorjustus; 01-05-2009 at 06:42 PM.
Reason: update
Are you sure you don't have a hidden file or folder in there. Try
Code:
ls -la | grep HTML
and see if any of the files start with a "." or more importantly, do you have any hidden folders in there.
- Jim
Actually, the script I have works for my files but I was testing with a bunch of files with spaces (irrelevant to what I'm actually renaming), so it will do the job. The thing is, it's still incredibly, incredibly slow because of the ridiculous amount of files. I ran my script last night and now, about 18 hours later, it's still going. I don't know what's up with that.
Also, I don't have files that start with a "." and no hidden files. I think it's just the ridiculous amount of them that's killing me.
EDIT: I just ran an ls on the folder and it's only renamed about 5,000 of them so far. Holy crap. Is there a faster way I can do this? The PC has a 2.0GHz dual core, 3gb ram. It shouldn't be going that slow.
Last edited by justiceorjustus; 01-06-2009 at 06:13 PM.
ok, so you know it successfully completed 5000. Try seeing how far it is again, to see if its farther than 5000. See if the procedure stalled somewhere. Did you background the take or do you still have access to the terminal where you ran the script? Is it outputting any errors (possibly from the files with spaces)?
There's a certain amt of overhead because bash is interpreted. Here's a quick/dirty perl version. Probably faster.
Code:
#!/usr/bin/perl -w
use File::Copy; # provides move cmd
use strict; # Enforce declarations
my (
$dir, $file, $filename, $filext, $num
);
$dir = "/home/chris/tmp";
$num = 0;
opendir(DIR, $dir) or die "Can't opendir $dir: $!\n";
chdir($dir) or die "Can't chdir to $dir: $!\n";
while( defined ($file = readdir DIR) )
{
next if $file =~ /^\.\.?$/; # skip curr, parent dir
# Get filename components
($filename, $filext) = split(/\./, $file);
next if $filext !~ /html/; # skip if not html
$num++;
move($file, "${filename}${num}.txt") or
die "unable to move file $file: $!\n";
}
EDIT: I just ran an ls on the folder and it's only renamed about 5,000 of them so far. Holy crap. Is there a faster way I can do this? The PC has a 2.0GHz dual core, 3gb ram. It shouldn't be going that slow.
The task should be i/o bound so the speed of the processor really shouldn't have much effect, but 5000 renames in 18 hours is still woeful. What type of filesystem are the files in and what type of device is it on?
You may want to check your system for performance problems with top, iostat, sar and similar utilities and look for something unusual.
BTW, as its all in one dir and I/O bound, I'd run multiple copies (of the Perl), one for each letter of the alphabet ie one for a*.HTML, b*.HTML, c*.html etc.
You've got cpu to burn. You might want to only do eg half the alphabet at once. Just keep adding letters and watch the performance via top.
There's a certain amt of overhead because bash is interpreted. Here's a quick/dirty perl version. Probably faster.
Code:
#!/usr/bin/perl -w
use File::Copy; # provides move cmd
use strict; # Enforce declarations
my (
$dir, $file, $filename, $filext, $num
);
$dir = "/home/chris/tmp";
$num = 0;
opendir(DIR, $dir) or die "Can't opendir $dir: $!\n";
chdir($dir) or die "Can't chdir to $dir: $!\n";
while( defined ($file = readdir DIR) )
{
next if $file =~ /^\.\.?$/; # skip curr, parent dir
# Get filename components
($filename, $filext) = split(/\./, $file);
next if $filext !~ /html/; # skip if not html
$num++;
move($file, "${filename}${num}.txt") or
die "unable to move file $file: $!\n";
}
Nice! You were definitely right about it being quick and dirty. Finished renaming all of the files in about 45 minutes! Thank you!!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.