LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 01-05-2009, 05:50 PM   #1
justiceorjustus
LQ Newbie
 
Registered: Jan 2009
Posts: 3

Rep: Reputation: 0
Exclamation Renaming ~750,000 files. Thunar is unbearably slow.


I have a folder from my site that I need to pull information from, but I need to transfer it to Windows for editing. The problem is that the folder contains about 750,000 files and I can't transfer it to Windows because of illegal characters ("?", etc) in the file names. I tried Thunar and it's been trying to load the files into the bulk rename for about 10 hours now. Is there a better way that I can rename those files? All I would need to do is rename them sequentially to 1.txt, 2.txt, etc.

Thanks

EDIT: Alright, I wrote a little script that I thought would solve my problem. I'm using the mv command and a loop in order to rename them but something's wrong. Here's the script if anyone can help me out.

Quote:
for i in *.HTML; do let j+=1 ; mv $i file$j.txt ; done
I tried it on a test folder with 10 files and it sees them, sees the correct number of them, and outputs the new name's accordingly - but outputs "target "file#.txt" is not a directory". What am I doing wrong? I'm running the command within the folder containing the files in the terminal.

Last edited by justiceorjustus; 01-05-2009 at 06:42 PM. Reason: update
 
Old 01-05-2009, 09:23 PM   #2
jimbo1708
Member
 
Registered: Jan 2007
Location: Pennsylvania
Distribution: Ubuntu 8.10 Server/9.04 Desktop, openSUSE 11.1
Posts: 154

Rep: Reputation: 31
Are you sure you don't have a hidden file or folder in there. Try

Code:
ls -la | grep HTML
and see if any of the files start with a "." or more importantly, do you have any hidden folders in there.

- Jim
 
Old 01-06-2009, 05:58 PM   #3
justiceorjustus
LQ Newbie
 
Registered: Jan 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by jimbo1708 View Post
Are you sure you don't have a hidden file or folder in there. Try

Code:
ls -la | grep HTML
and see if any of the files start with a "." or more importantly, do you have any hidden folders in there.

- Jim
Actually, the script I have works for my files but I was testing with a bunch of files with spaces (irrelevant to what I'm actually renaming), so it will do the job. The thing is, it's still incredibly, incredibly slow because of the ridiculous amount of files. I ran my script last night and now, about 18 hours later, it's still going. I don't know what's up with that.

Also, I don't have files that start with a "." and no hidden files. I think it's just the ridiculous amount of them that's killing me.

EDIT: I just ran an ls on the folder and it's only renamed about 5,000 of them so far. Holy crap. Is there a faster way I can do this? The PC has a 2.0GHz dual core, 3gb ram. It shouldn't be going that slow.

Last edited by justiceorjustus; 01-06-2009 at 06:13 PM.
 
Old 01-06-2009, 07:21 PM   #4
jimbo1708
Member
 
Registered: Jan 2007
Location: Pennsylvania
Distribution: Ubuntu 8.10 Server/9.04 Desktop, openSUSE 11.1
Posts: 154

Rep: Reputation: 31
ok, so you know it successfully completed 5000. Try seeing how far it is again, to see if its farther than 5000. See if the procedure stalled somewhere. Did you background the take or do you still have access to the terminal where you ran the script? Is it outputting any errors (possibly from the files with spaces)?

- Jim
 
Old 01-06-2009, 07:46 PM   #5
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,388

Rep: Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774
There's a certain amt of overhead because bash is interpreted. Here's a quick/dirty perl version. Probably faster.

Code:
#!/usr/bin/perl -w

use File::Copy;         # provides move cmd
use strict;             # Enforce declarations

my (
    $dir, $file, $filename, $filext, $num
    );

$dir = "/home/chris/tmp";
$num = 0;

opendir(DIR, $dir) or die "Can't opendir $dir: $!\n";
chdir($dir) or die "Can't chdir to $dir: $!\n";
while( defined ($file = readdir DIR) )
{
    next if $file =~ /^\.\.?$/;     # skip curr, parent dir

    # Get filename components
    ($filename, $filext) = split(/\./, $file);
    next if $filext !~ /html/;  # skip if not html

    $num++;
    move($file, "${filename}${num}.txt") or
                            die "unable to move file $file: $!\n";
}
 
Old 01-07-2009, 05:11 AM   #6
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,996

Rep: Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130Reputation: 5130
Quote:
Originally Posted by justiceorjustus View Post
EDIT: I just ran an ls on the folder and it's only renamed about 5,000 of them so far. Holy crap. Is there a faster way I can do this? The PC has a 2.0GHz dual core, 3gb ram. It shouldn't be going that slow.
The task should be i/o bound so the speed of the processor really shouldn't have much effect, but 5000 renames in 18 hours is still woeful. What type of filesystem are the files in and what type of device is it on?

You may want to check your system for performance problems with top, iostat, sar and similar utilities and look for something unusual.

Last edited by GazL; 01-07-2009 at 05:13 AM.
 
Old 01-07-2009, 06:22 PM   #7
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,388

Rep: Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774
BTW, as its all in one dir and I/O bound, I'd run multiple copies (of the Perl), one for each letter of the alphabet ie one for a*.HTML, b*.HTML, c*.html etc.
You've got cpu to burn. You might want to only do eg half the alphabet at once. Just keep adding letters and watch the performance via top.
 
Old 01-07-2009, 06:46 PM   #8
justiceorjustus
LQ Newbie
 
Registered: Jan 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Exclamation

Quote:
Originally Posted by chrism01 View Post
There's a certain amt of overhead because bash is interpreted. Here's a quick/dirty perl version. Probably faster.

Code:
#!/usr/bin/perl -w

use File::Copy;         # provides move cmd
use strict;             # Enforce declarations

my (
    $dir, $file, $filename, $filext, $num
    );

$dir = "/home/chris/tmp";
$num = 0;

opendir(DIR, $dir) or die "Can't opendir $dir: $!\n";
chdir($dir) or die "Can't chdir to $dir: $!\n";
while( defined ($file = readdir DIR) )
{
    next if $file =~ /^\.\.?$/;     # skip curr, parent dir

    # Get filename components
    ($filename, $filext) = split(/\./, $file);
    next if $filext !~ /html/;  # skip if not html

    $num++;
    move($file, "${filename}${num}.txt") or
                            die "unable to move file $file: $!\n";
}
Nice! You were definitely right about it being quick and dirty. Finished renaming all of the files in about 45 minutes! Thank you!!
 
Old 01-07-2009, 07:58 PM   #9
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,388

Rep: Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774
anytime
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
speed unbearably slow using Ubuntu heron, RTL-8139 wireless dombrowsky Linux - Hardware 2 12-03-2008 11:18 PM
Slack 12 installation unbearably slow Cdzin Slackware - Installation 4 07-09-2007 12:02 PM
LXer: SugarCRM Announces 1,000 Customers and 1,000,000 Open Source Downloads as Momentum for Open Source Applications Grows LXer Syndicated Linux News 0 12-19-2006 05:33 AM
1,000,000,000 PCs by 2010 masand Linux - News 4 11-01-2004 01:55 AM
Red Hat Linux Server Unbearably Slow costasm Linux - Software 7 03-19-2004 03:54 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:33 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration