LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Blogs > ghostdog74
User Name
Password

Notices



Rating: 31 votes, 4.87 average.

Simple Mass File Renamer & Deleter [Python]

Posted 09-17-2008 at 01:01 PM by ghostdog74

I am so bored today, that i started creating a command line batch file renamer, tested on Linux with Python 2.4.
Some of the simple things it can do is:
1) Change upper case to lower case and vice versa
2) Change file names by pattern substitution
3) Change file names by number sequence
4) Insert pattern in front of files
5) Insert pattern at the end of files.
6) Simple sort by number.
7) Manual revert of changed files.
8) Deletion of files by pattern match only.
9) Ability to recurse directory
10) Inverse match pattern, like grep -v
11) Case sensitivity search
12) Remove string by character positions. eg 0:10 => remove from start to 10th character.


Code:
#!/usr/bin/env python
import sys,os,getopt,glob,string,re,operator,fnmatch,time

################################## FUNCTIONS #################################################
def usage(name):    
    print """
    %s
    Mini File Renamer and Deleter v1.0.2 - Copyright 2008
    Author:	ghostdog74 
    
    %s
    usage: %s [-h] [-D dir] [-t f|d ][-[s|p] oldpat] [-e newpat ] [ -i pattern ][ -b patten ] [-d depth] [-Z|-v|-n|-r|-I] [-l] [files] 
    -D      Directory to start rename/delete. Use quotes for directories with spaces 
            If no directory is specified, current working directory is assumed
            Eg: Use -D c:\\test\\  (for Windows, end with trailing back slash)
    -h      Prints help page                
    -t      Type of files. Either files(f) or directories(d). Default is files, so can omit -t f
    -s      Sequence substitution. Specify pattern to be substituted. Must be specified with -e.
    -p      Pattern substitution. Must be specified with -e (pattern to change to)
            eg. 
                1) To change whole jpeg file name to upper case ==> -p ".*" -e "A-Z" "*.jpeg"
                2) To change the word "TEST" to lower case in jpeg file ==> -p "TEST" -e "a-z" "*.jpeg"
                3) To remove all numbers from directory name ==> -p "[0-9]+" -e '' -t d  -l "*" 
                4) To remove first character from file/directory ==>  -p "^." -e ''  -l "*.jpeg"  
                5) To remove special characters ==>   -p "[\"^']" -e "back" "*"       
                
    -e     * When used with -s, indicates ending sequence pattern. Can include alphanumberic. Must use ":" to specify range             
            Eg  
                1) -s "test" -e "01:10" will replace 'test' in files from '01' to '10'. If more than 1 files with 'test',
                    will go by sequence, ie '02' , '03' etc.
                2) -s "test" -e "###01:11@@@" will replace 'test' from '###01@@@' , followed by '###02@@@' and so on to ###11@@@            
           * When used with -p, indicates pattern to change to. 
           * When used with -c, specify -e "[A-Z]" to change to uppercase, -e "[a-z]" to change to lower case.
           
    -i	    Insert pattern to infront of file name.
    -b      Insert pattern to back of file name.
    -c      Remove characters in file name by position, position index start from 0. Always use -l to verify files to be changed.
            eg 
            1) -c 1 ==> remove 2nd character
            2) -c -1 ===> remove last character.
            3) -c -2: ===> remove from last 2nd character onwards.
            4) -c 3: ==> remove from 4th character onwards
            5) -c 4:10 ==> remove from 5th to 10th character
            6) -c :3 ==> remove from start to 3rd character.
            7) -c 1:3 -e "[A-Z]" ==> change positions specified to uppercase
                                    
    -l      List all files with pattern only. No renaming. Useful for verifying what will be changed.  
    -r      Used alone. Enable restoration of previous commands. eg %s -r            
    -n      Simple Numerical sort. Specify -n to turn numerical sorting on. Only works on files of the same structure.
    -d      # of directories down to do rename (default = 0). ie: Directory depth level
    -I      Case insensitive pattern search. eg  -p "rot" -e "" -I  -l "*.txt". Find rot,ROT,roT,RoT ..
    -v      Wildcard pattern reversal. eg -v "*.bat" : Files that doesn't end with .bat.
    -Z      Do deletion of files. 
            eg -d 4 -Z ".*01*"  -l -v  "*.txt" ==> delete all files that doesn't end with .txt and with the pattern "01" in the
                                                   file name, 4 levels deep into current directory
   [files]  List of files to be renamed/deleted. Can have wildcards. eg test*.txt
            To specify all files ==> "*". Will not work if not specified.
    """ % ("=" *100 , "=" * 100 ,name,name )
     

def pathChecker(path):
    ''' Function to determine correct path or file exists
        and then returns the number of count of the path separator
        as an indicator of the depth of the path.
    '''    
    if os.path.exists(path):
        pathcount=path.count(os.sep) 
        if pathcount > 1:           
            return int(pathcount) , path  , 0
        else:return 1,1,1
    else:
        return 0,0,-1

def combocheck(s,k):
    '''Function to check the options user keyed in against a set of bad options
        Input : s => bad options
                k => list of user supplied keys...eg -S -s -N..etc
    '''
    v = len(s); t = 0 ##store lenght of predefined bad options, t=0 to count matched bad ops
    for x in s:        
        if x in k:
            t = t + 1            
    ## if all bad options found            
    if v == t : return True
    else: return False


def doWalk(DIR=None,maxdepth=1,TYPE="f",ACTION="sub",patold=None,patnew=None,DEBUG=1,INVERSE=0,CASE=1,SORT=0,FileNamesArgs=None):
    ''' Traverse the directory specified until depth level,
        looking for files with the correct patterns and rename them
        accordingly
    '''
    GlobbedTypeList=[]
    AllFilesGlobbed=[]
  
    
    # convert into reg expression syntax in order to search. eg "*.txt" to ".*txt$"
    
    regex = fnmatch.translate(FileNamesArgs)
    reobj = re.compile(regex) 
    for ROOT,DIRECTORY,FILES in os.walk(DIR,True): 
        #do for files less or equal to maxdepth
        if ROOT.count(os.sep)  <= int(maxdepth):                        
            if FileNamesArgs is not None:     
                if INVERSE:            
                    for FI in os.listdir( ROOT ):
                        if not reobj.search(FI):
                            AllFilesGlobbed.append( os.path.join(ROOT,FI ))
                else:
                    try:
                        allfiles = glob.glob(os.path.join(ROOT,FileNamesArgs))
                    except Exception,e:
                        pass
                    else:
                        if allfiles:
                            for found in allfiles:    
                                if found not in AllFilesGlobbed:
                                    AllFilesGlobbed.append(found)
                            
    if TYPE=="d":                                         
        for dirname in AllFilesGlobbed:
            if os.path.isdir(dirname):
                if not dirname in GlobbedTypeList: 
                    GlobbedTypeList.append([dirname,dirname.count(os.sep)])
    else:
        for filenames in AllFilesGlobbed:
            if os.path.isfile(filenames):
                if not filenames in GlobbedTypeList:                    
                    GlobbedTypeList.append(filenames) 
        if SORT==1:
            GlobbedTypeList = sorted_copy(GlobbedTypeList)
        else:
            GlobbedTypeList=sorted(GlobbedTypeList, key=(operator.itemgetter(0)))
    # do various actions     
    doAction(GlobbedTypeList,TYPE,ACTION,patold,patnew,CASE,SORT,DEBUG)
                
        
       
def brake():
    raw_input("Enter")
     
def  clearscreen(rows):
    for i in range(rows): print 
        
def rename(FROM,TO="",DEBUG=1):       
    if FROM == TO:return
    if DEBUG==0  :        
        if TO :            
            try:
                os.rename(FROM,TO)
            except Exception,e:
                print "Error : ",e
            else:
                print FROM , " is renamed to ", TO
                # store to restore file. simple mechanism. Use pickle/shelve ??
                open(restorefile,"a").write("""%s,%s\n""" %( TO,FROM ))
        elif not TO :
            print "Deleting " ,FROM
            if os.path.isdir(FROM):
                try:
                    os.removedirs(FROM) #or use os.rmdir
                except Exception,e:
                    print "Error: ",e
            elif os.path.isfile(FROM):
                try :
                    os.remove(FROM)
                except Exception,e:
                    print "Error: ",e
              
    else:
        
        print "==>>>> ", "[" ,FROM ,"]==>[",TO,"]"

def changecase(patnew,thefile):
    # if -c and -e option and -e [A-Z] or -e [a-z]
    try:
        foundlowercase = re.findall( "\[a-z\]|a-z" , patnew)[0]                    
    except:
        try:
            founduppercase = re.findall( "\[A-Z\]|A-Z" , patnew)[0]
        except: notfound=1
        else:
            newname = thefile.replace(thefile,thefile.upper())
    else:
        newname = thefile.replace(thefile,thefile.lower())
    
        
def doAction(FILES,TYPE="d", ACTION="sub", patold="" ,patnew="", CASE=1, SORT=0,DEBUG=1):
    '''Function to do sequence substitution
        Input:  FILES => A list of globbed files to be processed.
                TYPE => f = files, d = directories
                patold => The pattern in the file to be substituted
                patnew => The new pattern to replace the old
                DEBUG => 1 : Do a listing only
                            0 : Do substitution, and renaming of files
    '''      
    
    notfound=0 #flag for doing case change.
    if TYPE=="d":
        # if renaming directories, rename from the last(highest) level. So have to sort according to maxdepth
        FILES=sorted(FILES, key=(operator.itemgetter(1)),reverse=True)

    if CASE == 0:
        patold_re = re.compile(patold,re.IGNORECASE)
    elif CASE:
        patold_re = re.compile(patold)
        
    if ACTION=="seq":         
        # see if in format 001:020 ...this indicates sequence
        patnew_re =re.compile("(\w*\d*\D*)(\d+)[:](\d+)(\D*\w*\d*)")
        seq = patnew_re.findall(patnew)[0]
        startseq=seq[1]; endseq=seq[-2]
        patendseqfront = seq[0]
        patendseqback = seq[-1]
        if endseq and int(endseq) < int(startseq) :
            startseq,endseq = endseq,startseq
        elif endseq and int(startseq) == int(endseq):
            endseq=startseq
        elif endseq =="": endseq=len(FILES)
        length_startseq = len(str(startseq))

    for fn in FILES:
        if TYPE=="d": FN=fn[0]
        elif TYPE=="f": FN=fn
        thepath , thefile = os.path.split(FN)            
        if ACTION=="insert":
            if patnew == "front":
                newname = patold+thefile
            elif patnew == "back":
                newname = thefile+patold
            rename(FN, os.path.join(thepath,newname),DEBUG )     
        elif ACTION=="char":
            # find <digit>:<digit>
            patdigit=re.compile("(-)*(\d*):(-)*(\d*)")
            b= list(thefile)
            if patnew and re.search( "\[a-z\]|a-z" , patnew) :
                caseflag=1
            elif patnew and re.search( "\[A-Z\]|A-Z" , patnew):
                caseflag=2
            else:caseflag=0
            
            # if single digit                
            if ":" not in patold and ( int(patold) < 0 or patold.isdigit() ) :
                if caseflag==1 :
                    b[int(patold)] = b[int(patold)].lower()
                elif caseflag==2:
                    b[int(patold)] = b[int(patold)].upper()
                else:
                    b.pop( int(patold) )
                newname = ''.join(b)               
            else:
                foundit = patdigit.findall(patold)[0]
                if foundit:
                    first = "".join(foundit[0:2])
                    second =  "".join(foundit[2:])
                    if first and not second:    
                    # if range is  eg 1:
                        if caseflag==1:                            
                            newname = thefile[0:int(first)]+thefile[int(first):].lower()
                        elif caseflag==2:
                            newname = thefile[0:int(first)]+thefile[int(first):].upper()
                        else:                      
                            newname = thefile[ : int(first) ]        
                    elif second and not first:  
                    # if range is eg :10
                        if caseflag==1:
                            newname = thefile[:int(second)].lower()+thefile[int(second):]
                        elif caseflag==2:
                            newname = thefile[:int(second)].upper()+thefile[ int(second) :]
                        else:
                            newname = thefile[ int(second) :]                        
                    elif first and second:    
                    # full range eg 4:10
                        if caseflag==1:
                            newname = thefile[0:int(first)]+thefile[int(first):int(second)].lower()+thefile[int(second):]
                        elif caseflag==2:
                            newname = thefile[0:int(first)]+thefile[int(first):int(second)].upper()+thefile[int(second):]
                        else:                          
                            for it in range(int(first),int(second)): b[it]="" # delete items to null
                            newname = ''.join(b)                        
            rename(FN, os.path.join(thepath,newname) ,DEBUG )
  
  
        elif patold_re.search(thefile)  : 
            if ACTION=="seq":    
                #string to replace.             
                repl= patendseqfront+str(startseq).zfill(length_startseq)+patendseqback
                newname = re.sub(patold,repl,thefile)    
                rename(FN, os.path.join(thepath,newname),DEBUG )      
                if int(startseq) == int(endseq):
                    print "\n==>Number of files less than ending sequence number...Exiting.."
                    break          
                #increment sequence
                startseq = int(startseq) + 1     
            elif ACTION=="sub":
                
                #changing case.  changing all filename to upper/lower case.
                try:
                    foundlowercase = re.findall( "\[a-z\]|a-z" , patnew)[0]                    
                except:
                    try:
                        founduppercase = re.findall( "\[A-Z\]|A-Z" , patnew)[0]
                    except: notfound=1
                    else:
                        newname = thefile.replace(thefile,thefile.upper())
                else:
                    newname = thefile.replace(thefile,thefile.lower())
                    
                if notfound:
                    newname = patold_re.sub(patnew,thefile) 
                rename(FN, os.path.join(thepath,newname),DEBUG ) 
            elif ACTION=="delete":
                rename(FN, None ,DEBUG )
                
# Taken from Python recipe                
def sorted_copy(alist):
    # http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234
    indices = map(_generate_index, alist)
    decorated = zip(indices, alist)
    decorated.sort()
    return [ item for index, item in decorated ]
    
def _generate_index(str):
    """
    Splits a string into alpha and numeric elements, which
    is used as an index for sorting"
    """
    #
    # the index is built progressively
    # using the _append function
    #
    index = []
    def _append(fragment, alist=index):
        if fragment.isdigit(): fragment = int(fragment)
        alist.append(fragment)

    # initialize loop
    prev_isdigit = str[0].isdigit()
    current_fragment = ''
    # group a string into digit and non-digit parts
    for char in str:
        curr_isdigit = char.isdigit()
        if curr_isdigit == prev_isdigit:
            current_fragment += char
        else:
            _append(current_fragment)
            current_fragment = char
            prev_isdigit = curr_isdigit
    _append(current_fragment)    
    return tuple(index)                

#--------------------------------------END Functions -------------------------#

## these are options not allowed
bad_options = [ ['-s','-p'],
                ['-Z','-p'],['-Z','-s'],['-Z','-c'],
                ['-c','-b'],['-c','-i'],['-c','-p'],['-c','-s'],
                #can add somemore...
              ]
                
################################## END FUNCTIONS ##################################################
    
if __name__ == '__main__':

    basename = os.path.basename(sys.argv[0])
    
    # create restore directory if doesn't exists
    restorepath = ".restore"
    if not os.path.exists(restorepath): os.mkdir(restorepath,777)    
    # design the restoration filename
    TIME=list(time.localtime())
    # get month and day into double digits string
    if len( str(TIME[1]) ) < 2: 
        TIME[1] = "0"+str(TIME[1])
    if len( str(TIME[2]) ) < 2:
        TIME[2] = "0"+str(TIME[2])
    TIME='-'.join(map(str,TIME))
    
    #name of the restore file
    restorefile  = os.path.join( restorepath , TIME+"-"+basename.split(".")[0])

    DEBUG=1
    INVERSE=0
    TYPE="f" #by default search files, not directories
    ACTION="sub"
    CASE=1 #case sensitive
    SORT=0 # alphabetical sort.
    
    try:
        opts, args = getopt.gnu_getopt (sys.argv[1:], "D:s:e:p:P:d:t:i:b:c:Z:lInvr")
        if args is None or len(args) <= 0: args = None
        else: 
            function =args[0]
            FileNames=args[-1]
    except Exception:
        
        usage(basename)
        sys.exit(1)
    if opts == []:
        usage(basename)
        sys.exit(1)

    options = dict(opts) ### convert to dict, easier to manipulate..i guess    
    keys = options.keys() ##get keys from options dictionary for checking

    if options.has_key('-h'):
        usage(basename)
        sys.exit()
           
    #restoration key       
    if options.has_key('-r'):            
        h={};rh={}
        print "These are a list of restore files: "
        for num,rfile in enumerate(os.listdir(restorepath)):
            num=num+1
            h[str(num)] = rfile
        h[str(num+1)]="Exit" #give the last choice
        while 1:       
            for k in sorted(h.keys()):
                print "%s) %s" % ( k,h[k])         
            choice = raw_input("Enter your choice( eg 0,1 ) to view: ")
            try :
                int (choice)
            except:
                print "Choose again"
            else:
                ofilename = h[choice]
                if ofilename == "Exit": sys.exit()
                print "0o0o0o0 ... opening %s ..... 0o0o0o0o "% ofilename
                for n,olines in enumerate(open( os.path.join(restorepath,ofilename))):
                    From,To=olines.strip().split(",")
                    n=n+1
                    rh[n] = [From,To]
                rh[n+1]=["All","Original"] 
                while 1:
                    for k in sorted(rh.keys()):
                        print "%s) Restore %s to-->>> %s" %( k,rh[k][0],rh[k][1])                         
                    restoreyesno = raw_input( "Continue to restore [y|n]?: ")
                    if restoreyesno in ["n","N"]: break
                    elif restoreyesno in ["y","Y"]: 
                        rchoice = raw_input("Enter choice to restore: ")
                        for n, olines in enumerate(open( os.path.join(restorepath,ofilename))):
                            From,To = olines.strip().split(",")
                            n=n+1
                            if "All" in rh[int(rchoice)] or "Original" in rh[int(rchoice)] :  
                                try:
                                    os.rename(From,To)
                                except Exception,e:
                                    print "Error restoring: ",e
                            
                            elif n==int(rchoice):
                                try:
                                    os.rename(From,To)
                                except Exception,e:
                                    print "Error restoring: ",e
        
    # directory key    
    if options.has_key('-D') and options['-D'] != []:
        depthcnt ,newpath, ret = pathChecker(options['-D'])    #check the root, whether exists   
        if ret == -1:
            print "%s does not exists. " % (options['-D'] )
            sys.exit(2)
        options['-D'] = newpath
    else :        
        # if not -D specified, take current directory
        depthcnt ,newpath, ret = pathChecker(os.getcwd())        
        options['-D'] = os.getcwd()      

    DIR=options['-D']
    

            
    # check bad options...
    for bad_ops in bad_options :        
        if combocheck( bad_ops, keys) :
            usage(basename)
            sys.exit(0)      
    
    # Check for maxdepth
    if options.has_key('-d'): 
        maxdepth = int(options['-d']) + depthcnt
    else: 
        maxdepth = depthcnt
    
    # check for list flag - debug mode , 0 for commit.
    if not options.has_key('-l'): DEBUG = 0 
    
    # check for inverse pattern search,similar to grep's -v
    if options.has_key('-v'): INVERSE = 1
    
    # check numerical sorting
    if options.has_key('-n'): SORT = 1
    
    #check file type, whether search file or directory    
    if options.has_key('-t'): 
        TYPE=options['-t']

    if options.has_key('-i')  :
        patold=options['-i']
        patnew="front"
        ACTION="insert"
        
    #insert at back of file name
    if options.has_key('-b')  :
        patold=options['-b']
        patnew="back"
        ACTION="insert"
                    
    if options.has_key('-c') and options.has_key('-e') :                    
        patold=options['-c']        
        patnew=options['-e']
        ACTION="char"
    elif options.has_key('-c') and not options.has_key('-e'):
        patold=options['-c']        
        patnew=""
        ACTION="char"        
                    
    if options.has_key('-s') and options.has_key('-e') : 
        patold=options['-s']
        patnew=options['-e']        
        if patnew == "" : usage(basename); sys.exit(1)
        ACTION="seq"
    elif options.has_key('-s') and not options.has_key('-e'):usage(basename) ;sys.exit()
        
    
    if options.has_key('-p') and options.has_key('-e') : 
        patold=options['-p'];patnew=options['-e']
    elif options.has_key('-p') and not options.has_key('-e'): usage(basename) ;sys.exit()
    
    if options.has_key('-v'):
        INVERSE=1
    
    # ignore case sensitivity
    if options.has_key('-I'): CASE=0
        
    if options.has_key('-Z'):          
        patold=options['-Z']
        if not patold : usage(basename);sys.exit(1)
        patnew=""
        ACTION="delete"
        
    # do the walking.
    try:
        doWalk(DIR,maxdepth,TYPE,ACTION,patold,patnew,DEBUG,INVERSE,CASE,SORT,FileNames)
    except:
        usage(basename)
        sys.exit()
Quote:
Code:
Some examples:
1) Changing cases:
   a) Change case of 2nd to 3rd character of all files starting with "test" to upper case.
      ---> filerenamer.py -c 1:3 -e "[A-Z]" -l "test*"
   b) Change case of 2nd till last 4th character of all files to lower case      
      ---> filerenamer.py -c 2:-4 -e "[a-z]" -l "*"
   c) Change file name to all upper case
      ---> filerenamer.py -c 0: -e "[A-Z]" -l "*"

2) Change file names by substitution:
   a) Change the word "test" to "foo" for all files starting with "test"
      ---> filerenamer.py -p "test" -e "foo" -l "test*"

3) Change file names to a number sequence
   a) Change the word "test" in files starting with test to number sequence 001 to 100 
      ---> filerenamer.py -s "test" -e "001:100" -l "test*"
   b) Change the word "test" in files starting with test to "foo" and number sequence 001 to 100
      ---> filerenamer.py -s "test" -e "foo001:100" -l "test*"
           * changes testbar.txt to foo001bar.txt 
   c) Change the word "test" in files starting with test to number sequence 001 to 100 followed by "foo"
      ---> filerenamer.py -s "test" -e "01:20foo" -l "test*"
           * changes testbar.txt to 001foobar.txt 

4) Removing characters by position
   a) Remove the first 5 characters for files starting with "test"
      ---> filerenamer.py -c 0:5 -l "test*"
   b) Remove the last 5 characters for files starting with "test"
      ---> filerenamer.py -c -5: -l "test*"
   c) Remove 2nd character from all files
      ---> filerenamer.py -c 1 -l "*"

   NB: uses the python indexing convention.
Posted in Uncategorized
Views 7848 Comments 0
« Prev     Main     Next »

  



All times are GMT -5. The time now is 09:44 PM.

Main Menu
Advertisement

Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration