LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 03-01-2011, 12:18 PM   #1
ppostma1
LQ Newbie
 
Registered: Dec 2006
Location: Michigan USA
Distribution: slackware, ubuntu, Debian, knoppix, Fedora, CentOS
Posts: 14

Rep: Reputation: 0
Question File Replication Routine


I have been scripting Inotify to work as file replication.
http://github.com/rvoicilas/inotify-tools/wiki

It works very well for casual use when editing or uploading files. But I have to make sure it stands up against a "tar bomb" when we install a new package. EG: tar -xzf really_big_php_project.tar.gz

Code:
#!/bin/sh

# get the current path
CURPATH=`pwd`
#not used:
CREATE="CREATE"
ISDIR="ISDIR"
DELETE="DELETE"

inotifywait -mr --timefmt '%d/%m/%y %H:%M' --format '%T %w %f %e' \
 -e close_write,delete,create /root/public_html/ | while read date time dir file evnt; do

if [ "$file" == "4913" ]; then
   continue
fi

   FILECHANGE=${dir}${file}
#if [ -f $FILECHANGE ]; then
#  echo "$FILECHANGE not found"
#  continue
#fi

   # convert absolute path to relative
   FILECHANGEREL=`echo "$FILECHANGE" | sed 's_'$CURPATH'/__'`

   var=$(echo $evnt | awk -F"," '{print $1,$2}')
   set -- $var

   # echo "${evnt} ($1 $2) was done on $FILECHANGEREL - ${file} in ${dir}"

   if [ "$1" == "DELETE" ]; then
       if [ "$2" == "ISDIR" ]; then
            echo "rmdir -f ${FILECHANGE}"
           ssh root@web2.home "rmdir ${FILECHANGE}"
       else
            echo "rm -f ${FILECHANGE}"
           ssh root@web2.home "rm -f ${FILECHANGE}"
       fi
   else
       if [ "$1" == "CLOSE_WRITE" ]; then
           if [ -f $FILECHANGE ]; then
             echo "rsync ${FILECHANGE}"
             rsync --relative -vrae 'ssh -p 22'  $FILECHANGEREL root@web2.home:/root/

             echo "checking: -d: $date -t: $time -dir: $dir -f: $file -ev: $evnt -ch: $FILECHANGE -rel: $FILECHANGEREL"
           fi
       else
           if [ "$1" == "CREATE" ]; then
               if [ "$2" == "ISDIR" ]; then
                    echo "mkdir ${FILECHANGE}"
                   ssh root@web2.home "mkdir ${FILECHANGE}"
               fi
           fi
       fi
   fi

echo ""

done
Obviously, it has to complete the current file transfer before it blocks for and waits for the next file event. So out of 200 files only 9 get transferred. (Yes, I know rsync will do dirs and deletes but the direct command is more efficient)

So I switched to a threaded approach:
Code:
#!/bin/sh

# get the current path
CURPATH=`pwd`

inotifywait -mr --timefmt '%d/%m/%y %H:%M' --format '%T %w %f %e' \
 -e close_write,delete,create /home/rbcmin/public_html/ | while read date time dir file evnt; do

./handle.bs $CURPATH $date $time $dir $file $evnt &

done
Which returns to the block/wait much quicker and transferred 120 of the 200 files, but was considerably hard on the network. Slowing our web access for a moment.

I'm not sure how to get a script to catch ALL files, Queue them, and transfer one by one non-threaded to be gentle on the network. Something to the idea of:
inotifywait -mr --timefmt '%d/%m/%y %H:%M' --format '%T %w %f %e' \
-e close_write,delete,create /root/public_html/ > file > while read date
 
Old 03-01-2011, 06:54 PM   #2
jmajor
Member
 
Registered: Nov 2004
Location: Australia
Distribution: Fedora, Ubuntu
Posts: 55
Blog Entries: 2

Rep: Reputation: 17
Your solution might be to write a small C program which has thread1 listening for changes and making a list, and thread2 picking up the list as often as it can and firing off an rsync. This would let you use the rsync bandwidth limit feature. Thread1 would not miss events. You would not kill the cpu / ram / swap usage with one thread per file.

In the tar bomb example thread2 would pick up a few as it starts, then when they were done would return to find a big batch waiting and go give rsync a good size batch.
 
Old 02-03-2012, 04:00 PM   #3
ppostma1
LQ Newbie
 
Registered: Dec 2006
Location: Michigan USA
Distribution: slackware, ubuntu, Debian, knoppix, Fedora, CentOS
Posts: 14

Original Poster
Rep: Reputation: 0
Cool python fix:

This is my basic but fully functional prototype:

import os
from pyinotify import WatchManager
from pyinotify import Notifier
from pyinotify import ProcessEvent
from pyinotify import EventsCodes

wm = WatchManager()

# print EventsCodes.ALL_FLAGS
# print "begin"
# print getattr(EventsCodes(), '__dict__')
# print dir(EventsCodes)
# print "hello"
# print EventsCodes.ALL_FLAGS['IN_CLOSE_WRITE']


mask = EventsCodes.IN_CLOSE_WRITE | EventsCodes.IN_CREATE | EventsCodes.ALL_FLAGS['IN_MOVE_SELF'] | EventsCodes.ALL_FLAGS['IN_MOVED_FROM'] | EventsCodes.ALL_FLAGS['IN_MOVED_TO'] # watched events
# mask = EventsCodes.ALL_FLAGS['IN_CLOSE_WRITE'] | EventsCodes.ALL_FLAGS['IN_CREATE'] # watched events


class PTmp(ProcessEvent):
def process_IN_CLOSE_WRITE(self, event):
print "CloseWrite: %s" % os.path.join(event.path, event.name)

def process_IN_CREATE(self, event):
print "Create: %s" % os.path.join(event.path, event.name)
print dir(event)
print 'dir', event.dir
print 'mask', event.mask
print 'maskname', event.maskname
print 'name', event.name
print 'path', event.path
print 'pathname', event.path
print ''

def process_IN_MOVE_SELF(self, event):
print "move self: %s" % os.path.join(event.path, event.name)
print dir(event)
print 'dir', event.dir
print 'name', event.name
print 'path', event.path
print 'pathname', event.path
print ''

def process_IN_MOVED_FROM(self, event):
print "movefrom: %s" % os.path.join(event.path, event.name)
print dir(event)
print 'dir', event.dir
print 'name', event.name
print 'path', event.path
print 'pathname', event.path
print ''

def process_IN_MOVED_TO(self, event):
print "moveto: %s" % os.path.join(event.path, event.name)
print dir(event)
print 'dir', event.dir
print 'name', event.name
print 'path', event.path
print 'pathname', event.path
print ''

notifier = Notifier(wm, PTmp())

wdd = wm.add_watch('/path/x', mask, rec=True, auto_add=True)


while True: # loop forever
try:
# process the queue of events as explained above
notifier.process_events()
if notifier.check_events():
# read notified events and enqeue them
notifier.read_events()
# you can do some tasks here...
except KeyboardInterrupt:
# destroy the inotify's instance on this interrupt (stop monitoring)
notifier.stop()
break


It prints out what need to be done as a debug/test.
I was able to pipe the output to a parser that executed the commands.
One cavet is how linux handles files by default (for live environments. giving the command tar -xzf x.tar.gz results in file big.mp3 being extracted but it:
creates file tmp.file
fills file tmp.file
mv tmp.file big.mp3

so a full capture of directory and file events reports a create and change/write to file tmp.file that when the parse/execute-sync checks for, does not exist. My handling of the problem was if the file does not exist, continue.
If it runs multiple changes on the same file before the parse/execute-sync is called, the first found change in the queue causes a sync of the final file, and then the subsequent sync calls return "up to date".

If one wants to optimize it I would say search ahead in the queue of events and if that file is changed again, call continue, so only the last event on a file is handled.


also note, depending on your python version, there will be a screw up of object variables. In some versions, a class setting a class level variable sets both the 'external' and 'internal' variable/value. In later versions only the internal variable is set so calling the static class variable (this.IN_CLOSE_WRITE, EventsCodes.IN_CLOSE_WRITE) from a the same file the class is declared in results in the expected value However calling the class variable (EventsCodes.IN_CLOSE_WRITE) from another file results an empty value. I modified the inotify python code (which has 2 files) when it sets the class variable defined in the class files (which becomes internal) to also set the public value from the other file to make all values act the same:
class EventsCodes:
FLAG_COLLECTIONS = {'OP_FLAGS': {
'IN_ACCESS' : 0x00000001,

...

EventsCodes.IN_ACCESS = 0x00000001


(I'm no python expert, nor do I want to be, but I tried the obvious EventCodes.FLAG_COLLECTION.IN_ACCESS, EventCodes.FLAG_COLLECTIONS{IN_ACCESS} and every permutation recommended to me. variable dumps from within the class revealed everything was set, and from outside of the class on the class or running object reported no such values attached)
 
  


Reply

Tags
file, inotify, replication, scripting


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] File replication between two servers pradeepontheline Linux - Networking 3 02-11-2011 12:00 AM
Looking at file replication options jakev383 Linux - Server 3 03-13-2009 07:30 AM
file replication jdutfield Linux - General 6 10-03-2007 10:46 AM
File Replication hobbit666 Linux - Enterprise 4 04-11-2007 10:01 AM
File under routine, mundane, etc. JasonSmead Programming 1 07-31-2003 03:40 PM


All times are GMT -5. The time now is 02:40 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration