how to scp files only after they've been there for X seconds
Hi everybody. I have a program that outputs files to a certain directory, lets say /data/output
only problem is i need to copy those files to another server. What i'm hoping to do is scp them to that other server, then move them out of /data/output to /data/SentToOtherServer after they are scp'd. the only problem is that the files are huge so take a long time to write into /data/output and i dont want it grabbing the files before they are fully written. is there something i can do timestamp-wise so they aren't scp'd unless a timestamp on that file is older than say 30 seconds which would mean its finished writing? |
How about instead using inotify watching files for "close_write" event?
|
I'm not familiar with inotify, never even heard of it until I read your post. there is a man page for it on my server, but I cant say i really understand how to use it.
|
While you have asked no question try 'inotifywait --monitor --recursive --quiet --csv --event close_write /data/output' as an example.
|
OK, looking at the man for inotify I kinda get what that would do. But how do i tie these commands into a .sh that would scp the file(s) then move them to another folder? I can write the scp command and the mv command, but dont know how to tie it all together with not grabbing files before they are written. could anyone provide an example?
|
or is there something like cmin that works for seconds where you can use that to do a find with?
|
Quote:
inofitywait is part of the inotify-tools package, and has its own man page (after you install the package). The inotify man pages describe the API, whereas inotifywait is a shell command. Try running this in the /data/output directory: Code:
inotifywait -mrq -e close_write --format '%w%f' . | xargs -I COMPLETEDFILE scp COMPLETEDFILE user@remote:path/COMPLETEDFILE Code:
inotifywait -mrq -e close_write --format '%w%f' . | while read FILE ; do ( scp "$FILE" "user@remote:path/$FILE" & ) ; done You might wish to take a look at the incron package. |
Thanks for some examples Nominal. Combining commands always confuses me. couple questions though, if i would be doing this from a .sh file (i'm assuming) how do i force it to run in the /data/output directory all the time? I also need to move the file to another folder once its been SCP'd to that other server, but i dont think something like that is in the example, or is it?
Appreciate the example and help. |
Quote:
Consider the following Bash script. Code:
#!/bin/bash The script will never exit by itself; you need to kill it via e.g. Code:
kill -HUP $(ps -C inotifywait -o pid=) It is quite possible to extend the above around some job or script, so that close_write events are only watched while the other job/script runs, and afterwards everything is cleaned up -- including scp'ing and copying any files the monitoring might have missed. That will make the script even more complicated, though. You should also consider what to do with errors, for example if you run out of disk space. Should you just output the error, or should you send an e-mail message? Note that inotifytools package is not installed by default on most Linux distributions. If you are a Linux cluster user, first contact your cluster admins to ask if inotifytools is installed, and if/which command-line utility you can use to send mail from compute nodes. E-mail is not always possible from compute nodes, or may only be possible via a specific command-line client, e.g. /bin/sendmail. |
wow, that pretty intense, and impressive! I never would have figured any of that out haha. do those printf's just put stuff up on the screen? I'm guess I already have the inotifytools installed because i was able to pull up man pages for the stuff.
So since i would cron this, I really dont need the printf's if they just write to the screen, since nobody would see them as this would run constantly in the background? Guess there's a lot more to think about then what i posted simply in my original post!! |
My original line of thought was to somehow do a find like
find /data/output/* -type f -cmin +1 even though i'd really like to do it like right after the file is closed or a few seconds after, kinda like what this inotify stuff does. then do the scp command, then move the file to /data/sent. I guess that's kinda simplistic and doesnt account for errors, and its very glamourous. Plus i have no idea how to combine then all to work right. Just figured i'd give more background. |
Quote:
Quote:
You might add another script to cron, to do the same for files that have not been modified in the last N minutes (say, a few hours), so that you "catch" anything the monitoring missed, or could not transfer for some reason. Basically, Code:
#!/bin/bash My personal approach to issues like this is much more careful than most. I tend to assume problems will occur, and try to handle them in an useful manner. Your initial idea might work well for you, without any issues, if you happen to select a large enough age limit. My environments tend to vary too much for a simple age limit to work reliably, so I've had to find more reliable methods. They are obviously a bit more complex, but I think their robustness more than makes up for the added complexity. |
Thanks for replying Nominal. I agree you're approach is probably better suited than my very basic idea from the get go, was just posting that to show my thought process.
I'm not sure how to let it run all the time in the background though. Also, if the server is restarted, would however you set that up automatically restart it as well so it would start doing the process? I'm also confused by your "note that it..." as i'm not sure how you use a pid file and i thought you said not to cron it (even though I dont know how to make it run all the time like you said". Sorry for all these questions, but i appreciate you answering them all. |
I'm just afraid this approach may be too far over my head, and i wouldnt be able to support it. but maybe after you answer those questions i'll understand it better. thanks again.
|
Here is a simple solution...
Quote:
Code:
./write_files_to_directory.sh && scp * 192.168.1.1:/tmp/ |
All times are GMT -5. The time now is 11:04 PM. |