Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Looking for a file sync program, hopefully git based, that can do:
Add multiple duplicate directories to the repository,
Test and eliminate/delete any 0Kb files in any of the directories loaded,
Find all "like named" files, including File, File1, File2 as autonamed by Linux
Do complete code comparisons before merger,
Hold the file with newest date as the anchor/master file,
Merge all changes from other "Like" files into the master file,
Delete all non-master files, after merge to master,
Hold non-mergable files in a separate repository for review,
Has suggested merge files for non-mergable files, by code comparison,
Has CRON based backup system for master and non-merged files.
Use a graphical or browser interface to work the remaining non-mergable files.
I've not yet found and app that does this and read a lot of info. Understand git itself can do this, but not git savy, so not sure how to implement it. Have git installed!
From what I read RSYNC is only capable of half these features, but inform me if you see differntly.
Kinda wondering if I have to write a BASH script that gets all the directories and files then starts calling other apps like rsync to work the backend issues. To properly sync with versioning, a compare will have to be executed.
Oh! On my list of features I have to be able to:
Read and Extract compressed files also,
Sync to my own git repo and then to Dropboxes git repo,
Seriously, because of so many ways to approach/do this and all totally confusing, conceptually, thinking about writing a "SyncMasters Bible" to show and explain, with examples of each method!
One funny thing about rsync flags are the flags that represent sequences of other flags, like -a (--archive) which is equivalent of -rlptgoD... so after trying heaps of options, I settled on good old -abcv, with an added -v for verbosity, lol. My rsync often look like
Usually one of the paths is on a network, and the -c flag only transfers files with different md5 checksums, which really saves bandwidth (which I still pay for by the GB)... Some network locations won't allow the -c flag, as it creates more work for their cpus, so I have to remove it in those situation.
The --suffix flag tells it to keep both files when there are duplicates with same name.
If I don't put the trailing slash on the source path, it makes a directory inside the target path, such that repeat syncs can create a directory inside a directory inside a directory ad infinitum.
These are not all of your features, however, I'm still an rsync novice--as pointed out, the man page is lengthy. Perhaps some of your other features are in there too.
Last edited by slac-in-the-box; 02-25-2019 at 06:21 AM.
Reason: spelling
OK, my biggest problem is that Dropbox has gone south, it's been total crap now for over 5 years. It errors out, requiring a new install and it can not use a currently existing path of /../,,/,,/Dropbox, so writes a new /Dropbox folder somewhere else on your box, so now have over 20 copies of the #$$^&^%%$ thing.
So I'm not longer using DBox, so I have to find all copied files on my present box, I have to:
Set a target folder/dir that I consider all the latest copies of the files,
Run a script to find all the files and record them in the DB,
Run another "diff" type script to determine which files are equal to the ones in the target dir,
Delete the extra copies that are exactly "like" or "equal" to the one's in the target dir,
Mark the deleted files in the DB, so they will be overlooked for additional processing,
Run an additional "diff" type script to record the lines or records not equal and place in the DB,
Merge the files if a document type,
Update the DBs is the file is a DB type and records have been changed or added.
Mark the completion dates, in the sync DB, when process is complete for the orginal file and it's dupes.
Find all .sql files not existing in the target dir and move/copy them there, since this will be the actual data backup dir.
Backup the entire updated set of DBs to a dated .tar.gz file, so we have the latest backup, since completing the sync,
Repeat this sync process for all files and dirs on the Network attached machines (20).
So since MySQL and the other DBs are usually the hardest to backup and recover I started with all the backed up .sql files. Elsewhere in post:
You'll see I was struggling with getting the /etc/updatedb.conf and the disk mounts to work correctly, to see all the /*.sql files. Finally got that working right, so have all the .sql files captured in /home/files/sql_dump.txt, which is over 4,000 files. Wrote the additional db_syncs.sql file to create the DB for recording these:
Code:
-- Database: `db_syncs`
# DROP Database IF EXISTS `db_syncs`;
CREATE Database IF NOT EXISTS `db_syncs`;
USE `db_syncs`;
DROP TABLE IF EXISTS `dh_files`;
CREATE TABLE `dh_files` (
`fil_idx` INT(11) NOT NULL AUTO_INCREMENT COMMENT 'Unique Cat Key',
`fil_sfx` INT(11) NOT NULL COMMENT 'Xref to SameFile',
`fil_pth` VARCHAR(255) NOT NULL COMMENT 'File Path',
`fil_bnm` VARCHAR(125) NOT NULL COMMENT 'File BaseName',
`fil_nam` VARCHAR(125) NOT NULL COMMENT 'File Name',
`fil_ext` VARCHAR(12) NOT NULL COMMENT 'File Ext',
`fil_org` ENUM('Y','N') DEFAILT 'N' COMMENT 'File From Org Dir',
`fil_eql` ENUM('Y','N') DEFAILT 'N' COMMENT 'File Equal to Org File',
`fil_ddt` datetime COMMENT 'Delete Date',
`fil_cdt` datetime COMMENT 'Create Date',
`fil_mdt` datetime COMMENT 'Modify Date',
PRIMARY KEY (`fil_idx`));
DROP TABLE IF EXISTS `dh_same`;
CREATE TABLE `dh_same` (
`sam_idx` INT(11) NOT NULL AUTO_INCREMENT COMMENT 'Unique Match Key',
`sam_fdx` INT(11) NOT NULL COMMENT 'Xref to New File',
`sam_pdx` INT(11) NOT NULL COMMENT 'Xref to Org File',
`sam_nam` VARCHAR(255) NOT NULL COMMENT 'File Name',
`sam_cdt` datetime NOT NULL COMMENT 'Create Date',
`sam_mdt` datetime NOT NULL COMMENT 'Modify Date',
PRIMARY KEY (`sam_idx`));
Sure I will have to add to this DB as I go along and learn what else I need. 2nd table is the one where "Same File" by filename are recorded.
So now writing a PHP script to run the process. I do over 90% of my coding in PHP, but will also want to create a BASH version for those of you challenged in PHP.
Running my script I ran into errors, which I posted at:
Finally decided to use "git merge-file" to accomplish elimination of the dupes, but try as I might, I can not get the cmd line string in the right format to actually make it work!
One funny thing about rsync flags are the flags that represent sequences of other flags, like -a (--archive) which is equivalent of -rlptgoD... so after trying heaps of options, I settled on good old -abcv, with an added -v for verbosity, lol. My rsync often look like
Usually one of the paths is on a network, and the -c flag only transfers files with different md5 checksums, which really saves bandwidth (which I still pay for by the GB)... Some network locations won't allow the -c flag, as it creates more work for their cpus, so I have to remove it in those situation.
The --suffix flag tells it to keep both files when there are duplicates with same name.
If I don't put the trailing slash on the source path, it makes a directory inside the target path, such that repeat syncs can create a directory inside a directory inside a directory ad infinitum.
These are not all of your features, however, I'm still an rsync novice--as pointed out, the man page is lengthy. Perhaps some of your other features are in there too.
slac-in-the-box,
Your comment made more sense than most, but being HEWBIE to the whole sync thing still GREEK to me. Hope you have patience with me and explain a little more!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.