I admit to being a loose nut in the operator's chair, but can someone tell me what I'm missing about using gwhereGWhere Package Page
Other than import from existing data files, I don't see how one loads information into the utility? Is it supposed to scan "drives" or "folder trees" to collect contents into the file catalog?
I have external, usb-connect drives (both flash and spinning), stacked like cord wood and want to catalog all of the contents.
looks like the ticket to get me started, but I cannot discover how to scan a drive or folder tree.
If one is supposed to obtain these lists of drive or folder contents form some other source, how does one do that?
The following find
command approximates the data that I want to store:
prompt$ find /some/path/ -type f -printf "%U:%G; %M; %kK; %h; %f; acc=%AD-%AT; chg=%CD-%AT \n"
1000:1000; -rw-rw-r--; 0K; /some/path; baz; acc=02/06/12-11:48:55.3913852050; chg=02/06/12-11:48:55.3913852050
1000:1000; -rw-rw-r--; 0K; /some/path; bam; acc=02/06/12-11:48:12.1355528140; chg=02/06/12-11:48:12.1355528140
1000:1000; -rw-rw-r--; 0K; /some/path; bar; acc=02/06/12-11:48:12.1355528140; chg=02/06/12-11:48:12.1355528140
1000:1000; -rw-rw-r--; 0K; /some/path; foo; acc=02/06/12-11:48:12.1355528140; chg=02/06/12-11:48:12.1355528140
(NOTE -- I used the tags "acc" and "chg" so my brain keeps things sorted. I won't use them in the catalog database.)
I don't need the fractional seconds part of the time, but the format, "%AT" from find, that I used being lazy, shows them. I'll also want to add "%F" to record the type of file system involved along with any available "volume label" once I discover how to learn that detail.
Ultimately, I'll want to be able to identify duplicates not only by name and size but also by MD5SUM or similar.
Then, a search by file name would allow me to recover which drive some data is currently on.