bash command or sript to list files
Hello,
need a command or script to list all files recursive without directories one line per file, no extra lines like ls -AR1 should print file size and name eg.: 12 file.ext 25684 file2.ext 589 file3.ext ... |
Is this homework? Do you have any ideas? Have you tried anything?
|
no, it is not homework :)
I have two volumes with mostly (but not exactly) same files, but completly different directory structures. Need to known wich files exist in only one volume, and wich are dups. tried ls with awk but all print extra lines |
How can you identify a file? Are names unique within each volume (file system?)? As in any /foo/bar and /goo/bar files? If so the what further characteristics, beyond the name, will be enough to uniquely identify a file -- size in bytes, modification time, checksum ... ?
Are you only dealing with "normal" files or do you have multipli-linked files, symlinks, device files, fifos ... ? |
How many files, roughly, in total?
|
hi,
files should identify by name and size need not special files, but volumes does not contain any spec files, just normals it is about 500k files in 900 GBytes in each volume, i think about 450k files are identical the directory structure is completly different |
Ouch! That's big! Performance will be significant and bash string manipulation is slow but I can't think how to handle whitespace in file names using awk (I'm not very proficient in awk so that doesn't mean it can't be done -- it almost certainly can). How about this for starters?
Code:
#!/bin/bash |
Thank you very mouch!
The solution is exactly what i need. Runtime is not a problem, granted one core for it, will continue in the background. Sorting not necessary, the output will be imported into mysql, then some simple query should show the dups and diffs. thx again! PS: runtime was about 45 min. result is about 8 MB fine |
All times are GMT -5. The time now is 08:46 AM. |