LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 08-25-2009, 02:26 PM   #1
sepi
LQ Newbie
 
Registered: Aug 2009
Location: Budapest, Hungary
Distribution: ubuntu 9.04 desktop amd64
Posts: 4

Rep: Reputation: 0
bash command or sript to list files


Hello,

need a command or script to
list all files recursive without directories
one line per file, no extra lines like ls -AR1
should print file size and name
eg.:
12 file.ext
25684 file2.ext
589 file3.ext
...
 
Old 08-25-2009, 02:28 PM   #2
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Is this homework? Do you have any ideas? Have you tried anything?
 
Old 08-25-2009, 02:33 PM   #3
sepi
LQ Newbie
 
Registered: Aug 2009
Location: Budapest, Hungary
Distribution: ubuntu 9.04 desktop amd64
Posts: 4

Original Poster
Rep: Reputation: 0
no, it is not homework
I have two volumes with mostly (but not exactly) same files, but completly different directory structures. Need to known wich files exist in only one volume, and wich are dups.

tried ls with awk but all print extra lines

Last edited by sepi; 08-25-2009 at 02:36 PM.
 
Old 08-25-2009, 02:45 PM   #4
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
How can you identify a file? Are names unique within each volume (file system?)? As in any /foo/bar and /goo/bar files? If so the what further characteristics, beyond the name, will be enough to uniquely identify a file -- size in bytes, modification time, checksum ... ?

Are you only dealing with "normal" files or do you have multipli-linked files, symlinks, device files, fifos ... ?
 
Old 08-25-2009, 02:47 PM   #5
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
How many files, roughly, in total?
 
Old 08-25-2009, 03:06 PM   #6
sepi
LQ Newbie
 
Registered: Aug 2009
Location: Budapest, Hungary
Distribution: ubuntu 9.04 desktop amd64
Posts: 4

Original Poster
Rep: Reputation: 0
hi,
files should identify by name and size
need not special files, but volumes does not contain any spec files, just normals
it is about 500k files in 900 GBytes in each volume, i think about 450k files are identical
the directory structure is completly different

Last edited by sepi; 08-25-2009 at 03:08 PM.
 
Old 08-25-2009, 03:47 PM   #7
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Ouch! That's big! Performance will be significant and bash string manipulation is slow but I can't think how to handle whitespace in file names using awk (I'm not very proficient in awk so that doesn't mean it can't be done -- it almost certainly can). How about this for starters?
Code:
#!/bin/bash
find . -type f -exec /bin/ls -l {} \; | while read x x x x size x x name
do
	echo $size "${name##*/}"
done
Maybe could be speeded up by using xargs on the find. The output will need sorting ...
 
Old 08-25-2009, 04:15 PM   #8
sepi
LQ Newbie
 
Registered: Aug 2009
Location: Budapest, Hungary
Distribution: ubuntu 9.04 desktop amd64
Posts: 4

Original Poster
Rep: Reputation: 0
Thank you very mouch!
The solution is exactly what i need.
Runtime is not a problem, granted one core for it, will continue in the background.
Sorting not necessary, the output will be imported into mysql, then some simple query should show the dups and diffs.
thx again!

PS:
runtime was about 45 min.
result is about 8 MB
fine

Last edited by sepi; 08-25-2009 at 04:44 PM.
 
  


Reply

Tags
bash



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash command to list installed libraries and version numbers newtovanilla Linux - Newbie 4 07-18-2008 04:49 PM
How do you list dot files only in bash? sysslack_linux General 7 11-10-2007 01:19 PM
A list of files I own in bash? subnet_rx Linux - Software 4 07-10-2006 12:01 PM
using bash find command to list *.h and *.cpp wgillett Linux - Software 3 12-15-2005 11:55 AM
Command to run another command against a list of files psweetma Linux - General 3 11-09-2005 05:29 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:12 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration