LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-26-2010, 11:03 PM   #1
hanker
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Rep: Reputation: 0
Need to output md5 or sha1 along with fullpath and filesize


Hi all,

I am trying to output md5 or sha1 along with fullpath/filename and file size but I dont seem to find a way to do this.

with
Quote:
find . -printf '%s %p'
i can retrieve size and fullpath and filename

however I am not able to merge that info with the md5 or sha1 of the file

my aim is to have a file such as this

6435b607f86b6e6be1e77bb3b1987677d1377275 ./abc/asda/file1.txt 404
6435b607f86b6e6be1e77bb3b987677d13772725 ./abc/asda/file2.txt 1404

also, performance is an issue for me, since i need to get the info out of 10m files (approx 6TB), so commands like find are preferred and less iterations among commands would be great too.

any ideas?

btw i've tried to use something like this
Quote:
find . -type f -printf '%s %p'| xargs awk '{x=system("md5sum "$2)}END {print x" "$2" "$1}'
but variable x contains the return value of the system command md5sum and not the stdout

thanks a million
Hanker
 
Old 08-27-2010, 12:46 AM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Hi, welcome to LQ!


Something like
Code:
#!/bin/bash
read file size
md5=$(md5sum $file)
printf "%s\t%s\t$s\n", $md5, $file, $size
Save as md5.sh

And then
Code:
find . -type f  -printf '%s %p\n' | xargs md5.sh
Untested. :}



Cheers,
Tink

Last edited by Tinkster; 08-27-2010 at 12:50 AM.
 
Old 08-27-2010, 12:52 AM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian + kde 4 / 5
Posts: 6,837

Rep: Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981Reputation: 1981
Well, first of all, you're right about the system function. It returns the exit value of the command, not the command itself.

But I also see a couple of other errors. First of all, take a look at the output of the find command. You need to specify newlines with printf. And the END section only prints the final values, not once for each line.

In any case, I don't see any easy one-liner way to accomplish what you want. But a simple loop can do the trick.
Code:
while read x; do

     size=${x% *}
     file=${x#* }
     sum="$(md5sum $file 2>/dev/null)"
     sum=${sum%% *}     #strip off filename to get sum only*

     echo "$sum $file $size"

done <<<"$(find . -type f -printf "%s %p\n")"
*Note that the output of md5sum shows the sum, then the filename separated by two spaces. As you seem to want only a single space between each part, I've removed the filename, then printed it separately again on the final line. This step could be removed if you don't mind the extra spaces.

Edit: Hah...tinkster got here first. But I've now added a link too.

Last edited by David the H.; 08-27-2010 at 12:57 AM.
 
1 members found this post helpful.
Old 08-27-2010, 01:55 AM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
bash 4
Code:
#!/bin/bash
shopt -s globstar
for file in /fullpath/**
do
  echo "fullpath: $file"
  echo "filename: ${file##*/}"
  md5=$(md5sum file)
  sha1=$(sha1sum file)
  echo "md5sum : ${md5% *}"
  echo "sha1 : ${sha1% *}"
done
 
Old 08-27-2010, 08:45 AM   #5
hanker
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Original Poster
Rep: Reputation: 0
Thank you All.

i still have some problems since i am running the script in cygwin
basically i'm getting an error

i've modified the script to understand what is going on

here is the md5.sh

Quote:
#!/bin/bash
read file size
echo "[$file] [$size]"
md5=$(openssl sha1 $file)
printf "[%s] [%s] [%s]\n" $md5 $file $size
the output of
Quote:
find . -type f -printf "%s %p\n"
is
Quote:
124 ./md5.sh
5 ./test.txt
while the error appears in the full command
Quote:
find . -type f -printf "%s %p\n" | xargs md5.sh
Quote:
': not a valid identifierze
[] []
] []9a3ee5e6b4b0d3255bfef95601890afd80709
i'd appreciate some more comments...

BTW Tinkster your approach is the most scalable/effective and that's why i'm using that.

Note that the read command seem to read only "file si" and not "file" and "size"

thanks again
Hanker

Last edited by hanker; 08-27-2010 at 08:46 AM. Reason: update
 
Old 08-27-2010, 10:01 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,565

Rep: Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901
@Tinkster - I am going to need help with this one too as the xargs errors for me with:
Code:
xargs: md5.sh: No such file or directory
This after running:
Code:
find -type f -printf '%s %p\n' | xargs md5.sh
I figure I am missing something but am not real familiar with xargs.
 
Old 08-27-2010, 10:08 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,565

Rep: Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901
Quote:
Originally Posted by David the H.
Note that the output of md5sum shows the sum, then the filename separated by two spaces.
I didn't seem to get this affect. I only had one space with the following:
Code:
while read -r s f
do
	echo -n $(md5sum "$f")
	echo " $s"
	echo -n $(sha1sum "$f")
	echo " $s"
done< <(find -type f -printf '%s %p\n')
 
Old 08-27-2010, 11:09 AM   #8
hanker
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Original Poster
Rep: Reputation: 0
@grail thanks for looking at these:
1. for the first post you probably need to create the md5.sh following Thinkster info at the start of the thread.
2. with the script in your second post i've got
Quote:
$ ./script.sh
./script.sh: line 7: syntax error near unexpected token `done'
./script.sh: line 7: `done << (find -type f -printf '%s %p\n')'
btw i've tried with both "done< <" and "done <<" but no luck...

note i'm using cygwin to run these commands/scripts and i've read on a few blogs that read in cygwin bash does not behave as in ksh, that's probably why i'm stuck...

any other ideas?

Last edited by hanker; 08-27-2010 at 11:12 AM. Reason: correction
 
Old 08-27-2010, 06:09 PM   #9
hanker
LQ Newbie
 
Registered: Aug 2010
Posts: 4

Original Poster
Rep: Reputation: 0
I can consider this closed in Ubuntu - solution adopted is David the H. thank ye a bunch
 
Old 08-28-2010, 04:27 AM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,565

Rep: Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901
Quote:
1. for the first post you probably need to create the md5.sh following Thinkster info at the start of the thread.
Yes I had created the md5.sh file and it was executable, but it still did not work for me.

Yes there should be a space between the 2 input redirections - <[ ]<
And for me it provides output like:
Quote:
d41d8cd98f00b204e9800998ecf8427e ./file3 0
da39a3ee5e6b4b0d3255bfef95601890afd80709 ./file3 0
50e5682375f97e1fed670905889f04b9 ./logfile 90
c13c5469ac1d7b60fb02a3c42cca18b487128ce2 ./logfile 90
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
AcidRip output filesize too small Donalb Linux - Software 1 05-04-2009 03:44 AM
How to use MD4, MD5, and SHA1 in Linux using C++ tnjones Programming 3 09-04-2008 12:14 AM
Help with MD5 and SHA1 signatures DeepSeaNautilus Linux - Security 6 08-11-2008 10:51 PM
md5/sha1 Openssl libraries gives different output for binary files return.c Programming 1 03-24-2008 01:27 PM
password hash storage (md5, sha1...) aneroid Programming 6 12-30-2005 11:27 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration