generate crc value for file(s) ?
i'm tired of googling... is there any !#&#@&$ way to generate a crc32 value for a file??? maybe like
$ md5sum file.ext but only $ crc32 file.ext ??? i don't know if there's a shell command that can do this or not... can't find one.. and i can't seem to find any hashing programs for linux |
I believe the command you're looking for is cksum
|
a crc32 value is 8 digits long... cksum gave me two values.. a 10 digit value and a 9 digit value.. niether one came close to the crc32 value of a known file i checked :(
upon cksum --help execution.. i get or: cksum [OPTION] Print CRC checksum and byte counts of each FILE. --help display this help and exit --version output version information and exit .. if this a crc checking option.. why am i not getting an 8 digit number? i'm confused :/ educate this ignorant newbie :) |
Sorry about that. I thought I remembered that the last byte of a crc was the parity or some other type of property. That's why I suggested it (10 hex digits = 5 bytes - 1 byte = 4 bytes = 32 bits)
I'm also a bit fuzzy about calculating CRC. I believe it's possible to calculate it in a number of different ways. For the files you have, did the website you downloaded them from provide information on how/what was used to calculate them? Or if you got it from another individual, you might be able to ask them? The only checksum tools I'm aware of are md5sum, cksum, and plain old sum. I checked the man pages for md5sum, and it doesn't seem like there are options to "downshift" it to 32 bits. And I think sum reports a 16-bit CRC. <edit> When talking about CRC and saying things like crc16, crc32, and crc64, the number represents the size of each "block" of data used in calculating the crc. In other words, for crc16, the routine will split the file up into 16-bit chunks to work with. Similarly, it will be 32-bits for crc32, and so on. </edt> |
I don't know if you've found a solution yet, but I did some more searching.
Here is a website that describes CRC error detection "painlessly": http://www.repairfaq.org/filipg/LINK/F_crc_v3.html I also did some other checking, and noticed that the library zlib has a crc32 calculator in it. So I wrote a quickie program to use it. You can try it if you like, but i make no guarantees. If you read through the website above, it will explain how calculating a CRC can differ from one implementation to the next (specifically the polynomial chosen). Using the routines in zlib gives a little reassurance that the polynomial wasn't just plucked out of the air. If you want to give my program a try, type this up (or copy-&-paste) it into a file. Save it as something like simple_crc.c. Actually, you can name it whatever you like, just make sure to use ".c" for the end of the filename. Code:
#include <stdio.h> If everything went smoothly, you have a new file simply names "simple_crc" in the current directory. To use it: ./simple_crc filename If the program spits out a message starting with anything other than "Calculated CRC:", then there was a problem. Since this is a quickie program, all you can really do is try again or verify the filename you gave is correct. In other words, it's not very robust when it comes to handling unexpected conditions. |
thanks for sticking with me this long :) and no, i haven't found a solution to my problem yet...
i've tried the program you gave me however i'm getting different results for all my files... eg: _____________________________________________________________________________________ [apostasy@localhost hack--DUSK]$ dir (B-A)Hack_Legend_of_the_Twilight_-_01_(E48F18B2).mkv (B-A)Hack_Legend_of_the_Twilight_-_02_(5023D2C9).mkv (B-A)Hack_Legend_of_the_Twilight_-_03_(0875FB91).mkv (B-A)Hack_Legend_of_the_Twilight_-_04_(CE26F317).mkv [apostasy@localhost hack--DUSK]$ simple_crc \(B-A\)Hack_Legend_of_the_Twilight_- _01_\(E48F18B2\).mkv Calculated CRC: 785D3BFE [apostasy@localhost hack--DUSK]$ simple_crc \(B-A\)Hack_Legend_of_the_Twilight_- _02_\(5023D2C9\).mkv Calculated CRC: E71E2733 [apostasy@localhost hack--DUSK]$ simple_crc \(B-A\)Hack_Legend_of_the_Twilight_- _03_\(0875FB91\).mkv Calculated CRC: 87AD85D5 [apostasy@localhost hack--DUSK]$ simple_crc \(B-A\)Hack_Legend_of_the_Twilight_- _04_\(CE26F317\).mkv Calculated CRC: 41A82D34 [apostasy@localhost hack--DUSK]$ _____________________________________________________________________________________ i have many files like this (with crc32 in their name) and unless they are all corrupted, i am not getting what i'm looking for. the reason i need to get matching crc values is that i have many files which i do not know the value of... and need to know for database purposes. I will ask around to see how these people are getting these values (what tools they are using) btw simple_crc is awesome.. if this all gets sorted out i will be using it often :) |
yea i had the same problems with that code.. didn't seem to work, could be a couple reasons.. first of all we're CRCin files that are at least a couple hundred megs, no need to load the whole thing into memory :-/..
anyway i'm werking on a better version, that might actually work ? not sure |
crc32.c
here code that werks for me ( on a rcent gentoo linux) ==
Code:
gw tmp # gcc -o crc32 -lz crc32.c Code:
|
Code:
[apostasy@localhost Kiddy Grade]$ crc32 \(B-A\)Kiddy_Grade_-_01_\(892D29A1\).mkv <EDIT>after thinking about it some i decided to try hasing some .avi files... your program works like its supposed to... but once i start hashing 'container' formats like .ogm or .mkv; that's where i am running into problems.. for some reason the mentioned below app does not give me the same problem</EDIT> A friend pointed me to http://calcchecksum.sourceforge.net and i'm using this app for now, yet i'm still searching for a command-line oriented tool. I've emailed the author of calcchecksum and he told me that there are plans to implement command-line functions into this, yet he is currently not actively working on the project. Code:
[apostasy@localhost Puni Puni Poemy]$ crc32 Puni.Puni.Poemy.01.Poemi.Is.In.A.Bad.Mood.\[AXP.DVDRip\].\[Dual.Audio\].\[7A9F8C2A\].ogm Code:
[apostasy@localhost DayDream]$ crc32 \[A-Keep\]_DayDream_OVA_01_\[F978C89B\].avi |
wierd... not familiar with 'container' formats like .ogm or .mkv, but the source should werk for computing correct crc32 values for any file. maybe there's a more obscure bug. according to http://www.afterdawn.com/glossary/terms/container.cfm avi is also a 'container' format. /me looks into this
|
You could try md5sum ... works for me :)
Cheers, Tink |
bug fixed :
found what appears to be a bug, also fixed option parsing.
from now on, you can grab newest version here : http://kremlor.net/projects/crc32/index.html here's the latest code as of now : Code:
/******************************************************************************* |
Quote:
|
Code:
gw crc32 # nice crc32 /arc/pub/anime/00_Completed_Series_2/Lain/* |
Tinkster:
You and I think alike ;) I already checked md5sum for its available options, but couldn't find anything telling it to downshift and calculate a 32-bit CRC. It spits out a full 128-bit hash which which is a bit more than Apostasy needs elektron: Take that code and run with it :) Just a pointer: I noticed you swapped some things around a bit; All those "return X" statements for unexpected error conditions were to give a non-zero exit result for the program (just to follow traditional *nix guidelines). In your computeCRC32 function, you return 0 on those same error conditions. The only problem I see is that value gets assigned to the crc value. In theory, 0x00000000 is a valid CRC 32 value. It might be rare (specifically a 1 in about 4 billion chance), but in such a case, you wouldn't be able to distinguish whether it's a legitimate CRC or an error. Also, you have a slight memory leak... Here's the sequence: 1. The malloc() in generateCRC32() goes off without a hitch 2. Suppose the open() call fails (returns -1) In that scenario, you return from the generateCRC32() without freeing the memory for file_buffer. If the user supplied more than one filename on the command line, the possibility exists that you could chew through memory quickly. Apostasy: Did you ever find out what program the people used to calculate their CRC value? I trust the guys that wrote the zlib library, and that means the key portion of elektron's program is solid. If the value it spits out does not match what you have, then[list=1][*]There is a different polynomial used between the original CRC calculation and the one in zlib[*]The files actually are corrupt[*]The CRC does not apply to the "container" file[/list=1] For #3, I'm suggesting something along the lines of a single zipped file. Say you had a gargantuan text file. You could calculate the CRC on the text file, then compress it. The CRC of the compressed file would not match the CRC of the uncompressed file. This would be a very, very odd way of distributing a CRC value, but it is possible. As a side note, from the perspective of CRC, it should not matter whether a file is a "container", text, movie, or whatever. A CRC calculator simply looks at a file as long string of bytes; it doesn't know and doesn't care what those bytes are used for. If there is a CRC calculator that changes the calculation based on the type of file, then that's a crappy and completely untrustworthy program. <edit> One other thing to consider. These container formats you mentioned, are they anything like the relationship between wav files and mp3s? What I'm getting at is this (along the lines of #3 again). Say you downloaded a wav file with the crc embedded in the filename ("Sir_Mix_A_Lot-Baby_Got_Back(12AB34CD).wav"). If you encode that wav file to mp3, usually, only the extension changes ("Sir_Mix_A_Lot-Baby_Got_Back(12AB34CD).mp3"). However, the CRC value for the mp3 will not be the same as the CRC for the wav file; meaning the filename is giving misleading information. So, if the container files you have went through some sort of similar process, that can explain why you get different CRC values. </edit> |
All times are GMT -5. The time now is 10:35 AM. |