Click HERE for Free Trial Download Click HERE for Order form
Space Hound 4 - [ About How Files are Compared ]
About How Files are Compared
Versions of Space Hound prior to version 4 used a technique called CRC (cyclical redundancy check) to determine if two specific files are equal to each other. A CRC Check is performed by applying a mathematical formula. It is based on the size of a file and the value of each bit within the file.
In the ten years since the product's initial release, there was only one reported instance where two different files, of the same size, computed the same result. This occurred when a user found a set of files that were used with Microsoft's Front Page Web Design Product. After researching this phenomenon, the only answer appeared to be the fact that these files were specifically engineered by Microsoft so that they could be selectively overlaid into Front Page itself without triggering anti-virus defenses that also use CRC checking to detect altered or otherwise dangerous files.
Most Duplicate File Finders on the market today use CRC checking to determine if two files are identical. It is not necessary to perform an extremely time consuming bit by bit comparison against all files which happen to share the same file size.
However, since Space Hound has always led the field in Duplicate File Finders, we felt that one case of specially engineered files was one two many so we have adopted a new method which is called MD5.
The MD5 specification (RFC 1321) says:
"It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given pre-specified target message digest."MD5 usages versus CRC means that exact duplicate searching will be a bit slower in this edition of Space Hound than in previous editions however user safety is our primary concern and other elements (such as total search time) have improved performance to compensate for this overhead.
Also, a note about MD5 versus Byte-by-Byte comparison tools - Space Hound includes a Byte-by-Byte comparison tool for taking that extra step (if necessary). However, you don't want a tool that performs this during the search process. Why? Because consider the situation where you have 50 files that need to be compared Byte-by-Byte. This means that file #1 must be checked against all of the other 49. File #2 must also be checked against EACH of the other #49. The more files you have that undergo individual testing against other potential duplicates, the longer the process is going to take. MD5 computations are highly reliable and, as the author of the program, I challenge anyone to to show me two files of the same size with identical MD5 values that are not identical Byte-by-Byte.