Click HERE for Free Trial Download Click HERE for Order form
Space Hound 4 - [ About Microsoft Word Duplicate Content Search ]
Return to Space Hound 4 Features
About Microsoft Word Duplicate Content Search (From Specialty Reports)
Duplicated Microsoft Word files cannot be found using normal duplicate file tools because they are often actually different files. Microsoft Word documents are stored in what is called Compound Document format.
Note: Use of this feature requires the local installation of the OfficeXP edition of Microsoft Word or later.
A recent test example with a single sentence of text containing 35 letters consumed over 26,000 bytes of file space. All Microsoft Word documents include this block of approximately 26,000 bytes in order to store statistical and informational values including total edit time, date last printed, previous revision information, etc.
This information can vary between two documents and the informational blocks may not even be the same size. This is why two Word Documents with identical textual content are often missed by other duplicate file finders.
The Microsoft Word Duplicate Content Search in Space Hound is designed to find files based on what they contain regardless of file name, file size, or the other miscellaneous information included within one of them. Use of this feature requires the local installation of Microsoft Word from the Office XP edition or later.
Special Instructions:
There are a number of situations where the use of this tool may be somewhat unsatisfactory. This is mostly applicable to damaged Microsoft Word Documents. A damaged document can cause unpredictable results within Microsoft Word itself. In many cases, Space Hound will detect these problems and bypass documents with problems. When a document is bypassed, it will be included within an Exceptions List (called EXCEPTION.TXT) that will be stored within the same folder where Space Hound is installed.