![]() ![]() ) with a much more robust approach than conventional hash values. first PPT, then PPTX, then PDF), it can still be recognized.Įven if a document was stored in a different file format (e.g. ![]() Ocenaudio 2.0.16 update#Īfter a "Save as" or or after printing (which may update a "last printed" timestamp), do not prevent identification either. Very often even if text was inserted/removed/reordered/revised, a document can still be recognized. The technology is called FuzZyDoc.įuzZyDoc hash values are stored in yet another hash database in X-Ways Forensics. So there are now 5 hash databases available in total, and counting. Hash sets based on selected documents can be added to the FuzZyDoc database exactly like hash sets can be created in ordinary hash databases, and the FuzZyDoc hash database can also be managed in the same dialog window as the other hash databases, so existing users will have no trouble locating and using the new functionality. For each selected document you can create 1 separate hash set, or you can create 1 hash set for all selected documents. Up to 65,535 hash sets are supported in a FuzZyDoc hash database.įuzZyDoc is available to all users of X-Ways Forensics and X-Ways Investigator (i.e. Chinese, Japanese, Korean, Indonesian, Malay, Tamil, Tagalog.įuzZyDoc should work well with documents in practically all Western and Eastern European languages, many Asian languages (e.g. , but not Thai, Divehi, Tibetan, Punjabi. Note that numbers in spreadsheet cells are not exploited by the algorithm, only text. Note that only files with a confirmed or newly identified type will be matched against the FuzZyDoc hash database. Ocenaudio 2.0.16 verification#įor that reason, file type verification is applied automatically when FuzZyDoc matching is requested.ĭocuments whose contents are largely identical (e.g. invoices created by the same company with the same letterhead) are considered similar by the algorithm even if important details change (billing address, price), depending on the amount of identical text. That means that if you have 1 copy of an invoice of a company, matching against unknown documents will easily identify other invoices of the same company. For every document that is matched against the database, up to 4 matching hash sets are returned, and the 4 best matching hash sets are picked for that if more than 4 match. For every matching hash set, X-Ways Forensics also presents a percentage that roughly indicates to what degree the contents of the document match the hash set. For example, 100% means that all the textual contents that X-Ways Forensics deemed relevant in the given document can also be found in the hash set, 50% means half of the contents. 100% does not rule out the possibility that the document(s) that the hash set is based on contain(s) much more (other) text. The matching percentage does not count characters one by one, and it works only on documents that actually make sense, not on small test files that only contain a few words.īefore matching files against the FuzZyDoc hash database (a new operation of Specialist | Refine Volume Snapshot), you can specify which types of files you would like to analyze, and you can unselect hash sets in the database that you are temporarily not interested in. By specifying less file types in the mask) of course will require less time, proportionally, but selecting less hash sets for matching as such does not save time. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |