An Efficient Similarity Digests Database Lookup – A Logarithmic Divide & Conquer Approach

Details

Serval ID
serval:BIB_5BC283220ABB
Type
Article: article from journal or magazin.
Collection
Publications
Title
An Efficient Similarity Digests Database Lookup – A Logarithmic Divide & Conquer Approach
Journal
Journal of Digital Forensics, Security and Law
Author(s)
Breitinger Frank, Rathgeb Christian, Baier Harald
ISSN
1558-7223
Publication state
Published
Issued date
2014
Volume
9
Number
2
Language
english
Abstract
Investigating seized devices within digital forensics represents a challenging task due to the increasing amount of data. Common procedures utilize automated file identification, which reduces the amount of data an investigator has to examine manually. In the past years the research field of approximate matching arises to detect similar data. However, if n denotes the number of similarity digests in a database, then the lookup for a single similarity digest is of complexity of O(n).
This paper presents a concept to extend existing approximate matching algorithms, which reduces the lookup complexity from O(n) to O(log(n)). Our proposed approach is based on the well-known divide and conquer paradigm and builds a Bloom filter-based tree data structure in order to enable an efficient lookup of similarity digests. Further, it is demonstrated that the presented technique is highly scalable operating a trade-off between storage requirements and computational efficiency. We perform a theoretical assessment based on recently published results and reasonable magnitudes of input data, and show that the complexity reduction achieved by the proposed technique yields a 220-fold acceleration of look-up costs.
Keywords
Digital forensics, hashing, approximate matching, Bloom filter, mrsh-v2, sdhash, index- ing
Open Access
Yes
Create date
06/05/2021 11:01
Last modification date
06/05/2021 11:31
Usage data