Automated evaluation of approximate matching algorithms on real data

Breitinger, Frank; Roussev, Vassil

doi:10.1016/j.diin.2014.03.002

Automated evaluation of approximate matching algorithms on real data

Détails

Demande d'une copie

ID Serval

serval:BIB_200E1BD52A3B

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

Production externe

Titre

Automated evaluation of approximate matching algorithms on real data

Périodique

Digital Investigation

Auteur⸱e⸱s

Breitinger Frank, Roussev Vassil

ISSN

1742-2876

Statut éditorial

Publié

Date de publication

05/2014

Volume

Numéro

Pages

S10-S17

Langue

anglais

Résumé

Abstract Bytewise approximate matching is a relatively new area within digital forensics, but its importance is growing quickly as practitioners are looking for fast methods to screen and analyze the increasing amounts of data in forensic investigations. The essential idea is to complement the use of cryptographic hash functions to detect data objects with bytewise identical representation with the capability to find objects with bytewise similar representations. Unlike cryptographic hash functions, which have been studied and tested for a long time, approximate matching ones are still in their early development stages and evaluation methodology is still evolving. Broadly, prior approaches have used either a human in the loop to manually evaluate the goodness of similarity matches on real world data, or controlled (pseudo-random) data to perform automated evaluation. This work’s contribution is to introduce automated approximate matching evaluation on real data by relating approximate matching results to the longest common substring (LCS). Specifically, we introduce a computationally efficient {LCS} approximation and use it to obtain ground truth on the t5 set. Using the results, we evaluate three existing approximate matching schemes relative to {LCS} and analyze their performance.

Mots-clé

Law, Medical Laboratory Technology, Computer Science Applications

DOI

10.1016/j.diin.2014.03.002

Web of science

000335438900002

Site de l'éditeur

http://www.sciencedirect.com/science/article/pii/S1742287614000073

Open Access

Oui

Création de la notice

06/05/2021 12:01