The impact of excluding common blocks for approximate matching

Moia, Vitor Hugo Galhardo; Breitinger, Frank; Henriques, Marco Aurélio Amaral

doi:10.1016/j.cose.2019.101676

The impact of excluding common blocks for approximate matching

Détails

Demande d'une copie

ID Serval

serval:BIB_203F1FD91074

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

Production externe

Titre

The impact of excluding common blocks for approximate matching

Périodique

Computers & Security

Auteur⸱e⸱s

Moia Vitor Hugo Galhardo, Breitinger Frank, Henriques Marco Aurélio Amaral

ISSN

0167-4048

Statut éditorial

Publié

Date de publication

02/2020

Volume

Pages

101676

Langue

anglais

Résumé

Approximate matching functions allow the identification of similarity (bytewise level) in a very efficient way, by creating and comparing compact representations of objects (a.k.a digests). However, many similarity matches occur due to common data that repeats over many different files and consist of inner structure, header and footer information, color tables, font specifications, etc.; data created by applications and not generated by users. Most of the times, this sort of information is less relevant from an investigator perspective and should be avoided. In this work, we show how the common data can be identified and filtered out by using approximate matching, as well as how they are spread over different file types and their frequency. We assess the impact on similarity when removing it (i.e., in the number of matches) and the effects on performance. Our results show that for a small price on performance, a reduction about 87% on the number of matches can be achieved when removing such data.

Mots-clé

General Computer Science, Law

DOI

10.1016/j.cose.2019.101676

Site de l'éditeur

http://www.sciencedirect.com/science/article/pii/S0167404819302159

Création de la notice

06/05/2021 12:01

Dernière modification de la notice

06/05/2021 12:43

Données d'usage

SERVAL

serveur académique lausannois

The impact of excluding common blocks for approximate matching

Détails