Investigating Graph Embedding Methods for Cross-Platform Binary Code Similarity Detection

Details

Ressource 1Download: EuroSP_final.pdf (1551.73 [Ko])
State: Public
Version: author
License: Not specified
Serval ID
serval:BIB_36DEE8E777F9
Type
Inproceedings: an article in a conference proceedings.
Collection
Publications
Institution
Title
Investigating Graph Embedding Methods for Cross-Platform Binary Code Similarity Detection
Title of the conference
Proceedings of the IEEE European Symposium on Security and Privacy (EuroS&P)
Author(s)
Cochard Victor, Pfammatter Damian, Duong Chi Thang, Humbert Mathias
Publication state
Published
Issued date
08/06/2022
Peer-reviewed
Oui
Language
english
Abstract
IoT devices are increasingly present, both in the industry and in consumer markets, but their security remains weak, which leads to an unprecedented number of attacks against them. In order to reduce the attack surface, one approach is to analyze the binary code of these devices to early detect whether they contain potential security vulnerabilities. More specifically, knowing some vulnerable function, we can determine whether the firmware of an IoT device contains some security flaw by searching for this function. However, searching for similar vulnerable functions is in general challenging due to the fact that the source code is often not openly available and that it can be compiled for different architectures, using different compilers and compilation settings. In order to handle these varying settings, we can compare the similarity between the graph embeddings derived from the binary functions. In this paper, inspired by the recent advances in deep learning, we propose a new method – GESS (graph embeddings for similarity search) – to derive graph embeddings, and we compare it with various state-of-the-art methods. Our empirical evaluation shows that GESS reaches an AUC of 0.979, thereby outperforming the best known approach. Furthermore, for a fixed low false positive rate, GESS provides a true positive rate (or recall) about 36% higher than the best previous approach. Finally, for a large search space, GESS provides a recall between 50% and 60% higher than the best previous approach.
Create date
10/06/2022 14:39
Last modification date
11/06/2022 7:09
Usage data