Our work on synthesizing semantic malware signatures is accepted at NDSS’17! Here is the abstract:
This paper proposes a technique for automatically learning semantic malware signatures for Android from very few samples of a malware family. The key idea underlying our technique is to look for a maximally suspicious common subgraph (MSCS) that is shared between all known instances of a malware family. An MSCS describes the shared functionality between multiple Android applications in terms of inter-component call relations and their semantic metadata (e.g., data-flow properties). Our approach identifies such maximally suspicious common subgraphs by reducing the problem to maximum satisfiability. Once a semantic signature is learned, our approach uses a combination of static analysis and a new relaxed signature matching algorithm to determine whether an Android application matches the semantic signature characterizing a given malware family. We have implemented our approach in a tool called ASTROID and show that it has a number of advantages over state-of-the-art malware detection techniques. First, we compare the semantic malware signatures automatically synthesized by ASTROID with manually-written signatures used in previous work and show that the signatures learned by ASTROID perform better in terms of accuracy as well as precision. Second, we compare ASTROID against two state-of-the-art malware detection tools and demonstrate its advantages in terms of interpretability and accuracy. Finally, we demonstrate that ASTROID’s relaxed signature matching algorithm is resistant to behavioral obfuscation and that it can be used to detect zero-day malware. In particular, we were able to find 22 instances of zero-day malware in Google Play, and none of these malicious apps are reported as malware by existing tools.