A software birthmark is a unique characteristic of a program. Thus, comparing the birthmarks between the plaintiff and defendant programs provides an effective approach for software plagiarism detection. However, software birthmark generation faces two main challenges: the absence of source code and various code obfuscation techniques that attempt to hide the characteristics of a program. In this paper, we propose a new type of software birthmark called DYKIS (DYnamic Key Instruction Sequence) that can be extracted from an executable without the need for source code.
The plagiarism detection algorithm based on our new birthmarks is resilient to both weak obfuscation techniques such as compiler optimizations and strong obfuscation techniques implemented in tools such as SandMark, Allatori and Upx. We have developed a tool called DYKIS-PD (DYKIS Plagiarism Detection tool) and conducted extensive experiments on large number of binary programs. The tool, the benchmarks and the experimental results are all publicly available.