Finding Substitutable Binary Code By Synthesizing Adapters

Vaibhav Sharma, Kesha Hietala, Stephen McCamant

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Independently developed codebases typically contain many segments of code that perform same or closely related operations (semantic clones). Finding functionally equivalent segments enables applications like replacing a segment by a more efficient or more secure alternative. Such related segments often have different interfaces, so some glue code (an adapter) is needed to replace one with the other. We present an algorithm that searches for replaceable code segments by attempting to synthesize an adapter between them from some finite family of adapters; it terminates if it finds no possible adapter. We implement our technique using concrete adapter enumeration based on Intel's Pin framework and binary symbolic execution, and explore the relation between size of adapter search space and total search time. We present examples of applying adapter synthesis for improving security of binary functions and switching between binary implementations of RC4. We present two large-scale evaluations: (1) we run adapter synthesis on more than 13,000 function pairs from the Linux C library, and (2) we reverse engineer fragments of ARM binary code by running more than a million adapter synthesis tasks. Our results confirm that several instances of adaptably equivalent binary functions exist in real-world code, and suggest that adapter synthesis can be applied for automatically replacing binary code with its adaptably equivalent variants.

Original languageEnglish (US)
JournalIEEE Transactions on Software Engineering
DOIs
StateAccepted/In press - 2019

Keywords

  • Binary codes
  • Computer science
  • Libraries
  • Reverse engineering
  • Security
  • Task analysis
  • Tools

Fingerprint

Dive into the research topics of 'Finding Substitutable Binary Code By Synthesizing Adapters'. Together they form a unique fingerprint.

Cite this