On the feasibility of automated built-in function modeling for PHP symbolic execution

Penghui Li, Wei Meng, Kangjie Lu, Changhua Luo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Symbolic execution has been widely applied in detecting vulnerabilities in web applications. Modeling language-specific built-in functions is essential for symbolic execution. Since built-in functions tend to be complicated and are typically implemented in low-level languages, a common strategy is to manually translate them into the SMT-LIB language for constraint solving. Such translation requires an excessive amount of human effort and deep understandings of the function behaviors. Incorrect translation can invalidate the final results. This problem aggravates in PHP applications because of their cross-language nature, i.e., , the built-in functions are written in C, but the rest code is in PHP. In this paper, we explore the feasibility of automating the process of modeling PHP built-in functions for symbolic execution. We synthesize C programs by transforming the constraint solving task in PHP symbolic execution into a C-compliant format and integrating them with C implementations of the built-in functions. We apply symbolic execution on the synthesized C program to find a feasible path, which gives a solution that can be applied to the original PHP constraints. In this way, we automate the modeling of built-in functions in PHP applications. We thoroughly compare our automated method with the state-of-the-art manual modeling tool. The evaluation results demonstrate that our automated method is more accurate with a higher function coverage, and can exploit a similar number of vulnerabilities. Our empirical analysis also shows that the manual and automated methods have different strengths, which complement each other in certain scenarios. Therefore, the best practice is to combine both of them to optimize the accuracy, correctness, and coverage of symbolic execution.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021
PublisherAssociation for Computing Machinery, Inc
Pages58-69
Number of pages12
ISBN (Electronic)9781450383127
DOIs
StatePublished - Apr 19 2021
Event2021 World Wide Web Conference, WWW 2021 - Ljubljana, Slovenia
Duration: Apr 19 2021Apr 23 2021

Publication series

NameThe Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021

Conference

Conference2021 World Wide Web Conference, WWW 2021
Country/TerritorySlovenia
CityLjubljana
Period4/19/214/23/21

Bibliographical note

Funding Information:
The work was partly supported by a grant from the Research Grants Council of the Hong Kong SAR, China (CUHK 14210219). Kangjie Lu was supported in part by the NSF awards CNS-1815621 and CNS-1931208. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the RGC of HK or the NSF.

Publisher Copyright:
© 2021 ACM.

Keywords

  • Constraint solving
  • PHP
  • Symbolic execution

Fingerprint

Dive into the research topics of 'On the feasibility of automated built-in function modeling for PHP symbolic execution'. Together they form a unique fingerprint.

Cite this