Input prioritization for testing neural networks

Taejoon Byun, Vaibhav Sharma, Abhishek Vijayakumar, Sanjai Rayadurgam, Darren Cofer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

48 Scopus citations

Abstract

Deep neural networks (DNNs) are increasingly being adopted for sensing and control functions in a variety of safety and mission-critical systems such as self-driving cars, autonomous air vehicles, medical diagnostics and industrial robotics. Failures of such systems can lead to loss of life or property, which necessitates stringent verification and validation for providing high assurance. Though formal verification approaches are being investigated, testing remains the primary technique for assessing the dependability of such systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining test oracle data - the expected output, a.k.a. label, for a given input - is high, which significantly impacts the amount and quality of testing that can be performed. Thus, prioritizing input data for testing DNNs in meaningful ways to reduce the cost of labeling can go a long way in increasing testing efficacy. This paper proposes using gauges of the DNN's sentiment derived from the computation performed by the model, as a means to identify inputs that are likely to reveal weaknesses. We empirically assessed the efficacy of three such sentiment measures for prioritization - confidence, uncertainty and surprise - and compare their effectiveness in terms of their fault-revealing capability and retraining effectiveness. The results indicate that sentiment measures can effectively flag inputs that expose unacceptable DNN behavior. For MNIST models, the average percentage of inputs correctly flagged ranged from 88% to 94.8%.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Artificial Intelligence Testing, AITest 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages63-70
Number of pages8
ISBN (Electronic)9781728104928
DOIs
StatePublished - May 17 2019
Event1st IEEE International Conference on Artificial Intelligence Testing, AITest 2019 - Newark, United States
Duration: Apr 4 2019Apr 9 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Artificial Intelligence Testing, AITest 2019

Conference

Conference1st IEEE International Conference on Artificial Intelligence Testing, AITest 2019
Country/TerritoryUnited States
CityNewark
Period4/4/194/9/19

Bibliographical note

Funding Information:
ACKNOWLEDGMENT This work is supported by AFRL and DARPA under contract FA8750-18-C-0099.

Publisher Copyright:
© 2019 IEEE.

Keywords

  • Coverage criteria
  • Machine learning
  • Neural networks
  • Test prioritization

Fingerprint

Dive into the research topics of 'Input prioritization for testing neural networks'. Together they form a unique fingerprint.

Cite this