Analyzing aviation safety reports: From topic modeling to scalable multi-label classification

Amrudin Agovic, Hanhuai Shan, Arindam Banerjee

Research output: Contribution to conferencePaperpeer-review

6 Scopus citations

Abstract

The Aviation Safety Reporting System (ASRS) is used to collect voluntarily submitted aviation safety reports from pilots, controllers and others. As such it is particularly useful in researching aviation safety deficiencies. In this paper we address two challenges related to the analysis of ASRS data: (1) the unsupervised extraction of meaningful and interpretable topics from ASRS reports and (2) multi-label classification of ASRS data based on a set of predefined categories. For topic modeling we investigate the practical usefulness of Latent Dirichlet Allocation (LDA) when it comes to modeling ASRS reports in terms of interpretable topics. We also utilize LDA to generate a more compact representation of ASRS reports to be used in multi-label classification. For multi-label classification we propose a novel and highly scalable multi-label classification algorithm based on multi-variate regression. Empirical results indicate that our approach is superior to several baseline and state-of-the-art approaches.

Original languageEnglish (US)
Pages83-97
Number of pages15
StatePublished - 2010
Externally publishedYes
EventNASA Conference on Intelligent Data Understanding, CIDU 2010 - Mountain View, CA, United States
Duration: Oct 5 2010Oct 6 2010

Other

OtherNASA Conference on Intelligent Data Understanding, CIDU 2010
Country/TerritoryUnited States
CityMountain View, CA
Period10/5/1010/6/10

Fingerprint

Dive into the research topics of 'Analyzing aviation safety reports: From topic modeling to scalable multi-label classification'. Together they form a unique fingerprint.

Cite this