In many applications, multilabel classification involves time-series predictors, as in multilabel video classification. How to account for the temporal dependencies with respect to input variables remains an issue, especially in action learning from videos.Motivated by the problem of video categorization and captioning, we propose a nonlinear multilabel classifier based on a hiddenMarkovmodel and a weighted loss separating false positive and negative classification errors. This allows us to account for label dependence and temporal dependencies of input variables in classification. Computationally, we derive a decomposable algorithm based on block-wise coordinate descent for non-convex minimization, where it permits not only to block-wise updates but also label-wise updates, leading to scalable computation. Theoretically, we derive the Bayes rule and prove that the proposedmethod consistently recovers the optimal performance of the Bayes rule. In simulations, the proposed method compares favorably with its competitors ignoring either label dependence or time-dependence. Finally, the utility of the proposed method is demonstrated by an application to ActivityNet Captions dataset for understanding a video's contents.
Bibliographical noteFunding Information:
Manuscript received August 31, 2019; revised April 17, 2020; accepted September 22, 2020. Date of publication September 28, 2020; date of current version October 9, 2020. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Elias Aboutan-ios. This work was supported in part by NSF under Grants DMS-1712564, DMS-1721216, DMS-1712580, and DMS-1721445. (Corresponding author: Yunzhang Zhu.) Yuezhang Che is with the School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200083, China (e-mail: email@example.com).
© 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
- Hidden markov models
- Label dependence
- Nonconvex minimization
- Video sequence