Group Learning for High-Dimensional Sparse Data

Vladimir Cherkassky; Hsiang Han Chen; Han Tai Shiao

doi:10.1109/IJCNN.2019.8852183

Group Learning for High-Dimensional Sparse Data

Vladimir Cherkassky, Hsiang Han Chen, Han Tai Shiao

Electrical and Computer Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

3 Scopus citations

Abstract

We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal.

Original language	English (US)
Title of host publication	2019 International Joint Conference on Neural Networks, IJCNN 2019
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728119854
DOIs	https://doi.org/10.1109/IJCNN.2019.8852183
State	Published - Jul 2019
Event	2019 International Joint Conference on Neural Networks, IJCNN 2019 - Budapest, Hungary Duration: Jul 14 2019 → Jul 19 2019

Publication series

Name	Proceedings of the International Joint Conference on Neural Networks
Volume	2019-July

Conference

Conference	2019 International Joint Conference on Neural Networks, IJCNN 2019
Country/Territory	Hungary
City	Budapest
Period	7/14/19 → 7/19/19

Bibliographical note

Funding Information:
ACKNOWLEDGMENT This work was supported, in part, by NIH grant UH2NS095495, and NIH grant R01NS092882.

Publisher Copyright:
© 2019 IEEE.

Keywords

Group Learning
SVM
binary classification
digit recognition
feature selection
histogram of projections
iEEG
seizure prediction
unbalanced data.

Access

10.1109/IJCNN.2019.8852183

OpenUrl availability

Full text

Cite this

Cherkassky, V., Chen, H. H., & Shiao, H. T. (2019). Group Learning for High-Dimensional Sparse Data. In 2019 International Joint Conference on Neural Networks, IJCNN 2019 Article 8852183 (Proceedings of the International Joint Conference on Neural Networks; Vol. 2019-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IJCNN.2019.8852183

Group Learning for High-Dimensional Sparse Data. / Cherkassky, Vladimir; Chen, Hsiang Han; Shiao, Han Tai.
2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc., 2019. 8852183 (Proceedings of the International Joint Conference on Neural Networks; Vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Cherkassky, V, Chen, HH & Shiao, HT 2019, Group Learning for High-Dimensional Sparse Data. in 2019 International Joint Conference on Neural Networks, IJCNN 2019., 8852183, Proceedings of the International Joint Conference on Neural Networks, vol. 2019-July, Institute of Electrical and Electronics Engineers Inc., 2019 International Joint Conference on Neural Networks, IJCNN 2019, Budapest, Hungary, 7/14/19. https://doi.org/10.1109/IJCNN.2019.8852183

@inproceedings{9444a57b11e2463ba324a6a6acb26c69,

title = "Group Learning for High-Dimensional Sparse Data",

abstract = "We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal.",

keywords = "Group Learning, SVM, binary classification, digit recognition, feature selection, histogram of projections, iEEG, seizure prediction, unbalanced data.",

author = "Vladimir Cherkassky and Chen, {Hsiang Han} and Shiao, {Han Tai}",

note = "Funding Information: ACKNOWLEDGMENT This work was supported, in part, by NIH grant UH2NS095495, and NIH grant R01NS092882. Publisher Copyright: {\textcopyright} 2019 IEEE.; 2019 International Joint Conference on Neural Networks, IJCNN 2019 ; Conference date: 14-07-2019 Through 19-07-2019",

year = "2019",

month = jul,

doi = "10.1109/IJCNN.2019.8852183",

language = "English (US)",

series = "Proceedings of the International Joint Conference on Neural Networks",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2019 International Joint Conference on Neural Networks, IJCNN 2019",

}

TY - GEN

T1 - Group Learning for High-Dimensional Sparse Data

AU - Cherkassky, Vladimir

AU - Chen, Hsiang Han

AU - Shiao, Han Tai

PY - 2019/7

Y1 - 2019/7

N2 - We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal.

AB - We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal.

KW - Group Learning

KW - SVM

KW - binary classification

KW - digit recognition

KW - feature selection

KW - histogram of projections

KW - iEEG

KW - seizure prediction

KW - unbalanced data.

UR - http://www.scopus.com/inward/record.url?scp=85073220194&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073220194&partnerID=8YFLogxK

U2 - 10.1109/IJCNN.2019.8852183

DO - 10.1109/IJCNN.2019.8852183

M3 - Conference contribution

AN - SCOPUS:85073220194

T3 - Proceedings of the International Joint Conference on Neural Networks

BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019

Y2 - 14 July 2019 through 19 July 2019

ER -

Group Learning for High-Dimensional Sparse Data

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this