Fast and robust supervised learning in high dimensions using the geometry of the data

Ujjal Kumar Mukherjee; Subhabrata Majumdar; Snigdhansu Chatterjee

doi:10.1007/978-3-319-20910-4_9

Fast and robust supervised learning in high dimensions using the geometry of the data

Ujjal Kumar Mukherjee, Subhabrata Majumdar, Snigdhansu Chatterjee

Statistics (Twin Cities)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

We develop a method for tracing out the shape of a cloud of sample observations, in arbitrary dimensions, called the data cloud wrapper (DCW). The DCW have strong theoretical properties, have algorithmic scalability and parallel computational features. We further use the DCW to develop a new fast, robust and accurate classification method in high dimensions, called the geometric learning algorithm (GLA). Two of the main features of the proposed algorithm are that there are no assumptions made about the geometric properties of the underlying data generating distribution, and that there are no parametric or other restrictive assumptions made either for the data or the algorithm. The proposed methods are typically faster and more robust than established classification techniques, while being comparably accurate in most cases.

Original language	English (US)
Title of host publication	Advances in Data Mining
Subtitle of host publication	Applications and Theoretical Aspects - 15th Industrial Conference, ICDM 2015, Proceedings
Editors	Petra Perner
Publisher	Springer Verlag
Pages	109-123
Number of pages	15
ISBN (Print)	9783319209098
DOIs	https://doi.org/10.1007/978-3-319-20910-4_9
State	Published - 2015
Event	15th Industrial Conference on Data Mining, ICDM 2015 - Hamburg, Germany Duration: Jul 11 2015 → Jul 24 2015

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	9165
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Other

Other	15th Industrial Conference on Data Mining, ICDM 2015
Country/Territory	Germany
City	Hamburg
Period	7/11/15 → 7/24/15

Bibliographical note

Funding Information:
This research is partially supported by NSF grant # IIS-1029711, NASA grant #-1502546) the Institute on the Environment (IonE), and College of Liberal Arts (CLA) at the University of Minnesota.

Publisher Copyright:
© Springer International Publishing Switzerland 2015.

Access

10.1007/978-3-319-20910-4_9

OpenUrl availability

Full text

Cite this

Mukherjee, U. K., Majumdar, S., & Chatterjee, S. (2015). Fast and robust supervised learning in high dimensions using the geometry of the data. In P. Perner (Ed.), Advances in Data Mining: Applications and Theoretical Aspects - 15th Industrial Conference, ICDM 2015, Proceedings (pp. 109-123). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9165). Springer Verlag. https://doi.org/10.1007/978-3-319-20910-4_9

Fast and robust supervised learning in high dimensions using the geometry of the data. / Mukherjee, Ujjal Kumar; Majumdar, Subhabrata; Chatterjee, Snigdhansu.
Advances in Data Mining: Applications and Theoretical Aspects - 15th Industrial Conference, ICDM 2015, Proceedings. ed. / Petra Perner. Springer Verlag, 2015. p. 109-123 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9165).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Mukherjee, UK, Majumdar, S & Chatterjee, S 2015, Fast and robust supervised learning in high dimensions using the geometry of the data. in P Perner (ed.), Advances in Data Mining: Applications and Theoretical Aspects - 15th Industrial Conference, ICDM 2015, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9165, Springer Verlag, pp. 109-123, 15th Industrial Conference on Data Mining, ICDM 2015, Hamburg, Germany, 7/11/15. https://doi.org/10.1007/978-3-319-20910-4_9

Mukherjee UK, Majumdar S, Chatterjee S. Fast and robust supervised learning in high dimensions using the geometry of the data. In Perner P, editor, Advances in Data Mining: Applications and Theoretical Aspects - 15th Industrial Conference, ICDM 2015, Proceedings. Springer Verlag. 2015. p. 109-123. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-319-20910-4_9

Mukherjee, Ujjal Kumar ; Majumdar, Subhabrata ; Chatterjee, Snigdhansu. / Fast and robust supervised learning in high dimensions using the geometry of the data. Advances in Data Mining: Applications and Theoretical Aspects - 15th Industrial Conference, ICDM 2015, Proceedings. editor / Petra Perner. Springer Verlag, 2015. pp. 109-123 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{823f21ce95004188b464e179a0036d3a,

title = "Fast and robust supervised learning in high dimensions using the geometry of the data",

abstract = "We develop a method for tracing out the shape of a cloud of sample observations, in arbitrary dimensions, called the data cloud wrapper (DCW). The DCW have strong theoretical properties, have algorithmic scalability and parallel computational features. We further use the DCW to develop a new fast, robust and accurate classification method in high dimensions, called the geometric learning algorithm (GLA). Two of the main features of the proposed algorithm are that there are no assumptions made about the geometric properties of the underlying data generating distribution, and that there are no parametric or other restrictive assumptions made either for the data or the algorithm. The proposed methods are typically faster and more robust than established classification techniques, while being comparably accurate in most cases.",

author = "Mukherjee, {Ujjal Kumar} and Subhabrata Majumdar and Snigdhansu Chatterjee",

note = "Funding Information: This research is partially supported by NSF grant # IIS-1029711, NASA grant #-1502546) the Institute on the Environment (IonE), and College of Liberal Arts (CLA) at the University of Minnesota. Publisher Copyright: {\textcopyright} Springer International Publishing Switzerland 2015.; 15th Industrial Conference on Data Mining, ICDM 2015 ; Conference date: 11-07-2015 Through 24-07-2015",

year = "2015",

doi = "10.1007/978-3-319-20910-4_9",

language = "English (US)",

isbn = "9783319209098",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "109--123",

editor = "Petra Perner",

booktitle = "Advances in Data Mining",

}

TY - GEN

T1 - Fast and robust supervised learning in high dimensions using the geometry of the data

AU - Mukherjee, Ujjal Kumar

AU - Majumdar, Subhabrata

AU - Chatterjee, Snigdhansu

N1 - Funding Information: This research is partially supported by NSF grant # IIS-1029711, NASA grant #-1502546) the Institute on the Environment (IonE), and College of Liberal Arts (CLA) at the University of Minnesota. Publisher Copyright: © Springer International Publishing Switzerland 2015.

PY - 2015

Y1 - 2015

N2 - We develop a method for tracing out the shape of a cloud of sample observations, in arbitrary dimensions, called the data cloud wrapper (DCW). The DCW have strong theoretical properties, have algorithmic scalability and parallel computational features. We further use the DCW to develop a new fast, robust and accurate classification method in high dimensions, called the geometric learning algorithm (GLA). Two of the main features of the proposed algorithm are that there are no assumptions made about the geometric properties of the underlying data generating distribution, and that there are no parametric or other restrictive assumptions made either for the data or the algorithm. The proposed methods are typically faster and more robust than established classification techniques, while being comparably accurate in most cases.

AB - We develop a method for tracing out the shape of a cloud of sample observations, in arbitrary dimensions, called the data cloud wrapper (DCW). The DCW have strong theoretical properties, have algorithmic scalability and parallel computational features. We further use the DCW to develop a new fast, robust and accurate classification method in high dimensions, called the geometric learning algorithm (GLA). Two of the main features of the proposed algorithm are that there are no assumptions made about the geometric properties of the underlying data generating distribution, and that there are no parametric or other restrictive assumptions made either for the data or the algorithm. The proposed methods are typically faster and more robust than established classification techniques, while being comparably accurate in most cases.

UR - http://www.scopus.com/inward/record.url?scp=84950121438&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84950121438&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-20910-4_9

DO - 10.1007/978-3-319-20910-4_9

M3 - Conference contribution

AN - SCOPUS:84950121438

SN - 9783319209098

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 109

EP - 123

BT - Advances in Data Mining

A2 - Perner, Petra

PB - Springer Verlag

T2 - 15th Industrial Conference on Data Mining, ICDM 2015

Y2 - 11 July 2015 through 24 July 2015

ER -

Fast and robust supervised learning in high dimensions using the geometry of the data

Abstract

Publication series

Other

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this