Global linear neighborhoods for efficient label propagation

Ze Tian; Rui Kuang

doi:10.1137/1.9781611972825.74

Global linear neighborhoods for efficient label propagation

Ze Tian, Rui Kuang

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

29 Scopus citations

Abstract

Graph-based semi-supervised learning improves classification by combining labeled and unlabeled data through label propagation. It was shown that the sparse representation of graph by weighted local neighbors provides a better similarity measure between data points for label propagation. However, selecting local neighbors can lead to disjoint components and incorrect neighbors in graph, and thus, fail to capture the underlying global structure. In this paper, we propose to learn a nonnegative low-rank graph to capture global linear neighborhoods, under the assumption that each data point can be linearly reconstructed from weighted combinations of its direct neighbors and reachable indirect neighbors. The global linear neighborhoods utilize information from both direct and indirect neighbors to preserve the global cluster structures, while the low-rank property retains a compressed representation of the graph. An efficient algorithm based on a multiplicative update rule is designed to learn a nonnegative low-rank factorization matrix minimizing the neighborhood reconstruction error. Large scale simulations and experiments on UCI datasets and high-dimensional gene expression datasets showed that label propagation based on global linear neighborhoods captures the global cluster structures better and achieved more accurate classification results.

Original language	English (US)
Title of host publication	Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
Publisher	Society for Industrial and Applied Mathematics Publications
Pages	863-872
Number of pages	10
ISBN (Print)	9781611972320
DOIs	https://doi.org/10.1137/1.9781611972825.74
State	Published - 2012
Event	12th SIAM International Conference on Data Mining, SDM 2012 - Anaheim, CA, United States Duration: Apr 26 2012 → Apr 28 2012

Publication series

Name	Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012

Other

Other	12th SIAM International Conference on Data Mining, SDM 2012
Country/Territory	United States
City	Anaheim, CA
Period	4/26/12 → 4/28/12

Access

10.1137/1.9781611972825.74

OpenUrl availability

Full text

Cite this

Tian, Z., & Kuang, R. (2012). Global linear neighborhoods for efficient label propagation. In Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012 (pp. 863-872). (Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611972825.74

Global linear neighborhoods for efficient label propagation. / Tian, Ze; Kuang, Rui.
Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012. Society for Industrial and Applied Mathematics Publications, 2012. p. 863-872 (Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Tian, Z & Kuang, R 2012, Global linear neighborhoods for efficient label propagation. in Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012. Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012, Society for Industrial and Applied Mathematics Publications, pp. 863-872, 12th SIAM International Conference on Data Mining, SDM 2012, Anaheim, CA, United States, 4/26/12. https://doi.org/10.1137/1.9781611972825.74

@inproceedings{b1297f0504f8422db3d0997069db8f64,

title = "Global linear neighborhoods for efficient label propagation",

abstract = "Graph-based semi-supervised learning improves classification by combining labeled and unlabeled data through label propagation. It was shown that the sparse representation of graph by weighted local neighbors provides a better similarity measure between data points for label propagation. However, selecting local neighbors can lead to disjoint components and incorrect neighbors in graph, and thus, fail to capture the underlying global structure. In this paper, we propose to learn a nonnegative low-rank graph to capture global linear neighborhoods, under the assumption that each data point can be linearly reconstructed from weighted combinations of its direct neighbors and reachable indirect neighbors. The global linear neighborhoods utilize information from both direct and indirect neighbors to preserve the global cluster structures, while the low-rank property retains a compressed representation of the graph. An efficient algorithm based on a multiplicative update rule is designed to learn a nonnegative low-rank factorization matrix minimizing the neighborhood reconstruction error. Large scale simulations and experiments on UCI datasets and high-dimensional gene expression datasets showed that label propagation based on global linear neighborhoods captures the global cluster structures better and achieved more accurate classification results.",

author = "Ze Tian and Rui Kuang",

year = "2012",

doi = "10.1137/1.9781611972825.74",

language = "English (US)",

isbn = "9781611972320",

series = "Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012",

publisher = "Society for Industrial and Applied Mathematics Publications",

pages = "863--872",

booktitle = "Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012",

note = "12th SIAM International Conference on Data Mining, SDM 2012 ; Conference date: 26-04-2012 Through 28-04-2012",

}

TY - GEN

T1 - Global linear neighborhoods for efficient label propagation

AU - Tian, Ze

AU - Kuang, Rui

PY - 2012

Y1 - 2012

N2 - Graph-based semi-supervised learning improves classification by combining labeled and unlabeled data through label propagation. It was shown that the sparse representation of graph by weighted local neighbors provides a better similarity measure between data points for label propagation. However, selecting local neighbors can lead to disjoint components and incorrect neighbors in graph, and thus, fail to capture the underlying global structure. In this paper, we propose to learn a nonnegative low-rank graph to capture global linear neighborhoods, under the assumption that each data point can be linearly reconstructed from weighted combinations of its direct neighbors and reachable indirect neighbors. The global linear neighborhoods utilize information from both direct and indirect neighbors to preserve the global cluster structures, while the low-rank property retains a compressed representation of the graph. An efficient algorithm based on a multiplicative update rule is designed to learn a nonnegative low-rank factorization matrix minimizing the neighborhood reconstruction error. Large scale simulations and experiments on UCI datasets and high-dimensional gene expression datasets showed that label propagation based on global linear neighborhoods captures the global cluster structures better and achieved more accurate classification results.

AB - Graph-based semi-supervised learning improves classification by combining labeled and unlabeled data through label propagation. It was shown that the sparse representation of graph by weighted local neighbors provides a better similarity measure between data points for label propagation. However, selecting local neighbors can lead to disjoint components and incorrect neighbors in graph, and thus, fail to capture the underlying global structure. In this paper, we propose to learn a nonnegative low-rank graph to capture global linear neighborhoods, under the assumption that each data point can be linearly reconstructed from weighted combinations of its direct neighbors and reachable indirect neighbors. The global linear neighborhoods utilize information from both direct and indirect neighbors to preserve the global cluster structures, while the low-rank property retains a compressed representation of the graph. An efficient algorithm based on a multiplicative update rule is designed to learn a nonnegative low-rank factorization matrix minimizing the neighborhood reconstruction error. Large scale simulations and experiments on UCI datasets and high-dimensional gene expression datasets showed that label propagation based on global linear neighborhoods captures the global cluster structures better and achieved more accurate classification results.

UR - http://www.scopus.com/inward/record.url?scp=84880250285&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880250285&partnerID=8YFLogxK

U2 - 10.1137/1.9781611972825.74

DO - 10.1137/1.9781611972825.74

M3 - Conference contribution

AN - SCOPUS:84880250285

SN - 9781611972320

T3 - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012

SP - 863

EP - 872

BT - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012

PB - Society for Industrial and Applied Mathematics Publications

T2 - 12th SIAM International Conference on Data Mining, SDM 2012

Y2 - 26 April 2012 through 28 April 2012

ER -

Global linear neighborhoods for efficient label propagation

Abstract

Publication series

Other

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this