Online ℓ1-dictionary learning with application to novel document detection

Shiva Prasad Kasiviswanathan; Huahua Wang; Arindam Banerjee; Prem Melville

Online ℓ₁-dictionary learning with application to novel document detection

Shiva Prasad Kasiviswanathan, Huahua Wang, Arindam Banerjee, Prem Melville

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

38 Scopus citations

Abstract

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ₁-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ₁- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.

Original language	English (US)
Title of host publication	Advances in Neural Information Processing Systems 25
Subtitle of host publication	26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012
Pages	2258-2266
Number of pages	9
State	Published - 2012
Event	26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012 - Lake Tahoe, NV, United States Duration: Dec 3 2012 → Dec 6 2012

Publication series

Name	Advances in Neural Information Processing Systems
Volume	3
ISSN (Print)	1049-5258

Other

Other	26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012
Country/Territory	United States
City	Lake Tahoe, NV
Period	12/3/12 → 12/6/12

OpenUrl availability

Full text

Cite this

Online ℓ₁-dictionary learning with application to novel document detection. / Kasiviswanathan, Shiva Prasad; Wang, Huahua; Banerjee, Arindam et al.
Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012. 2012. p. 2258-2266 (Advances in Neural Information Processing Systems; Vol. 3).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Kasiviswanathan, SP, Wang, H, Banerjee, A & Melville, P 2012, Online ℓ₁-dictionary learning with application to novel document detection. in Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012. Advances in Neural Information Processing Systems, vol. 3, pp. 2258-2266, 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012, Lake Tahoe, NV, United States, 12/3/12.

@inproceedings{849afe4316244361b5ab0f6b21713f20,

title = "Online ℓ1-dictionary learning with application to novel document detection",

abstract = "Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ1- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.",

author = "Kasiviswanathan, {Shiva Prasad} and Huahua Wang and Arindam Banerjee and Prem Melville",

year = "2012",

language = "English (US)",

isbn = "9781627480031",

series = "Advances in Neural Information Processing Systems",

pages = "2258--2266",

booktitle = "Advances in Neural Information Processing Systems 25",

note = "26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012 ; Conference date: 03-12-2012 Through 06-12-2012",

}

TY - GEN

T1 - Online ℓ1-dictionary learning with application to novel document detection

AU - Kasiviswanathan, Shiva Prasad

AU - Wang, Huahua

AU - Banerjee, Arindam

AU - Melville, Prem

PY - 2012

Y1 - 2012

N2 - Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ1- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.

AB - Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ1- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.

UR - http://www.scopus.com/inward/record.url?scp=84877755328&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84877755328&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84877755328

SN - 9781627480031

T3 - Advances in Neural Information Processing Systems

SP - 2258

EP - 2266

BT - Advances in Neural Information Processing Systems 25

T2 - 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012

Y2 - 3 December 2012 through 6 December 2012

ER -