Distributed sparse linear regression

Gonzalo Mateos; Juan Andrés Bazerque; Georgios B Giannakis

doi:10.1109/TSP.2010.2055862

Distributed sparse linear regression

Gonzalo Mateos, Juan Andrés Bazerque, Georgios B Giannakis

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

396 Scopus citations

Abstract

The Lasso is a popular technique for joint estimation and continuous variable selection, especially well-suited for sparse and possibly under-determined linear regression problems. This paper develops algorithms to estimate the regression coefficients via Lasso when the training data are distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons. A motivating application is explored in the context of wireless communications, whereby sensing cognitive radios collaborate to estimate the radio-frequency power spectrum density. Attaining different tradeoffs between complexity and convergence speed, three novel algorithms are obtained after reformulating the Lasso into a separable form, which is iteratively minimized using the alternating-direction method of multipliers so as to gain the desired degree of parallelization. Interestingly, the per agent estimate updates are given by simple soft-thresholding operations, and inter-agent communication overhead remains at affordable level. Without exchanging elements from the different training sets, the local estimates consent to the global Lasso solution, i.e., the fit that would be obtained if the entire data set were centrally available. Numerical experiments with both simulated and real data demonstrate the merits of the proposed distributed schemes, corroborating their convergence and global optimality. The ideas in this paper can be easily extended for the purpose of fitting related models in a distributed fashion, including the adaptive Lasso, elastic net, fused Lasso and nonnegative garrote.

Original language	English (US)
Article number	5499155
Pages (from-to)	5262-5276
Number of pages	15
Journal	IEEE Transactions on Signal Processing
Volume	58
Issue number	10
DOIs	https://doi.org/10.1109/TSP.2010.2055862
State	Published - Oct 2010

Bibliographical note

Funding Information:
Manuscript received January 29, 2010; accepted June 20, 2010. Date of publication July 01, 2010; date of current version September 15, 2010. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Huaiyu Dai. Work in this paper was supported by the NSF Grants CCF-0830480 and ECCS-0824007. Part of the paper was presented at the International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, March 15-19, 2010.

Keywords

Distributed linear regression
Lasso
parallel optimization
sparse estimation

Access

10.1109/TSP.2010.2055862

OpenUrl availability

Full text

Cite this

@article{e0a6c6c0acef42bbb482b5d886b298b4,

title = "Distributed sparse linear regression",

abstract = "The Lasso is a popular technique for joint estimation and continuous variable selection, especially well-suited for sparse and possibly under-determined linear regression problems. This paper develops algorithms to estimate the regression coefficients via Lasso when the training data are distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons. A motivating application is explored in the context of wireless communications, whereby sensing cognitive radios collaborate to estimate the radio-frequency power spectrum density. Attaining different tradeoffs between complexity and convergence speed, three novel algorithms are obtained after reformulating the Lasso into a separable form, which is iteratively minimized using the alternating-direction method of multipliers so as to gain the desired degree of parallelization. Interestingly, the per agent estimate updates are given by simple soft-thresholding operations, and inter-agent communication overhead remains at affordable level. Without exchanging elements from the different training sets, the local estimates consent to the global Lasso solution, i.e., the fit that would be obtained if the entire data set were centrally available. Numerical experiments with both simulated and real data demonstrate the merits of the proposed distributed schemes, corroborating their convergence and global optimality. The ideas in this paper can be easily extended for the purpose of fitting related models in a distributed fashion, including the adaptive Lasso, elastic net, fused Lasso and nonnegative garrote.",

keywords = "Distributed linear regression, Lasso, parallel optimization, sparse estimation",

author = "Gonzalo Mateos and Bazerque, {Juan Andr{\'e}s} and Giannakis, {Georgios B}",

note = "Funding Information: Manuscript received January 29, 2010; accepted June 20, 2010. Date of publication July 01, 2010; date of current version September 15, 2010. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Huaiyu Dai. Work in this paper was supported by the NSF Grants CCF-0830480 and ECCS-0824007. Part of the paper was presented at the International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, March 15-19, 2010.",

year = "2010",

month = oct,

doi = "10.1109/TSP.2010.2055862",

language = "English (US)",

volume = "58",

pages = "5262--5276",

journal = "IEEE Transactions on Signal Processing",

issn = "1053-587X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "10",

}

TY - JOUR

T1 - Distributed sparse linear regression

AU - Mateos, Gonzalo

AU - Bazerque, Juan Andrés

AU - Giannakis, Georgios B

N1 - Funding Information: Manuscript received January 29, 2010; accepted June 20, 2010. Date of publication July 01, 2010; date of current version September 15, 2010. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Huaiyu Dai. Work in this paper was supported by the NSF Grants CCF-0830480 and ECCS-0824007. Part of the paper was presented at the International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, March 15-19, 2010.

PY - 2010/10

Y1 - 2010/10

N2 - The Lasso is a popular technique for joint estimation and continuous variable selection, especially well-suited for sparse and possibly under-determined linear regression problems. This paper develops algorithms to estimate the regression coefficients via Lasso when the training data are distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons. A motivating application is explored in the context of wireless communications, whereby sensing cognitive radios collaborate to estimate the radio-frequency power spectrum density. Attaining different tradeoffs between complexity and convergence speed, three novel algorithms are obtained after reformulating the Lasso into a separable form, which is iteratively minimized using the alternating-direction method of multipliers so as to gain the desired degree of parallelization. Interestingly, the per agent estimate updates are given by simple soft-thresholding operations, and inter-agent communication overhead remains at affordable level. Without exchanging elements from the different training sets, the local estimates consent to the global Lasso solution, i.e., the fit that would be obtained if the entire data set were centrally available. Numerical experiments with both simulated and real data demonstrate the merits of the proposed distributed schemes, corroborating their convergence and global optimality. The ideas in this paper can be easily extended for the purpose of fitting related models in a distributed fashion, including the adaptive Lasso, elastic net, fused Lasso and nonnegative garrote.

AB - The Lasso is a popular technique for joint estimation and continuous variable selection, especially well-suited for sparse and possibly under-determined linear regression problems. This paper develops algorithms to estimate the regression coefficients via Lasso when the training data are distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons. A motivating application is explored in the context of wireless communications, whereby sensing cognitive radios collaborate to estimate the radio-frequency power spectrum density. Attaining different tradeoffs between complexity and convergence speed, three novel algorithms are obtained after reformulating the Lasso into a separable form, which is iteratively minimized using the alternating-direction method of multipliers so as to gain the desired degree of parallelization. Interestingly, the per agent estimate updates are given by simple soft-thresholding operations, and inter-agent communication overhead remains at affordable level. Without exchanging elements from the different training sets, the local estimates consent to the global Lasso solution, i.e., the fit that would be obtained if the entire data set were centrally available. Numerical experiments with both simulated and real data demonstrate the merits of the proposed distributed schemes, corroborating their convergence and global optimality. The ideas in this paper can be easily extended for the purpose of fitting related models in a distributed fashion, including the adaptive Lasso, elastic net, fused Lasso and nonnegative garrote.

KW - Distributed linear regression

KW - Lasso

KW - parallel optimization

KW - sparse estimation

UR - http://www.scopus.com/inward/record.url?scp=77956773514&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77956773514&partnerID=8YFLogxK

U2 - 10.1109/TSP.2010.2055862

DO - 10.1109/TSP.2010.2055862

M3 - Article

AN - SCOPUS:77956773514

SN - 1053-587X

VL - 58

SP - 5262

EP - 5276

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

IS - 10

M1 - 5499155

ER -

Distributed sparse linear regression

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this