Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

In the current era of big data, researchers routinely collect and analyze data of super-large sample sizes. Data-oriented statistical methods have been developed to extract information from super-large data. Smoothing spline ANOVA (SSANOVA) is a promising approach for extracting information from noisy data; however, the heavy computational cost of SSANOVA hinders its wide application. In this paper, we propose a new algorithm for fitting SSANOVA models to super-large sample data. In this algorithm, we introduce rounding parameters to make the computation scalable. To demonstrate the benefits of the rounding parameters, we present a simulation study and a real data example using electroencephalography data. Our results reveal that (using the rounding parameters) a researcher can fit nonparametric regression models to very large samples within a few seconds using a standard laptop or tablet computer.

Original languageEnglish (US)
Pages (from-to)433-444
Number of pages12
JournalStatistics and its Interface
Volume9
Issue number4
DOIs
StatePublished - 2016

Bibliographical note

Funding Information:
This research was partially supported by NSF grants DMS 1440037 and DMS 1438957, and start-up funds from the University of Minnesota.

Keywords

  • Rounding parameter
  • Scalable algorithm
  • Smoothing spline ANOVA

Fingerprint

Dive into the research topics of 'Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters'. Together they form a unique fingerprint.

Cite this