TY - GEN
T1 - Toward data-driven, semi-automatic inference of phenomenological physical models
T2 - 12th SIAM International Conference on Data Mining, SDM 2012
AU - Pendse, Saurabh V.
AU - Tetteh, Isaac K.
AU - Semazzi, Fredrick
AU - Kumar, Vipin
AU - Samatova, Nagiza F.
PY - 2012
Y1 - 2012
N2 - First-principles based predictive understanding of complex, dynamic physical phenomena, such as regional precipitation or hurricane intensity and frequency, is quite limited due to the lack of complete phenomenological models underlying their physics. To address this gap, hypothesis-driven, manually-constructed, conceptual hurricane models and models for regional-scale precipitation extremes have been emerging. To complement both approaches, we propose a methodology for data-driven, semi-automatic inference of plausible phenomenological models and apply it to derive the model for eastern Sahel rainfall, an important factor for socioeconomic growth and development of this region. At its core, our methodology derives cause-effect relationships using the Lasso multivariate regression model and quantifies compound affect that the complex interplay among the key predictors at their prominent temporal phases plays on the response (rainfall). Specifically, we propose methods for (a) detecting and ranking predictors' prominent temporal phases, (b) optimizing the regularization penalty, (c) assessing predictor statistical significance, (d) performing impact analysis of data normalization on model inference, and (e) calculating the Expected Causality Impact (ECI) score to quantify impact analysis. The culmination of this study is the plausible phenomenological model of the eastern Sahel seasonal rainfall and quantified key climate drivers involved in the rainfall variability at different time lags. To the best of our knowledge, this is the first phenomenological model of this phenomenon; several of its components are consistent with the known evidence from literature.
AB - First-principles based predictive understanding of complex, dynamic physical phenomena, such as regional precipitation or hurricane intensity and frequency, is quite limited due to the lack of complete phenomenological models underlying their physics. To address this gap, hypothesis-driven, manually-constructed, conceptual hurricane models and models for regional-scale precipitation extremes have been emerging. To complement both approaches, we propose a methodology for data-driven, semi-automatic inference of plausible phenomenological models and apply it to derive the model for eastern Sahel rainfall, an important factor for socioeconomic growth and development of this region. At its core, our methodology derives cause-effect relationships using the Lasso multivariate regression model and quantifies compound affect that the complex interplay among the key predictors at their prominent temporal phases plays on the response (rainfall). Specifically, we propose methods for (a) detecting and ranking predictors' prominent temporal phases, (b) optimizing the regularization penalty, (c) assessing predictor statistical significance, (d) performing impact analysis of data normalization on model inference, and (e) calculating the Expected Causality Impact (ECI) score to quantify impact analysis. The culmination of this study is the plausible phenomenological model of the eastern Sahel seasonal rainfall and quantified key climate drivers involved in the rainfall variability at different time lags. To the best of our knowledge, this is the first phenomenological model of this phenomenon; several of its components are consistent with the known evidence from literature.
UR - http://www.scopus.com/inward/record.url?scp=84880236358&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880236358&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972825.4
DO - 10.1137/1.9781611972825.4
M3 - Conference contribution
AN - SCOPUS:84880236358
SN - 9781611972320
T3 - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
SP - 35
EP - 46
BT - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
PB - Society for Industrial and Applied Mathematics Publications
Y2 - 26 April 2012 through 28 April 2012
ER -