TY - GEN
T1 - Towards understanding dominant processes in complex dynamical systems
T2 - 6th International Workshop on Knowledge Discovery from Sensor Data, SensorKDD'12 - Held in Conjunction with the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2012
AU - Das, Debasish
AU - Ganguly, Auroop
AU - Banerjee, Arindam
AU - Obradovic, Zoran
PY - 2012
Y1 - 2012
N2 - Complex dynamical systems like precipitation extremes under climate variability or change are typically governed by multiple processes at multiple scales. The processes themselves may be manifested at multiple scales and would need to be captured through key indicator variables, which in turn may be better projected by physical models than the variables of interest. We posit that hybrid approaches based on physically-motivated approaches and data-driven methods, which in turn are conditioned on both observations and simulations from large-scale physics-based models, may offer novel and quantifiable insights. The data-driven approaches may need to extend and adapt methods developed and tested in statistics, data mining and machine learning to the concept of dominant processes. In this paper, we performed some exploratory data analysis to characterize the effect of dominant processes on precipitation extremes, annually and seasonally, and from global and century scale to regional and decadal scale, and found some interesting insights that pointed towards need of improved understanding. We identified the gaps in understanding the regional drivers where data-driven methods can make useful improvements and eventually lead to a predictive model for precipitation extremes. Although we do not propose any specific method for solving the problem, we realize that any successful data mining solution should include all or a subset of tools like dimensionality reduction, grouped variable selection, non-linear regression and graphical models. The concepts of dominant processes proposed here would likely generalize broadly to climate extremes while the solution frameworks themselves may generalize beyond climate.
AB - Complex dynamical systems like precipitation extremes under climate variability or change are typically governed by multiple processes at multiple scales. The processes themselves may be manifested at multiple scales and would need to be captured through key indicator variables, which in turn may be better projected by physical models than the variables of interest. We posit that hybrid approaches based on physically-motivated approaches and data-driven methods, which in turn are conditioned on both observations and simulations from large-scale physics-based models, may offer novel and quantifiable insights. The data-driven approaches may need to extend and adapt methods developed and tested in statistics, data mining and machine learning to the concept of dominant processes. In this paper, we performed some exploratory data analysis to characterize the effect of dominant processes on precipitation extremes, annually and seasonally, and from global and century scale to regional and decadal scale, and found some interesting insights that pointed towards need of improved understanding. We identified the gaps in understanding the regional drivers where data-driven methods can make useful improvements and eventually lead to a predictive model for precipitation extremes. Although we do not propose any specific method for solving the problem, we realize that any successful data mining solution should include all or a subset of tools like dimensionality reduction, grouped variable selection, non-linear regression and graphical models. The concepts of dominant processes proposed here would likely generalize broadly to climate extremes while the solution frameworks themselves may generalize beyond climate.
KW - Dominant processes
KW - Graphical models
KW - Precipitation extremes
KW - Predictive models
UR - http://www.scopus.com/inward/record.url?scp=84866629911&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866629911&partnerID=8YFLogxK
U2 - 10.1145/2350182.2350184
DO - 10.1145/2350182.2350184
M3 - Conference contribution
AN - SCOPUS:84866629911
SN - 9781450315548
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 16
EP - 24
BT - Proceedings of the 6th Int. Workshop on Knowledge Discovery from Sensor Data, SensorKDD'12 - Held in Conjunction with the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2012
Y2 - 12 August 2012 through 12 August 2012
ER -