Statistical Significance of Climate Sensitivity Predictors Obtained by Data Mining

TitleStatistical Significance of Climate Sensitivity Predictors Obtained by Data Mining
Publication TypeJournal Article
Year of Publication2014
AuthorsCaldwell, Peter M., Bretherton Christopher S., Zelinka Mark D., Klein Stephen S., Santer Benjamin D., and Sanderson Benjamin M.
JournalGeophysical Research Letters
Abstract / Summary

Several recent efforts to estimate Earth's equilibrium climate sensitivity (ECS) focus on identifying quantities in the current climate which are skillful predictors of ECS yet can be constrained by observations. This study automates the search for observable predictors using data from Phase 5 of the Coupled Model Intercomparison Project (CMIP5). The primary focus of this paper is assessing statistical significance of the resulting predictive relationships. Failure to account for dependence between models, variables, locations, and seasons is shown to yield misleading results. A new technique for testing the field significance of data-mined correlations which avoids these problems is presented. Using this new approach, all 41,741 relationships we tested were found to be explainable by chance. This leads us to conclude that data mining is best used to identify potential relationships which are then validated or discarded using physically-based hypothesis testing.

URLhttp://onlinelibrary.wiley.com/doi/10.1002/2014GL059205/abstract
DOI10.1002/2014GL059205
Journal: Geophysical Research Letters
Year of Publication: 2014

Several recent efforts to estimate Earth's equilibrium climate sensitivity (ECS) focus on identifying quantities in the current climate which are skillful predictors of ECS yet can be constrained by observations. This study automates the search for observable predictors using data from Phase 5 of the Coupled Model Intercomparison Project (CMIP5). The primary focus of this paper is assessing statistical significance of the resulting predictive relationships. Failure to account for dependence between models, variables, locations, and seasons is shown to yield misleading results. A new technique for testing the field significance of data-mined correlations which avoids these problems is presented. Using this new approach, all 41,741 relationships we tested were found to be explainable by chance. This leads us to conclude that data mining is best used to identify potential relationships which are then validated or discarded using physically-based hypothesis testing.

DOI: 10.1002/2014GL059205
Citation:
Caldwell, PM, CS Bretherton, MD Zelinka, SS Klein, BD Santer, and BM Sanderson.  2014.  "Statistical Significance of Climate Sensitivity Predictors Obtained by Data Mining."  Geophysical Research Letters.  https://doi.org/10.1002/2014GL059205.