Reducing the Computational Cost of the ECF using a nuFFT: A fast and objective probability density estimation method

TitleReducing the Computational Cost of the ECF using a nuFFT: A fast and objective probability density estimation method
Publication TypeJournal Article
Year of Publication2014
JournalComputational Statistics and Data Analysis
Abstract / Summary

Geophysical research often calls for robust and thorough statistical characterization of model output and data (e.g., estimating probability density functions: PDFs). Existing methods for robustly estimating PDFs are generally computationally expensive, which inhibits their use for high-resolution datasets. DOE researchers have recently developed a novel method for reducing the computational cost of a key stage in many statistical analyses—calculation of the empirical characteristic function—by 100x. This improvement in computational efficiency allows for the estimation of huge sets of PDFs from high-resolution datasets. In this specific case, the estimation of atmospheric wind PDFs from 1 Terabyte of model output only required use of 1,000 CPUs for an hour at DOE’s National Energy Research Computing Center, whereas this previously would have required 100,000 CPUs. This ensemble of wind PDFs is currently being used to evaluate a theory about the multi-scale nature of updrafts in the atmosphere. This work paves the way toward routinely doing such comprehensive statistical characterizations of high-resolution datasets, which are critical for evaluating hypotheses about the climate system and other geophysical systems.

DOI10.1016/j.csda.2014.06.002
Journal: Computational Statistics and Data Analysis
Year of Publication: 2014

Geophysical research often calls for robust and thorough statistical characterization of model output and data (e.g., estimating probability density functions: PDFs). Existing methods for robustly estimating PDFs are generally computationally expensive, which inhibits their use for high-resolution datasets. DOE researchers have recently developed a novel method for reducing the computational cost of a key stage in many statistical analyses—calculation of the empirical characteristic function—by 100x. This improvement in computational efficiency allows for the estimation of huge sets of PDFs from high-resolution datasets. In this specific case, the estimation of atmospheric wind PDFs from 1 Terabyte of model output only required use of 1,000 CPUs for an hour at DOE’s National Energy Research Computing Center, whereas this previously would have required 100,000 CPUs. This ensemble of wind PDFs is currently being used to evaluate a theory about the multi-scale nature of updrafts in the atmosphere. This work paves the way toward routinely doing such comprehensive statistical characterizations of high-resolution datasets, which are critical for evaluating hypotheses about the climate system and other geophysical systems.

DOI: 10.1016/j.csda.2014.06.002
Citation:
O'Brien, TA, WD Collins, Sa rAUSCHER, and TD Ringler.  2014.  "Reducing the Computational Cost of the ECF using a nuFFT: A fast and objective probability density estimation method."  Computational Statistics and Data Analysis.  https://doi.org/10.1016/j.csda.2014.06.002.