Biological and Environmental Research - Earth and Environmental System Sciences
Earth and Environmental System Modeling

Evaluating the Potential and Challenges of an Uncertainty Quantification Method for Long Short‐Term Memory Models for Soil Moisture Predictions

TitleEvaluating the Potential and Challenges of an Uncertainty Quantification Method for Long Short‐Term Memory Models for Soil Moisture Predictions
Publication TypeJournal Article
Year of Publication2020
JournalWater Resources Research
Volume56
Number12
Abstract / Summary

Recently, recurrent deep networks have shown promise to harness newly available satellite‐sensed data for long‐term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates. Here we evaluated Monte Carlo dropout with an input‐dependent data noise term (MCD+N), an efficient uncertainty estimation framework originally developed in computer vision, for hydrologic time-series predictions. MCD+N simultaneously estimates a heteroscedastic input‐dependent data noise term (a trained error model attributable to observational noise) and a network weight uncertainty term (attributable to insufficiently constrained model parameters). Although MCD+N has appealing features, many heuristic approximations were employed during its derivation, and rigorous evaluations and evidence of its asserted capability to detect dissimilarity were lacking. To address this, we provided an in‐depth evaluation of the scheme's potential and limitations. We showed that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+N indeed gave a good estimate of predictive error, provided that we tuned a hyperparameter and used a representative training data set. The input‐dependent term responded strongly to observational noise, while the model term clearly acted as a detector for physiographic dissimilarity from the training data, behaving as intended. However, when the training and test data were characteristically different, the input‐dependent term could be misled, undermining its reliability. Additionally, due to the data‐driven nature of the model, data noise also influences network weight uncertainty, and therefore the two uncertainty terms are correlated. Overall, this approach has promise, but care is needed to interpret the results.

URLhttps://doi.org/10.1029/2020WR028095
DOI10.1029/2020WR028095
Project: 
Journal: Water Resources Research
Year of Publication: 2020
Volume: 56
Number: 12
Publication Date: 11/2020

Recently, recurrent deep networks have shown promise to harness newly available satellite‐sensed data for long‐term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates. Here we evaluated Monte Carlo dropout with an input‐dependent data noise term (MCD+N), an efficient uncertainty estimation framework originally developed in computer vision, for hydrologic time-series predictions. MCD+N simultaneously estimates a heteroscedastic input‐dependent data noise term (a trained error model attributable to observational noise) and a network weight uncertainty term (attributable to insufficiently constrained model parameters). Although MCD+N has appealing features, many heuristic approximations were employed during its derivation, and rigorous evaluations and evidence of its asserted capability to detect dissimilarity were lacking. To address this, we provided an in‐depth evaluation of the scheme's potential and limitations. We showed that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+N indeed gave a good estimate of predictive error, provided that we tuned a hyperparameter and used a representative training data set. The input‐dependent term responded strongly to observational noise, while the model term clearly acted as a detector for physiographic dissimilarity from the training data, behaving as intended. However, when the training and test data were characteristically different, the input‐dependent term could be misled, undermining its reliability. Additionally, due to the data‐driven nature of the model, data noise also influences network weight uncertainty, and therefore the two uncertainty terms are correlated. Overall, this approach has promise, but care is needed to interpret the results.

DOI: 10.1029/2020WR028095
Citation:
Fang, K, D Kifer, K Lawson, and C Shen.  2020.  "Evaluating the Potential and Challenges of an Uncertainty Quantification Method for Long Short‐Term Memory Models for Soil Moisture Predictions."  Water Resources Research 56(12).  https://doi.org/10.1029/2020WR028095.