Conditional Coverage Estimation for High-Quality Prediction Intervals

Ziyi Huang; Henry Lam; Haofeng Zhang

doi:10.1007/s11518-023-5560-1

Journal of Systems Science and Systems Engineering ›› 2023, Vol. 32 ›› Issue (3) :289 -319. DOI: 10.1007/s11518-023-5560-1

Article

Conditional Coverage Estimation for High-Quality Prediction Intervals

Author information +

History +

PDF

Abstract

Deep learning has been recently studied to generate high-quality prediction intervals (PIs) for uncertainty quantification in regression tasks, including recent applications in simulation metamodeling. The high-quality criterion requires PIs to be as narrow as possible, whilst maintaining a pre-specified level of data (marginal) coverage. However, most existing works for high-quality PIs lack accurate information on conditional coverage, which may cause unreliable predictions if it is significantly smaller than the marginal coverage. To address this problem, we propose an end-to-end framework which could output high-quality PIs and simultaneously provide their conditional coverage estimation. In doing so, we design a new loss function that is both easy-to-implement and theoretically justified via an exponential concentration bound. Our evaluation on real-world benchmark datasets and synthetic examples shows that our approach not only achieves competitive results on high-quality PIs in terms of average PI width, but also accurately estimates conditional coverage information that is useful in assessing model uncertainty.

Keywords

Uncertainty quantification / prediction intervals / conditional coverage / neural networks / calibration error

Cite this article

Download citation ▾

Ziyi Huang, Henry Lam, Haofeng Zhang. Conditional Coverage Estimation for High-Quality Prediction Intervals. Journal of Systems Science and Systems Engineering, 2023, 32(3): 289-319 DOI:10.1007/s11518-023-5560-1

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Ankenman B, Nelson B L, Staum J (2008). Stochastic kriging for simulation metamodeling. Proceedings of the 2008 Winter Simulation Conference. USA.

[2]	Ankenman B, Nelson B L, Staum J. Stochastic kriging for simulation metamodeling. Operations Research, 2010, 58(2): 371-382.

[3]	Anthony M, Bartlett P L. Neural Network Learning: Theoretical Foundations, 1999, Cambridge: Cambridge University Press.

[4]	Barber R F, Candes E J, Ramdas A, Tibshirani R J (2019a). The limits of distribution-free conditional predictive inference. arXiv Preprint arXiv: 1903.04684.

[5]	Barber R F, Candes E J, Ramdas A, Tibshirani R J (2019b). Predictive inference with the jackknife+. arXiv Preprint arXiv: 1905.02928.

[6]	Bartlett P L, Harvey N, Liaw C, Mehrabian A. Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks. Journal of Machine Learning Research, 2019, 20(63): 1-17.

[7]	Barton R R, Meckesheimer M. Metamodel-based simulation optimization. Handbooks in Operations Research and Management Science, 2006, Netherlands: North Holland

[8]	Bekki J M, Chen X, Batur D (2014). Steady-state quantile parameter estimation: An empirical comparison of stochastic kriging and quantile regression. Proceedings of the 2014 Winter Simulation Conference. USA.

[9]	Bishop C M. Pattern Recognition and Machine Learning, 2006, USA: Springer.

[10]	Box G E, Tiao G C. Bayesian Inference in Statistical Analysis, 2011, USA: John Wiley & Sons.

[11]	Bridle J S (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing. France.

[12]	Chen H, Huang Z, Lam H, Qian H, Zhang H (2021). Learning prediction intervals for regression: Generalization and calibration. International Conference on Artificial Intelligence and Statistics. Virtual.

[13]	Chen X, Kim K-K (2013). Building metamodels for quantile-based measures using sectioning. Proceedings of the 2013 Winter Simulation Conference. USA.

[14]	Chen X, Kim K-K. Efficient var and cvar measurement via stochastic kriging. INFORMS Journal on Computing, 2016, 28(4): 629-644.

[15]	Couckuyt I, Gonzalez S R, Branke J (2022). Bayesian optimization: Tutorial. Proceedings of the Genetic and Evolutionary Computation Conference Companion. USA.

[16]	Dalmasso N, Pospisil T, Lee A B, Izbicki R, Freeman P E, Malz A I. Conditional density estimation tools in Python and R with applications to photometric red-shifts and likelihood-free cosmological inference. Astronomy and Computing, 2020, 30: 100362.

[17]	Dutordoir V, Salimbeni H, Hensman J, Deisenroth M (2018). Gaussian process conditional density estimation. Advances in Neural Information Processing Systems. Canada.

[18]	Fort S, Hu H, Lakshminarayanan B (2019). Deep ensembles: A loss landscape perspective. arXiv Preprint arXiv:1912.02757.

[19]	Frazier P I (2018). A tutorial on Bayesian optimization. arXiv Preprint arXiv: 1807.02811.

[20]	Freeman P E, Izbicki R, Lee A B. A unified framework for constructing, tuning and assessing photometric redshift density estimates in a selection bias setting. Monthly Notices of the Royal Astronomical Society, 2017, 468(4): 4556-4565.

[21]	Gal Y, Ghahramani Z (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. International Conference on Machine Learning. USA.

[22]	Galván I M, Valls J M, Cervantes A, Aler R. Multiobjective evolutionary optimization of prediction intervals for solar energy forecasting with neural networks. Information Sciences, 2017, 418: 363-382.

[23]	Geifman Y, Uziel G, El-Yaniv R (2018). Bias-reduced uncertainty estimation for deep neural classifiers. International Conference on Learning Representations. Canada.

[24]	Guo C, Pleiss G, Sun Y, Weinberger K Q (2017). On calibration of modern neural networks. International Conference on Machine Learning. Australia.

[25]	Gustafsson F K, Danelljan M, Schon T B (2020). Evaluating scalable Bayesian deep learning methods for robust computer vision. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. USA.

[26]	Hernández-Lobato J M, Adams R (2015). Probabilistic backpropagation for scalable learning of Bayesian neural networks. International Conference on Machine Learning. France.

[27]	Holmes M P, Gray A G, Isbell Jr C L (2007). Fast non-parametric conditional density estimation. Uncertainty in Artificial Intelligence. Canada.

[28]	Izbicki R, Lee A B. Nonparametric conditional density estimation in a high-dimensional regression setting. Journal of Computational and Graphical Statistics, 2016, 25(4): 1297-1316.

[29]	Izbicki R, Lee A B, Freeman P E. Photo-z estimation: An example of nonparametric conditional density estimation under selection bias. Annals of Applied Statistics, 2017, 11(2): 698-724.

[30]	Khosravi A, Nahavandi S, Creighton D, Atiya A F. Lower upper bound estimation method for construction of neural network-based prediction intervals. IEEE Transactions on Neural Networks, 2010, 22(3): 337-346.

[31]	Khosravi A, Nahavandi S, Creighton D, Atiya A F. Comprehensive review of neural network-based prediction intervals and new advances. IEEE Transactions on Neural Networks, 2011, 22(9): 1341-1356.

[32]	Kivaranovic D, Johnson K D, Leeb H (2020). Adaptive, distribution-free prediction intervals for deep networks. International Conference on Artificial Intelligence and Statistics. Italy.

[33]	Koenker R, Hallock K F. Quantile regression. Journal of Economic Perspectives, 2001, 15(4): 143-156.

[34]	Kosorok M R (2007). Introduction to Empirical Processes and Semiparametric Inference. Springer, USA.

[35]	Kuchibhotla A K, Ramdas A K (2019). Nested conformal prediction and the generalized jackknife+. arXiv Preprint arXiv: 1910.10562.

[36]	Kull M, Nieto M P, Kängsepp M, Silva Filho T, Song H, Flach P (2019). Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration. Advances in Neural Information Processing Systems. Canada.

[37]	Kumar A, Liang P S, Ma T (2019). Verified uncertainty calibration. Advances in Neural Information Processing Systems. Canada.

[38]	Lakshminarayanan B, Pritzel A, Blundell C (2017). Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in Neural Information Processing Systems. USA.

[39]	Lam H, Zhang H (2021). Neural predictive intervals for simulation metamodeling. Proceedings of the 2021 Winter Simulation Conference. USA.

[40]	Lam H, Zhang H (2022). Prediction intervals for simulation metamodeling. arXiv Preprint arXiv: 2204.01904.

[41]	Lee S, Purushwalkam S, Cogswell M, Crandall D, Batra D (2015). Why m heads are better than one: Training a diverse ensemble of deep networks. arXiv Preprint arXiv:1511.06314.

[42]	Lei J, G’Sell M, Rinaldo A, Tibshirani R J, Wasserman L. Distribution-free predictive inference for regression. Journal of the American Statistical Association, 2018, 113(523): 1094-1111.

[43]	Lei J, Rinaldo A, Wasserman L. A conformal prediction approach to explore functional data. Annals of Mathematics and Artificial Intelligence, 2015, 74(1–2): 29-43.

[44]	Lei J, Wasserman L. Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2014, 76(1): 71-96.

[45]	MacKay D J (1992). Bayesian methods for adaptive models. Ph.D. thesis, California Institute of Technology.

[46]	Massart P (2007). Concentration Inequalities and Model Selection. Springer, USA.

[47]	Meinshausen N. Quantile regression forests. Journal of Machine Learning Research, 2006, 7: 983-999.

[48]	Neal R M (2012). Bayesian Learning for Neural Networks. Springer, USA.

[49]	Niculescu-Mizil A, Caruana R (2005). Predicting good probabilities with supervised learning. International Conference on Machine Learning. Germany.

[50]	Nixon J, Dusenberry M, Zhang L, Jerfel G, Tran D (2019). Measuring calibration in deep learning. arXiv Preprint arXiv: 1904.01685.

[51]	Ovadia Y, Fertig E, Ren J, Nado Z, Sculley D, Nowozin S, Dillon J, Lakshminarayanan B, Snoek J (2019). Can you trust your model’s uncertainty? Evaluating predictive uncertainty under dataset shift. Advances in Neural Information Processing Systems. Canada.

[52]	Ozbulak U, De Neve W, Van Messem A (2018). How the softmax output is misleading for evaluating the strength of adversarial examples. arXiv Preprint arXiv: 1811.08577.

[53]	Pearce T, Leibfried F, Brintrup A (2020). Uncertainty in neural networks: Approximately Bayesian ensembling. International Conference on Artificial Intelligence and Statistics. Virtual.

[54]	Pearce T, Zaki M, Brintrup A, Neely A (2018). High-quality prediction intervals for deep learning: A distribution-free, ensembled approach. arXiv Preprint arXiv: 1802.07167.

[55]	Romano Y, Patterson E, Candes E (2019). Conformalized quantile regression. Advances in Neural Information Processing Systems. Canada.

[56]	Rosenfeld N, Mansour Y, Yom-Tov E (2018). Discriminative learning of prediction intervals. International Conference on Artificial Intelligence and Statistics. Spain.

[57]	Sensoy M, Kaplan L, Kandemir M (2018). Evidential deep learning to quantify classification uncertainty. Advances in Neural Information Processing Systems. Canada.

[58]	Sontag E D. VC dimension of neural networks. NATO ASI Series F Computer and Systems Sciences, 1998, 168: 69-96.

[59]	Staum J (2009). Better simulation metamodeling: The why, what, and how of stochastic kriging. Proceedings of the 2009 Winter Simulation Conference. USA.

[60]	Tagasovska N, Lopez-Paz D (2018). Frequentist uncertainty estimates for deep learning. arXiv Preprint arXiv:1811.00908.

[61]	Tagasovska N, Lopez-Paz D (2019). Single-model uncertainties for deep learning. Advances in Neural Information Processing Systems. Canada.

[62]	Van der Vaart A W, Wellner J A (1996). Weak Convergence and Empirical Processes with Applications to Statistics. Springer. USA.

[63]	Vovk V (2012). Conditional validity of inductive conformal predictors. Asian Conference on Machine Learning. Singapore.

[64]	Vovk V, Gammerman A, Shafer G (2005). Algorithmic Learning in a Random World. Springer. USA.

[65]	Vovk V, Nouretdinov I, Gammerman A. On-line predictive linear regression. The Annals of Statistics, 2009, 37(3): 1566-1590.

[66]	Wang B, Lu J, Yan Z, Luo H, Li T, Zheng Y, Zhang G (2019). Deep uncertainty quantification: A machine learning approach for weather forecasting. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. USA.

[67]	Zhang H, Zimmerman J, Nettleton D, Nordman D J. Random forest prediction intervals. The American Statistician, 2019, 74(4): 392-406.

[68]	Zhu L, Lu J, Chen Y (2019). HDI-forest: Highest density interval regression forest. Proceedings of the 28th International Joint Conference on Artificial Intelligence. China.