Conditional Coverage Estimation for High-Quality Prediction Intervals
Ziyi Huang , Henry Lam , Haofeng Zhang
Journal of Systems Science and Systems Engineering ›› 2023, Vol. 32 ›› Issue (3) : 289 -319.
Deep learning has been recently studied to generate high-quality prediction intervals (PIs) for uncertainty quantification in regression tasks, including recent applications in simulation metamodeling. The high-quality criterion requires PIs to be as narrow as possible, whilst maintaining a pre-specified level of data (marginal) coverage. However, most existing works for high-quality PIs lack accurate information on conditional coverage, which may cause unreliable predictions if it is significantly smaller than the marginal coverage. To address this problem, we propose an end-to-end framework which could output high-quality PIs and simultaneously provide their conditional coverage estimation. In doing so, we design a new loss function that is both easy-to-implement and theoretically justified via an exponential concentration bound. Our evaluation on real-world benchmark datasets and synthetic examples shows that our approach not only achieves competitive results on high-quality PIs in terms of average PI width, but also accurately estimates conditional coverage information that is useful in assessing model uncertainty.
Uncertainty quantification / prediction intervals / conditional coverage / neural networks / calibration error
| [1] |
Ankenman B, Nelson B L, Staum J (2008). Stochastic kriging for simulation metamodeling. Proceedings of the 2008 Winter Simulation Conference. USA. |
| [2] |
|
| [3] |
|
| [4] |
Barber R F, Candes E J, Ramdas A, Tibshirani R J (2019a). The limits of distribution-free conditional predictive inference. arXiv Preprint arXiv: 1903.04684. |
| [5] |
Barber R F, Candes E J, Ramdas A, Tibshirani R J (2019b). Predictive inference with the jackknife+. arXiv Preprint arXiv: 1905.02928. |
| [6] |
|
| [7] |
|
| [8] |
Bekki J M, Chen X, Batur D (2014). Steady-state quantile parameter estimation: An empirical comparison of stochastic kriging and quantile regression. Proceedings of the 2014 Winter Simulation Conference. USA. |
| [9] |
|
| [10] |
|
| [11] |
Bridle J S (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing. France. |
| [12] |
Chen H, Huang Z, Lam H, Qian H, Zhang H (2021). Learning prediction intervals for regression: Generalization and calibration. International Conference on Artificial Intelligence and Statistics. Virtual. |
| [13] |
Chen X, Kim K-K (2013). Building metamodels for quantile-based measures using sectioning. Proceedings of the 2013 Winter Simulation Conference. USA. |
| [14] |
|
| [15] |
Couckuyt I, Gonzalez S R, Branke J (2022). Bayesian optimization: Tutorial. Proceedings of the Genetic and Evolutionary Computation Conference Companion. USA. |
| [16] |
|
| [17] |
Dutordoir V, Salimbeni H, Hensman J, Deisenroth M (2018). Gaussian process conditional density estimation. Advances in Neural Information Processing Systems. Canada. |
| [18] |
Fort S, Hu H, Lakshminarayanan B (2019). Deep ensembles: A loss landscape perspective. arXiv Preprint arXiv:1912.02757. |
| [19] |
Frazier P I (2018). A tutorial on Bayesian optimization. arXiv Preprint arXiv: 1807.02811. |
| [20] |
|
| [21] |
Gal Y, Ghahramani Z (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. International Conference on Machine Learning. USA. |
| [22] |
|
| [23] |
Geifman Y, Uziel G, El-Yaniv R (2018). Bias-reduced uncertainty estimation for deep neural classifiers. International Conference on Learning Representations. Canada. |
| [24] |
Guo C, Pleiss G, Sun Y, Weinberger K Q (2017). On calibration of modern neural networks. International Conference on Machine Learning. Australia. |
| [25] |
Gustafsson F K, Danelljan M, Schon T B (2020). Evaluating scalable Bayesian deep learning methods for robust computer vision. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. USA. |
| [26] |
Hernández-Lobato J M, Adams R (2015). Probabilistic backpropagation for scalable learning of Bayesian neural networks. International Conference on Machine Learning. France. |
| [27] |
Holmes M P, Gray A G, Isbell Jr C L (2007). Fast non-parametric conditional density estimation. Uncertainty in Artificial Intelligence. Canada. |
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
Kivaranovic D, Johnson K D, Leeb H (2020). Adaptive, distribution-free prediction intervals for deep networks. International Conference on Artificial Intelligence and Statistics. Italy. |
| [33] |
|
| [34] |
Kosorok M R (2007). Introduction to Empirical Processes and Semiparametric Inference. Springer, USA. |
| [35] |
Kuchibhotla A K, Ramdas A K (2019). Nested conformal prediction and the generalized jackknife+. arXiv Preprint arXiv: 1910.10562. |
| [36] |
Kull M, Nieto M P, Kängsepp M, Silva Filho T, Song H, Flach P (2019). Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration. Advances in Neural Information Processing Systems. Canada. |
| [37] |
Kumar A, Liang P S, Ma T (2019). Verified uncertainty calibration. Advances in Neural Information Processing Systems. Canada. |
| [38] |
Lakshminarayanan B, Pritzel A, Blundell C (2017). Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in Neural Information Processing Systems. USA. |
| [39] |
Lam H, Zhang H (2021). Neural predictive intervals for simulation metamodeling. Proceedings of the 2021 Winter Simulation Conference. USA. |
| [40] |
Lam H, Zhang H (2022). Prediction intervals for simulation metamodeling. arXiv Preprint arXiv: 2204.01904. |
| [41] |
Lee S, Purushwalkam S, Cogswell M, Crandall D, Batra D (2015). Why m heads are better than one: Training a diverse ensemble of deep networks. arXiv Preprint arXiv:1511.06314. |
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
MacKay D J (1992). Bayesian methods for adaptive models. Ph.D. thesis, California Institute of Technology. |
| [46] |
Massart P (2007). Concentration Inequalities and Model Selection. Springer, USA. |
| [47] |
|
| [48] |
Neal R M (2012). Bayesian Learning for Neural Networks. Springer, USA. |
| [49] |
Niculescu-Mizil A, Caruana R (2005). Predicting good probabilities with supervised learning. International Conference on Machine Learning. Germany. |
| [50] |
Nixon J, Dusenberry M, Zhang L, Jerfel G, Tran D (2019). Measuring calibration in deep learning. arXiv Preprint arXiv: 1904.01685. |
| [51] |
Ovadia Y, Fertig E, Ren J, Nado Z, Sculley D, Nowozin S, Dillon J, Lakshminarayanan B, Snoek J (2019). Can you trust your model’s uncertainty? Evaluating predictive uncertainty under dataset shift. Advances in Neural Information Processing Systems. Canada. |
| [52] |
Ozbulak U, De Neve W, Van Messem A (2018). How the softmax output is misleading for evaluating the strength of adversarial examples. arXiv Preprint arXiv: 1811.08577. |
| [53] |
Pearce T, Leibfried F, Brintrup A (2020). Uncertainty in neural networks: Approximately Bayesian ensembling. International Conference on Artificial Intelligence and Statistics. Virtual. |
| [54] |
Pearce T, Zaki M, Brintrup A, Neely A (2018). High-quality prediction intervals for deep learning: A distribution-free, ensembled approach. arXiv Preprint arXiv: 1802.07167. |
| [55] |
Romano Y, Patterson E, Candes E (2019). Conformalized quantile regression. Advances in Neural Information Processing Systems. Canada. |
| [56] |
Rosenfeld N, Mansour Y, Yom-Tov E (2018). Discriminative learning of prediction intervals. International Conference on Artificial Intelligence and Statistics. Spain. |
| [57] |
Sensoy M, Kaplan L, Kandemir M (2018). Evidential deep learning to quantify classification uncertainty. Advances in Neural Information Processing Systems. Canada. |
| [58] |
|
| [59] |
Staum J (2009). Better simulation metamodeling: The why, what, and how of stochastic kriging. Proceedings of the 2009 Winter Simulation Conference. USA. |
| [60] |
Tagasovska N, Lopez-Paz D (2018). Frequentist uncertainty estimates for deep learning. arXiv Preprint arXiv:1811.00908. |
| [61] |
Tagasovska N, Lopez-Paz D (2019). Single-model uncertainties for deep learning. Advances in Neural Information Processing Systems. Canada. |
| [62] |
Van der Vaart A W, Wellner J A (1996). Weak Convergence and Empirical Processes with Applications to Statistics. Springer. USA. |
| [63] |
Vovk V (2012). Conditional validity of inductive conformal predictors. Asian Conference on Machine Learning. Singapore. |
| [64] |
Vovk V, Gammerman A, Shafer G (2005). Algorithmic Learning in a Random World. Springer. USA. |
| [65] |
|
| [66] |
Wang B, Lu J, Yan Z, Luo H, Li T, Zheng Y, Zhang G (2019). Deep uncertainty quantification: A machine learning approach for weather forecasting. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. USA. |
| [67] |
|
| [68] |
Zhu L, Lu J, Chen Y (2019). HDI-forest: Highest density interval regression forest. Proceedings of the 28th International Joint Conference on Artificial Intelligence. China. |
/
| 〈 |
|
〉 |