Model calibration and evaluation via optimal subsampling using electronic health record data

Joochul Lee et al.

Journal of the Royal Statistical Society. Series A: Statistics in Society2026https://doi.org/10.1093/jrsssa/qnag036article
AJG 3
Weight
0.50

Abstract

A common challenge for validating risk prediction models using electronic health record (EHR) data is that labels for the predicted outcome are not directly available. Towards efficient and unbiased model validation, we study optimal sampling designs for efficiently labelling an informative subset of patients in an EHR cohort. Given a pre-specified number of outcome labels, our design aims to minimize the asymptotic variance of an improved inverse probability weighted (‘I-IPW’) estimator for predictive accuracy metrics. Implementation of the sampling requires accurate risk estimates and the predictive accuracy metric of interest. We therefore propose to implement sampling in two steps. First a portion of the target number of labels is acquired by applying entropy sampling to a random subset of the cohort. These initial labels are used to calibrate risk estimates and obtain an initial estimate of the predictive accuracy metric, which are used to inform optimal sampling of the remaining target number of labels. The final estimate of the predictive accuracy metrics is obtained by applying the I-IPW estimator to the cohort and all acquired labels pooled together. Results from simulation studies and application to a real EHR dataset indicate superior efficiency of the proposed sampling design and I-IPW estimator.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1093/jrsssa/qnag036

Or copy a formatted citation

@article{joochul2026,
  title        = {{Model calibration and evaluation via optimal subsampling using electronic health record data}},
  author       = {Joochul Lee et al.},
  journal      = {Journal of the Royal Statistical Society. Series A: Statistics in Society},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1093/jrsssa/qnag036},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Model calibration and evaluation via optimal subsampling using electronic health record data

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.