Calibrated Model Criticism Using Split Predictive Checks

Jiawei Li & Jonathan H. Huggins

Journal of the American Statistical Association2026https://doi.org/10.1080/01621459.2026.2649585preprint
AJG 4ABDC A*
Weight
0.40

What the paper says

Assessing how well a Bayesian model generalizes to unobserved data is essential, yet existing general-purpose model checks are either not properly calibrated (as in posterior predictive checks) or fail to be sufficiently general for practical use (e.g., due to requiring model-specific derivations). We propose <i>split predictive checks (SPCs)</i> as a simple, general-purpose class of predictive checks that maintain the usability of posterior predictive checks while directly targeting predictive generalization. SPCs work by splitting the data into training and test subsets, then fitting the model to the former and evaluating predictive discrepancies on the latter. We develop an asymptotic theory for two variants – single SPCs and divided SPCs – and show that, unlike posterior predictive checks, both yield asymptotically calibrated (hence interpretable) p-values. Our results show that single SPCs work well at identifying substantial misspecification, while divided SPCs are sensitive even to subtle departures from modeling assumptions. Through simulation studies and real-data applications, we show that SPCs provide reliable, flexible, and computationally efficient assessments of Bayesian model fit, often revealing issues with predictive generalization missed by other predictive checks.

2 citations

Open paper page →

Cite this paper

https://doi.org/https://doi.org/10.1080/01621459.2026.2649585

Or copy a formatted citation

@article{jiawei2026,
  title        = {{Calibrated Model Criticism Using Split Predictive Checks}},
  author       = {Jiawei Li & Jonathan H. Huggins},
  journal      = {Journal of the American Statistical Association},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1080/01621459.2026.2649585},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Calibrated Model Criticism Using Split Predictive Checks

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.40

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.25 × 0.4 = 0.10
M · momentum0.53 × 0.15 = 0.08
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.