Score-Based Tests With Fixed Effects Person Parameters in Item Response Theory: Detecting Model Misspecification Including Differential Item Functioning

Rudolf Debelak & Charles Driver

Applied Psychological Measurement2026https://doi.org/10.1177/01466216261422480article
AJG 2ABDC B
Weight
0.50

Abstract

We present a fast, score-based test to detecting model misspecification in item response theory (IRT) models that remains valid when person parameters are treated as fixed effects, as may be used for very large data sets. The new approximation (i) eliminates the need to pre-specify ability groups or priors for person abilities, (ii) does not require explicit functional form assumptions, (iii) works with two estimators designed for very high item/person counts-constrained joint maximum likelihood (CJML) and joint maximum a posteriori (JMAP)-and (iv) requires only a single model fit, making DIF-screening faster and simpler than alternatives based on model comparisons. A spline-based residualization step further suppresses spurious Type I error when the ordering covariate is correlated with ability. Simulations with the two-parameter logistic model show nominal error rates and high power once examinees contribute around 15-20 responses; only extremely short tests (around 10 items) still pose challenges under strong impact. An application to 1,602 reading items and 57,684 students from the Mindsteps platform demonstrates scalability and practical value, flagging 13% of items for gender-related DIF and correlating highly with conventional approaches of explicitly modeling DIF. Together, these results position the proposed test as a robust, computation-light diagnostic for large-scale assessments when classical random-effects approaches are infeasible, ability group structure is unknown or complex, or the shape of DIF effects is unknown or complex.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1177/01466216261422480

Or copy a formatted citation

@article{rudolf2026,
  title        = {{Score-Based Tests With Fixed Effects Person Parameters in Item Response Theory: Detecting Model Misspecification Including Differential Item Functioning}},
  author       = {Rudolf Debelak & Charles Driver},
  journal      = {Applied Psychological Measurement},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1177/01466216261422480},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Score-Based Tests With Fixed Effects Person Parameters in Item Response Theory: Detecting Model Misspecification Including Differential Item Functioning

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.