Estimating Trends With Differential Item Functioning: A Comparison of Five IRT-Based Approaches

Oskar Engels et al.

Educational and Psychological Measurement2026https://doi.org/10.1177/00131644251408818article

ABDC A

Weight

0.50

What the paper says

In longitudinal assessments, tests are frequently used to estimate trends over time. However, when item parameters lack invariance, time-point comparisons can be distorted, necessitating appropriate statistical methods to achieve accurate estimation. This study compares trend estimates using the two-parameter logistic (2PL) model under item parameter drift (IPD) across five trend-estimation approaches for two time points: First, concurrent calibration, which jointly estimates item parameters across multiple time points. Second, fixed calibration, which estimates item parameters at a single time point and fixes them at the other time point. Third, robust linking with Haberman and Haebara as linking methods with L p or L 0 losses. Fourth, non-invariant items are detected using likelihood-ratio tests or the root mean square deviation statistic with fixed or data-driven cutoffs, and trend estimates are then recomputed using only the detected invariant items under partial invariance. Fifth, regularized estimation under a smooth Bayesian information criterion (SBIC) is applied, shrinking small or null IPD effects toward zero while estimating all others as nonzero. Bias and relative root mean square error (RMSE) were evaluated for the mean and SD at T2. An empirical example using synthetic longitudinal reading data, applying the trend-estimation approaches, is provided. The results indicate that the regularized estimation with SBIC performed best across conditions, maintaining low bias and RMSE, followed by robust linking methods. Specifically, Haberman linking with the L 0 loss function showed superior performance under unbalanced IPD, outperforming the partial invariance approaches. Concurrent and fixed calibration showed the poorest trend recovery under unbalanced IPD conditions.

Open paper page →

Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact	0.50 × 0.4 = 0.20
M · momentum	0.50 × 0.15 = 0.07
V · venue signal	0.50 × 0.05 = 0.03
R · text relevance †	0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.