The role of reliability in experiments

Jeffrey N. Rouder et al.

British Journal of Mathematical and Statistical Psychology2026https://doi.org/10.1111/bmsp.70042article
ABDC B
Weight
0.50

Abstract

We are concerned about an emphasis on reliability for analysis of psychology experiments. Experiments have two elements of sample size: the number of individuals and the number of replicate trials within a task, and that complicates reliability measures. To account for these elements, we distinguish among three levels of analysis: (1) A foundational level that centers task properties without recourse to either element of sample size. An example statistic is intraclass correlation which is the proportion of variances without reference to sample sizes. (2) An intermediate level that centers the number of trials but not the number of individuals. An example statistic on this level is reliability which describes variabilities with reference to numbers of trials but not numbers of individuals. A final level centers both the numbers of individuals and trials. An example quantity is the uncertainty in a correlation coefficient, which, ideally, reflects sample size limits in individuals and trials. Reliability describes an intermediate level - neither useful for communicating foundational task properties nor interpreting correlations. We advocate that researchers consider all three levels and highlight the role of hierarchical models in doing so.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1111/bmsp.70042

Or copy a formatted citation

@article{jeffrey2026,
  title        = {{The role of reliability in experiments}},
  author       = {Jeffrey N. Rouder et al.},
  journal      = {British Journal of Mathematical and Statistical Psychology},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1111/bmsp.70042},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

The role of reliability in experiments

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.