Heads we win, tails you lose: AI detectors in education
Mark Andrew Bassett et al.
Abstract
The increasing use of generative artificial intelligence (AI) in student assessment has led to institutional reliance on detection tools. Unlike plagiarism detection, AI detection relies on unverifiable probabilistic estimates. In this paper, we argue that generative AI detection should not be used in education due to its methodological imperfections, violation of procedural fairness, and unverifiable outputs. Generative AI detectors cannot be tested in real-world conditions where the true origin of a text is unknown. Attempts to validate results through linguistic markers, multiple tools, or comparisons with past work introduce confirmation bias rather than independent verification. Moreover, categorising text as human- or AI-generated imposes a false dichotomy that ignores work created with, not by, AI. Generative AI detection also raises security concerns. Academic integrity investigations must rely on evidence meeting the balance of probabilities standard, which generative AI detection scores do not satisfy.
1 citation
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.16 × 0.4 = 0.06 |
| M · momentum | 0.53 × 0.15 = 0.08 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.