In the Blink of an AI: Exploring Large Language Models’ Capability to Infer Traits From LinkedIn
Tobias Marc Härtel
Abstract
Large language models (LLMs) are increasingly promoted to practitioners as tools for inferring personality traits from LinkedIn profiles, promising scalable and innovative assessments. Yet, the psychometric foundations of such inferences remain untested. Building on the lens model, we presented 406 LinkedIn profiles to Microsoft Copilot (powered by GPT-4) twice, using single-shot prompting to assess personality (Big Five, narcissism) and intelligence. Inferences showed satisfactory intra-rater reliability for observable traits (up to r = .81), but poor reliability for less visible traits, suggesting unstable inferences (as low as r = .31). Correlations with ground-truth test scores indicate above-chance yet limited convergent validity for intelligence ( r = .24), openness ( r = .20), and extraversion ( r = .20), but not for less visible traits. Analysis of 32 coded LinkedIn cues suggests that this above-chance convergence reflects Copilot drawing on LinkedIn information with some consistency and sensitivity to valid trait signals. While this suggests a rudimentary functional grasp of personality, inferences were undermined by serious flaws, including positivity bias, range restriction, poor discriminant validity, cue overgeneralization, and adverse demographic impacts. By extending the lens model to LLMs as perceivers, we offer a theoretical and empirical foundation for understanding LLM-based trait inferences. Overall, claims that LLMs can validly infer personality from LinkedIn profiles are not just overoptimistic, but potentially harmful—they risk encouraging the adoption of practices that could lead to invalid selection decisions, unfair treatment of applicants, and legal exposure for organizations.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.