How Well Do Ratings Reflect Sentiment? Evidence From a Large Italian Review Corpus

Nicolò Biasetton et al.

Applied Stochastic Models in Business and Industry2026https://doi.org/10.1002/asmb.70090article
ABDC B
Weight
0.50

Abstract

Understanding whether numerical ratings reliably reflect the sentiment expressed in user‐generated product reviews is critical for accurate interpretation of online feedback. Although star ratings provide immediate, quantifiable signals to consumers and businesses, they may not fully convey the nuanced sentiment contained in text. Thus, we investigate the relationship between review ratings and underlying sentiment using a large corpus of Italian online product reviews. Since review corpora typically lack explicit sentiment labels, we develop a predictive framework for sentiment. We use a BERT‐based encoder (specifically, AlBERTo), fine‐tuned on our large, domain‐specific corpus, and a multi‐task CORAL ordinal regression trained on a sample with multiple human annotations. Finally, we utilize Correspondence Analysis to compare user ratings with the predicted sentiment scores. Our sentiment model shows strong performance on the validation set when evaluated on a five‐point ordinal scale, achieving MAE below 0.62 and RMSE below 0.82. The comparison between ratings and sentiment predictions shows that ratings and textual sentiment are generally aligned at extreme and neutral points, but notable discrepancies exist for mid‐scale evaluations, where ratings often fail to capture underlying textual nuances.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1002/asmb.70090

Or copy a formatted citation

@article{nicolò2026,
  title        = {{How Well Do Ratings Reflect Sentiment? Evidence From a Large Italian Review Corpus}},
  author       = {Nicolò Biasetton et al.},
  journal      = {Applied Stochastic Models in Business and Industry},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.1002/asmb.70090},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

How Well Do Ratings Reflect Sentiment? Evidence From a Large Italian Review Corpus

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.