Accessible AI: comparing ChatGPT and traditional machine learning for sentiment analysis
Irem Önder
Abstract
Purpose This study aims to investigate the effectiveness and accessibility of GPT-4-turbo, a large language model (LLM), compared to traditional machine learning (ML) methods for sentiment analysis in restaurant reviews. Beyond predictive performance, the study emphasizes practical usability, focusing on accessibility, implementation effort and interpretability particularly for hospitality professionals without technical expertise. Design/methodology/approach A data set of 4,000 Yelp restaurant reviews was used to compare four traditional ML models (Logistic Regression, Support Vector Machine, Random Forest and Naive Bayes) against GPT-4-turbo, deployed in a zero-shot, no-code setting via the paid ChatGPT interface. All models were evaluated using accuracy, F1 score, recall, computational time and interpretability. Five-fold cross-validation and paired Welch’s t-tests were used to assess statistical significance among traditional models, while bootstrapping and error analysis were used to assess GPT-4-turbo’s classification behavior. Findings Support Vector Machine achieved the highest accuracy (92.36%) and F1 score (95.23%) among traditional models, while GPT-4-turbo offered the highest recall (97.28%) and required minimal setup, with no coding or data preprocessing. Although GPT-4-turbo had a slightly higher overall error rate, its ease of use and strong recall suggest that it is a viable option for rapid sentiment analysis in operational hospitality contexts. Practical implications While traditional ML models remain effective, GPT-4-turbo offers a fast, no-code alternative for hospitality professionals with limited technical expertise. Its high recall and ease of use make it well-suited for timely monitoring of customer feedback and rapid service recovery. Overall, user-friendly AI tools can lower adoption barriers and support data-informed decision-making in hospitality operations, strengthening service quality and guest experience management. Originality/value This study shifts the focus from technical model optimization to practical usability and accessibility of sentiment analysis tools in hospitality. It uniquely examines GPT-4-turbo as a no-code, plug-and-play alternative to traditional ML models for binary sentiment classification. By evaluating not just accuracy but also implementation effort and interpretability, the study offers a practitioner-oriented perspective on AI adoption in hospitality analytics, contributing new insights into real-world applicability in service and experience management contexts.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.