ChatGPT-4: Can It Handle Real-World Accounting Cases?

Theresa F. Henry et al.

Journal of Emerging Technologies in Accounting2025https://doi.org/10.2308/jeta-2024-028article
AJG 1ABDC B
Weight
0.37

Abstract

In this study, we assess ChatGPT-4’s ability in terms of accuracy, appropriateness of support, and consistency by applying it to a sizable number of case questions within the Deloitte Trueblood Case Study series. We contribute to the literature in three ways. First, we evaluate ChatGPT-4 on its ability to provide open-ended responses to realistic case study questions. Second, we ask ChatGPT-4 to not only answer the case questions but also provide the appropriate support from the relevant FASB standard. Finally, we run the case questions through ChatGPT-4 three times, therefore assessing the variation in ChatGPT-4 responses. Our results show that ChatGPT-4’s ability to accurately answer and support the case questions with consistency is not at levels that would be expected by accounting professionals. ChatGPT-4’s performance indicates that it may not be ready for more advanced accounting applications or even be relied upon for supplementary support by an accountant.

1 citation

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.2308/jeta-2024-028

Or copy a formatted citation

@article{theresa2025,
  title        = {{ChatGPT-4: Can It Handle Real-World Accounting Cases?}},
  author       = {Theresa F. Henry et al.},
  journal      = {Journal of Emerging Technologies in Accounting},
  year         = {2025},
  doi          = {https://doi.org/https://doi.org/10.2308/jeta-2024-028},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

ChatGPT-4: Can It Handle Real-World Accounting Cases?

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.37

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.16 × 0.4 = 0.06
M · momentum0.53 × 0.15 = 0.08
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.