Auditing Artificial Intelligence: Context Engineering with Retrieval-Augmented Generation

Mark D. Sheldon

Journal of Information Systems2026https://doi.org/10.2308/isys-2025-045article
AJG 1ABDC A
Weight
0.50

Abstract

Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) by integrating dynamic external data, enabling more contextually relevant and accurate outputs. As these systems gain traction in accounting applications (e.g., financial reporting, auditing, and tax), concerns emerge regarding data reliability, control oversight, and auditability. This study applies Design Science Research Methodology to develop a practical control framework tailored to the four core stages of a typical RAG system: External Data Management, User Interaction and Query Flow, Retrieval and Prompt Construction, and LLM Inference and Response. The proposed framework includes control objectives linked to stage-specific risks relevant to financial reporting. Iterative validation was conducted using an innovative multi-LLM consensus process to ensure coverage and reduce hallucination risk. This study contributes a framework for evaluating artificial intelligence (AI)-integrated systems, guidance for auditors and regulators to assess probabilistic AI outputs, extension of existing audit control frameworks, and instructional materials for accounting educators.

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.2308/isys-2025-045

Or copy a formatted citation

@article{mark2026,
  title        = {{Auditing Artificial Intelligence: Context Engineering with Retrieval-Augmented Generation}},
  author       = {Mark D. Sheldon},
  journal      = {Journal of Information Systems},
  year         = {2026},
  doi          = {https://doi.org/https://doi.org/10.2308/isys-2025-045},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Auditing Artificial Intelligence: Context Engineering with Retrieval-Augmented Generation

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.50 × 0.4 = 0.20
M · momentum0.50 × 0.15 = 0.07
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.