Enriched Construction Regulation Inquiry Responses: A Hybrid Search Approach for Large Language Models
Chuanni He et al.
Abstract
The applicability of existing automated compliance check tools in construction is limited, as they are insufficient to provide end-to-end responses given the fragmented and unstructured compliance checking requirements in practice. We explored the potential of large language models (LLMs) to fill the gap by proposing an improved retrieval-augmented generation (RAG) framework to conduct question-answering (QA)-based construction quality checks. The framework contains a novel hybrid search engine that integrates term frequency–inverse document frequency (TF-IDF)-based keyword search with text-embedding search to facilitate domain semantic-aware regulation information extraction. Subsequently, we established a RAG-based chatbot that enables construction managers to obtain construction quality check results and justification directly and precisely via conversations. The framework was tested using 110 real-world QA scenarios covering three concrete structure regulations of 148,170 words. Results show that the enhanced system has improved 15.1% and 11.2% in hit rate and mean reciprocal rank (MRR) compared with naïve RAG. The natural language responses demonstrate more precise and faithful results than conventional LLMs. Our research will contribute to the body of knowledge by proposing an improved RAG system to enhance the practicability of automated compliance checks. It also will push the boundary of LLM applications in construction by revealing how domain-specific terminologies facilitate knowledge extraction in LLM systems.
14 citations
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.62 × 0.4 = 0.25 |
| M · momentum | 0.85 × 0.15 = 0.13 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.