Quality Inspection Information Extraction: Intelligent Parsing of Construction Quality Inspection Specifications Based on Natural Language Processing

Jilong Liu et al.

Journal of Construction Engineering and Management2026https://doi.org/10.1061/jcemd4.coeng-16670article

AJG 2ABDC A*

Weight

0.50

What the paper says

Construction quality inspections play a vital role in ensuring project safety, regulatory compliance, and long-term sustainability. The diverse and intricate nature of construction quality inspection specifications poses substantial challenges for manual parsing. This paper presents an innovative parsing method grounded in natural language processing (NLP), aimed at efficiently extracting inspection information from a wide range of quality inspection specifications through a generalized and adaptable approach. A structured extraction method utilizing regular expressions is introduced, accompanied by a comprehensive labeling system developed by this paper for named entity recognition (NER). Additionally, a corresponding data set is constructed to address existing data deficiencies in this domain. By leveraging the robustly optimized bidirectional encoder representations from transformers pretraining approach (RoBERTa) with whole word masking (WWM), combined with a bidirectional long short-term memory (BiLSTM) and a conditional random field (CRF) module, information extraction performance is significantly enhanced. Through prompt engineering, cypher statements are generated based on named entity recognition (NER) results with the assistance of a large language model (LLM), enabling the construction of a knowledge graph for inspection information. This method effectively parses quality inspection specifications and automates the extraction of inspection data, thereby reducing reliance on manual operations. It lays a solid foundation for the future automation of quality inspection tasks and contributes to the advancement of intelligent management practices.

Open paper page →

Evidence weight

0.50

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact	0.50 × 0.4 = 0.20
M · momentum	0.50 × 0.15 = 0.07
V · venue signal	0.50 × 0.05 = 0.03
R · text relevance †	0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.