Construction specification knowledge extraction method based on hybrid deep learning algorithm
Xufang DENG , Fei CHENG , Yuangeng LYU , Lun DENG , Leping LIU , Jingyi FENG
Water Resources and Hydropower Engineering ›› 2025, Vol. 56 ›› Issue (S1) : 76 -84.
Engineering specifications are one of the important standard documents commonly used in the construction process. Faced with these unstructured engineering specification texts, efficiently and accurately extracting relevant knowledge and presenting this knowledge in a visual format plays a significant role in improving knowledge utilization efficiency and enhancing management personnel′s understanding of engineering specification texts. A deep learning-based method was proposed for extracting knowledge from typical engineering specification texts, integrating ALBERT(A Lite Bidirectional Encoder Representation from Transformers), BiLSTM(Bi-directional Long Short-Term Memory), and CRF(Conditional Random Fields) to establish an entity recognition model for engineering specifications. The model enhances the semantic features of the text to identify entities within the engineering specifications. Additionally, it employs the Attention mechanism and BiLSTM to extract relationships from the engineering specifications and constructs an engineering specification knowledge graph based on the extracted knowledge. Using the “Construction and Acceptance Specifications for Water Supply and Drainage Pipeline Projects” as a typical example, the method was validated, yielding an F1 score of 78.18% for entity recognition, which is superior to traditional models, and an F1 score of 98.35% for relationship extraction. Leveraging this knowledge, an engineering specification knowledge graph was established. Through a knowledge graph-based global information display, specific information retrieval, the efficiency of utilizing engineering specification knowledge was improved, assisting with on-site construction.
engineering specification / knowledge extraction / ALBERT pre-training model / BiLSTM / CRF / attention
/
| 〈 |
|
〉 |