GeoNER: Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training

Kai MA , Xinxin HU , Miao TIAN , Yongjian TAN , Shuai ZHENG , Liufeng TAO , Qinjun QIU

Acta Geologica Sinica (English Edition) ›› 2024, Vol. 98 ›› Issue (5) : 1404 -1417.

PDF
Acta Geologica Sinica (English Edition) ›› 2024, Vol. 98 ›› Issue (5) : 1404 -1417. DOI: 10.1002/1755-6724.15213
Original Article

GeoNER: Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training

Author information +
History +
PDF

Abstract

As important geological data, a geological report contains rich expert and geological knowledge, but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge. While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents, their effectiveness is hampered by a dearth of domain-specific knowledge, which in turn leads to a pronounced decline in recognition accuracy. This study summarizes six types of typical geological entities, with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition (GNER). In addition, GeoWoBERT-advBGP (Geological Word-base BERT-adversarial training Bi-directional Long Short-Term Memory Global Pointer) is proposed to address the issues of ambiguity, diversity and nested entities for the geological entities. The model first uses the fine-tuned word granularity-based pre-training model GeoWoBERT (Geological Word-base BERT) and combines the text features that are extracted using the BiLSTM (Bi-directional Long Short-Term Memory), followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference, the decoding finally being performed using a global association pointer algorithm. The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.

Keywords

geological named entity recognition / geological report / adversarial training / confrontation training / global pointer / pre-training model

Cite this article

Download citation ▾
Kai MA, Xinxin HU, Miao TIAN, Yongjian TAN, Shuai ZHENG, Liufeng TAO, Qinjun QIU. GeoNER: Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training. Acta Geologica Sinica (English Edition), 2024, 98(5): 1404-1417 DOI:10.1002/1755-6724.15213

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

2024 Geological Society of China

AI Summary AI Mindmap
PDF

384

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/