A Hybrid Approach of Prompt-based Learning and Rules for Domain Specific Named Entity Recognition

ZHANG Han; ZHANG Yazhou; XU Bingzhi; ZHANG Chengfang

doi:10.13705/j.issn.1671-6841.2024040

您当前所在位置：首页> 文献列表> 一种混合提示学习与规则的领域命名实体识别方法

2025, 05, v.57 31-38

一种混合提示学习与规则的领域命名实体识别方法

张晗^1,2 张亚洲² 徐秉智² 张铖方¹

1.四川警察学院智能警务四川省重点实验室 2.郑州大学网络空间安全学院

基金项目(Foundation): 智能警务四川省重点实验室开放课题(ZNJW2024KFQN005); 河南省高等学校重点科研项目(24A520047); 河南省重大科技专项(231100210200)

邮箱(Email): chengfangzhang@scpolicec.edu.cn;

DOI: 10.13705/j.issn.1671-6841.2024040

发布时间： 2024-06-30

出版时间： 2024-06-30

网络发布时间： 2024-06-30

移动端阅读

299	3	158
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

基于提示的微调学习为改善针对特定领域的命名实体识别(named entity recognition, NER)任务的性能提供了一个新的研究方向，但现有的提示学习方法面临需要人工构造模板、提示信息冗长、提示模板固定等问题。针对以上问题，提出了一种结合提示学习与专家知识的领域命名实体识别方法。首先，通过引入Bootstrapping算法自动识别潜在的实体，并改进了在获取相同上下文未标注实体类别过程中字符串匹配算法以获取更多提示信息模板。其次，引入领域本体中的专家知识来解决提示信息的可靠性问题。同时，采用一阶谓词的形式表示提示信息来改善提示信息长度。最后，通过在金融与信息安全两个数据集上的实验，验证了该方法能够有效提高领域命名实体识别的性能。

关键词： 提示学习; 命名实体识别; 自然语言处理; 低资源;

Abstract：

Prompt-based fine-tuning was a new direction to improve the performance of domain specific named entity recognition(NER). However, the existing methods faced challenges such as the need of manual template construction, lengthy prompt information, and fixed prompt templates. To address these issues, a method combined prompt learning with expert knowledge was proposed in the field of domain specific named entity recognition. Firstly, by introducing the bootstrapping algorithm, potential entities were automatically identified. And the string matching algorithm used in the process of obtaining unannotated entity types from the same context was improved to obtain more prompt information templates. Next, expert knowledge from the domain ontology was introduced to address the reliability concerns associated with prompt information. Simultaneously, first-order predicate logic was used to represent prompt information and to improve the representation of prompt information. Finally, with experiments on finance dataset and information security dataset, the method was verified to improve the performance of domain specific named entity recognition effectively.

KeyWords： prompt based learning; named entity recognition; natural language processing; low resource;

参考文献

[1] YANG L Y,YUAN L F,CUI L Y,et al.FactMix:using a few labeled in-domain examples to generalize to cross-domain named entity recognition[EB/OL].(2022-08-24)[2023-12-16].https://doi.org/10.48550/arXiv.2208.11464.

[2] 李明键，李卫军，王海荣.融合关联信息与CNN的实体识别研究[J].郑州大学学报(理学版),2023,55(5):53-59.LI M J,LI W J,WANG H R.Fusion of association information and entity recogni-tion of CNN[J].Journal of Zhengzhou university(natural science edition),2023,55(5):53-59.

[3] ZHAO Y C,ZHANG B K,GAO D.Construction of petrochemical knowledge graph based on deep learning[J].Journal of loss prevention in the process industries,2022,76:104736.

[4] HAN X,ZHANG Z Y,DING N,et al.Pre-trained models:past,present and future[J].AI open,2021,2:225-250.

[5] DING N,CHEN Y L,HAN X,et al.Prompt-learning for fine-grained entity typing[C]//Findings of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2022:6888-6901.

[6] SCHICK T,SCHüTZE H.It′s not just size that matters:small language models are also few-shot learners[EB/OL].(2020-09-15)[2023-12-16].https://doi.org/10.48550/arXiv.2009.07118.

[7] WU Y Q,LIU Y F,LU W M,et al.Towards interactivity and interpretability:a rationale-based legal judgment prediction framework[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2022:4787-4799.

[8] SHEN Y,TAN Z,WU S,et al.Prompt-NER:prompt locating and typing for named entity recognition[EB/OL].(2023-05-26)[2023-12-16].https://doi.org/10.48550/arXiv.2305.17104.

[9] CUI L Y,WU Y,LIU J,et al.Template-based named entity recognition using BART[C]//Findings of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2021:1835-1845.

[10] MA R T,ZHOU X,GUI T,et al.Template-free prompt tuning for few-shot NER[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg:Association for Computational Linguistics,2022:5721-5732.

[11] LEE D H,KADAKIA A,TAN K M,et al.Good examples make a faster learner:simple demonstration-based learning for low-resource NER[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2022:2687-2700.

[12] LU Y J,LIU Q,DAI D,et al.Unified structure generation for universal information extraction[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2022:5755-5772.

[13] LING T T,CHEN L,SHENG H X,et al.Sentence-level event detection without triggers via prompt learning and machine reading comprehension[EB/OL].(2023-06-25)[2023-12-16].http://arxiv.org/abs/2306.14176.

[14] CHEN H N,LUO X W.An automatic literature knowledge graph and reasoning network modeling framework based on ontology and natural language processing[J].Advanced engineering informatics,2019,42:100959.

[15] BIERMANN D,GOODWIN M,GRANMO O C.Knowledge infused representations through combination of expert knowledge and original input[C]//Symposium of the Norwegian AI Society.Cham:Springer International Publishing,2022:3-15.

[16] TEIXEIRA J,SARMENTO L,OLIVEIRA E.A bootstrapping approach for training a NER with conditional random fields[C]//Portuguese Conference on Artificial Intelligence.Berlin:Springer Press,2011:664-678.

[17] SCHICK T,SCHüTZE H.Exploiting cloze-questions for few-shot text classification and natural language inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2021:255-269.

[18] HE K,MAO R,HUANG Y C,et al.Template-free prompting for few-shot named entity recognition via semantic-enhanced contrastive learning[J].IEEE transactions on neural networks and learning systems,2023,99:1-13.

[19] WANG Y G,HUANG Y C,GONG T L,et al.Enhancing cross-lingual few-shot named entity recognition by prompt-guiding[C]//International Conference on Artificial Neural Networks.Cham:International Springer Publishing,2023:159-170.

[20] COMMENTZ-WALTER B.A string matching algorithm fast on the average[M]//Automata,Languages and Programming.Berlin:Springer Press,1979:118-132.

[21] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].(2019-05-24) [2023-12-16].http://arxiv.org/abs/1810.04805.

[22] HU X S,ZHANG H J,HU S L.Chinese named entity recognition based on BERTbased-BiLSTM-CRF model[C]//IEEE/ACIS 22nd International Conference on Computer and Information Science.Piscataway:IEEE Press,2022:100-104.

[23] 谢博.基于深度学习的中文网络威胁情报信息抽取技术研究[D].贵阳：贵州大学，2022.XIE B.Research on information extraction technology of chinese cyber threat intelligence based on deep learning[D].Guiyang:Guizhou University,2022.

[24] HE Q,CHEN G W,SONG W C,et al.Prompt-based word-level information injection BERT for Chinese named entity recognition[J].Applied sciences,2023,13(5):3331.

基本信息:

DOI：10.13705/j.issn.1671-6841.2024040

中图分类号:TP391.1;TP18

引用信息:

[1]张晗,张亚洲,徐秉智,等.一种混合提示学习与规则的领域命名实体识别方法[J].郑州大学学报(理学版),2025,57(05):31-38.DOI:10.13705/j.issn.1671-6841.2024040.

基金信息:

智能警务四川省重点实验室开放课题(ZNJW2024KFQN005); 河南省高等学校重点科研项目(24A520047); 河南省重大科技专项(231100210200)

发布时间：

2024-06-30

出版时间：

2024-06-30

网络发布时间：

2024-06-30

请选择需要下载的pdf数据

郑州大学学报（理学版）

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

请选择需要下载的pdf数据

郑州大学学报（理学版）

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈