面向医学文本的实体关系抽取研究综述Review on Entity Relation Extraction for Medical Text
昝红英;关同峰;张坤丽;奥德玛;穗志方;
摘要(Abstract):
实体抽取和关系抽取作为信息抽取的重要子任务,近些年众多学者利用多种技术在该领域开展了深入研究。将这些技术应用于医学领域,抽取非结构化和半结构化的医学文本构建医学知识图谱,可服务于下游子任务。从医学领域实体关系抽取的相关概念出发,从不同角度对深度学习模型进行分类;进而依据数据集的构建方式,对监督学习和远程监督的多实例学习模型进行分析和讨论;最后展望了面向医学文本的实体关系抽取的未来研究方向。
关键词(KeyWords): 实体关系抽取;医学领域;监督学习;多实例学习
基金项目(Foundation): 国家重点研发计划项目(2017YFB1002101);; 国家社会科学基金重大项目(18ZDA315);; 中国博士后科学基金项目(2019TQ0286);; 河南省科技攻关项目(192102210260);; 河南省医学科技攻关计划省部共建项目(SB201901021);; 河南省高等学校重点科研项目(19A520003,20A520038)
作者(Authors): 昝红英;关同峰;张坤丽;奥德玛;穗志方;
DOI: 10.13705/j.issn.1671-6841.2020190
参考文献(References):
- [1]VRANDECˇI C'D,KRTZSCH M.Wikidata:a free collaborative knowledgebase[J].Communications of the ACM,2014,57(10):78-85.
- [2]LEHMANN J,ISELE R,JAKOB M,et al.DBpedia:a large-scale,multilingual knowledge base extracted from Wikipedia[J].Semantic web,2015,6(2):167-195.
- [3]BOLLACKER K,EVANS C,PARITOSH P,et al.Freebase:a collaboratively created graph database for structuring human knowledge[C]∥Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.New York:Association for Computing Machinery,2008:1247-1250.
- [4]DAVIS A P,GRONDIN C J,JOHNSON R J,et al.The comparative toxicogenomics database:update 2017[J].Nucleic acids research,2017,45(1):972-978.
- [5]奥德玛,杨云飞,穗志方,等.中文医学知识图谱CMeKG构建初探[J].中文信息学报,2019,33(10):1-9.BYAMBASUREN O,YANG Y F,SUI Z F,et al.Preliminary study on the construction of Chinese medical knowledge graph[J].Journal of Chinese information processing,2019,33(10):1-9.
- [6]GRISHMAN R,SUNDHEIM B.Message understanding conference-6:a brief history[C]∥Proceedings of the 16th Conference on Computational Linguistics.Morristown:Association for Computational Linguistics,1996:466-471.
- [7]REBHOLZ-SCHUHMANN D,YEPES A,LI C,et al.Assessment of NER solutions against the first and second CALBC silver standard corpus[J].Journal of biomedical semantics,2011,2(S5):63-71.
- [8]HANISCH D,FUNDEL K,MEVISSEN H T,et al.ProMiner:rule-based protein and gene entity recognition[J].BMC bioinformatics,2005,6(S1):1-9.
- [9]SAVOVA G K,MASANZ J J,OGREN P V,et al.Mayo clinical Text Analysis and Knowledge Extraction System (c TAKES):architecture,component evaluation and applications[J].Journal of the American medical informatics association,2010,17(5):507-513.
- [10]YANG Z H,LIN H F,LI Y P.Exploiting the performance of dictionary-based bio-entity name recognition in biomedical literature[J].Computational biology and chemistry,2008,32(4):287-291.
- [11]LAFFERTY J,MCCALLUM A,PEREIRA F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]∥Proceedings of the 18th International Conference on Machine Learning.San Francisco:Morgan Kaufmann Publishers Inc,2001:282-289.
- [12]TANG B Z,CAO H X,WU Y H,et al.Recognizing clinical entities in hospital discharge summaries using structural support vector machines with word representation features[J].BMC medical informatics and decision making,2013,13(S1):1-10.
- [13]CHANG Y C,DAI H J,WU J Y,et al.TEMPTING system:a hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries[J].Journal of biomedical informatics,2013,46:54-62.
- [14]UZUNER,SOUTH B R,SHEN S Y,et al.2010 i2b2/VA challenge on concepts,assertions,and relations in clinical text[J].Journal of the American medical informatics association,2011,18(5):552-556.
- [15]WEI C H,PENG Y F,LEAMAN R,et al.Overview of the Bio Creative V chemical disease relation (CDR) task[C]∥Proceedings of the 5th Bio Creative Challenge Evaluation Workshop.Oxford:Oxford University Press,2015:154-166.
- [16]JELIER R,JENSTER G,DORSSERS L C J,et al.Co-occurrence based meta-analysis of scientific texts:retrieving biological relationships between genes[J].Bioinformatics,2005,21(9):2049-2058.
- [17]NIKFARJAM A,EMADZADEH E,GONZALEZ G.Towards generating a patient's timeline:extracting temporal relationships from clinical notes[J].Journal of biomedical informatics,2013,46:40-47.
- [18]YANG Y L,LAI P T,TSAI R H.A hybrid system for temporal relation extraction from discharge summaries[C]∥International Conference on Technologies and Applications of Artificial Intelligence.Cham:Springer,2014:379-386.
- [19]LEE J Y,DERNONCOURT F,SZOLOVITS P.MIT at Sem Eval-2017 task 10:relation extraction with convolutional neural networks[C]∥Proceedings of the 11th International Workshop on Semantic Evaluation.Stroudsburg:Association for Computational Linguistics,2017:978-984.
- [20]SEOL J W,YI W J,CHOI J,et al.Causality patterns and machine learning for the extraction of problem-action relations in discharge summaries[J].International journal of medical informatics,2017,98:1-12.
- [21]杨锦锋,于秋滨,关毅,等.电子病历命名实体识别和实体关系抽取研究综述[J].自动化学报,2014,40(8):1537-1562.YANG J F,YU Q B,GUAN Y,et al.An overview of research on electronic medical record oriented named entity recognition and entity relation extraction[J].Acta automatica sinica,2014,40(8):1537-1562.
- [22]鄂海红,张文静,肖思琪,等.深度学习实体关系抽取研究综述[J].软件学报,2019 30(6):1793-1818.E H H,ZHANG W J,XIAO S Q,et al.Survey of entity relationship extraction based on deep learning[J].Journal of software,2019,30(6):1793-1818.
- [23]庄传志,靳小龙,朱伟建,等.基于深度学习的关系抽取研究综述[J].中文信息学报,2019,33(12):1-18.ZHUANG C Z,JIN X L,ZHU W J,et al.Deep learning based relation extraction:a survey[J].Journal of Chinese information processing,2019,33(12):1-18.
- [24]ZENG X R,ZENG D J,HE S Z,et al.Extracting relational facts by an end-to-end neural model with copy mechanism[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2018:506-514.
- [25]FU T J,LI P H,MA W Y.GraphRel:modeling text as relational graphs for joint entity and relation extraction[C]∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2019:1409-1418.
- [26]WEI Z,SU J,WANG Y,et al.A novel cascade binary tagging framework for relational triple extraction[C]∥Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2020:1476-1488.
- [27]DODDINGTON G.Automatic content extraction (ACE) program-task definitions and performance measures[C]∥Proceedings of the 3rd International Conference on Language Resources and Evaluation.Lisbon:European Language Resources Association,2004:837-840.
- [28]HENDRICKX I,KIM S N,KOZAREVA Z,et al.Sem Eval-2010 task 8:multi-way classification of semantic relations between pairs of nominals[C]∥Proceedings of the 5th International Workshop on Semantic Evaluation.Uppsala:Association for Computational Linguistics,2010:33-38.
- [29]MINTZ M,BILLS S,SNOW R,et al.Distant supervision for relation extraction without labeled data[C]∥Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing.Morristown:Association for Computational Linguistics,2009:1003-1011.
- [30]RIEDEL S,YAO L M,MCCALLUM A.Modeling relations and their mentions without labeled text[M]∥Machine Learning and Knowledge Discovery in Databases.Berlin:Springer,2010:148-163.
- [31]GARDENT C,SHIMORINA A,NARAYAN S,et al.Creating training corpora for NLG micro-planners[C]∥Proceedings of the55th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2017:179-188.
- [32]HAN X,ZHU H,YU P F,et al.FewRel:a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2018:4803-4809.
- [33]LIU C Y,SUN W B,CHAO W H,et al.Convolution neural network for relation extraction[M]∥Advanced Data Mining and Applications.Berlin:Springer,2013:231-242.
- [34]ZENG D,LIU K,LAI S,et al.Relation classification via convolutional deep neural network[C]∥Proceedings of the 25th International Conference on Computational Linguistics.Dublin:Dublin City University and Association for Computational Linguistics,2014:2335-2344.
- [35]SAHU S,ANAND A,ORUGANTY K,et al.Relation extraction from clinical texts using domain invariant convolutional neural network[C]∥Proceedings of the 15th Workshop on Biomedical Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2016:206-215.
- [36]NGUYEN T H,GRISHMAN R.Relation extraction:perspective from convolutional neural networks[C]∥Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:39-48.
- [37]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]∥Conference and Workshop on Neural Information Processing Systems.La Jolla:Neural Information Processing Systems Foundation Inc,2013:3111-3119.
- [38]SANTOS C,XIANG B,ZHOU B W.Classifying relations by ranking with convolutional neural networks[C]∥Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:626-634.
- [39]GU J H,SUN F Q,QIAN L H,et al.Chemical-induced disease relation extraction via convolutional neural network[EB/OL].[2020-05-01].https:∥academic.oup.com/database/article/doi/10.1093/database/bax024/3098440.
- [40]GU J H,QIAN L H,ZHOU G D.Chemical-induced disease relation extraction with various linguistic features[EB/OL].[2020-05-12].https:∥academic.oup.com/database/article/doi/10.1093/database/baw042/2630319.
- [41]ZHANG D X,WANG D.Relation classification via recurrent neural network[EB/OL].[2020-05-08].https:∥arxiv.org/abs/1508.01006.
- [42]ZHOU H W,DENG H J,CHEN L,et al.Exploiting syntactic and semantics information for chemical-disease relation extraction[EB/OL].[2020-05-12].https:∥academic.oup.com/database/article/doi/10.1093/database/baw048/2630338.
- [43]NGUYEN D Q,VERSPOOR K.Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings[C]∥Proceedings of the Workshop on Biomedical Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2018:129-136.
- [44]CHIKKA V R,KARLAPALEM K.A hybrid deep learning approach for medical relation extraction[EB/OL].[2020-05-08].https:∥arxiv.org/abs/1806.11189.
- [45]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectional long short-term memory networks for relation classification[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2016:207-212.
- [46]WANG L L,CAO Z,DE MELO G,et al.Relation classification via multi-level attention CNNs[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2016:1298-1307.
- [47]RAMAMOORTHY S,MURUGAN S.An attentive sequence model for adverse drug event extraction from biomedical text[EB/OL].[2020-05-01].https:∥arxiv.org/abs/1801.00625.
- [48]LI H D,CHEN Q C,TANG B Z,et al.Chemical-induced disease extraction via convolutional neural networks with attention[C]∥IEEE International Conference on Bioinformatics and Biomedicine.Washington:IEEE Computer Society,2017:1276-1279.
- [49]ZHOU H W,LANG C K,LIU Z,et al.Knowledge-guided convolutional networks for chemical-disease relation extraction[J].BMC bioinformatics,2019,20(1):260-273.
- [50]BORDES A,USUNIER N,GARCIA-DURAN A,et al.Translating embeddings for modeling multi-relational data[C]∥Conference and Workshop on Neural Information Processing Systems.La Jolla:Neural Information Processing Systems Foundation Inc,2013:2787-2795.
- [51]GORMLEY M R,YU M,DREDZE M.Improved relation extraction with feature-rich compositional embedding models[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1774-1784.
- [52]YU X,LAM W.Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C]∥Proceedings of the 23rd International Conference on Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2010:1399-1407.
- [53]LI Q,JI H.Incremental joint extraction of entity mentions and relations[C]∥Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2014:402-412.
- [54]MIWA M,SASAKI Y.Modeling joint entity and relation extraction with table representation[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2014:1858-1869.
- [55]REN X,WU Z,HE W,et al.Cotype:joint extraction of typed entities and relations with knowledge bases[C]∥Proceedings of the 26th International Conference on World Wide Web.Geneva:International World Wide Web Conferences Steering Committee,2017:1015-1024.
- [56]MIWA M,BANSAL M.End-to-end relation extraction using LSTMs on sequences and tree structures[EB/OL].[2020-05-01].https:∥arxiv.org/abs/1601.00770.
- [57]LI F,ZHANG M S,FU G H,et al.A neural joint model for extracting bacteria and their locations[M]∥Advances in Knowledge Discovery and Data Mining.Cham:Springer,2017:15-26.
- [58]ZENG D J,LIU K,CHEN Y B,et al.Distant supervision for relation extraction via piecewise convolutional neural networks[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1753-1762.
- [59]HOFFMANN R,ZHANG C,LING X,et al.Knowledge-based weak supervision for information extraction of overlapping relations[C]∥Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg:Association for Computational Linguistics,2011:541-550.
- [60]SURDEANU M,TIBSHIRANI J,NALLAPATI R,et al.Multi-instance multi-label learning for relation extraction[C]∥Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.Stroudsburg:Association for Computational Linguistics,2012:455-465.
- [61]LIN Y K,SHEN S Q,LIU Z Y,et al.Neural relation extraction with selective attention over instances[C]∥Proceedings of the54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2016:2124-2133.
- [62]JIANG X,WANG Q,LI P,et al.Relation extraction with multi-instance multi-label convolutional neural networks[C]∥Proceedings of the 26th International Conference on Computational Linguistics.Osaka:the COLING Organizing Committee,2016:1471-1480.
- [63]FENG X C,GUO J,QIN B,et al.Effective deep memory networks for distant supervised relation extraction[C]∥Proceedings of the 26th International Joint Conference on Artificial Intelligence.Palo Alto:AAAI Press,2017:4002-4008.
- [64]JI G,LIU K,HE S,et al.Distant supervision for relation extraction with sentence-level attention and entity descriptions[C]∥Proceedings of the 26th International Joint Conference on Artificial Intelligence.Palo Alto:AAAI Press,2017:3060-3066.
- [65]MIWA M,BANSAL M.End-to-end relation extraction using LSTMs on sequences and tree structures[C]∥Proceedings of the54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2016:1105-1116.
- [66]KATIYAR A,CARDIE C.Going out on a limb:joint extraction of entity mentions and relations without dependency trees[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2017:917-928.
- [67]ZHENG S C,WANG F,BAO H Y,et al.Joint extraction of entities and relations based on a novel tagging scheme[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2017:1227-1236.
- [68]昝红英,窦华溢,贾玉祥,等.基于多来源文本的中文医学知识图谱的构建[J].郑州大学学报(理学版),2020,52(2):45-51.ZAN H Y,DOU H Y,JIA Y X,et al.Construction of Chinese medical knowledge graph based on multi-source corpus[J].Journal of Zhengzhou university (natural science edition),2020,52(2):45-51.
- [69]昝红英,刘涛,牛常勇,等.面向儿科疾病的命名实体及实体关系标注语料库构建及应用[J].中文信息学报,2020,34(5):19-26.ZAN H Y,LIU T,NIU C Y,et al.Construction and application of named entity and entity relations corpus for pediatric diseases[J].Journal of Chinese information processing,2020,34(5):19-26.
- [70]张坤丽,赵旭,关同峰,等.面向医疗文本的实体及关系标注平台的构建及应用[J].中文信息学报,2020,34(6):36-44.ZHANG K L,ZHAO X,GUAN T F,et al.A platform for entity and entity relationship labeling in medical texts[J].Journal of Chinese information processing,2020,34(6):36-44.
- [71]MUNKHDALAI T,YU H.Meta networks[C]∥Proceedings of the 34th International Conference on Machine Learning.Massachusetts:the Journal of Machine Learning Research,2017:2554-2563.
- [72]SNELL J,SWERSKY K,ZEMEL R.Prototypical networks for few-shot learning[C]∥Conference and Workshop on Neural Information Processing Systems.Los Angeles:Neural Information Processing Systems Foundation Inc,2017:4077-4087.
- [73]GARCIA V,ESTRACH J B.Few-shot learning with graph neural networks[EB/OL].[2020-05-01].https:∥arxiv.org/pdf/1711.04043v3.pdf.
- [74]MISHRA N,ROHANINEJAD M,CHEN X,et al.A simple neural attentive meta-learner[EB/OL].[2020-05-01].https:∥arxiv.org/pdf/1707.03141.pdf.