nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2017, 04, v.49 40-45
基于自然语言处理的中文产科电子病历研究
基金项目(Foundation): 973课题(2014CB340504);; 国家自然科学基金项目(61402419,60970083);; 国家社会科学基金项目(14BYY096);; 河南省科技厅基础研究项目(142300410231,142300410308)
邮箱(Email):
DOI: 10.13705/j.issn.1671-6841.2017005
移动端阅读
摘要:

电子病历中蕴含着大量的医疗知识和患者的健康信息,而产科电子病历的结构化及信息抽取对临床决策支持及提高人口的生育健康水平具有重要意义.首先对中文产科电子病历的结构特点及内容进行了分析,并采用基于规则的方法对电子病历数据进行了清洗和结构化;其次采用最大熵(ME)模型及基于规则方法按治疗类型对电子病历进行分类,分类的F值达到88.16%;最后,为了进一步利用电子病历进行信息抽取和知识挖掘,以短句为单位,相似度为衡量标准,采用支持向量机(SVM)模型对首次病程记录进行去重处理及自动差异化分析,从分析的结果中筛选出68.6%的重复及相似短句.

Abstract:

Electronic medical record contains a lot of medical knowledge and patient' s health information. The structuralization and information extraction of obstetric electronic medical records is of great significance on clinical decision and the bearing health. The structural characteristics and content of Chinese obstetric electronic medical records were analyzed. The EMR data was cleaned and structuralized by using the rule-base method. Then the electronic medical records of different treatment types were automatically classified by using the maximum entropy model and rule-based methods. And the F value reached88. 16%. At last,in order to further use electronic medical records for information extraction and knowledge mining,the support vector machine model,in which a phrase was taken as a unit and similarity as benchmark,was used to remove the repetition in first course of disease records. And the result was that68. 6% of the reduplicate and similar phrases were deleted from the records. It was expected that this study could contribute to the further research on the information extraction from obstetrics electronic medical records.

参考文献

[1]李晓雅.卫生部出台《电子病历基本规范(试行)》[J].中国社区医师(医学专业),2010,11(3):21.

[2]杨孝光,李运明,张虎军,等.发达国家及地区电子病历发展现状与启示[J].西南军医,2013,15(3):345-346.

[3]KOHANE I S,GREENSPUN P,FACKLER J,et al.Building national electronic medical record systems via the world wide web[J].American journal of ophthalmology,1996,122(3):191-207.

[4]DEMNERFUSHMAN D,CHAPMAN W W,MCDONALD C J.What can natural language processing do for clinical decision support?[J].Journal of biomedical informatics,2009,42(5):760-772.

[5]BRUIJN B,CHERRY C,KIRITCHENKO S,et al.Machine-learned solutions for three stages of clinical information extraction:the state of the art at i2b2 2010[J].Journal of the American medical informatics association,2011,18(5):557-562.

[6]CLARK C,ABERDEEN J,COARR M,et al.MITRE system for clinical assertion status classification[J].Journal of the American medical informatics association,2010,18(5):563-567.

[7]RYAN R J.Groundtruth budgeting:a novel approach to semi-supervised relation extraction of medical language[D].Cambridge:Massachusetts institute of technology,2011:2-66.

[8]CASTANEDA C,NALLEY K,MANNION C,et al.Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine[J].Journal of clinical bioinformatics,2015,5(1):1-16.

[9]MOHAMMADI H,NEMATI M,ALLAHMORADI Z,et al.Ultrasound estimation of fetal weight in twins by artificial neural network[J].Journal of biomedical science and engineering,2011,4(1):46-50.

[10]BODENREIDER O.The unified medical language system(UMLS):integrating biomedical terminology[J].Nucleic acids research,2004,32:267-270.

[11]UZUNER,LUO Y,SZOLOVITS P.Evaluating the state-of-the-art in automatic de-identification[J].Journal of the American medical informatics association,2007,14(5):550-563.

[12]UZUNER,SOLTI I,CADAG E.Extracting medication information from clinical text[J].Journal of the American medical informatics association,2010,17(5):514-518.

[13]于一,廖睿,叶大田.电子病历结构化方法概述[J].北京生物医学工程,2007,26(1):103-106.

[14]李伟.非结构化病历文档结构化转换方法研究[D].天津:河北工业大学,2013.

[15]赵津京,滕国洲,冷建文,等.出院小结存在的问题及对策[J].解放军医院管理杂志,2009,16(1):34-35.

[16]赵芳芳.面向中文电子病历的词性标注技术研究[D].哈尔滨:哈尔滨工业大学,2014.

[17]叶枫,陈莺莺,周根贵,等.电子病历中命名实体的智能识别[J].中国生物医学工程学报,2011,30(2):256-262.

[18]杨锦锋,关毅,何彬,等.中文电子病历命名实体和实体关系语料库构建[J].软件学报,2016,27(11):2725-2746.

[19]李丹亚,胡铁军,李军莲,等.中文一体化医学语言系统的构建与应用[J].情报杂志,2011,30(2):147-151.

[20]曾召,王小平.UMLS与中医药一体化语言系统的建立[J].中华医学图书情报杂志,2006,15(3):1-3.

[21]李廉.电子病历妇产科手术知情同意书缺陷分析与改进措施[J].中国病案,2015,16(2):56-59.

[22]江林,童亚非,李兴海,等.妇产科住院病历书写质控与持续改进[J].现代医学,2011,39(3):353-355.

[23]杨锦锋,于秋滨,关毅,等.电子病历命名实体识别和实体关系抽取研究综述[J].自动化学报,2014,40(8):1537-1562.

[24]谢幸,苟文丽.妇产科学[M].第8版.北京:人民卫生出版社,2013.

基本信息:

DOI:10.13705/j.issn.1671-6841.2017005

中图分类号:TP391.1

引用信息:

[1]张坤丽,马鸿超,赵悦淑,等.基于自然语言处理的中文产科电子病历研究[J],2017,49(04):40-45.DOI:10.13705/j.issn.1671-6841.2017005.

基金信息:

973课题(2014CB340504);; 国家自然科学基金项目(61402419,60970083);; 国家社会科学基金项目(14BYY096);; 河南省科技厅基础研究项目(142300410231,142300410308)

检 索 高级检索