| 124 | 1 | 233 |
| 下载次数 | 被引频次 | 阅读次数 |
针对基因组组装问题,从数据预处理,利用KMP算法在O(m+n)的时间上快速确定某两个碱基片段的最大重复度,将读长序列依据Overlap图连成Contigs链以及Contigs N50的确定4个环节,改进现有的OLC拼接技术,并给出优化后的模型和算法,较好地解决了基因组组装问题.
Abstract:It was great significance to obtain the genetic information of the organism quickly and accurately,then to obtain the sequence information of target creature genome for life science research. For genome assembly problem,four steps were proposed. They were the data preprocessing,the use of KMP algorithm in the time of O( m + n) to quickly determine the maximum of certain segments of two bases the duplication,Contigs chain according to the Overlap figure,and the determination of Contigs N50. They could considerably improve existing OLC splicing technology. The optimized model and algorithm was presented,and the problem of the genome assembly could be solved in an easy way.
[1]骆志刚,方小永,丁凡.DNA序列拼接的研究进展及挑战[J].计算机工程与科学,2007,29(8):127-127.
[2]孙海汐,王秀杰.DNA测序技术发展及其展望[J].科研信息化技术与应用,2009(3):18-18.
[3]徐魁,陈科,徐君,等.CGDNA:基于簇图的基因组序列集成拼接算法[J].计算机科学,2015,42(9):235-239.
[4]曾培龙.基于reads引导的基因组序列拼接算法[J].智能计算机与应用,2015,5(3):23-25.
[5]汪勇,张新,徐琼,等.基因重组算法设计及多目标旅行商问题求解[J].系统工程,2015(2):68-73.
[6]耿丽,张仁杰.对基因组组装算法的分析和研究[J].世界最新医学信息文摘,2015(88):169-170.
[7]毛华,赵小娜,史田敏,等.多部图的最大匹配算法[J].郑州大学学报(理学版),2013,45(1):27-29.
[8]郑纬民,林皎,罗水华.DNA序列拼接中欧拉超路算法的新并行策略[J].计算机学报,2006,29(1):139-139.
[9]王旭.基于De Bruijn图的DNA contig生成算法[D].哈尔滨:哈尔滨工业大学,2011.
[10]严蔚敏.数据结构(C语言版)[M].北京:清华大学出版社,2007.
[11]数学建模全国组委会.全国大学生数学建模竞赛[EB/OL].[2014-04-19].http://www.Mcm.edu.cn.
[12]美吉生物网.Illumina测序reads过滤[EB/OL].[2014-05-19].http://www.majorbio.com.
基本信息:
DOI:10.13705/j.issn.1671-6841.2015266
中图分类号:Q811.4
引用信息:
[1]买阿丽,杨雯雯.关于基因重组中OLC算法的改进研究[J],2016,48(02):34-39+46.DOI:10.13705/j.issn.1671-6841.2015266.
基金信息:
国家自然科学基金资助项目(11526183);; 山西省基础研究项目(2015021015);; 运城学院数学学科研究项目(XK-2014035,XK-2014030),运城学院博士启动项目(YQ-2014011)
2015-11-16
2015
2016-01-21
2016
2