nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2018, 01, v.50 72-76
一种基于粒子群算法优化的加权随机森林模型
基金项目(Foundation): 国家自然科学基金项目(61473266)
邮箱(Email):
DOI: 10.13705/j.issn.1671-6841.2017006
投稿时间: 2017-01-09
投稿日期(年): 2017
终审时间: 2017-04-11
终审日期(年): 2017
审稿周期(年): 1
移动端阅读
摘要:

随机森林是一种高效的分类算法,其模型中的投票选取机制会导致一些训练精度较低的决策树也拥有相同的投票能力,从而降低准确度,而且模型中的决策树棵数及其他参数通常难以选取.为解决此问题,在投票时将每棵决策树乘以一个与其训练精度成正比的权重,并采用粒子群算法优化随机森林模型,通过迭代优化选取模型中包含的参数.通过UCI数据库进行验证,结果显示提出的加权随机森林模型分类正确率高于一般的随机森林算法及传统的分类算法.

Abstract:

The voting mechanism in the random forest( RF) model would reduce the correct rate. The number of decision trees and the other parameters in the random forest were difficult to select. To solve these problems,a weighted random forests model was proposed. In voting,each decision tree was multiplied a weight which was proportional to its training accuracy. The parameters contained were selected by the iterative optimization with PSO algorithm. The experimental results with the UCI database showed that the classification accuracy of the proposed model was higher than that of the original random forests and the traditional classification algorithm.

参考文献

[1]BREIMAN L.Random forests[J].Machine learning,2001,45(1):5-32.

[2]BREIMAN L.Bagging predictors[J].Machine learning,1996,24(2):123-140.

[3]HO T.The random subspace method for constructing decision forests[J].IEEE transactions on pattern analysis and machine intelligence,1998,20(8):832-844.

[4]李欣海.随机森林模型在分类与回归分析中的应用[J].应用昆虫学报,2013,50(4):1190-1197.

[5]林成德,彭国兰.随机森林在企业信用评估指标体系确定中的应用[J].厦门大学学报(自然科学版),2007,46(2):199-203.

[6]杨帆,林琛,周绮凤,等.基于随机森林的潜在k近邻算法及其在基因表达数据分类中的应用[J].系统工程理论与实践,2012,32(4):815-825.

[7]ROBNIK-IKONJA M.Improving random forests[C]∥15th European Conference on Machine Learning.Italy,2004.

[8]ISHWARAN H,KOGALUR U B,BLACKSTONE E H,et al.Random survival forests[J].Journal of thoracic oncology official publication of the international association for the study of lung cancer,2008,6(12):1974-1975.

[9]NICOLAI M.Quantile regression forests[J].Journal of machine learning research,2006,7(2):983-999.

[10]CROUX C,JOOSSENS K,LEMMENS A.Trimmed bagging[J].Computational statistics&data analysis,2007,52(1):362-368.

[11]AMARATUNGA D,CABRERA J,LEE Y S.Enriched random forests[J].Bioinformatics,2008,24(18):2010-2014.

[12]XU B,GUO X,YE Y,et al.An improved random forest classifier for text categorization[J].Journal of computers,2012,7(12):2913-2920.

基本信息:

DOI:10.13705/j.issn.1671-6841.2017006

中图分类号:TP18

引用信息:

[1]王杰,程学新,彭金柱.一种基于粒子群算法优化的加权随机森林模型[J],2018,50(01):72-76.DOI:10.13705/j.issn.1671-6841.2017006.

基金信息:

国家自然科学基金项目(61473266)

投稿时间:

2017-01-09

投稿日期(年):

2017

终审时间:

2017-04-11

终审日期(年):

2017

审稿周期(年):

1

检 索 高级检索