基于局部邻域嵌入的无监督特征选择Unsupervised Feature Selection Based on Local Neighborhood Embedding
脱倩娟,赵红
摘要(Abstract):
机器学习中,特征选择可以有效降低数据维度.考虑到流形学习能够保持原始数据的几何结构,l_(2,1)范数能够防止过拟合,提升模型的泛化能力,将二者结合起来可以提高特征选择的效果和效率.结合局部邻域嵌入(LNE)算法和l_(2,1)范数,提出一种新的无监督特征选择方法.其主要思想是:首先利用数据样本和邻域间的距离以及重构系数构造相似矩阵;其次构建低维空间并结合l_(2,1)范数进行稀疏回归;最后计算每个特征的重要性并选出最优特征子集.实验通过与几种典型的特征选择算法做对比,验证了所提算法的有效性.
关键词(KeyWords): 机器学习;局部邻域嵌入;流形学习;无监督特征选择
基金项目(Foundation): 国家自然科学基金资助项目(61379049,61472406);; 福建省自然科学基金资助项目(2015J01269);; 漳州市自然科学基金资助项目(ZZ2016J35)
作者(Author): 脱倩娟,赵红
DOI: 10.13705/j.issn.1671-6841.2016087
参考文献(References):
- [1]DUDA R O,HART P E,STORK D G.Pattern classification[M].New York:John Willey&Sons,2004.
- [2]COVER T M,THOMAS J A.Elements of information theory[M].2nd edition.New York:John Willey&Sons,2003.
- [3]CAI D,ZHANG C,HE X.Unsupervised feature selection for multi-cluster data[C]//Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining.Washington,2010:333-342.
- [4]HE X,CAI D,NIYOGI P.Laplacian score for feature selection[C]//Advances in neural information processing systems.Columbia,2005:507-514.
- [5]PENG H,LONG F,DING C.Feature selection based on mutual information criteria of max-dependency,max-relevance,and min-redundancy[J].Pattern analysis and machine intelligence,IEEE transactions on,2005,27(8):1226-1238.
- [6]谭台哲,叶青,尚鹏.基于局部重构的无监督特征选择方法[J].计算机应用研究,2014,31(9):2828-2831.
- [7]YANG Y,SHEN H T,MA Z,et al.l2,1norm regularized discriminative feature selection for unsupervised learning[C]//IJCAI Proceedings:international joint conference on artificial intelligence.Bacelona,2011.
- [8]LI Z,YANG Y,LIU J,et al.Unsupervised feature selection using nonnegative spectral analysis[C]//National conference on artificial intelligence.Toronto,2012:1026-1032.
- [9]TENENBAUM J B,DE SILVA V,LANGFORD J C.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319-2323.
- [10]ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326.
- [11]SAUL L K,ROWEIS S T.Think globally,fit locally:unsupervised learning of low dimensional manifolds[J].The journal of machine learning research,2003,4(6):119-155.
- [12]马丽,董唯光,梁金平,等.基于随机投影的正交判别流形学习算法[J].郑州大学学报(理学版),2016,48(1):102-109.
- [13]BELKIN M,NIYOGI P.Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Advances in neural information processing systems.Denver,2001,585-591.
- [14]SCHLKOPF B,SMOLA A,MLLER K R.Kernel principal component analysis[C]//International conference on artificial neural networks.Berlin,1997:583-588.
- [15]BELHUMEUR P N,HESPANHA J P,KRIEGMAN D J.Eigenfaces vs.fisherfaces:Recognition using class specific linear projection[J].Pattern analysis and machine intelligence,IEEE transactions on,1997,19(7):711-720.
- [16]HE X F,NIYOGI P.Locality preserving projections[J].Neural information processing systems,2005,45(1):186-197.
- [17]HE X,CAI D,YAN S,et al.Neighborhood preserving embedding[C]//Tenth IEEE international conference on computer vision(ICCV'05).Beijing,2005:1208-1213.
- [18]ZHEN L,PENG X,PENG D.Local neighborhood embedding for unsupervised nonlinear dimension reduction[J].Journal of software,2013,8(2):410-417.
- [19]LI J D,CHENG K W,LIU H.Feature selection datasets[EB/OL].[2016-09-14].http://featureselection.asu.edu/datasets.php.