基于CNN的Webshell文件检测Webshell Detection Based on Convolutional Neural Network
傅建明;黎琳;王应军;
摘要(Abstract):
Webshell是一种以ASP、PHP和JSP等网页文件形式存在的命令执行环境,可以用于Web服务器的远程访问控制.Webshell采用混淆和加密,增加了分析难度和检测难度.基于特征值匹配的Webshell检测方法难以有效对抗混淆加密,且无法检测未知的Webshell,为此提出了一种基于CNN的Webshell检测方法.该方法首先编译PHP文件获取opcode,再利用词汇表模型提取词序特征,最后训练得到CNN检测模型.实验结果表明,该方法在精确率、召回率、F1值都优于传统的机器学习算法,且检测率也高于现有的安全工具,证明了该方法的有效性.
关键词(KeyWords): Webshell;opcode;词汇表模型;深度学习
基金项目(Foundation): 国家自然科学基金项目(61373168,U1636107);; 中国科学院信息工程研究所中国科学院网络测评技术重点实验室开放课题
作者(Authors): 傅建明;黎琳;王应军;
DOI: 10.13705/j.issn.1671-6841.2018263
参考文献(References):
- [1] INTERNET LIVE STATS. Websites hacked today[EB/OL].[2018-05-29].http:∥www.internetlivestats.com/.
- [2]世界经济论坛.2018年全球风险报告[R/OL].(2017-12-12)[2018-05-29]. https:∥www.mmc.com/content/dam/mmc-web/Global-Risk-Center/Files/The-Global-Risks-Report-2018-Simplified-Chinese-version.pdf.
- [3] KIM J,YOO D H,JANG H,et al. Web SHArk 1. 0:a benchmark collection for malicious web shell detection[J]. JIPS,2015,11(2):229-238.
- [4] WEB TECHNOLOGY SURVEYS. Usage of server-side programmign languages for websites[EB/OL].[2018-05-23].https:∥w3techs.com/technologies/overview/programming_language/all.
- [5] NEW JERSEY CYBERSECURITY AND COMMUNICATIONS INTEGRATION CELL.China-chopper[EB/OL].[2018-05-29].https:∥www.cyber.nj.gov/threat-profiles/trojan-variants/china-chopper.
- [6] TIAN Y,WANG J,ZHOU Z,et al. CNN-webshell:malicious web shell detection with convolutional neural network[C]∥Pro-ceedings of the 2017 VI International Conference on Network,Communication and Computing. Kunming,2017:75-79.
- [7]石刘洋.基于Web日志的Webshell检测方法研究[J].信息安全研究,2016,2(1):66-73.
- [8] ZHANG J,JANG J,GU G,et al. Error-sensor:mining information from HTTP error traffic for malware intelligence[C]∥In-ternational Symposium on Research in Attacks,Intrusions,and Defenses. Heraklion,2018:467-489.
- [9] LAMPESBERGER H,WINTER P,ZEILINGER M,et al. An on-line learning statistical model to detect malicious web requests[C]∥International Conference on Security and Privacy in Communication Systems. Heidelberg,2011:19-38.
- [10] LINUX SOFTWARE AND BLOG. R-fx networks linuz malware detect[EB/OL].[2018-05-29]. https:∥www.rfxn.com/pro-jects/linux-malware-detect/.
- [11] EMPOSHA.PHP shell detector:web shell detection tool[EB/OL].(2011-07-17)[2018-05-29]. https:∥www.emposha.com/category/security.
- [12]深圳迪元素科技有限公司.D盾[EB/OL].[2018-05-29].http:∥www.d99net.net/.
- [13] ARGYROS G,STAIS I,KIAYIAS A,et al. Back in black:towards formal,black box analysis of sanitizers and filters[C]∥IEEE Symposium on Security and Privacy. San Jose,2016:91-109.
- [14] LANDGREY.PHP一句话木马检测绕过研究[EB/OL].[2018-05-29].https:∥xz.aliyun.com/t/2335.
- [15] KIM Y. Convolutional neural networks for sentence classification[C]∥Proceedings of the 2014 Conference on Empirical Meth-ods in Natural Language Processing(EMNLP). Doha,2014:1746-1751.
- [16]彭玉青,刘帆,高晴晴,等.基于微调优化的深度学习在语音识别中的应用[J].郑州大学学报(理学版),2016,48(4):30-35.
- [17]廖健,王素格,李德玉,等.基于增强字向量的微博观点句情感极性分类方法[J].郑州大学学报(理学版),2017,49(1):39-44.
- [18]毛晓波,程志远,周晓东.基于特征图叠加的脱机手写体汉字识别[J].郑州大学学报(理学版),2018,50(3):78-82.
- [19] YANG J,WANG L,XU Z. A novel semantic-aware approach for detecting malicious web traffic[C]∥International Conferenceon Information and Communications Security. Beijing,2017:633-645.
- [20] ZHANG X,ZHAO J,YANN L C. Character-level convolutional networks for text classification[C]∥Advances in Neural Infor-mation Processing Systems. Montreal,2015:649-657.
- [21] WRENCH P M,IRWIN B V W. Towards a PHP webshell taxonomy using deobfuscation-assisted similarity analysis[C]∥Infor-mation Security for South Africa(ISSA),IEEE. Johannesburg,2015:1-8.
- [22] STAROV O,DAHSE J,AHMAD S S,et al. No honor among thieves:a large-scale analysis of malicious web shells[C]∥Pro-ceedings of the 25th International Conference on World Wide Web. Montreal,2016:1021-1032.
- [23] LE V G,NGUYEN H T,LU D N,et al. A solution for automatically malicious web shell and web application vulnerability de-tection[C]∥International Conference on Computational Collective Intelligence. Halkidiki,2016:367-378.
- [24] TU T D,GUANG C,GUO X J,et al. Webshell detection techniques in web applications[C]∥International Conference onComputing,Communication and Networking Technologies. Hefei,2014:1-7.
- [25] XU M K,CHEN X,HU Y. Design of software to search ASP webshell[J]. Procedia engineering,2012,29(12):123-127.
- [26]胡建康,徐震,马多贺,等.基于决策树的Webshell检测方法研究[J].网络新媒体技术,2012,1(6):15-19.
- [27]孟正,梅瑞,张涛,等.Linux下基于SVM分类器的Web Shell检测方法研究[J].信息网络安全,2014,5(5):5-9.
- [28]朱魏魏,胡勇.基于NN-SVM的Webshell检测方法[J].通信与信息技术,2015,37(2):55-58.
- [29] ZHAO Z,XU S,KANG B H,et al. Investigation and improvement of multi-layer perceptron neural networks for credit scoring[J]. Expert systems with applications,2015,42(7):3508-3516.
- [30] MCCALLUM A,NIGAM K. A comparison of event models for naive bayes text classification[C]∥AAAI-98 workshop on learn-ing for text categorization. Madison,1998:41-48.
- [31] HEARST M A,DUMAIS S T,OSUNA E,et al. Support vector machines[J]. IEEE intelligent systems and their applications,1998,13(4):18-28.
- [32] GROSSBERG S. Recurrent neural networks[J]. Scholarpedia,2013,8(2):1888.
- [33] WEBSHELL PUB.河马在线查杀[EB/OL].[2018-05-29].http:∥n.shellpub.com/.
- [34]百度安全.OpenRASP[EB/OL].(2018-07-10)[2018-05-29].https:∥rasp.baidu.com/#section-intro.
- [35] EDR安全软件中心.深信服[EB/OL].[2018-05-29]. http:∥edr.sangfor.com.cn/backdoor_detection.html.