nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2021, 03, v.53 1-8
基于ALBERT-CRNN的弹幕文本情感分析
基金项目(Foundation): 国家自然科学基金项目(61977021,61902114); 湖北省2019年技术创新专项(2019ACA144)
邮箱(Email):
DOI: 10.13705/j.issn.1671-6841.2020359
摘要:

提出一种结合ALBERT预训练语言模型与卷积循环神经网络(convolutional recurrent neural network, CRNN)的弹幕文本情感分析模型ALBERT-CRNN。首先使用ALBERT预训练语言模型获取弹幕文本的动态特征表示,使得句子中同一个词在不同上下文语境中具有不同的词向量表达;然后利用CRNN对特征进行训练,充分考虑了文本中的局部特征信息和上下文语义关联;最后通过Softmax函数得出弹幕文本的情感极性。在哔哩哔哩、爱奇艺和腾讯视频三个视频平台的弹幕文本数据集上进行实验,结果表明,ALBERT-CRNN模型在三个数据集上的准确率分别达到94.3%、93.5%和94.8%,相比一些传统模型具有更好的效果。

Abstract:

The barrage text sentiment analysis model ALBERT-CRNN based on ALBERT pre-training language model and convolutional recurrent neural network(CRNN) was proposed. Firstly, the ALBERT pre-training language model was used to obtain the dynamic feature representations of barrage texts, so that the same word had different word vector expressions in different contexts. Then, these feature vectors were trained by CRNN, which made the local features and context semantic correlation to be fully considered. Finally, the sentiment polarity of barrage texts was obtained by the Softmax function. Experiments were carried out on the barrage text datasets of Bilibili, iQiYi and Tencent video platforms. The experimental results showed that the accuracy of ALBERT-CRNN on the above three datasets reached 94.3%, 93.5% and 94.8% respectively, which were better than some traditional models.

参考文献

[1] 郑飏飏,徐健,肖卓.情感分析及可视化方法在网络视频弹幕数据分析中的应用[J].现代图书情报技术,2015(11):82-90.ZHENG Y Y,XU J,XIAO Z.Utilization of sentiment analysis and visualization in online video bullet-screen comments[J].New technology of library and information service,2015(11):82-90.

[2] 洪庆,王思尧,赵钦佩,等.基于弹幕情感分析和聚类算法的视频用户群体分类[J].计算机工程与科学,2018,40(6):1125-1139.HONG Q,WANG S Y,ZHAO Q P,et al.Video user group classification based on barrage comments sentiment analysis and clustering algorithms[J].Computer engineering & science,2018,40(6):1125-1139.

[3] 庄须强,刘方爱.基于AT-LSTM的弹幕评论情感分析[J].数字技术与应用,2018,36(2):210-212.ZHUANG X Q,LIU F A.Emotional analysis of bullet-screen comments based on AT-LSTM[J].Digital technology and application,2018,36(2):210-212.

[4] 李平,戴月明,吴定会.双通道卷积神经网络在文本情感分析中的应用[J].计算机应用,2018,38(6):1542-1546.LI P,DAI Y M,WU D H.Application of dual-channel convolutional neural network in sentiment analysis[J].Journal of computer applications,2018,38(6):1542-1546.

[5] 邱宁佳,丛琳,周思丞,等.结合改进主动学习的SVD-CNN弹幕文本分类算法[J].计算机应用,2019,39(3):644-650.QIU N J,CONG L,ZHOU S C,et al.SVD-CNN barrage text classification algorithm combined with improved active learning[J].Journal of computer applications,2019,39(3):644-650.

[6] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 27th International Conference on Advances in Neural Information Processing Systems.Cambridge:MIT Press,2013:3111-3119.

[7] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[EB/OL].[2020-08-17].https://www.researchgate.net/publication/234131319.

[8] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding [EB/OL].[2020-08-17].https://www.researchgate.net/publication/328230984.

[9] PETERS M,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg:Association for Computational Linguistics,2018:2227-2237.

[10] RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training [EB/OL].[2020-08-17].https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf.

[11] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need [EB/OL].[2020-03-02].https://www.researchgate.net/publication/317558625.

[12] LAN Z Z,CHEN M D,GOODMAN S,et al.ALBERT:a lite BERT for self-supervised learning of language representations[EB/OL].[2020-02-13].https://www.researchgate.net/publication/336084032.

[13] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2014:1746-1751.

[14] SOCHER R,LIN C Y,NG A Y,et al.Parsing natural scenes and natural language with recursive neural networks[C]//Proceedings of the 28th International Conference on Machine Learning.New York:ACM Press,2011:129-136.

[15] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural computation,1997,9(8):1735-1780.

[16] DEY R,SALEM F M.Gate-variants of gated recurrent unit (GRU) neural networks[C]//Proceedings of the 60th IEEE International Midwest Symposium on Circuits and Systems.Piscataway:IEEE Press,2017:1597-1600.

[17] 陈榕,任崇广,王智远,等.基于注意力机制的CRNN文本分类算法[J].计算机工程与设计,2019,40(11):3151-3157.CHEN R,REN C G,WANG Z Y,et al.Attention based CRNN for text classification[J].Computer engineering and design,2019,40(11):3151-3157.

[18] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate [EB/OL].[2020-05-24].https://www.researchgate.net/publication/265252627.

[19] CHO K,VAN MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2020-05-24].https://www.researchgate.net/publication/262877889.

[20] YANG Z L,DAI Z H,YANG Y M,et al.XLNet:generalized autoregressive pretraining for language understanding[EB/OL].[2020-05-26].https://www.researchgate.net/publication/333892322.

基本信息:

DOI:10.13705/j.issn.1671-6841.2020359

中图分类号:TP391.1

引用信息:

[1]曾诚,温超东,孙瑜敏,等.基于ALBERT-CRNN的弹幕文本情感分析[J].郑州大学学报(理学版),2021,53(03):1-8.DOI:10.13705/j.issn.1671-6841.2020359.

基金信息:

国家自然科学基金项目(61977021,61902114); 湖北省2019年技术创新专项(2019ACA144)

检 索 高级检索