| 165 | 2 | 135 |
| 下载次数 | 被引频次 | 阅读次数 |
语音识别和语音合成是近年来的热门研究,相关研究主要是在字素音素匹配的基础上进行统计分析,探寻拼写与发音之间的内在规律,达成形音转换。语言自身的混沌性使字素音素匹配非常复杂,提出一种交互式可视化工具,通过从粗到细的动态分类方式完成字素音素增量匹配。在此基础上,通过全局关联性分析揭示英语发音的整体规律,并及时检验和处理匹配错误和特殊发音等异常情况。采用5个著名语料库中的35 182个单词进行实验,匹配过程快速精准,并且发现的规律具有高度的可解释性,为相关人员进行语言学习和语音研究提供了坚实的基础。
Abstract:Speech recognition and speech synthesis were hot topics of study. Related study was mainly based on the statistical analysis of the grapheme and phoneme matching, exploring the internal patterns between spelling and pronunciation, and achieving the conversion of form and sound. Due to the chaos of the language itself, the process of grapheme-phoneme matching was very complex. An interactive visualization tool was proposed to complete the incremental matching of graphemes and phonemes through dynamic classification from coarse to fine. On this basis, the overall rules of English pronunciation were revealed through global relevance analysis, and abnormal situations such as mismatches and special pronunciations were checked and dealt with in time. 35 182 words in five famous corpora were used to conduct experiments. The matching process was fast and accurate, and the discovered rules were highly interpretable, providing a solid foundation for fourther exploration in language learning and phonetic research.
[1] HANNA P R.Phoneme-grapheme correspondences as cues to spelling improvement[EB/OL].(1965-01-01).https://www.researchgate.net/publication/234667177_PHONEME-GRAPHEME_CORRESPONDENCES_AS_CUES_TO_SPELLING_IMPROVEMENT.
[2] ELOVITZ H,JOHNSON R,MCHUGH A,et al.Letter-to-sound rules for automatic translation of English text to phonetics[J].IEEE transactions on acoustics,speech,and signal processing,1976,24(6):446-459.
[3] RENTZEPOPOULOS P A,KOKKINAKIS G.Efficient multilingual phoneme-to-grapheme conversion based on HMM[J].Computational linguistics,1996,22(3):351-376.
[4] WANG H,CHEN G L,XU L X.A high-accuracy approach to pronunciation prediction for out-of-vocabulary English word[J].Journal of Donghua university(english edition),2005,22(1):124-128.
[5] OGBUREKE K U,CAHILL P,CARSON-BERNDSEN J.Hidden markov models with context-sensitive observations for grapheme-to-phoneme conversion[C]//Interspeech Conference of the International Speech Communication Association,Makuhari:ISCA-INT Speech Communication Association,2010:1105-1108.
[6] KHEANG S,KATSURADA K,IRIBE Y,et al.New grapheme generation rules for two-stage modelbased grapheme-to-phoneme conversion[J].Journal of ICT research and applications,2014,8(2):157-174.
[7] HANNEMANN M,TRMAL J,ONDEL L,et al.Bayesian joint-sequence models for grapheme-to-phoneme conversion[C]//2017 IEEE International Conference on Acoustics,Speech and Signal Processing.New Orleans:IEEE Press,2017:2836-2840.
[8] ALSALLAKH B,HANBURY A,HAUSER H,et al.Visual methods for analyzing probabilistic classification data[J].IEEE transactions on visualization and computer graphics,2014,20(12):1703-1712.
[9] KIM K,LEE J.Sentiment visualization and classification via semi-supervised nonlinear dimensionality reduction[J].Pattern recognition,2014,47(2):758-768.
[10] JAMRóZ D,NIEDOBA T.Application of multidimensional data visualization by means of self-Organizing kohonen maps to evaluate classification possibilities of various coal types/zastosowanie wizualizacji wielowymiarowych danych za pomoca sieci kohonena do oceny mo■ci klasyfikacji ■nych typów w?gla[J].Archives of mining sciences,2015,60(1):39-50.
[11] KANDEL S,HEER J,PLAISANT C,et al.Research directions in data wrangling:visualizations and transformations for usable and credible data[J].Information visualization,2011,10(4):271-288.
[12] KANDEL S,PARIKH R,PAEPCKE A,et al.Profiler:integrated statistical analysis and visualization for data quality assessment[C]//Proceedings of the International Working Conference on Advanced Visual Interfaces.New York:Association for Computing Machinery Press,2012:547-554.
[13] KRISHNAN S,WANG J N,WU E,et al.Activeclean:interactive data cleaning for statistical modeling[J].Proceedings of the VLDB endowment,2016,9(12):948-959.
[14] LIU S X,ANDRIENKO G,WU Y C,et al.Steering data quality with visual analytics:the complexity challenge[J].Visual informatics,2018,2(4):191-197.
[15] WILKINSON L.Visualizing big data outliers through distributed aggregation[J].IEEE transactions on visualization and computer graphics,2018,24(1):256-266.
[16] CAO N,LIN C G,ZHU Q H,et al.Voila:visual anomaly detection and monitoring with streaming spatiotemporal data[J].IEEE transactions on visualization and computer graphics,2018,24(1):23-33.
[17] 李苑,王国胤,李智星,等.基于序列注意力机制的卷积神经网络异常检测[J].郑州大学学报(理学版),2019,51(2):17-22.LI Y,WANG G Y,LI Z X,et al.A sequential attention based convolutional neural network for anomaly detection[J].Journal of Zhengzhou university (natural science edition),2019,51(2):17-22.
[18] XIANG S X,YE X,XIA J Z,et al.Interactive correction of mislabeled training data[C]//2019 IEEE Conference on Visual Analytics Science and Technology.Vancouver:IEEE Press,2019:57-68.
基本信息:
DOI:10.13705/j.issn.1671-6841.2021163
中图分类号:TN912.3;H311
引用信息:
[1]李昕,李珊,龚文涛,等.英语字素音素的可视匹配与特征分析[J],2021,53(04):44-52.DOI:10.13705/j.issn.1671-6841.2021163.
基金信息:
中石油重大科技项目(ZD2019-183-004);; 中央高校基本科研业务费专项资金(20CX05019A)
2021-05-07
2021
2021-11-04
2021
1