基于循环神经网络的西班牙语词汇发音预测模型研究
Research on Predictive Model of Spanish Vocabulary Pronunciation Based on Recurrent Neural Network
DOI: 10.12677/CSA.2020.1010182, PDF,    科研立项经费支持
作者: 赵皎谷, 马延周, 黄晓辉:战略支援部队信息工程大学洛阳校区,河南 洛阳
关键词: 西班牙语发音词典字音转换循环神经网络Spanish Pronunciation Prediction Grapheme-to-Phoneme Conversion Recurrent Neural Network
摘要: 依据西班牙语词汇和音素的特征以及词汇标音过程的特点,将西班牙语词汇标音过程建模为序列标注任务,提出基于字符嵌入 + 循环神经网络 + 连接时序分类的端到端词汇标音模型。首先利用word2vec框架在自建的西班牙语词库上训练字符嵌入向量,从而形成西班牙语字符的分布式向量编码表示;之后基于循环神经网络和连接时序分类算法构建了西班牙语词汇标音模型,并在自建的发音词典语料上进行了训练与测试。试验结果显示,基于字符嵌入 + 循环神经网络 + 连接时序分类的词汇标音模型可以获得较其他统计模型或是神经网络模型更高的标音准确率,同时较传统标音模型有更简单的标注流程,对数据集的要求也要低得多,可有效实现端到端的西班牙语词汇标音任务。
Abstract: According to the characteristics of these vocabularies and phonemes and the characteristics of the vocabulary transcription process, the word vocabulary transcription process is modeled as a sequence labeling task, and an end-to-end vocabulary transcription model method based on character embedding + recurrent neural network + connection arrangement classification is proposed. First, this paper uses the word2vec framework to train the character embedding vector on the self-built serial thesaurus to form a distributed encoding representation of the character; then based on the recurrent neural network and the connection classification algorithm, a model called vocabulary transcription is constructed. The test results show that the word transcription model of string embedding + cyclic neural network + connection order classification can use higher transcription accuracy than other statistical models or neural network models. At the same time, it has a simpler labeling process than traditional phonetic models. The requirements of the phonetic transcription should also be reduced, that can effectively realize the end-to-end task called phonetic transcription.
文章引用:赵皎谷, 马延周, 黄晓辉. 基于循环神经网络的西班牙语词汇发音预测模型研究[J]. 计算机科学与应用, 2020, 10(10): 1714-1727. https://doi.org/10.12677/CSA.2020.1010182

参考文献

[1] 唐美丽, 胡琼, 马廷淮. 基于循环神经网络的语音识别研究[J]. 现代电子技术, 2019, 42(14): 152-156.
[2] Veena, P.V., Anand Kumar, M. and Soman, K.P. (2018) Character Embedding for Language Identification in Hindi-English Code-Mixed Social Media Text. Computación y Sistemas, 22, 65-74. [Google Scholar] [CrossRef
[3] Fang, C., Moriwaki, Y., Li, C.H. and Shimizu, K. (2019) Prediction of Antifungal Peptides by Deep Learning with Character Embedding. IPSJ Transactions on Bioinformatics, 12, 21-29.
[4] 方春, 孙福振, 李彩虹, 邢林林. 基于深度学习和字符嵌入的细胞穿透肽预测[J]. 计算机仿真, 2019, 36(10): 353-358.
[5] 杨丽, 吴雨茜, 王俊丽, 刘义理. 循环神经网络研究综述[J]. 计算机应用, 2018, 38(S2): 1-6+26.
[6] Graves, A. and Jaitly, N. (2014) Towards End-to-End Speech Recognition with Recurrent Neural Net-works. Proceedings of the 31st International Conference on Machine Learning (ICML-14), United States, 1764-1772.
[7] Schuster, M. and Paliwal, K.K. (1997) Bidirectional Recurrent Neural Networks. IEEE Transactions on Signal Processing, 45, 2673-2681. [Google Scholar] [CrossRef
[8] Graves, A., Fern´andez, S., Gomez, F. and Schmidhuber, J. (2006) Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Re-current Neural Networks. ICML’6: Proceedings of the 23rd International Conference on Machine learning, Pittsburgh, PA, 369-376. [Google Scholar] [CrossRef