«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

HTML)

分享到：

《武汉工程大学学报》[ISSN:1674-2869/CN:42-1779/TQ]

卷:: 40
期数:: 2018年06期

页码:: 691-695

栏目:: 机电与信息工程

出版日期:: 2018-12-28

文章信息/Info

Title:: HMM-Based Tone Speech Model

文章编号:: 20180621

作者:: 易雪蓉¹; 黄　巍*¹; 2; 胡　迪¹; 蒋　怡¹; 1. 武汉工程大学计算机科学与工程学院，湖北武汉 430205；2. 智能机器人湖北省重点实验室（武汉工程大学），湖北武汉 430205

Author(s):: YI Xuerong¹; HUANG Wei*¹; 2; HU Di¹; JIANG Yi¹; 1. School of Computer Science and Engineering， Wuhan Institute of Technology， Wuhan 430205， China；2. Hubei Key Laboratory of Intelligent Robot （Wuhan Institute of Technology）， Wuhan 430205， China

关键词:: 语音识别; 隐马尔科夫模型; 声调模型; 转移概率

Keywords:: speech recognition; Hidden Markov Model; tone model; transition probability

分类号:: TP391

DOI:: 10. 3969/j. issn. 16742869. 2018. 06. 021

文献标志码:: A

摘要:: 针对声韵母相同但声调不同的近音字识别问题和声韵母及声调都相同的同音字识别问题，提出在语音模型和语言模型中分别引入声调和字转移概率，以提高近音字和同音字的识别率。首先将声调划分为5种表现形式添加到汉语音节的最后一个音素中构成新音素，使用高斯混合隐马尔科夫模型建模新音素。然后通过统计方法计算特定语境下的字间转移概率。最后使用HTK工具包实现了带声调的语音模型和有字转移概率的语言模型。实验结果证明添加声调可以提高近音字的识别率，使用特定语境下字间转移概率可以提高同音字的识别率。

Abstract:: To improve the recognition rate of approximant characters with the same initial but different tones and the recognition accuracy of the homophonous characters with the same initial and tone， we introduced the tone and word transition probabilities into the models of speech and language respectively. Firstly， the tone is divided into five forms and added to the last phoneme of Chinese syllable to form a new phoneme， which was afterwards modeled by Gaussian mixed hidden Markov model. Then， we calculated the word transition probabilities in a specific context. Finally， we adopted the Hidden Markov Model Toolkit to realize the models of tonal speech and language with word transition probabilities. The experiments show that the tones can improve the recognition rate of approximant characters， and the use of word transition probabilities in a specific context can promote the recognition rate of homophonous characters.

参考文献/References:

［1］　何湘智. 语音识别的研究与发展［J］. 计算机与现代化，2002（3）：3-6. ［2］　聂敏. 语音识别及其关键技术［J］. 无线通信技术，1999（4）：53-56. ［3］　禹琳琳. 语音识别技术及应用综述［J］. 现代电子技术，2013（13）：43-45. ［4］　侯一民，周慧琼，王政一. 深度学习在语音识别中的研究进展综述［J］. 计算机应用研究，2017，34（8）：2241-2246. ［5］　黄哲杉. 语音机器人隐马尔可夫算法探究［J］. 现代信息科技，2018（4）：95-98. ［6］　赵力，邹采荣，吴镇扬. 基于连续分布型HMM的汉语连续语音的声调识别方法［J］. 信号处理，2000，16（1）：20-23. ［7］　曹阳，黄泰翼，徐波. 基于统计方法的汉语连续语音中声调模式的研究［J］. 自动化学报，2004，30（2）：191-198. ［8］　DO J H， KANG O．Automatic prosodic tone choice classification with Brazil’s into nation model［J］. International Journal of Speech Technology，2016，19（1）：95-109. ［9］　刘超. 语音识别中的深度学习方法［D］. 北京：清华大学，2015. ［10］　余尤好. 神经网络在通信系统回音对消中的应用［J］. 武汉工程大学学报，2012，34（9）：70-74. ［11］　张仕良. 基于深度神经网络的语音识别模型研究［D］. 合肥：中国科学技术大学，2017. ［12］　张登岐. “十五”高教版《现代汉语》的语法系统［J］. 阜阳师范学院学报（社会科学版），2005（6）：60-68. ［13］　YOUNG S V， EVERMANN G N， GALES M， et al. The HTK book version 3.5［EB/OL］.（2018-07-25）［2018-08-15］.http://htk.eng.cam.ac.uk/［14］　张强，陶宏才. 基于HTK的语音识别语言模型设计及性能分析［J］. 成都信息工程学院学报，2009，24（2）：142-146. ［15］　周盼. 基于深层神经网络的语音识别声学建模研究［D］. 合肥：中国科学技术大学，2014. ［16］　捷亚. 谈谈汉语拼音的字母教学［J］. 语文建设，1985（3）：45-46. ［17］　黄中伟，杨磊，徐明，等. 普通话语音识别中的基本音素分析［J］. 深圳大学学报（理工版），2006，23（4）：356-357. ［18］　卢偓. 现代汉语音节的数量与构成分布［J］. 语言教学与研究，2001（6）：28-34.

相似文献/References:

[1]杨　帆,秦智鹏.基于STM32的语音分类垃圾桶设计[J].武汉工程大学学报,2020,42(06):693.[doi:10.19843/j.cnki.CN42-1779/TQ.202007003]
　YANG Fan,QIN Zhipeng.Design of Classified Trash Can with Speech Recognition Based on STM32[J].Journal of Wuhan Institute of Technology,2020,42(06):693.[doi:10.19843/j.cnki.CN42-1779/TQ.202007003]

备注/Memo

备注/Memo:: 收稿日期：2018-08-14作者简介：易雪蓉，硕士研究生。E-mail：1143152674@qq.com*通讯作者：黄　巍，博士，副教授。E-mail：wei.huang@foxmail.com引文格式：易雪蓉，黄巍. 基于HMM的声调语音模型研究［J］. 武汉工程大学学报，2018，40（6）：691-695.

更新日期/Last Update: 2018-12-22

《武汉工程大学学报》[ISSN:1674-2869/CN:42-1779/TQ]

文章信息/Info

参考文献/References:

相似文献/References:

备注/Memo

常用功能

导航/Navigate

工具/Tools

统计/Statistics