作者其他论文
文献详情
A Kullback-Leibler Divergence Based Recurrent Mixture Density Network for Acoustic Modeling in Emotional Statistical Parametric Speech Synthesis
文献类型:会议
作者:An, Xiaochun[1]  Zhang, Yuchao[2]  Liu, Bing[3]  Xue, Liumeng[4]  Xie, Lei[5]  
机构:[1]Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China.;
[2]Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China.;
[3]Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China.;
[4]Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China.;
[5]Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China.;
年:2018
通讯作者:Xie, L (reprint author), Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China.
会议名称:PROCEEDINGS OF THE JOINT WORKSHOP OF THE 4TH WORKSHOP ON AFFECTIVE SOCIAL MULTIMEDIA COMPUTING AND FIRST MULTI-MODAL AFFECTIVE COMPUTING OF LARGE-SCALE MULTIMEDIA DATA (ASMMC-MMAC'18)
页码范围:1-6
会议开始日期:2018-01-01
收录情况:EI  
所属部门:计算机学院
学科:计算机科学
人气指数:19
浏览次数:19
语言:外文
关键词:Emotional statistical parametric speech synthesis; recurrent mixture density network; LSTM; KLD-RMDN
摘要:
This paper proposes a Kullback-Leibler divergence (KLD) based recurrent mixture density network (RMDN) approach for acoustic modeling in emotional statistical parametric speech synthesis (SPSS), which aims at improving model accuracy and emotion naturalness. First, to improve model accuracy, we propose to use RMDN as acoustic model, which combines an LSTM with a mixture density network (MDN). Adding mixture density layer allows us to do multimodal regression as well as to predict variances, thus ...More
0
评论(0 条评论)
登录