作者其他论文
文献详情
Pre-Alignment guided attention for improving training efficiency and model stability in end-To-end speech synthesis
文献类型:期刊
作者:Zhu, Xiaolian[1]  Zhang, Yuchao[2]  Yang, Shan[3]  Xue, Liumeng[4]  Xie, Lei[5]  
机构:[1]School of Computer Science, Northwestern Polytechnical University, Xi'an, 710065, China |Public Computer Education Center, Hebei University of Economics and Business, Shijiazhuang, 050061, China
[2]School of Computer Science, Northwestern Polytechnical University, Xi'an, 710065, China
[3]School of Computer Science, Northwestern Polytechnical University, Xi'an, 710065, China
[4]School of Computer Science, Northwestern Polytechnical University, Xi'an, 710065, China
[5]School of Computer Science, Northwestern Polytechnical University, Xi'an, 710065, China
通讯作者:Xie, Lei(lxie@nwpu.edu.cn)
年:2019
期刊名称:IEEE Access影响因子和分区
卷:7
页码范围:65955-65964
增刊:正刊
收录情况:EI(20192507063896)  
所属部门:计算机学院
人气指数:39
浏览次数:39
摘要:
Recently, end-To-end (E2E) neural text-To-speech systems, such as Tacotron2, have begun to surpass the traditional multi-stage hand-engineered systems, with both simplified system building pipelines and high-quality speech. With a unique encoder-decoder neural structure, the Tacotron2 system no longer needs separately learned text analysis front-end, duration model, acoustic model, and audio synthesis module. The key of such a system lies in the attention mechanism, which learns an alignment bet ...More
0
评论(0 条评论)
登录