作者其他论文
文献详情
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search
文献类型:会议
作者:Yuan, Yougen[1]  Leung, Cheung-Chi[2]  Xie, Lei[3]  Chen, Hongjie[4]  Ma, Bin[5]  Li, Haizhou[6]  
机构:[1]Northwestern Polytech Univ, Sch Comp Sci, Xian, Shaanxi, Peoples R China.;
[2]Alibaba Inc, Singapore, Singapore.;
[3]Northwestern Polytech Univ, Sch Comp Sci, Xian, Shaanxi, Peoples R China.;
[4]Northwestern Polytech Univ, Sch Comp Sci, Xian, Shaanxi, Peoples R China.;
[5]Alibaba Inc, Singapore, Singapore.;
[6]Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore.;
年:2018
通讯作者:Xie, L (reprint author), Northwestern Polytech Univ, Sch Comp Sci, Xian, Shaanxi, Peoples R China.
会议名称:19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES
页码范围:97-101
会议开始日期:2018-01-01
收录情况:CPCI-S(WOS:000465363900020)  
所属部门:计算机学院
人气指数:38
浏览次数:38
被引频次:1
语言:外文
关键词:acoustic word embeddings; word pairs; temporal context; triplet loss; query-by-example spoken term detection
摘要:
We propose to learn acoustic word embeddings with temporal context for query-by-example (QbE) speech search. The temporal context includes the leading and trailing word sequences of a word. We assume that there exist spoken word pairs in the training database. We pad the word pairs with their original temporal context to form fixed-length speech segment pairs. We obtain the acoustic word embeddings through a deep convolutional neural network (CNN) which is trained on the speech segment pairs wit ...More
0
评论(0 条评论)
登录