语音 - 话题 | 元语音研究网

元语音

[浏览需要 0 积分] 发布于2024-03-23 10:30:14

优秀博士论文推荐—李乃寒—面向语音合成的深度学习算法研究与应用
语音合成(speech synthesis，又名 text-to-speech,TTS)是人机交互的重要方法之一，旨在合成清晰且自然的音频。语音合成的应用场景非常广泛，比如手机和个人电脑的语音助手、同声传译的语音输出环节、车载导航播报、新闻朗读等等。通过语...

赞 2

评论 2

浏览 1254

语音
元语音

[浏览需要 0 积分] 发布于2022-05-14 11:37:49

端到端语音识别-01-田正坤
论文优势：（1）CTC 模型介绍（2）基础 Attention 模型（3）Encoder 模型探讨（4）软、硬 Attention 机制（5）多任务学习结构（6）Transformer 结构（7）训练技巧与个人思考论文下载链接：

赞 3

评论 22

浏览 2489

开源分享
元语音

[浏览需要 0 积分] 发布于2022-03-06 21:41:54

清华大学 - 语音识别基本法
下载链接

赞 6

评论 45

浏览 3031

开源分享
Speech

[浏览需要 0 积分] 发布于2025-03-06 14:10:10

【CP】Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets
论文链接代码链接

赞 2

评论 1

浏览 700

语音
Speech

[浏览需要 0 积分] 发布于2025-01-08 13:31:47

【CP】Breaking Through the Spike: Spike Window Decoding for Accelerated and Precise Automatic Speech Recognition
论文链接

赞 3

评论 1

浏览 1150

语音
Speech

[浏览需要 0 积分] 发布于2024-12-24 11:22:07

【TR】Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
论文链接

赞 2

评论 1

浏览 796

语音
Speech

[浏览需要 0 积分] 发布于2024-12-12 16:51:49

【CP】Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
论文链接

赞 2

评论 1

浏览 973

语音
Speech

[浏览需要 0 积分] 发布于2024-11-08 10:48:20

【Technique Report】Moshi: a speech-text foundation model for real-time dialogue
论文链接

赞 2

评论 1

浏览 924

语音
Speech

[浏览需要 2 积分] 发布于2024-08-19 14:01:54

【Conference Paper】Mixture-of-Expert Conformer for Streaming Multilingual ASR
论文链接

赞 2

评论 1

浏览 7

语音
Speech

[浏览需要 0 积分] 发布于2024-10-21 14:35:43

【Conference Paper】Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
论文链接代码链接

赞 2

评论 1

浏览 1070

语音
Speech

[浏览需要 0 积分] 发布于2024-10-11 10:48:39

【极力推荐】Jason Wei
个人网站链接

赞 2

评论 1

浏览 1111

语音
Speech

[浏览需要 0 积分] 发布于2024-10-08 14:01:36

【Conference Paper】Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens
论文链接

赞 2

评论 1

浏览 1244

语音
Speech

[浏览需要 0 积分] 发布于2024-10-08 14:40:19

【Conference Paper】Boosting CTC-based ASR using inter-layer attention-based CTC loss
论文链接

赞 2

评论 1

浏览 1129

语音
Speech

[浏览需要 0 积分] 发布于2024-09-20 13:29:17

【Codes】BESTRQ NV NEMO
代码链接

赞 2

评论 1

浏览 1171

语音
Speech

[浏览需要 0 积分] 发布于2024-09-03 15:47:44

【Conference】BENCHMARKING JAPANESE SPEECH RECOGNITION ON ASR-LLM SETUPS WITH MULTI-PASS AUGMENTED GENERATIVE ERROR CORRECTION
论文链接

赞 2

评论 1

浏览 967

语音
Speech

[浏览需要 0 积分] 发布于2024-09-02 19:57:23

【Codes】MWER区分性训练代码在CTC与AED端的分别实现
CTC 端实现链接 AED 端实现链接

赞 2

评论 1

浏览 1114

语音
Speech

[浏览需要 2 积分] 发布于2024-08-29 17:25:59

【Conference Paper】EFFICIENT DOMAIN ADAPTATION FOR SPEECH FOUNDATION MODELS
论文链接

赞 2

评论 1

浏览 7

语音
Speech

[浏览需要 2 积分] 发布于2024-08-30 17:45:49

【Conference Paper】Re-investigating the Efficient Transfer Learning of Speech Foundation Model using Feature Fusion Methods
论文链接

赞 3

评论 1

浏览 9

语音
Speech

[浏览需要 2 积分] 发布于2024-09-02 16:09:43

【Conference Paper】PARAMETER-EFFICIENT TRANSFER LEARNING UNDER FEDERATED LEARNING FOR AUTOMATIC SPEECH RECOGNITION
论文链接

赞 1

评论 1

浏览 2

语音
Speech

[浏览需要 8 积分] 发布于2024-08-21 13:26:49

【Journal Paper & Codes】Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
论文链接代码链接 Samples 链接

赞 3

评论 1

浏览 14

语音

元语音
347 帖子 • 51 评论

2534
Speech
166 帖子 • 24 评论

1292
AI柠檬
14 帖子 • 16 评论

526
江南一点红
1 帖子 • 1 评论

46
懵
懵懵懂懂的新手
1 帖子 • 17 评论

39
M
Mephisto
0 帖子 • 4 评论

28
後藤ひとり
0 帖子 • 0 评论

23
心行
2 帖子 • 0 评论

20
出东巷
0 帖子 • 0 评论

15
betciso
0 帖子 • 1 评论

12