Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

降噪之后的音频推理准确度下降 #2522

Open
MiningIrving opened this issue May 9, 2024 · 2 comments
Open

降噪之后的音频推理准确度下降 #2522

MiningIrving opened this issue May 9, 2024 · 2 comments

Comments

@MiningIrving
Copy link

我有一个需求需要在噪音环境中使用ASR进行转录,这些噪音处于高噪环境中,针对于以上需求我对wenet进行了一系列的噪音实验,发现了如下现象:
我向我的音频中添加SNR=1dB的白噪音后,不通过降噪其CER为4.55%
降噪之后其CER为28.12%,向降噪之后的音频加入4000~8000频率的白噪音再进行ASR推理,其准确度上升到7.67%。
在我的需求中,我确实需要使用到降噪模块,以保证VAD的准确性,我该如何对ASR进行改进,以达到人可以听清的时候,ASR也能转录正确。
是需要使用降噪后的数据进行微调吗,还是有其他方法

@fclearner
Copy link
Contributor

端到端训练呗,asr去适应降噪模块,前端大佬们说任何软件层面的降噪都是对asr有损的

@fclearner
Copy link
Contributor

可以把两个任务分开呢吗,一个是可以听(降噪)的音频,一个是用于识别的音频,asr可以加一些线上数据(带噪)去训练

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants