Can I use shorter training and testing utterances? #27

youyou098888 · 2024-04-12T08:52:41Z

I notice that both training and testing utterances are 4seconds long and the inference is "the evaluation utterances are first chunked to 4-second segments and processed by the network, with 2-second overlapping between consecutive segments."

If I want to shink the input of the network, Is there any chance I can use them in shorter audio, say 200ms long?
Can I use 4-seconds for training and 200ms for inference?
If not, Can I use 200ms for training and 200ms for inference?

quancs · 2024-04-12T14:59:50Z

You can try. But I think 200ms may not be a good choice for training/inference, as context is not enough for the neural network to learn/predict.

I notice that both training and testing utterances are 4seconds long and the inference is "the evaluation utterances are first chunked to 4-second segments and processed by the network, with 2-second overlapping between consecutive segments."

Note: this configuration is for SpatialNet, not for online SpatialNet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use shorter training and testing utterances? #27

Can I use shorter training and testing utterances? #27

youyou098888 commented Apr 12, 2024

quancs commented Apr 12, 2024 •

edited

Can I use shorter training and testing utterances? #27

Can I use shorter training and testing utterances? #27

Comments

youyou098888 commented Apr 12, 2024

quancs commented Apr 12, 2024 • edited

quancs commented Apr 12, 2024 •

edited