Skip to content

How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi

License

Notifications You must be signed in to change notification settings

Shuyib/zindi_mcv_swahilli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zindi_mcv_swahilli

public word error rate: 0.114294524
private word error rate: 0.112809661

How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition. I only downloaded the test.tar.gz directory later I unzipped it and resampled all the audio to 16KHz. I noticed that there was some audio that was muffled, and was pretty bad as is due to the sampling rates that were set. Anyways, the script I used to do the conversion is called prepare_files.sh. Follow the instructions to install seamless m4t large. I performed inference on each audio file python asr.py the output was then saved to asr_results.csv then it was formatted to a certain format needed for Zindi with python clean_submission.py.

You can do all this in one step

make run

Lesson

Review huggingface leaderboard for the ASR models. Look for one with the fastest and the most accurate.

leaderboard

Facebook/meta have a lot of Speech to text models. Look for one that is capable of doing Speech to text. The ones that primarily do one thing seem to be the best.

About

How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published