asr

Star

Here are 1,020 public repositories matching this topic...

metame-ai / awesome-audio-plaza

Star

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

awesome tts music-generation asr audio-generation zero-shot-tts awesome-music-generation

Updated Jun 3, 2024

NVIDIA / NeMo

Star

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Jun 3, 2024
Python

CheshireCC / faster-whisper-GUI

Star

faster_whisper GUI with PySide6

openai vad whisper asr transcribe voice-transcription faster-whisper whisperx

Updated Jun 3, 2024
Python

SunbirdAI / dsa2024-speech-data-recording

Star

Record audio data for ASR

audio speech-recognition asr

Updated Jun 3, 2024
JavaScript

common-voice / cv-dataset

Star

Metadata and versioning details for the Common Voice dataset

voice open-data dataset speech-recognition asr open-datasets

Updated Jun 3, 2024
JavaScript

winstxnhdw / CapGen

Star

A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.

docker caddy automatic-speech-recognition whisper asr fastapi uvicorn-gunicorn huggingface huggingface-spaces ctranslate2

Updated Jun 3, 2024
Python

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift

android windows macos linux raspberry-pi ios text-to-speech csharp cpp dotnet speech-to-text aarch64 mfc risc-v asr arm32 onnx vits openkylin

Updated Jun 3, 2024
C++

PaddlePaddle / PaddleSpeech

Star

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Jun 3, 2024
Python

Thoroldvix / youtube-transcript-api

Star

Java library which allows you to retrieve subtitles/transcripts for a YouTube video.

java youtube youtube-video youtube-api captions subtitles transcript subtitle transcripts asr youtube-subtitles youtube-transcripts youtube-captions youtube-subtitle youtube-transcript-api youtube-transcript translating-transcripts youtube-asr

Updated Jun 3, 2024
Java

alphacep / vosk-api

Star

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Updated Jun 3, 2024
Jupyter Notebook

smswg / FreeSwitch-Mod_FunAsr

Star

FreeSWITCH Mod_FunASR语音识别模块，基于此模块实现空号识别+关机等异常状态或早期媒体音检测,无需Asr语音识别费用。

freeswitch asr funasr

Updated Jun 3, 2024

smswg / FreeSwitch-Mod_Asr

Star

FreeSWITCH 阿里云Mod_ASR模块基于2024年阿里云最新Sdk3，经过大量生产环境测试稳定。可用于AI智能外呼机器人。

freeswitch asr

Updated Jun 3, 2024

RitchieP / VerbaLex

Star

Automatic Speech Recognition system for non-native English speakers

nlp asr huggingface-transformers

Updated Jun 3, 2024
Jupyter Notebook

wenet-e2e / wenet

Star

Production First and Production Ready End-to-End Speech Recognition Toolkit

pytorch transformer speech-recognition automatic-speech-recognition production-ready whisper asr conformer e2e-models

Updated Jun 3, 2024
Python

DmitryRyumin / ICASSP-2023-24-Papers

Star

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!