← Back to Leaderboard
AI ToolsTOOL
About
Open-source voice AI framework that includes advanced speech recognition (ASR) for 60-minute audio transcription with speaker diarization, text-to-speech (TTS) for 90-minute multi-speaker synthesis, and real-time streaming TTS. Operates at ultra-low 7.5Hz frame rate for efficient long-form audio processing.
Tags
voice-aispeech-recognitiontext-to-speechopen-sourcelong-form-audiospeaker-diarizationstreaming-ttsmicrosoft
Tech Stack
Python
Comments
No comments yet.