VIBE
← Back to Leaderboard
AI ToolsTOOL
AI ToolsOpen SourceTOOL1mo ago49.2k

About

Open-source voice AI framework that includes advanced speech recognition (ASR) for 60-minute audio transcription with speaker diarization, text-to-speech (TTS) for 90-minute multi-speaker synthesis, and real-time streaming TTS. Operates at ultra-low 7.5Hz frame rate for efficient long-form audio processing.

Tags

voice-aispeech-recognitiontext-to-speechopen-sourcelong-form-audiospeaker-diarizationstreaming-ttsmicrosoft

Tech Stack

Python

Comments

No comments yet.