Transcript – Audio Transcription App with Speaker Diarization

April 8, 2026
13:48

Type: Software
Status: POC

Tech Stack: Python, FastAPI, SQLAlchemy (async), Celery 5, Redis 7, PostgreSQL 16, faster-whisper/WhisperX/OpenAI Whisper, PyTorch (CUDA 12.6), React 18, Vite, TypeScript, Docker

Problem Statement

Transcription of meetings, interviews, and conferences requires not only speech-to-text but also speaker recognition (who said what?). Cloud-based services are expensive and problematic from a data privacy perspective (GDPR). Local alternatives rarely offer comparable quality, no provider comparison, and no admin dashboard for monitoring quality and performance of different Whisper variants.

Description

A complete transcription platform with three Whisper variants (faster-whisper, OpenAI-Whisper, WhisperX), automatic GPU acceleration (CUDA), optional speaker diarization, and provider selection per upload. Offers a user frontend and admin dashboard with observability (provider comparison, job logs, error rates, charts), user/group management, audit logs, and export functions.

Use Case

Upload audio files and automatically convert them to text — with recognition of who is speaking.

Link: https://github.com/rawk7000/transcript (private repo)

Transcript – Audio Transcription App with Speaker Diarization

Need more information? Check out our other projects!

Projects

Companies

Associations

About