Tool CategoriesAudio and SpeechSpeech to Text

WAAS

3.5 264 0

Type:

Open Source Projects

Last Updated:

2025/10/14

Description:

WAAS (Whisper as a Service) is an open-source GUI and API for OpenAI's Whisper, enabling easy audio and video transcription with email notifications and a local browser-based editor.

speech-to-text

audio transcription

video transcription

Whisper API

OpenAI

Open Website

Overview of WAAS

WAAS: Whisper as a Service - GUI and API for OpenAI Whisper

WAAS (Whisper as a Service) is an open-source project that provides a GUI and API for OpenAI's Whisper, making audio and video transcription more accessible and user-friendly. It offers both a graphical user interface (GUI) for easy file upload and transcription and an API for programmatic access.

What is WAAS?

WAAS provides an interface to upload and transcribe audio or video files. After transcription, users receive an email with download links for the transcription in various formats, including Jojo-file, SRT, or plain text. A key feature is the local browser-based editor for correcting transcription errors.

Key Features

GUI for Upload and Transcription: Simple interface for uploading audio and video files.
Email Notifications: Receive email notifications with download links after transcription.
Multiple Output Formats: Download transcriptions in Jojo-file, SRT, or plain text formats.
Local Browser-Based Editor: Correct transcription errors within the browser.
API Access: Programmatic access to transcription services via API.

How does WAAS work?

WAAS allows users to upload audio or video files through a GUI (named Jojo) or via an API. The uploaded file is then processed using OpenAI's Whisper model for transcription. Once the transcription is complete, the user receives an email containing links to download the transcription in various formats. The browser-based editor allows users to refine and correct any errors in the transcription before saving the final result.

API Documentation

The WAAS API provides several endpoints for transcription and related tasks:

POST /v1/transcribe: Adds a new transcription job to the queue.
- Required parameters: email_callback or webhook_id.
- Optional parameters: language, model, task, filename.
- Body: Raw audio data.
OPTIONS /v1/transcribe: Retrieves available options for the transcription route.
POST /v1/detect: Detects the language of the audio file.
- Optional parameter: model.
- Body: Raw audio data.
OPTIONS /v1/detect: Retrieves available options for the detect route.
GET /v1/download/<job_id>: Retrieves the completed transcription in the requested output format.
- Optional parameter: output (json, timecode_txt, txt, vtt, srt).
OPTIONS /v1/download/<job_id>: Retrieves available options for the download route.
GET /v1/jobs/<job_id>: Retrieves the status and metadata of the specified job.
GET /v1/queue: Retrieves the current length of the queue.

Webhook Integration

WAAS supports webhook notifications. Upon successful or failed transcription, a POST request is sent to the configured webhook URL with a JSON payload and an X-WAAS-Signature header for content verification.

Who is WAAS for?

Researchers needing to transcribe interviews or lectures.
Journalists working with audio or video content.
Developers integrating transcription services into their applications.
Anyone needing to quickly and accurately transcribe audio or video files.

Installation

To install and run WAAS, follow these steps:

Clone the repository.
Create a virtual environment.
Install the required Python packages using pip install -r requirements.txt.
Configure environment variables such as BASE_URL, EMAIL_SENDER_ADDRESS, EMAIL_SENDER_PASSWORD, and EMAIL_SENDER_HOST.
Run the setup using Docker Compose.

Running with Docker Compose

Create a .envrc file with the necessary environment variables.
Add a allowed_webhooks.json file (if using webhooks) with valid webhook URLs and tokens.
Run docker-compose --env-file .envrc up.

Using NVIDIA CUDA

To enable GPU acceleration with NVIDIA CUDA:

Install NVIDIA Docker.
Edit the docker-compose.yml file to use the Dockerfile.gpu and uncomment the device reservation.
Run docker-compose --env-file .envrc up.

Why choose WAAS?

WAAS offers a user-friendly interface and API for leveraging OpenAI's Whisper model. Its features like email notifications, multiple output formats, and local browser-based editing make it a convenient and efficient solution for audio and video transcription needs. The flexibility to run it locally or integrate it into existing systems via the API makes it a versatile tool for various use cases.

In conclusion, WAAS is a valuable tool for anyone looking to transcribe audio or video content quickly and accurately. Its open-source nature and ease of use make it an excellent choice for both personal and professional use.

Recommended Directory

AI Voice Synthesis AI Voice Changer AI Music Creation Speech to Text AI Voice Customer Service and Assistant Podcast and Video Dubbing

More categories ...

Best Alternative Tools to "WAAS"

WhisperAPI

153 0

WhisperAPI offers a fast and accurate video & audio transcription API powered by OpenAI Whisper. Get 5 free transcriptions daily. Supports multiple formats, generous limits, and privacy-first approach.

audio transcription

WhisperUI

519 0

WhisperUI provides affordable speech to text conversion using OpenAI Whisper. Convert audio files to text and SRT formats easily. Get started with a free account!

audio transcription

Buzz Captions

604 0

Buzz Captions is an offline audio transcription and translation tool powered by OpenAI's Whisper. It supports various audio/video formats and exports to CSV, SRT, TXT, and VTT.

audio transcription

speech to text

Speech Studio

463 0

Azure AI Speech Studio empowers developers with speech-to-text, text-to-speech, and translation tools. Explore features like custom models, voice avatars, and real-time transcription to enhance app accessibility and engagement.

speech transcription

voice synthesis

Whisper API

364 0

Whisper API: Affordable audio transcription API powered by OpenAI. Easy integration, speaker detection, supports 100+ languages. Free trial available!

audio transcription API

AIverse

98 0

AIverse is an all-in-one platform granting access to thousands of AI models for image/video generation, LLMs, speech-to-text, music creation, and more. Enjoy unlimited use for $20/month with easy integration.

image upscaling

background removal

superwhisper

638 0

Superwhisper is an AI-powered voice-to-text app for macOS and iPhone, enabling faster typing and seamless integration with any application. Transcribe audio and video, translate languages, and boost productivity.

voice transcription

speech to text

Lemonfox.ai Speech-To-Text API

235 0

Lemonfox.ai's Speech-To-Text API transcribes audio files quickly and affordably. It supports 100+ languages, speaker recognition, and offers high accuracy with secure data processing. Try it free for one month!

speech-to-text

transcription

SubEasy

720 0

SubEasy.ai offers AI-powered automatic transcription and translation services with high accuracy, context-aware AI, and support for 100+ languages.

AI transcription

video subtitles

Yescribe.ai

448 0

Yescribe.ai is an AI-powered transcription service that converts audio and video to text with 99.9% accuracy, supporting 98+ languages. It offers fast, secure, and affordable transcription solutions for various industries.

audio transcription

TurboScribe

478 0

TurboScribe offers unlimited AI-powered audio and video transcription with 99.8% accuracy in 98+ languages. Transcribe files in seconds, generate subtitles, and enjoy speaker recognition—all starting with 3 free daily transcripts.

audio transcription

video subtitles

Transcript LOL

429 0

Transcript LOL provides AI-powered audio and video transcription with high accuracy, speaker recognition, and unlimited minutes. Perfect for content creators, researchers, and businesses.

AI transcription

speech to text

Whisper Notes

364 0

Whisper Notes is an offline speech-to-text app for iOS/macOS, utilizing Whisper AI for private, accurate transcription. It supports 80+ languages, audio file import, and offers lifetime access with a one-time purchase.

offline transcription

speech to text

TranscriptionPlus

515 0

TranscriptionPlus offers fast and accurate AI-powered transcription with up to 99% accuracy. Transcribe audio and video files effortlessly with speaker identification, summary generation, and topic extraction.

audio transcription

speech to text

Add to Favorites

Edit Favorite

WAAS

Overview of WAAS

WAAS: Whisper as a Service - GUI and API for OpenAI Whisper

What is WAAS?

Key Features

How does WAAS work?

API Documentation

Webhook Integration

Who is WAAS for?

Installation

Running with Docker Compose

Using NVIDIA CUDA

Why choose WAAS?

Best Alternative Tools to "WAAS"