How It Works

Record or Select

Record or select your stereo WAV audio (max 10s, 16-bit) to begin.

Upload

Drag and drop your audio. The system automatically queues it for processing.

AI Model

Our model begins transcription with real-time progress.

Receive Transcript

Get both raw (CTC) and KenLM-refined transcript. Status updates automatically.

After processing an audio file, the AI model returns two versions of the transcript. First, it applies CTC greedy decoding to the predicted probabilities, collapsing repeated characters to generate the initial result. Next, the KenLM language model uses beam search to autocorrect and find the most probable matches among candidate transcriptions, further improving the accuracy.