whisper-medium

This documentation is valid for the following list of our models:

  • #g1_whisper-medium

Model Overview

The Whisper models are primarily for AI research, focusing on model robustness, generalization, and biases, and are also effective for English speech recognition. The use of Whisper models for transcribing non-consensual recordings or in high-risk decision-making contexts is strongly discouraged due to potential inaccuracies and ethical concerns.

The models are trained using 680,000 hours of audio and corresponding transcripts from the internet, with 65% being English audio and transcripts, 18% non-English audio with English transcripts, and 17% non-English audio with matching non-English transcripts, covering 98 languages in total.

Setup your API Key

If you don’t have an API key for the Apilaplas API yet, feel free to use our Quickstart guide.

Submit a request

API Schema

Creating and sending a speech-to-text conversion task to the server

post
Authorizations
AuthorizationstringRequired

Bearer key

Body
modelundefined · enumRequiredPossible values:
custom_intentany ofOptional
stringOptional
or
string[]Optional
custom_topicany ofOptional
stringOptional
or
string[]Optional
custom_intent_modestring · enumOptionalPossible values:
custom_topic_modestring · enumOptionalPossible values:
detect_languagebooleanOptional
detect_entitiesbooleanOptional
detect_topicsbooleanOptional
diarizebooleanOptional
dictationbooleanOptional
diarize_versionstringOptional
extrastringOptional
filler_wordsbooleanOptional
intentsbooleanOptional
keywordsstringOptional
languagestringOptional
measurementsbooleanOptional
multi_channelbooleanOptional
numeralsbooleanOptional
paragraphsbooleanOptional
profanity_filterbooleanOptional
punctuatebooleanOptional
searchstringOptional
sentimentbooleanOptional
smart_formatbooleanOptional
summarizestringOptional
tagstring[]Optional
topicsbooleanOptional
utterancesbooleanOptional
utt_splitnumberOptional
Responses
post
/v1/stt/create
201Success

Requesting the result of the task from the server using the generation_id

Quick Code Examples

Let's use the #g1_whisper-medium model to transcribe the following audio fragment:

Example #1: Processing a Speech Audio File via URL

Response

Example #2: Processing a Speech Audio File via File Path

Response

Last updated