nova-2

This documentation is valid for the following list of our models:

#g1_nova-2-automotive
#g1_nova-2-conversationalai
#g1_nova-2-drivethru
#g1_nova-2-finance
#g1_nova-2-general
#g1_nova-2-medical
#g1_nova-2-meeting
#g1_nova-2-phonecall
#g1_nova-2-video
#g1_nova-2-voicemail

Model Overview

Nova-2 builds on the advancements of Nova-1 with speech-specific optimizations to its Transformer architecture, refined data curation techniques, and a multi-stage training approach. These improvements result in a lower word error rate (WER) and better entity recognition (including proper nouns and alphanumeric sequences), as well as enhanced punctuation and capitalization.

Nova-2 offers the following model options:

automotive: Optimized for audio with automotive oriented vocabulary.
conversationalai: Optimized for use cases in which a human is talking to an automated bot, such as IVR, a voice assistant, or an automated kiosk.
drivethru: Optimized for audio sources from drivethrus.
finance: Optimized for multiple speakers with varying audio quality, such as might be found on a typical earnings call. Vocabulary is heavily finance oriented.
general: Optimized for everyday audio processing.
medical: Optimized for audio with medical oriented vocabulary.
meeting: Optimized for conference room settings, which include multiple speakers with a single microphone.
phonecall: Optimized for low-bandwidth audio phone calls.
video: Optimized for audio sourced from videos.
voicemail: Optimized for low-bandwidth audio clips with a single speaker. Derived from the phonecall model.

Setup your API Key

If you don’t have an API key for the Apilaplas API yet, feel free to use our Quickstart guide.

Submit a request

API Schema

Creating and sending a speech-to-text conversion task to the server

post

Authorizations

Body

modelundefined · enumRequiredPossible values:

custom_intentany ofOptional

stringOptional

string[]Optional

custom_topicany ofOptional

stringOptional

string[]Optional

custom_intent_modestring · enumOptionalPossible values:

custom_topic_modestring · enumOptionalPossible values:

detect_languagebooleanOptional

detect_entitiesbooleanOptional

detect_topicsbooleanOptional

diarizebooleanOptional

dictationbooleanOptional

diarize_versionstringOptional

extrastringOptional

filler_wordsbooleanOptional

intentsbooleanOptional

keywordsstringOptional

languagestringOptional

measurementsbooleanOptional

multi_channelbooleanOptional

numeralsbooleanOptional

paragraphsbooleanOptional

profanity_filterbooleanOptional

punctuatebooleanOptional

searchstringOptional

sentimentbooleanOptional

smart_formatbooleanOptional

summarizestringOptional

tagstring[]Optional

topicsbooleanOptional

utterancesbooleanOptional

utt_splitnumberOptional

Responses

201Success

application/json

post

POST /v1/stt/create HTTP/1.1
Host: api.apilaplas.com
Authorization: Bearer <YOUR_LAPLASAPI_KEY>
Content-Type: application/json
Accept: */*
Content-Length: 596

{
  "model": "#g1_nova-2-automotive",
  "custom_intent": "text",
  "custom_topic": "text",
  "custom_intent_mode": "strict",
  "custom_topic_mode": "strict",
  "detect_language": true,
  "detect_entities": true,
  "detect_topics": true,
  "diarize": true,
  "dictation": true,
  "diarize_version": "text",
  "extra": "text",
  "filler_words": true,
  "intents": true,
  "keywords": "text",
  "language": "text",
  "measurements": true,
  "multi_channel": true,
  "numerals": true,
  "paragraphs": true,
  "profanity_filter": true,
  "punctuate": true,
  "search": "text",
  "sentiment": true,
  "smart_format": true,
  "summarize": "text",
  "tag": [
    "text"
  ],
  "topics": true,
  "utterances": true,
  "utt_split": 1
}

201Success

{
  "generation_id": "123e4567-e89b-12d3-a456-426614174000"
}

Requesting the result of the task from the server using the generation_id

Quick Code Examples

Let's use the #g1_nova-2-meeting model to transcribe the following audio fragment:

Example #1: Processing a Speech Audio File via URL

import time
import requests

base_url = "https://api.apilaplas.com/v1"
# Insert your LAPLAS API Key instead of <YOUR_LAPLASAPI_KEY>:
api_key = "<YOUR_LAPLASAPI_KEY>"

# Creating and sending a speech-to-text conversion task to the server
def create_stt():
    url = f"{base_url}/stt/create"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }

    data = {
        "model": "#g1_nova-2-meeting",
        "url": "https://audio-samples.github.io/samples/mp3/blizzard_primed/sample-0.mp3"
    }
 
    response = requests.post(url, json=data, headers=headers)
    
    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print(response_data)
        return response_data

# Requesting the result of the task from the server using the generation_id
def get_stt(gen_id):
    url = f"{base_url}/stt/{gen_id}"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }
    response = requests.get(url, headers=headers)
    return response.json()
    
# First, start the generation, then repeatedly request the result from the server every 10 seconds.
def main():
    stt_response = create_stt()
    gen_id = stt_response.get("generation_id")



    if gen_id:
        start_time = time.time()

        timeout = 600
        while time.time() - start_time < timeout:
            response_data = get_stt(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break
        
            status = response_data.get("status")

            if status == "waiting" or status == "active":
                ("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Processing complete:/n", response_data["result"]['results']["channels"][0]["alternatives"][0]["transcript"])
                return response_data
   
        print("Timeout reached. Stopping.")
        return None     


if __name__ == "__main__":
    main()

Response

{'generation_id': 'h66460ba-0562-1dd9-b440-a56d947e72a3'}
Processing complete:
 He doesn't belong to you and i don't see how you have anything to do with what is be his power yet he's he persona from this stage to you be fine

Example #2: Processing a Speech Audio File via File Path

import time
import requests

base_url = "https://api.apilaplas.com/v1"
# Insert your LAPLAS API Key instead of <YOUR_LAPLASAPI_KEY>:
api_key = "<YOUR_LAPLASAPI_KEY>"

# Creating and sending a speech-to-text conversion task to the server
def create_stt():
    url = f"{base_url}/stt/create"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }

    data = {
        "model": "#g1_nova-2-meeting",
    }
    with open("stt-sample.mp3", "rb") as file:
        files = {"audio": ("sample.mp3", file, "audio/mpeg")}
        response = requests.post(url, data=data, headers=headers, files=files)
    
    if response.status_code >= 400:
        print(f"Error: {response.status_code} - {response.text}")
    else:
        response_data = response.json()
        print(response_data)
        return response_data

# Requesting the result of the task from the server using the generation_id
def get_stt(gen_id):
    url = f"{base_url}/stt/{gen_id}"
    headers = {
        "Authorization": f"Bearer {api_key}", 
    }
    response = requests.get(url, headers=headers)
    return response.json()
    
# First, start the generation, then repeatedly request the result from the server every 10 seconds.
def main():
    stt_response = create_stt()
    gen_id = stt_response.get("generation_id")


    if gen_id:
        start_time = time.time()

        timeout = 600
        while time.time() - start_time < timeout:
            response_data = get_stt(gen_id)

            if response_data is None:
                print("Error: No response from API")
                break
        
            status = response_data.get("status")

            if status == "waiting" or status == "active":
                ("Still waiting... Checking again in 10 seconds.")
                time.sleep(10)
            else:
                print("Processing complete:/n", response_data["result"]['results']["channels"][0]["alternatives"][0]["transcript"])
                return response_data
   
        print("Timeout reached. Stopping.")
        return None     


if __name__ == "__main__":
    main()

Response

{'generation_id': 'd793a81c-f8d8-40e0-a7c6-049ec6f54446'}
Processing complete:
 He doesn't belong to you, and I don't see how you have anything to do with what is be his power yet. He's he pursuing that from this stage to you.

PreviousDeepgram NextOpenAI

Last updated 3 months ago