The most accurate speech models available — pick the right one
Nova-3 at a glance
All three models belong to the same generation of Deepgram's neural architecture, but they're tuned for fundamentally different jobs. Here's the shortest version of the comparison; the detailed breakdown follows below.
Nova-3: Deepgram's general-purpose flagship
When people say "Nova-3," they usually mean the foundation model behind the entire family. It's the most capable ASR engine Deepgram has shipped for general-purpose use, and it's built to handle the messy, complex audio scenarios that older models struggled with.
Foundation Model
Nova-3

Nova-3 is the result of substantial retraining on a more diverse and representative dataset than its predecessor, Nova-2. The most visible result is a 54.2% reduction in word error rate for streaming audio and 47.4% for batch processing when compared against major competitors, numbers that translate directly into fewer missed words and cleaner transcripts in production.
Beyond raw accuracy, Nova-3 introduces several capabilities that matter for real applications: it can handle conversations where speakers freely switch between languages mid-sentence (multilingual codeswitching), it understands domain-specific vocabulary without requiring custom model training, and it supports optional on-the-fly redaction of personal information like phone numbers and names.
The self-serve customization feature is worth highlighting separately, Nova-3 is the first Deepgram model that allows vocabulary adaptation (keyterm prompting) without submitting a model retraining request. If your application regularly hears uncommon product names, internal jargon, or specialized terminology, you can provide keyterms at request time and the model will give them elevated recognition priority immediately.
- Industry-leading word error rate
- Real-time multilingual codeswitching
- 50+ supported languages
- Enhanced domain-specific terminology
- Keyterm prompting (no retraining)
- PII redaction support
- Speaker diarization
- Batch and streaming support
- Smart formatting for numbers/dates
- Punctuation and paragraph detection
Recommended use cases
Meetings & conference calls
Multi-speaker audio with accurate speaker diarization and noisy conditions handled well.
Live event captioning
Real-time accuracy for broadcasts, webinars, and live presentations.
Multilingual applications
Products where users switch between languages naturally in the same conversation.
Far-field & noisy audio
Smart home, in-car, and ambient environments with unpredictable sound quality.
Nova-3 Medical: purpose-built for clinical environments
Healthcare speech recognition is its own discipline. The vocabulary is enormous, pronunciation is inconsistent across specialties and accents, and the cost of a transcription error is categorically higher than in a general context. Nova-3 Medical was built specifically for this problem.
Medical Specialist
Nova-3 Medical

Nova-3 Medical carries all the architectural improvements of Nova-3 but is additionally fine-tuned on a deep corpus of medical speech — clinical dictation recordings, physician notes, discharge summaries, and the full range of specialty-specific language that a general model would frequently miss or mangle.
The practical difference shows up in terminology like drug names, anatomical structures, diagnostic codes, procedural terms, and the specific phrasing patterns that clinicians use when dictating under time pressure. A model that hasn't been trained on this domain will struggle with words like "methicillin-resistant Staphylococcus aureus," "ventriculoperitoneal shunt," or specialty abbreviations that don't appear in everyday speech. Nova-3 Medical handles these reliably.
It currently supports English across eight regional variants, which covers the major clinical markets where English is the primary working language of documentation.
- Medical vocabulary fine-tuning
- Pharmacological term accuracy
- Anatomical & procedural terminology
- Clinical dictation patterns
- 8 English regional variants
- Specialty-specific language
- High-accuracy medical abbreviations
Nova-3 Medical provides transcription assistance and should always be used with appropriate clinical oversight. It is not a clinical decision support tool and does not replace qualified medical documentation review. Healthcare organizations should ensure regulatory compliance, including HIPAA requirements, in their integration. Always validate transcriptions before they enter official medical records.
Where it fits in healthcare workflows
Physician dictation
SOAP notes, discharge summaries, referral letters, and operative reports dictated at natural speech pace.
EHR integration
Real-time transcription piped directly into electronic health record fields, reducing manual entry time.
Pharmacy documentation
Drug names, dosages, routes, and clinical instructions accurately transcribed without generic misrecognition.
Medical research
Interview transcription for qualitative research, focus groups, and clinical study documentation.
Nova-3 General: when versatility is the requirement

The nova-3-general model string (identical to nova-3) is Deepgram's recommended default for any application that doesn't have a specific domain requirement. It's the broadest, most capable model for general-purpose transcription work.
General Purpose
nova-3-general
If your use case doesn't align with a particular industry or niche, Nova-3 General provides a versatile solution. It performs strongly on customer service calls, podcast transcription, accessibility captioning, content localization, and any application where the audio environment and vocabulary are unpredictable.
The model's multilingual capability is a genuine differentiator here. In independent testing, Nova-3 showed up to 8:1 user preference ratios over competing models for certain language pairs — not just English. For teams building global applications, this means you can rely on a single model instead of managing separate pipelines for different languages.
Industries where it performs well
Contact centers
Agent call transcription, quality assurance, post-call summarization, and compliance monitoring.
Media & content
Podcast transcripts, video subtitles, interview archives, and content accessibility workflows.
Enterprise productivity
Meeting notes, internal knowledge capture, voice search, and document generation from spoken content.
EdTech & eLearning
Lecture transcription, accessibility compliance, language learning feedback, and interactive voice exercises.
Performance characteristics
Nova-3 represents a major step forward over Nova-2 and over competing general-purpose speech models. Here's a simplified view of where the models stand relative to each other on key dimensions.
Which Nova-3 model should you use?
The right model depends on your domain, language requirements, and what kind of audio you're processing. Here's a practical decision framework.
Use nova-3 or nova-3-general when:
- Your users speak multiple languages or code-switch
- Audio comes from noisy or far-field environments
- You need the broadest language coverage
- Your use case spans multiple industries
- You want the flexibility of keyterm prompting
- You're building multilingual consumer or enterprise apps
Use nova-3-medical when:
- You're building for clinical or healthcare workflows
- Your audio contains drug names, anatomical terms, or procedures
- Accuracy on specialist vocabulary is critical
- You're integrating with EHR or clinical documentation systems
- Your users are primarily English-speaking clinicians
- Regulatory and compliance context demands domain accuracy
Heads up on Nova-2 Medical: If your current integration uses nova-2-medical, consider testing nova-3-medical — the generation upgrade brings meaningfully better accuracy for most medical audio. Nova-2 remains available for backward compatibility and for use cases requiring filler word detection, which Nova-3 doesn't support yet.
Start building with Nova-3 today
Access Deepgram's Nova-3, Nova-3 General, and Nova-3 Medical through AI/ML API, alongside hundreds of other models, in one place.
Frequently asked questions
What's the difference between nova-3 and nova-3-general?
They're the same model. Deepgram uses both strings as aliases that resolve to identical underlying model weights. You can use either in your API calls and get exactly the same results. The nova-3-general string makes the intent more explicit when working with a team or reading code later.
Does Nova-3 Medical work for non-English languages?
Not currently. Nova-3 Medical supports English only, across eight regional variants (en, en-US, en-AU, en-CA, en-GB, en-IE, en-IN, en-NZ). If you need medical transcription in other languages, you would need to use nova-3-general with appropriate keyterm prompting, though the specialist tuning won't be present.
What is keyterm prompting and does it work with all Nova-3 models?
Keyterm prompting lets you provide a list of words or phrases that the model should treat as high-priority vocabulary. This is useful for proper nouns, brand names, internal terminology, or rare words. You pass these at request time using the keyterm parameter. It works with Nova-3 / Nova-3 General and Nova-3 Medical alike — no model retraining required.
How does multilingual codeswitching work in Nova-3?
When you set language=multi, Nova-3 actively detects which language is being spoken and transcribes it accordingly — even within a single utterance. The supported languages for multi mode are English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch. This is different from simple language detection: the model handles transitions in real time rather than labeling the dominant language of a whole segment.



