Stable Audio

Stable Audio generates high-quality audio from text prompts with innovative features like audio transformation and extensive creative control.

Model Overview Card for Stable Audio

Basic Information

Model Name: Stable Audio
Developer/Creator: Stability AI
Release Date: September 2023
Version: 1.0
Model Type: Audio Generation Model

Description

Overview:

Stable Audio is an advanced audio generation model designed to create high-quality audio tracks from textual prompts.

Key Features:

High-Quality Output: Generates audio at 44.1 kHz stereo, providing professional-grade sound quality.
Length Flexibility: Capable of producing tracks with coherent musical structures including intros, developments, and outros.
Diverse Sound Creation: Generates melodies, sound effects, and various audio styles, catering to musicians and sound designers.

Intended Use:

The model is intended for musicians, sound designers, and developers looking to create music, sound effects, or ambient sounds for various applications such as games, films, or interactive media.

Language Support:

Stable Audio primarily supports English for text prompts but can process multilingual inputs depending on the context of the prompt.

Technical Details

Architecture:

Stable Audio employs a latent diffusion model architecture optimized for audio generation. It uses a combination of a highly compressed autoencoder for efficient representation of audio waveforms and a diffusion transformer (DiT) that excels in manipulating data over long sequences.

Training Data:

The model was trained on a diverse dataset sourced from the AudioSparx music library, which includes over 800,000 audio files encompassing music, sound effects, and single-instrument stems.

Data Source and Size: The dataset is large and varied, ensuring a comprehensive understanding of different audio elements and styles.
Diversity and Bias: The training data was curated to respect creator rights with an opt-out option for artists. This approach helps minimize bias while ensuring diverse representation in the generated outputs.

Performance Metrics:

Stable Audio has demonstrated impressive performance metrics:

Metric	Score
Quality Index	High
Length of Generated Tracks	up to 47s
Sampling Rate	44.1 kHz

Usage

Code Samples

The model is available on the AI/ML API platform as "Stable Audio" .

API Documentation

Detailed API Documentation is available here.

Ethical Guidelines

Stability AI emphasizes ethical considerations in AI development by promoting transparency regarding the model's capabilities and limitations. The organization ensures that all training data respects copyright laws and provides options for artists to opt out of data usage.

Licensing

Stable Audio is available under a commercial license that allows both research and commercial usage rights while ensuring compliance with ethical standards regarding creator rights.

Get Stable Audio API here.

Try it now

Stable Audio

AI Playground

Our Clients' Voices

Stable Audio

Model Overview Card for Stable Audio

Basic Information

Description

Overview:

Intended Use:

Language Support:

Technical Details

Architecture:

Training Data:

Performance Metrics:

Usage

Code Samples

API Documentation

Ethical Guidelines

Licensing

200+ AI Models

The Best Growth Choice
for Enterprise

Stable Audio

AI Playground

Our Clients' Voices

Stable Audio

Model Overview Card for Stable Audio

Basic Information

Description

Overview:

Intended Use:

Language Support:

Technical Details

Architecture:

Training Data:

Performance Metrics:

Usage

Code Samples

API Documentation

Ethical Guidelines

Licensing

200+ AI Models

The Best Growth Choice for Enterprise

The Best Growth Choice
for Enterprise