Stability AI Unveils Open-Source Text-to-Audio Generator

Stable Audio Open allows users to create short audio samples for non-commercial use, with fine-tuning options available.

Overview

Stable Audio Open generates up to 47-second audio samples from text prompts.
The model supports sound effects, instrument riffs, and ambient noises but not full songs or vocals.
Users can fine-tune the model with custom audio data to create unique samples.
The dataset used for training lacks cultural diversity, impacting the generated samples.
Commercial use requires a subscription to Stability AI's premium Stable Audio service.