Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models

Welcome to the demo page for "Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models" (MSG-LD) paper. Our model builds upon the foundation of latent diffusion models and introduces advancements that address the complexities of music generation, arrangement generation and separetion.

We introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. Our model also enables arrangement generation by creating any subset of tracks given the others (e.g., generating a guitar track based on provided bass and drum tracks). We trained our model on the Slakh2100 dataset, compared it with existing multi-track generative model, and observed significant improvements across objective metrics for both source separation, music and arrangement generation tasks.

On this page, we present the separetion and generation demos of MSG-LD in 3 different scenarios: separation, total generation, and arrangement generation.

Source Seperation


Total Generation


Arrangement Generation

Arrangement Generation B: Bass


Arrangement Generation D: Drums


Arrangement Generation G: Guitar


Arrangement Generation P: Piano


Arrangement Generation BD: Bass and Drums


Arrangement Generation BG: Bass and Guitar


Arrangement Generation BP: Bass and Piano


Arrangement Generation DG: Drums and Guitar


Arrangement Generation PD: Drums and Piano


Arrangement Generation GP: Guitar and Piano


Arrangement Generation BDG: Bassm Drum and Guitar


Arrangement Generation BDP: Bassm Drum and Piano


Arrangement Generation BGP: Bassm Guitar and Piano


Arrangement Generation DGP: Drums Guitar and Piano