in Projects, Programming, Technology

AI for Music Arrangement

This is part of a series on Opportunities for AI in Music Production.

Problem / When writing a song using a Digital Audio Workstation (DAW) like Ableton Live, Logic Pro, or GarageBand, a key point of friction is arranging individual instrument parts (audio and MIDI recordings) into a cohesive song. Most amateur musicians get blocked at the arrangement phase. They can create loops that sound good together, but fail to produce a cohesive song.

Song arrangements are highly templatized. For example, a typical pop song has an arrangement made up of sections like intro, verse, chorus, verse, chorus, bridge, chorus, chorus, outro. Across sections, instrument tracks come in or out. Electronic dance music songs usually have a more additive arrangement, where tracks are added or removed over time.

Solution / An “Auto Arranger” DAW Plugin to analyze existing songs, extract the arrangement, and create a template (AKA a “ghost arrangement”) for the composer to fill in the blanks with their existing parts.

Auto Arranger will analyze an audio file, identify sections of the song, as well as when individual instruments enter, exit, or change in the mix. We train it by generating audio files with known components.

Key questions:

  • What approach should we use to extract instrument presence? We could generate a dataset for this pretty easily. Lots of songs with variations of parts coming in and out. The training set doesn’t even have to sound like music. I think this would entail some kind of auto-encoder that maps from a data file representing the included elements in a given time window to the waveform. Then later you’d reverse the encoding to evaluate (given a waveform) what elements (instruments) occur at given times.
  • How about identifying verse vs. chorus within a song? This is probably done like deCoda below. Don’t call it “verse” and “chorus” but categorize each region as belonging to “Part A” or “Part B” based on regions of similarity with a song (keeping in mind one verse might not have the same instruments as another, but is probably melodically similar).
  • Assuming we can generate templates, integration with the DAW is critical. Most likely, our software would run as a plugin, suck up clips from the project (ideally matching them by “similarity” for the template part), and write them back to the arrangement view.

Some existing products, for visual inspiration:

iZotope RX visualizes an audio file for editing, but doesn’t derive a template from it.
deCoda extracts sections and melodies, but doesn’t create a new arrangement.
Splice visualizes the tracks and clips in your project, but can’t detect them in other songs.
Song Sketch solves the 2nd half of the problem. It lets you manually apply parts into a template (but templates are created by humans)

Next: AI for Audio Sample Browsing

Write a Comment

Comment