This is part of a series on Opportunities for AI in Music Production.
Problem / When writing a song using a Digital Audio Workstation (DAW) like Ableton Live, Logic Pro, or GarageBand, a key point of friction is arranging individual instrument parts (audio and MIDI recordings) into a cohesive song. Most amateur musicians get blocked at the arrangement phase. They can create loops that sound good together, but fail to produce a cohesive song.
Song arrangements are highly templatized. For example, a typical pop song has an arrangement made up of sections like intro, verse, chorus, verse, chorus, bridge, chorus, chorus, outro. Across sections, instrument tracks come in or out. Electronic dance music songs usually have a more additive arrangement, where tracks are added or removed over time.
Solution / An “Auto Arranger” DAW Plugin to analyze existing songs, extract the arrangement, and create a template (AKA a “ghost arrangement”) for the composer to fill in the blanks with their existing parts.
Auto Arranger will analyze an audio file, identify sections of the song, as well as when individual instruments enter, exit, or change in the mix. We train it by generating audio files with known components.
- What approach should we use to extract instrument presence? We could generate a dataset for this pretty easily. Lots of songs with variations of parts coming in and out. The training set doesn’t even have to sound like music. I think this would entail some kind of auto-encoder that maps from a data file representing the included elements in a given time window to the waveform. Then later you’d reverse the encoding to evaluate (given a waveform) what elements (instruments) occur at given times.
- How about identifying verse vs. chorus within a song? This is probably done like deCoda below. Don’t call it “verse” and “chorus” but categorize each region as belonging to “Part A” or “Part B” based on regions of similarity with a song (keeping in mind one verse might not have the same instruments as another, but is probably melodically similar).
- Assuming we can generate templates, integration with the DAW is critical. Most likely, our software would run as a plugin, suck up clips from the project (ideally matching them by “similarity” for the template part), and write them back to the arrangement view.
Some existing products, for visual inspiration: