CompoNet: Toward Incorporating Human Perspective in Automatic Music Generation Using Deep Learning



The art nature of music makes it difficult, if not impossible, to extract solid rules from composed pieces and express them mathematically. This has led to the lack of utilization of music expert knowledge in the AI literature for automation of music composition. In this study, we employ intervals, which are the building blocks of music, to represent musical data closer to human composers’ perspectives. Based on intervals, we developed and trained OrchNet which translates musical data into and from numerical vector representation. Another model called CompoNet is developed and trained to generate music. Using intervals and a novel monitor-and-inject mechanism, we address the two main limitations of the literature: lack of orchestration and lack of long-term memory. The music generated by CompoNet is evaluated by Turing Test: whether human judges can tell the difference between the music pieces composed by humans versus. generated by our system. The Turing test results were compared using Mann-Whitney U Test, and there was no statistically significant difference between human-composed music versus what our system has generated.

Full Text

This preprint is available for download as a PDF.