Convolutional and Graph-Theoretic Audio Analyses for Favorable Music Recommendations

Presenter: Anirudh Kamath

Research Category: Interdisciplinary Topics, Centers and Institutes

Ever listen to music on shuffle and press the “skip” button countless times before finding the “right” song? PyRate aims to find unique similarities between songs using both concrete song data and latent features found via raw audio analysis. Using these similarities, song recommendations and shuffling become almost a trivial process.

A Neo4J database was implemented to link songs together via common artists, albums, and playlists that songs appear in. Using this graph-theoretic data model, collaborative filtering-based recommendations were used to recommend songs based off of common elements in other songs (same artists, albums, playlists). However, collaborative filtering only fully works if there exists a path from one song to every other song in the database, which is not always the case.

To address gaps in collaborative filtering, songs were also evaluated using audio analysis. Spotify provides audio data that breaks down a song into sequences of various segments and sections, each composed of pitches, timbres, tempos, and loudness down to the millisecond. Using a convolutional autoencoder, we consider these elements (pitches, timbre, loudness, and rhythm) of a segment, then evaluate those properties of the segments around it. In doing so, we can reduce the dimensionality/complexity of the original song data into a standardized vector that can be used for more scalable and reliable song recommendations in addition to collaborative filtering.

Thus, through collaborative filtering in a graph database in combination with convolutional autoencoding, we provide a scalable and accurate music recommendation platform via unsupervised similarity ranking.