Skip to content

Resolve "Analysis of playlist corpus"

Andreas Rubin-Schwarz requested to merge 3-analysis-of-playlist-corpus into master

Closes #3 (closed)

Offers insights for sampling. Performs EDA on complete playlist data set. Steps include:

  • Popularity aggregations of songs and tracks
  • Adding popularity stats per playlist
  • Overview of most popular items and cross-check with official stats
  • Density plots on continuous features (including new popularity feature)
  • New ms_mean feature to represent playlist length on a track basis
  • Bi-Variate Analysis (Correlations, p-Values)
  • Analysis of text information in header and description (including emojis, tf-idf scores, normal frequencies and bigrams)

Merge request reports