Resolve "Analysis of playlist corpus"
Closes #3 (closed)
Offers insights for sampling. Performs EDA on complete playlist data set. Steps include:
- Popularity aggregations of songs and tracks
- Adding popularity stats per playlist
- Overview of most popular items and cross-check with official stats
- Density plots on continuous features (including new popularity feature)
- New ms_mean feature to represent playlist length on a track basis
- Bi-Variate Analysis (Correlations, p-Values)
- Analysis of text information in header and description (including emojis, tf-idf scores, normal frequencies and bigrams)