Create training & dev sets
In order to move forward with experiments, we need to split the available data into training and validation distributions.
Steps to complete the task might include the following:
-
Coming up with a sampling (and potentially subsampling) method -
Develop solid reasoning behind splits -
Potentially create reproducible sampling methods
Additional data sets for validation purposes (last.fm, AOTM, etc.) will be added at later point.
Edited by Andreas Rubin-Schwarz