Danceformer - End to End Dance Step Prediction on Chart Difficulty
Master Thesis by Cassandra Grzonkowski
Installation
- Clone and setup repository
$ git clone https://gitlab.com/cassandra.grzonkowski/danceformer.git
$ cd ../danceformer/
$ pip install requirements.txt
- Load some songs from https://zenius-i-vanisher.com/v5.2/simfiles.php, for example the following pack: https://zenius-i-vanisher.com/v5.2/viewsimfilecategory.php?categoryid=127.
First Preliminary Experiment - Check Note Distances and BPMs
Will show some statistics of the (downloaded) songs prepared by given path.
ToDo: also make use of multiple bpms
$ preliminary_exp_main.py
Create (Mel-)Spectogram images from music
Given a .ogg file or multiple (a path to the folder), create (mel-) spectogram.
$ audio_to_image.py
Naive Approach
Given songs including difficulty and melspectogram images, train a CNN to preprocess the images and a MLP to preprocess the difficulty (getting a tensor). After that input the preprocessed image to a transformers encoder and the preprocessed difficulty to the transformers decoder.
ToDo: encode/decode chart as a sequence of tokens with a dictionary and give it to the decoder
$ main.py
In order to save everything, following arguments need to be specified:
$ main.py --save_vocabulary --save_dataset --folder --model_params_path --save_dir --details_dir --save_dir_models
save_vocabulary: Needs the folder a vocabulary is saved in.
save_dataset: Needs the folder the dataset is saved in.
folder: Needs the folder with songpacks, these are the original data of our training dataset.
save_dir_models: Needs the folder the model files are saved in.
model_params_path: Needs the folder with exact naming for the model files to be saved.
save_dir: Needs the folder all rest data is saved in.
details_dir: Needs the folder in which details such as accuracy during training can be saved.
Furthermore, we need to set if we train or evaluate our data with
$ main.py --eval
For evaluation, we need to provide a model to load. A model can also be loaded during training.
In order to load an already trained model we can use provide its path via:
$ main.py --load_model_params_path
Optional one can adjust batch size, the number of epochs and save the created spectrograms during training in a provided folder:
$ main.py --batch_size --number_epochs --specs_folder
In order to load an already created dataset, we need the path to the vocabulary and dataset:
$ main.py --get_vocabulary --dataset
Special case, we have the data created in a folder, we can provide it via
$ main.py --built_dataset --built_dataset_charts
for all the data needed, charts are provided extra since these can change per thresholding and approach.
Note arguments --old_version and --old_version_voc, as well as --model_params_path_nan_problem were needed during changes in the databuilding during experiments
and are irrelevant for new created datasets.
ET approach and ST approach
Go to branch fix_prediction_combine_tokens.
Call it in the same way as before. The chart generation will use the script combine_tokens to create the different charts.
In order to have the ET approach, set parameter version_mines_rolls in preprocess to False, for the ST approach set parameter version_mines_rolls to True.
To apply thresholding, after chart creation, which can be done separately with script built_data, we need to first apply the script thresholding before training. Therefore, after the built_data and thresholding scripts, we can call the main script and provide the corresponding build_dataset and built_dataset_charts folder.
Analyse token occurrence
In order to analyse the token occurrence use script
$ analyse_token_occurrence.py
Analyse zero occurrence
In order to analyse the zero token occurrence use script
$ analyse_zero_occurrence.py
Conditioning Testing
Therefore, go to branch test_difficulty.
This version is adjusted for evaluation and returns during the evaluation the difficulty test counting matches generating charts to corresponding difficulties compared to all other difficulties.
Visualization
In order to visualize our results gathered in the details folder during training and evaluation, we have provided:
$ visualize.py
In order to afterward visualize a confusion matrix only returned in the log files:
$ visualize_matrix.py