dlc2action.project
Project interface
The most convenient way to use dlc2action is through the high-level project interface. It is defined in the
project module and its main functions are managing configuration files and keeping track of experiments.
When you create a project.Project instance with a previously unused name, it generates a new project folder with results,
history and configuration files.
.
project_name
├── config
├── meta
├── saved_datasets
└── results
├── logs
│ └── episode.txt
├── models
│ └── episode
│ ├── epoch25.pt
│ └── epoch50.pt
├── searches
│ └── search
│ ├── search_param_importances.html_docs
│ └── search_contour.html_docs
├── splits
│ ├── time_25.0%validation_10.0%test.txt
│ └── random_20.0%validation_10.0%test.txt
├── suggestions
│ └── active_learning
│ ├── video1_suggestion.pickle
│ ├── video2_suggestion.pickle
│ └── al_points.pickle
└── predictions
├── episode_epoch25.pickle
└── episode_epoch50_newdata.pickle
Here is an explanation of this structure.
The config folder contains .yaml configuration files. Project instances can read them into a parameter dictionary and update. Those readers understand several blanks for certain parameters that can be inferred from the data on runtime:
'dataset_features'will be replaced with the shape of features per frame in the data,'dataset_classes'will be replaced with the number of classes,'dataset_inverse_weights'at losses.yaml will be replaced with a list of float values that are inversely'dataset_len_segment'will be replaced with the length of segment in the data,'model_features'will be replaced with the shape of features per frame in the model feature extraction output (the input to SSL modules). proportional to the number of frames labeled with the corresponding classes.
Pickled history files go in the meta folder. They are all pandas dataframes that store the relevant task
parameters, a summary of experiment results (where applicable) and some meta information, like additional
parameters or the time when the record was added. There are separate files for the history of training episodes,
hyperparameter searches, predictions, saved datasets and active learning file generations. The classes that handle
those files are defined at the meta module.
When a dataset is generated (the features are extracted and cut), it is saved in the saved_datasets folder. Every time you create a new task, Project will check the saved dataset records and load pre-computed features if they exist. You can always safely clean the datasets to save space with the remove_datasets() function.
Everything else is stored in the results folder. The text training log files go into the logs subfolder. Model
checkpoints (with 'model_state_dict', 'optimizer_state_dict' and 'epoch' keys) are saved in the models
subfolder. The main results of hyperparameter searches (best parameters and best values) are kept in the meta files
but they also generate html_docs plots that can be accessed in the **searches** subfolder. Split text files can be found
in the **splits** subfolder. They are also checked every time you create a task and if a split with the same
parameters already exists it will be loaded. Active learning files are saved in the **suggestions** subfolder.
Suggestions for each video are named *{video_id}_suggestion.pickle* and the active learning file is always
*al_points.pickle*. Finally, prediction files (pickled dictionaries) are stored in the predictions subfolder.
1# 2# Copyright 2020-present by A. Mathis Group and contributors. All rights reserved. 3# 4# This project and all its files are licensed under GNU AGPLv3 or later version. 5# A copy is included in dlc2action/LICENSE.AGPL. 6# 7""" 8## Project interface 9 10The most convenient way to use `dlc2action` is through the high-level project interface. It is defined in the 11`project` module and its main functions are managing configuration files and keeping track of experiments. 12When you create a `project.Project` instance with a previously unused name, it generates a new project folder with results, 13history and configuration files. 14 15``` 16. 17project_name 18├── config 19├── meta 20├── saved_datasets 21└── results 22 ├── logs 23 │ └── episode.txt 24 ├── models 25 │ └── episode 26 │ ├── epoch25.pt 27 │ └── epoch50.pt 28 ├── searches 29 │ └── search 30 │ ├── search_param_importances.html_docs 31 │ └── search_contour.html_docs 32 ├── splits 33 │ ├── time_25.0%validation_10.0%test.txt 34 │ └── random_20.0%validation_10.0%test.txt 35 ├── suggestions 36 │ └── active_learning 37 │ ├── video1_suggestion.pickle 38 │ ├── video2_suggestion.pickle 39 │ └── al_points.pickle 40 └── predictions 41 ├── episode_epoch25.pickle 42 └── episode_epoch50_newdata.pickle 43``` 44 45Here is an explanation of this structure. 46 47The **config** folder contains .yaml configuration files. Project instances can read them into a parameter dictionary 48and update. Those readers understand several blanks for certain parameters that can be inferred from the data on 49runtime: 50 51* `'dataset_features'` will be replaced with the shape of features per frame in the data, 52* `'dataset_classes'` will be replaced with the number of classes, 53* `'dataset_inverse_weights'` at losses.yaml will be replaced with a list of float values that are inversely 54* `'dataset_len_segment'` will be replaced with the length of segment in the data, 55* `'model_features'` will be replaced with the shape of features per frame in the model feature extraction 56 output (the input to SSL modules). 57proportional to the number of frames labeled with the corresponding classes. 58 59Pickled history files go in the **meta** folder. They are all pandas dataframes that store the relevant task 60parameters, a summary of experiment results (where applicable) and some meta information, like additional 61parameters or the time when the record was added. There are separate files for the history of training episodes, 62hyperparameter searches, predictions, saved datasets and active learning file generations. The classes that handle 63those files are defined at the `meta` module. 64 65When a dataset is generated (the features are extracted and cut), it is saved in the **saved_datasets** folder. Every 66time you create a new task, Project will check the saved dataset records and load pre-computed features if they 67exist. You can always safely clean the datasets to save space with the remove_datasets() function. 68 69Everything else is stored in the *results* folder. The text training log files go into the **logs** subfolder. Model 70checkpoints (with `'model_state_dict'`, `'optimizer_state_dict'` and `'epoch'` keys) are saved in the **models** 71subfolder. The main results of hyperparameter searches (best parameters and best values) are kept in the meta files 72but they also generate html_docs plots that can be accessed in the **searches** subfolder. Split text files can be found 73in the **splits** subfolder. They are also checked every time you create a task and if a split with the same 74parameters already exists it will be loaded. Active learning files are saved in the **suggestions** subfolder. 75Suggestions for each video are named *{video_id}_suggestion.pickle* and the active learning file is always 76*al_points.pickle*. Finally, prediction files (pickled dictionaries) are stored in the **predictions** subfolder. 77""" 78 79from dlc2action.project.project import * 80from dlc2action.project.meta import *