dlc2action
dlc2action is an action segmentation package that makes running and tracking experiments easy.
Usage
dlc2action is designed to be modular.
You can use the high-level project interface for convenient experiment management or just import a metric
or an SSL module if you want more freedom. Here are some tutorials for using the package.
Project
Project is the class that can create and maintain configuration files and keep track of your experiments.
Creating
To start a new project, you can create a new dlc2action.project.project.Project instance in python.
from dlc2action.project import Project
project = Project(
'project_name',
data_type='data_type',
annotation_type='annotation_type',
data_path='path/to/data/folder',
annotation_path='path/to/annotation/folder',
)
Alternatively, if you have installed the package with pip, you can run a command in your terminal.
$ dlc2action_init --name project_name -d data_type -a annotation_type -dp path/to/data_folder -ap path/to/annotation_folder
A new folder will be created at projects_path/project_name with all the necessary files. The default projects path is
a DLC2Action folder that will be created in your home directory.
The project structure looks like this.
.
project_name
├── config # Settings files
├── meta # Project meta files (experiment records)
├── saved_datasets # Pre-computed dataset files
└── results
├── logs # Training logs (human readable)
│ └── episode.txt
├── models # Model checkpoints
│ └── episode
│ ├── epoch25.pt
│ └── epoch50.pt
├── searches # Hyperparameter search results (graphs)
│ └── search
│ ├── search_param_importances.html_docs
│ └── search_contour.html_docs
├── splits # Split files
│ ├── time_25.0%validation_10.0%test.txt
│ └── random_20.0%validation_10.0%test.txt
├── suggestions # Suggestion and active learning files
│ └── active_learning
│ ├── video1_suggestion.pickle
│ ├── video2_suggestion.pickle
│ └── al_points.pickle
└── predictions # Prediction files (pickled dictionaries)
├── episode_epoch25.pickle
└── episode_epoch50_newdata.pickle
You can find a more detailed explanation of the structure at dlc2action.project.
After the poprojectrject is created you can modify the
parameters manually in the project_name/config folder or with the project.update_parameters() function.
Make sure to fill in all the fields marked with ???.
This can also be done through the terminal. With an --all flag the command will iterate through all parameters
and otherwise it will only ask you to fill in the ??? blanks.
$ dlc2action_fill --name project_name --all
Training
When you want to start your experiments, just create a dlc2action.project.project.Project instance again
(or use the one you created
to initialize the project). This time you don't have to set any parameters except the project name (and, if
you used it when creating the project, projects_path).
from dlc2action.project import Project
project = Project('project_name')
The first thing you will want to do is train some models. There are three ways to run a training episode
in dlc2action.
Run a single episode
project.run_episode('episode_1')We have now run a training episode with the default project parameters (read from the configuration files) and saved it in the meta files under the name
episode_1.Run multiple episodes in a row
project.run_episodes(['episode_2', 'episode_3', 'episode_4'])That way the
dlc2action.task.universal_task.Taskinstance will not be re-created every time, which might save you some time.Continue a previously run episode
project.continue_episode('episode_2', num_epochs=500)In case you decide you want to continue an older episode, you can load all parameters and state dictionaries and set a new number of epochs. That way training will go on from where it has stopped and in the end the episode will be re-saved under the same name. Note that
num_epochsdenotes the new total number of epochs, so ifepisode_2has already been trained for 300 epochs, for example, it will now run for 200 epochs more, not 500.
Of course, changing the default parameters every time you want to run a new configuration is not very convenient.
And, luckily, you don't have to do that. Instead you can add a parameters_update parameter to
dlc2action.project.project.Project.run_episode (or parameters_updates to
dlc2action.project.project.Project.run_episodes; all other parameters generalize to multiple episodes
in a similar way). The third
function does not take many additional parameters since it aims to continue an experiment from exactly where
it ended.
project.run_episode(
'episode_5',
parameters_update={
'general': {'ssl': ['contrastive']},
'ssl': {'contrastive': {'ssl_weight': 0.01}},
},
)
In order to find the parameters you can modify, just open the config folder of your project and browse through
the files or call dlc2action_fill in your terminal (see [creating] section). The first-level keys
are the filenames without the extension ('augmentations', 'data', 'general', 'losses',
'metrics', 'ssl', 'training'). Note that there are no
model.yaml or features.yaml files there for the 'model' and 'features' keys in the parameter
dictionaries. Those parameters are read from the files in the model and features folders that correspond
to the options you set in the 'general' dictionary. For example, if at general.yaml model_name is set to
'ms_tcn3', the 'model' dictionary will be read from model/ms_tcn3.yaml.
If you want to create a new episode with modified parameters that loads a previously trained model, you can do that
by adding a load_episode parameter. By default we will load the last saved epoch, but you can also specify
which epoch you want with load_epoch. In that case the closest checkpoint will be chosen.
project.run_episode(
'episode_6',
parameters_update={'training': {'batch_size': 64}},
load_episode='episode_2',
load_epoch=100,
)
Optimizing
dlc2action also has tools for easy hyperparameter optimization. We use the optuna auto-ML package to perform
the searches and then save the best parameters and the search graphs (contour and parameter importance).
The best parameters
can then be loaded when you are run an episode or saved as the defaults in the configuration files.
To start a search, you need to run the project.project.Project.run_hyperparameter_search command. Let's
say we want to optimize for four parameters: overlap length, SSL task type, learning rate and
number of feature maps in the model. Here is the process
to figure out what we want to run.
Find the parameter names
We look those parameters up in the config files and find them in data.yaml, general.yaml, training.yaml and models/ms_tcn3.yaml, respectively. That means that our parameter names are
'data/overlap','general/ssl','training/lr'and'model/num_f_maps'.Define the search space
There are five types of search spaces in
dlc2action:int: integer values (uniform sampling),int_log: integer values (logarithmic scale sampling),float: float values (uniform sampling),float_log: float values (logarithmic sampling),categorical: choice between several values.
The first four are defined with their minimum and maximum value while
categoricalrequires a list of possible values. So the search spaces are described with tuples that look either like(search_space_type, min, max)or like("categorical", list_of_values).We suspect that the optimal overlap is somewhere between 10 and 80 frames, SSL tasks should be either only contrastive or contrastive and masked_features together, sensible learning rate is between 10-2 and 10-4 and the number of feature maps should be between 8 and 64. That makes our search spaces
("int", 10, 80)for the overlap,("categorical", [["contrastive"], ["contrastive", "masked_features"]])for the SSL tasks,("float_log", 1e-4, 1e-2)for the learning rate and ("int_log", 8, 64) for the feature maps.Choose the search parameters
You need to decide which metric you are optimizing for and for how many trials. Note that it has to be one of the metrics you are computing: check
metric_functionsat general.yaml and add your metric if it's not there. Thedirectionparameter determines whether this metric is minimized or maximized. The metric can also be averaged over a few of the most successful epochs (averageparameter). If you want to useoptuna's pruning feature, setprunetoTrue.You can also use parameter updates and load older experiments, as in the [training] section of this tutorial.
Here we will maximize the recall averaged over 5 top epochs and run 50 trials with pruning.
Now we are ready to run!
project.run_hyperparameter_search(
search_space={
"data/overlap": ("int", 10, 80),
"general/ssl": (
"categorical",
[["contrastive"], ["contrastive", "masked_features"]]
),
"training/lr": ("float_log": 1e-4, 1e-2),
"model/num_f_maps": ("int_log", 8, 64),
},
metric="recall",
n_trials=50,
average=5,
search_name="search_1",
direction="maximize",
prune=True,
)
After a search is finished, the best parameters are saved in the meta files and search graphs are saved at
project_name/results/searches/search_1. You can see the best parameters by running
project.project.Project.list_best_parameters:
project.list_best_parameters('search_1')
Those results can also be loaded in a training episode or saved in the configuration files directly.
project.update_parameters(
load_search='search_1',
load_parameters=['training/lr', 'model/num_f_maps'],
round_to_binary=['model/num_f_maps'],
)
project.run_episode(
'episode_best_params',
load_search='search_1',
load_parameters=['data/overlap', 'general/ssl'],
)
In this example we saved the learning rate and the number of feature maps in the configuration files and
loaded the other parameters to run
the episode_best_params training episode. Note how we used the round_to_binary parameter.
It will round the number of feature maps to the closest power of two (7 to 8, 35 to 32 and so on). This is useful
for parameters like the number of features or the batch size.
Exploring results
After you run a bunch of experiments you will likely want to get an overview.
Visualization
You can get a feeling for the predictions made by a model by running project.project.Project.visualize_results.
project.visualize_results('episode_1', load_epoch=50)
This command will generate a prediction for a random sample and visualize it compared to the ground truth. There is a lot of parameters you can customize, check them out in the documentation).
Another available visualization type is training curve comparison with project.project.Project.plot_episodes.
You can compare different metrics and modes across several episode or within one. For example, this command
will plot the validation accuracy curves for the two episodes.
project.plot_episodes(['episode_1', 'episode_2'], metrics=['accuracy'])
And this will plot training and validation recall curves for episode_3.
project.plot_episodes(['episode_3'], metrics=['recall'], modes=['train', 'val'])
You can also plot several episodes as one curve. That can be useful, for example, with episodes 2 and 6
in our tutorial, since episode_6 loaded the model from episode_2.
project.plot_episodes([['episode_2', 'episode_6'], 'episode_4'], metrics=['precision'])
Tables
Alternatively, you can start analyzing your experiments by putting them in a table. You can get a summary of
your training episodes with project.project.Project.list_episodes. It provides you with three ways to filter
the data.
Episode names
You can directly say which episodes you want to look at and disregard all others. Let's say we want to see episodes 1, 2, 3 and 4.
Parameters
In a similar fashion, you can choose to only display certain parameters. See all available parameters by running
project.list_episodes().columns. Here we are only interested in the time of adding the record, the recall results and the learning rate.Value filter
The last option is to filter by parameter values. Filters are defined by strings that look like this:
'{parameter_name}::{sign}{value}'. You can use as many filters as ypu want and separate them with a comma. The parameter names are the same as in point 2, the sign can be either>,>=,<,<=or=and you choose the value. Let's say we want to only get the episodes that took at least 30 minutes to train and got to a recall that is higher than 0.4. That translates to'results/recall::>0.4,meta/training_time::>=00:30:00'.
Putting it all together, we get this command.
project.list_episodes(
episodes=['episode_1', 'episode_2', 'episode_3', 'episode_4'],
display_parameters=['meta/time', 'results/recall', 'training/lr'],
episode_filter='results/recall::>0.4,meta/training_time::>=00:30:00',
)
There are similar functions for summarizing other history files: project.project.Project.list_searches,
project.project.Project.list_predictions, project.project.Project.list_suggestions and they all follow the
same pattern.
Making predictions
When you feel good about one of the models, you can move on to making predictions. There are two ways to do that:
generate pickled prediction dictionaries with project.project.Project.run_prediction or active learning
and suggestion files for the [GUI] with project.project.Project.run_suggestion.
By default the predictions will be made for the entire dataset at the data path of the project. However, you
can also choose to only make them for the training, validation or test subset (set with mode parameter) or
even for entirely new data (set with data_path parameter). Note that if you set the data_path
value, the split
parameters will be disregarded and mode will be forced to 'all'!
Here is an example of running these functions.
project.run_prediction('prediction_1', episode_name='episode_3', load_epoch=150, mode='test')
project.run_suggestion(
'suggestion_new_data',
suggestion_episode='episode_4',
suggestion_classes=['sleeping', 'eating'],
suggestion_threshold=0.6,
exclude_classes=['inactive'],
data_path='/path/to/new_data_folder',
)
The first command will generate a prediction dictionary with the model from epoch 150 of episode_3 for the test
subset of the project data (split according to the configuration files) and save it at
project_name/results/predictions/prediction_1.pickle. The second command will create suggestion and active
learning files for data at /path/to/new_data_folder and save them at
project_name/results/suggestions/suggestion_new_data. There is a lot of parameters you can tune here to
get exactly what you need, so we really recommend reading the documentation of the
project.project.Project.run_suggestion function before using it.
The history of prediction and suggestion runs is recorded and at the [exploring] section we have described how to access it.
Contribution
Here you can learn how to add new elements to dlc2action.
Models
Models in dlc2action are formalized as model.base_model.Model instances. This class inherits from
torch.nn.Module but adds a more formalized structure and defines interaction with SSL modules.
Note that SSL modules and base models are added separately to allow for 'mixing and matching', so make sure
the model you want to add only predicts the target value.
The process to add a new model would be as follows.
Separate your model into feature extraction and prediction
If we want to add an SSL module to your module, where should it go? Everything up to that point is your feature extractor and everything after is the prediction generator.
Write the modules
Create a {model_name}_modules.py file at the dlc2action/models folder and write all necessary modules there.
Formalize input and output
Make sure that your feature extraction and prediction generation modules accept a single input and return a single output.
Write the model
Create a {model_name}.py file at the dlc2action/models folder and write the code for the
model.base_model.Modelchild class. Themodel.base_model.Model._feature_extractorandmodel.base_model.Model._predictorfunctions need to return your feature extraction and prediction generation modules, respectively. See themodel.ms_tcncode for reference.Add options
Add your model to the
modelsdictionary atdlc2action.options. Add{'{model_name}.{class_name}.dump_patches': False, '{model_name}.{class_name}.training': False}to the__pdoc__dictionary atdlc2action.model.__init__.Add config parameters
Read the [config] explanation and add your parameters.
SSL
SSL tasks are formalized as ssl.base_ssl.SSLConstructor instances. Read the documentation at ssl to
learn more about this and follow the process.
Create a new file
Create a new file for your constructor at dlc2action/ssl and start defining a
ssl.base_ssl.SSLConstructorchild class.Determine the type of your task
You can find the descriptions of available SSL types at
ssl.base_ssl.SSLConstructor. If what you want to do does not fit any of those types, you can add a new one. In that case you will have to modify themodel.base_modelcode. More specifically, add the new name to theavailable_ssl_typesvariable and define the interaction atmodel.base_model.Model.forward. Set thetypeclass variable to the type you decided on.Define the data transformation
You need to write a function that will take a feature dictionary (see
feature_extraction) as input and return SSL input and SSL target as output. Read more about that atssl. Thetransformationmethod of your class should perform that function.Define the network module
Take a look at
ssl.modulesand choose one of the existing modules or add your own. Remember that itsforwardmethod has to receive the output of a base model's feature extraction module (see the [model] tutorial for more details) as input and return a value that will go into the SSL loss function as output. Your class'sconstruct_modulemethod needs to return an instance of this module.Define the loss function
Choose your loss at
dlc2action.lossor add a new one. If you do decide to add a new loss, see the [loss] tutorial. Set thelossmethod of your class to apply this loss.Add options
Add your model to the
ssl_constructorsdictionary atdlc2action.optionsand set a default loss weight in thessl_weightsdictionary at training.yaml.Add config parameters
Read the [config] explanation and add your parameters.
Datasets
Datasets in dlc2action are defined through data.base_store.InputStore and data.base_store.BehaviorStore
classes. Before you start, please read the documentation at data to learn more about the assumptions we make
and the different classes involved in data processing. Then, follow these instructions.
Figure out what you need to add
What are the data and annotation types of your dataset? Can you find either of them at
options.input_storesandoptions.annotation_stores? If so, you are in luck. If not, move on to have a look atdata.annotation_storeanddata.input_store. It's likely that the classes we already have already do most of what you need. For annotation, in most cases it should be enough to inherit fromdata.annotation_store.ActionSegmentationStoreand only change thedata.annotation_store.ActionSegmentationStore._open_annotationsfunction. With input stores it can be more complicated, but should definitely use implementations atdata.input_storeas an example.Write the class
Create a new class at either
data.input_storeordata.annotation_store, inherit from whatever is the closest to what you need and implement all abstract functions.Add options
Add your store to
options.annotation_storesoroptions.input_stores.Add config parameters
Read the [config] explanation and add your parameters.
Losses
Adding a new loss is fairly straightforward since dlc2action uses torch.nn.Module instances as losses.
Write the class
Add a new loss class inheriting from
torch.nn.Moduleatdlc2action.loss. Add it to one of the existing submodules or create a new one if it isn't a good fit for anything. Generally, theforwardmethod of your loss should take prediction and target tensors as input, in that order, and return a float value.Add options
If you want your new loss to be an option for the main loss function in training, you should add it to the
dlc2action.options.lossesdictionary. Note that in that case it has to have the input and output formats described above. If your loss expects input from a multi-stage model, like MS-TCN, add its name todlc2action.options.losses_multistageas well.Add config parameters
Read the [config] explanation and add your parameters.
Metrics
All metrics in dlc2action inherit from metric.base_metric.Metric. They store internal parameters that get updated
with each batch and are then used to compute the metric value. You need to write the functions for resetting the
parameters, updating them and calculating the value.
Write the class
Add a new class at
metric.metrics, inheriting frommetric.base_metric.Metricand write the three abstract methods (read the documentation for details).Add options
Add your new metric to the
dlc2action.options.metricsdictionary. Next, if it is supposed to decrease with increasing prediction quality, add its name to thedlc2action.options.metrics_minimizelist and if it does not say anything about how good the predictions are, add it todlc2action.options.metrics_no_direction. If it increases when predictions get better, you don't need to do anything else.Add config parameters
If you added your loss to the options at the previous step, you also need to add it to the config files. Read the [config] explanation and do that.
Feature extractors
Feature extractors in dlc2action are defined fairly loosely. You need to make a new class that inherits from
one of feature_extraction.FeatureExtractor subclasses (i.e. feature_extraction.PoseFeatureExtractor). Each
of those subclasses are defined for a subclass of data.base_store.InputStore for a specific broad type of
data (i.e. data.base_store.PoseInputStore for pose estimation data).
When extracting the features, data.base_store.InputStore instances will pass a data dictionary to the feature
extractor and they will also contain the methods to extract information from that dictionary. You only
need to use that information (coordinates, list of body parts, number of frames and so on) to compute
some new features (distances, speeds, accelerations). Read the feature_Extraction documentation to find out more
about that. When you're done with reading you can start with those instructions.
Write the class
Create a new class at
feature_extractionthat inherits from afeature_extraction.FeatureExtractorsubclass. Look at the methods available for information extraction and write theextract_featuresfunction that will transform that information into some new features.Add a transformer
Since you've added some new features,
dlc2actiondoes not know how to handle them when adding augmentations. You need to either create a newtransformer.base_transformer.Transformeror modify an existing one to fix that. Read the [augmentations] tutorial to find out how.Add options
Add your feature extractor to
dlc2action.options.feature_extractorsand set the corresponding transformer atdlc2action.options.extractor_to_transformer.Add config parameters
Read the [config] explanation and add your parameters.
Augmentations
Augmentations in dlc2action are carried out by transformer.base_transformer.Transformer instances.
Since they need to know how to handle different features, each transformer handles the output of specific
feature_extraction.FeatureExtractor subclasses (only one transformer corresponds to each feature extractor,
but each transformer can
work with multiple feature extractors). Hence the instructions.
Figure out whether you need a new transformer
If the feature extractor you are interested in already has a corresponding transformer, you only need to add a new method to that transformer. You can find which transformer you need at
dlc2action.options.extractor_to_transformer. If the feature extractor is new, it might still be enough to modify an existing transformer. Take a look attransformerand see if any of the existing classes are close enough to what you need. If they are not, you will have to create a new class.Add a new class (skip this step if you are not creating a new transformer)
If you did decide to create a new class, create a new {transformer_name}.py file at
dlc2action.transformerand write it there, inheriting fromtransformer.base_transformer.Transformer.Add augmentation functions
Add new methods to your transformer class. Each method has to perform one augmentation, take
(main_input: dict, ssl_inputs: list, ssl_targets: list)as input and return an output of the same format. Heremain_inputis a feature dictionary of the sample data,ssl_inputsis a list of SSL input feature dictionaries andssl_targetsis a list of SSL target feature dictionaries. The same augmentations are applied to all inputs if they are notNoneand have feature keys that are recognized by your transformer.When you are done writing the methods, add them with their names to the dictionary returned by
transformer.base_transformer.augmentations_dict. If you think that your augmentation suits a very wide range of datasets, add its name to the list returned bytransformer.base_transformer.default_augmentationsas well.Add options (skip this step if you are not creating a new transformer)
Add your new transformer to
dlc2action.options.transformers.Add config parameters
Read the [config] explanation and add your parameters.
Config files
It is important to modify the config files when you add something new to dlc2action since it is supposed to
mainly be used through the project interface that relies on the files being up-to-date.
To add configuration parameters for a new model, feature extractor, input or annotation store, create a new file in the model, features, data or annotation subfolder of dlc2action/config, respectively. If you are adding a new loss, metric or SSL constructor, add a new dictionary to the corresponding file at the same folder. Make sure to use the same name as what you've added to the options file. Then just list all necessary parameters (use the same names as in the class signature), their default values and an explanation in the commentary.
The commentary for every parameter has to start with its supposed type. Use one of 'bool', 'int', 'float',
'str', 'list' or 'set' or define a set of possible values with square brackets (e.g. ['pose', 'image']).
The type has to be followed with a semicolon and a short explanation. Please try to keep the explanations one line
long. Here is an example:
frame_limit: 15 # int; tracklets shorter than this number of frames will be discarded
Note that if a parameter that you need is already defined at dlc2action/config/general.yaml, it will be pulled automatically, there is no need to add it to your new list of parameters. Just make sure that the name of the parameter in your class signature and at general.yaml is the same.
In addition, there are also several blanks you can use to set the default parameters. Those blanks
are filled at runtime after the dataset is created. Read more about them at the project documentation.
When you are finished with defining the parameters, you should also describe them in the documentation (perhaps in more detail) at the [options] section.
Options
Here you can find an overview of the parameters currently available in the settings.
Data types
'dlc_tracklet': DLC tracklet data,'dlc_track': DLC track data,
Annotation types
'dlc': the classic_annotation GUI format,'boris': BORIS format,
Data
Annotation
BORIS and DLC
behaviors: set
the set of string names of the behaviors to predictannotation_suffix: set
the set of string suffixes such that annotation file names have the format of {video_id}{annotation_suffix}correction: dict
a dictionary of label corrections where keys are the names to be replaced and values are the replacements (e. g.{'sleping': 'sleeping', 'calm locomotion': 'locomotion'})error_class: str
a class denoting errors in pose estimation (the intervals annotated with this class will be ignored)min_frames_action: int
the minimum number of frames for an action (shorter actions will be discarded during training)filter_annotated: bool
if true,dlc2actionwill discard long unannotated intervals during trainingfilter_background: bool
if true, only label frames as background if a behavior is annotated somewhere close (in the same segment)filter_visibility: bool
if true, only label behaviors if there is suffucient visibility (a fraction of frames in a segment larger thanvisibility_min_frachas visibility score larger thanvisibility_min_score; the visibility score is defined bydata.base_store.InputStoreinstances and should generally be from 0 to 1visibility_min_score: float
the minimum visibility score for visibility filtering (generally from 0 to 1)visibility_min_frac: float
the minimum fraction of visible frames for visibility filtering
CalMS21
No parameters
Input
DLC track and tracklet
data_suffix: set
the set of suffices such that the data files have the format of {video_id}{data_suffix}data_prefix: set
a set of prefixes used in file names for different views such that the data files have the format of {data_prefix}{video_id}{data_suffix}feature_suffix: str
the string suffix such that the precomputed feature files are named {video_id}{feature_suffix}; the files should be either stored atdata_pathor included infile_pathscontert_int_indices: bool
if true, convert any integer key i in feature files to 'ind{i}'canvas_shape: list
the size of the canvas where the pose was definedlen_segment: int
the length of segments (in frames) to cut the videos intooverlap: int
the overlap (in frames) between neighboring segmentsignored_bodyparts: set
the set of string names of bodyparts to ignoredefault_agent_name: str
the default agent name used in pose file generationframe_limit: int(only for tracklets)
tracklets shorter than this number of frames will be discarded
CalMS21
len_segment: int
the length of segments (in frames) to cut the videos intooverlap: int
the overlap (in frames) between neighboring segmentsload_unlabeled: bool
if true, unlabeled data will be loadedfile_prefix: str
the prefix such that the data files are named {prefix}train.npy and {prefix}test.npy
Features
Kinematic
interactive: bool
if true, features are computed for pairs of clips (agents)keys: set
the keys to include (a subset of{"coords", "intra_distance", "speed_joints", "angle_joints_radian", "acc_joints", "inter_distance"}where'coords'is the raw coordinates,'intra_distance'is the distances between all pairs of keypoints,'speed_joints'is the approximated vector speeds of all keypoints,'angle_joints_radian'is the approximated angular speeds of all keypoints,'acc_joints'is the approximated accelerations of all keypoints,'inter_distance'is the distance between all pairs of keypoints where the points belong to different agents (used ifinteractiveis true))
Augmentations
Kinematic
augmentations: set
a list of augmentations (a subset of{'rotate', 'mask', 'add_noise', 'shift', 'zoom', 'mirror'})use_default_augmentations: bool
if true, the default augmentation list will be used (['mirror', 'shift', 'add_noise'])rotation_limits: list
list of float rotation angle limits ([low, high]; default [-pi/2, pi/2])mirror_dim: set
set of integer dimension indices that can be mirrorednoise_std: float
standard deviation of added noisezoom_limits: list
list of float zoom limits ([low, high]; default [0.5, 1.5])masking_probability: float
the probability of masking a joint
Model
MS-TCN3
- num_layers_PG: int`
number of layers in prediction generation stage - num_layers_R: int`
number of layers in refinement stages num_R: int
number of refinement stagesnum_f_maps: int
number of feature mapsdims: int
number of input features (set to 'dataset_features' to get the number fromdlc2action.data.dataset.BehaviorDataset.dataset_features())dropout_rate: float
dropout rateskip_connections_refinement: bool
if true, skip connections are added to the refinement stagesblock_size_prediction: int
if not null, skip connections are added to the prediction generation stage with this intervaldirection_R: str
causality of refinement stages; choose from:None: no causality,bidirectional: a combination of forward and backward networks,forward: causal,backward: anticausal
direction_PG: str
causality of refinement stages (seedirection_R)shared_weights: bool
if `True``, weights are shared across refinement stagesblock_size_refinement: int
if not 0, skip connections are added to the refinement stage with this intervalPG_in_FE: bool
ifTrue, the prediction generation stage is included in the feature extractorrare_dilations: bool
ifTrue, dilation increases with layer less oftennum_heads: int
the number of parallel refinement stages
General
model_name: str
model name; choose from:'ms_tcn3': the original MS-TCN++ model with options to share weights across refinement stages or add skip connections.
num_classes: int
number of classes (set to 'dataset_num_classes' to get the number fromdlc2action.data.dataset.BehaviorDataset.num_classes())exclusive: bool
if true, single-label classification is used; otherwise multi-labelssl: set
a set of SSL types to use; choose from:'contrastive': contrastive SSL with NT-Xent loss,'pairwise': pairwise comparison SSL with triplet or circle loss,'masked_features': prediction of randomly masked features with MSE loss,'masked_frames': prediction of randomly masked frames with MSE loss,'masked_joints': prediction of randomly masked joints with MSE loss,'masked_features_tcn': prediction of randomly masked features with MSE loss and a TCN module,'masked_frames_tcn': prediction of randomly masked frames with MSE loss and a TCN module,'masked_joints_tcn': prediction of randomly masked joints with MSE loss and a TCN module
metric_functions: set
set of metric names; choose from:'recall': frame-wise recall,'segmental_recall': segmental recall,'precision': frame-wise precision,'segmental_precision': segmental precision,'f1': frame-wise F1 score,'segmental_f1': segmental F1 score,'edit_distance': edit distance (as a fraction of length of segment),'count': the fraction of predictions labeled with a specific behavior
loss_function: str
name of loss function; choose from:'ms_tcn': a loss designed for the MS-TCN network; cross-entropy + MSE for consistency
feature_extraction: str
the feature extraction method; choose from:'kinematic': extracts distances, speeds and accelerations for pose estimation data
only_load_annotated: bool
if true, the input files that don't have a matching annotation file will be disregardedignored_clips: set
a set of string clip ids (agent names) to be ignored
Losses
MS_TCN
weights: list
list of weights for weighted cross-entropyfocal: bool
if True, focal loss will be usedgamma: float
the gamma parameter of focal lossalpha: float
the weight of consistency loss
Metrics
Recall, precision and F1 score (segmental and not)
main_class: int
if not null, recall will only be calculated for main_classaverage: str
averaging method; choose from'macro','micro'or'none'ignored_classes: set
a set of class ids to ignore in calculationiou_threshold: float(only for segmental metrics)
intervals with IoU larger than this threshold are considered correct
Count
classes: set
a set of the class indices to count the occurrences of
SSL
Contrastive
ssl_features: int
length of clip feature vectorstau: float
tau (NT-Xent loss parameter)len_segment: int
length of the segments that enter the SSL modulenum_f_maps: list
shape of the segments that enter the SSL module
Pairwise
ssl_features: int
length of clip feature vectorsmargin: float
margin (triplet loss parameter)distance: str
either 'cosine' or 'euclidean'loss: str
either 'triplet' or 'circle'gamma: float
gamma (triplet and circle loss parameter)len_segment: int
length of the segments that enter the SSL modulenum_f_maps: list
shape of the segments that enter the SSL module
Masked joints, features and frames
frac_masked: float
fraction of features to be maskednum_ssl_layers: int
number of layers in the SSL modulenum_ssl_f_maps: int
number of feature maps in the SSL moduledims: int
number of features per frame in original input datanum_f_maps: list
shape of the segments that enter the SSL module
Contrastive masked
ssl_features: int
length of clip feature vectorstau: float
tau (NT-Xent loss parameter)len_segment: int
length of the segments that enter the SSL modulenum_f_maps: list
shape of the segments that enter the SSL modulenum_masked: int
number of frames to mask
Training
lr: float
learning ratedevice: str
device name (recognized bytorch)verbose: bool
print training processaugment_train: int
either 1 to use augmentations during training or 0 to not useaugment_val: int
number of augmentations to average over at validationssl_weights: dict
dictionary of SSL loss function weights (keys are SSL names, values are weights)num_epochs: int
number of epochsto_ram: bool
transfer the dataset to RAM for training (preferred if the dataset fits in working memory)batch_size: int
batch sizefreeze_features: bool
freeze the feature extractor parametersignore_tags: bool
ignore meta tags (meta tags are generated by some datasets and used by some models and can contain information such as annotator id); whenTrue, all meta tags are set toNonemodel_save_epochs: int
interval for saving training checkpoints (the last epoch is always saved)use_test: float
the fraction of the test dataset to use in training without labels (for SSL tasks)partition_method: str
the train/test/val partitioning method; choose from:'random': sort videos into subsets randomly,'random:test-from-name'(or'random:test-from-name:{name}'): sort videos into training and validation subsets randomly and create the test subset from the video ids that start with a speific substring ('test'by default, ornameif provided),'random:equalize:segments'and'random:equalize:videos': sort videos into subsets randomly but making sure that for the rarest classes at least0.8 * val_fracof the videos/segments that contain occurrences of the class get into the validation subset and0.8 * test_fracget into the test subset; this in ensured for all classes in order of increasing number of occurrences until the validation and test subsets are full'val-from-name:{val_name}:test-from-name:{test_name}': create the validation and test subsets from the video ids that start with specific substrings (val_namefor validation andtest_namefor test) and sort all other videos into the training subset'folders': read videos from folders named test, train and val into corresponding subsets,'time': split each video into training, validation and test subsequences,'time:strict': split each video into validation, test and training subsequences and throw out the last segment in validation and test (to get rid of overlaps),'file': split according to a split file.
val_frac: float
fraction of dataset to use as validationtest_frac: float
fraction of dataset to use as testsplit_path: str
path to the split file (only used when partition_method is'file', otherwise disregarded and filled automatically)
1# 2# Copyright 2020-present by A. Mathis Group and contributors. All rights reserved. 3# 4# This project and all its files are licensed under GNU AGPLv3 or later version. 5# A copy is included in dlc2action/LICENSE.AGPL. 6# 7""" 8`dlc2action` is an action segmentation package that makes running and tracking experiments easy. 9 10# Usage 11`dlc2action` is designed to be modular. 12You can use the high-level project interface for convenient experiment management or just import a metric 13or an SSL module if you want more freedom. Here are some tutorials for using the package. 14 15## Project 16Project is the class that can create and maintain configuration files and keep track of your experiments. 17 18### Creating 19[creating]: #creating 20To start a new project, you can create a new `dlc2action.project.project.Project` instance in python. 21```python 22from dlc2action.project import Project 23 24project = Project( 25 'project_name', 26 data_type='data_type', 27 annotation_type='annotation_type', 28 data_path='path/to/data/folder', 29 annotation_path='path/to/annotation/folder', 30) 31``` 32 33Alternatively, if you have installed the package with pip, you can run a command in your terminal. 34 35``` 36$ dlc2action_init --name project_name -d data_type -a annotation_type -dp path/to/data_folder -ap path/to/annotation_folder 37``` 38 39A new folder will be created at `projects_path/project_name` with all the necessary files. The default projects path is 40a `DLC2Action` folder that will be created in your home directory. 41The project structure looks like this. 42``` 43. 44project_name 45├── config # Settings files 46├── meta # Project meta files (experiment records) 47├── saved_datasets # Pre-computed dataset files 48└── results 49 ├── logs # Training logs (human readable) 50 │ └── episode.txt 51 ├── models # Model checkpoints 52 │ └── episode 53 │ ├── epoch25.pt 54 │ └── epoch50.pt 55 ├── searches # Hyperparameter search results (graphs) 56 │ └── search 57 │ ├── search_param_importances.html_docs 58 │ └── search_contour.html_docs 59 ├── splits # Split files 60 │ ├── time_25.0%validation_10.0%test.txt 61 │ └── random_20.0%validation_10.0%test.txt 62 ├── suggestions # Suggestion and active learning files 63 │ └── active_learning 64 │ ├── video1_suggestion.pickle 65 │ ├── video2_suggestion.pickle 66 │ └── al_points.pickle 67 └── predictions # Prediction files (pickled dictionaries) 68 ├── episode_epoch25.pickle 69 └── episode_epoch50_newdata.pickle 70``` 71 72You can find a more detailed explanation of the structure at `dlc2action.project`. 73 74After the poprojectrject is created you can modify the 75parameters manually in the `project_name/config` folder or with the `project.update_parameters()` function. 76Make sure to fill in all the fields marked with `???`. 77 78This can also be done through the terminal. With an `--all` flag the command will iterate through all parameters 79and otherwise it will only ask you to fill in the `???` blanks. 80 81``` 82$ dlc2action_fill --name project_name --all 83``` 84 85### Training 86[training]: #training 87When you want to start your experiments, just create a `dlc2action.project.project.Project` instance again 88(or use the one you created 89to initialize the project). This time you don't have to set any parameters except the project name (and, if 90you used it when creating the project, `projects_path`). 91 92```python 93from dlc2action.project import Project 94project = Project('project_name') 95``` 96 97The first thing you will want to do is train some models. There are three ways to run a *training episode* 98in `dlc2action`. 99 1001. **Run a single episode** 101 102 ```python 103 project.run_episode('episode_1') 104 ``` 105 106 We have now run a training episode with the default project parameters (read from the configuration files) 107 and saved it in the meta files under the name `episode_1`. 108 1092. **Run multiple episodes in a row** 110 111 ```python 112 project.run_episodes(['episode_2', 'episode_3', 'episode_4']) 113 ``` 114 115 That way the `dlc2action.task.universal_task.Task` instance will not be 116 re-created every time, which might save you some time. 117 1183. **Continue a previously run episode** 119 120 ```python 121 project.continue_episode('episode_2', num_epochs=500) 122 ``` 123 124 In case you decide you want to continue an older episode, you can load all parameters and state dictionaries 125 and set a new number of epochs. That way training will go on from where it has stopped and in the end the 126 episode will be re-saved under the same name. Note that `num_epochs` denotes the **new total number of epochs**, 127 so if `episode_2` has already been trained for 300 epochs, for example, it will now run for 200 epochs more, 128 not 500. 129 130Of course, changing the default parameters every time you want to run a new configuration is not very convenient. 131And, luckily, you don't have to do that. Instead you can add a `parameters_update` parameter to 132`dlc2action.project.project.Project.run_episode` (or `parameters_updates` to 133`dlc2action.project.project.Project.run_episodes`; all other parameters generalize to multiple episodes 134in a similar way). The third 135function does not take many additional parameters since it aims to continue an experiment from exactly where 136it ended. 137 138```python 139project.run_episode( 140 'episode_5', 141 parameters_update={ 142 'general': {'ssl': ['contrastive']}, 143 'ssl': {'contrastive': {'ssl_weight': 0.01}}, 144 }, 145) 146``` 147 148In order to find the parameters you can modify, just open the *config* folder of your project and browse through 149the files or call `dlc2action_fill` in your terminal (see [creating] section). The first-level keys 150are the filenames without the extension (`'augmentations'`, `'data'`, `'general'`, `'losses'`, 151`'metrics'`, `'ssl'`, `'training'`). Note that there are no 152*model.yaml* or *features.yaml* files there for the `'model'` and `'features'` keys in the parameter 153dictionaries. Those parameters are **read from the files in the *model* and *features* folders** that correspond 154to the options you set in the `'general'` dictionary. For example, if at *general.yaml* `model_name` is set to 155`'ms_tcn3'`, the `'model'` dictionary will be read from *model/ms_tcn3.yaml*. 156 157If you want to create a new episode with modified parameters that loads a previously trained model, you can do that 158by adding a `load_episode` parameter. By default we will load the last saved epoch, but you can also specify 159which epoch you want with `load_epoch`. In that case the closest checkpoint will be chosen. 160 161```python 162project.run_episode( 163 'episode_6', 164 parameters_update={'training': {'batch_size': 64}}, 165 load_episode='episode_2', 166 load_epoch=100, 167) 168``` 169 170### Optimizing 171`dlc2action` also has tools for easy hyperparameter optimization. We use the `optuna` auto-ML package to perform 172the searches and then save the best parameters and the search graphs (contour and parameter importance). 173The best parameters 174can then be loaded when you are run an episode or saved as the defaults in the configuration files. 175 176To start a search, you need to run the `project.project.Project.run_hyperparameter_search` command. Let's 177say we want to optimize for four parameters: overlap length, SSL task type, learning rate and 178number of feature maps in the model. Here is the process 179to figure out what we want to run. 180 1811. **Find the parameter names** 182 183 We look those parameters up in the config files and find them in *data.yaml*, *general.yaml*, *training.yaml* 184 and *models/ms_tcn3.yaml*, 185 respectively. 186 That means that our parameter names are `'data/overlap'`, `'general/ssl'`, `'training/lr'` and 187 `'model/num_f_maps'`. 188 1892. **Define the search space** 190 191 There are five types of search spaces in `dlc2action`: 192 193 - `int`: integer values (uniform sampling), 194 - `int_log`: integer values (logarithmic scale sampling), 195 - `float`: float values (uniform sampling), 196 - `float_log`: float values (logarithmic sampling), 197 - `categorical`: choice between several values. 198 199 The first four are defined with their minimum and maximum value while `categorical` requires a list of 200 possible values. So the search spaces are described with tuples that look either like 201 `(search_space_type, min, max)` or like `("categorical", list_of_values)`. 202 203 We suspect that the optimal overlap is somewhere between 10 and 80 frames, SSL tasks should be either 204 only contrastive or contrastive and masked_features together, sensible learning rate is between 205 10<sup>-2</sup> and 10<sup>-4</sup> and the number of feature maps should be between 8 and 64. 206 That makes our search spaces `("int", 10, 80)` for the overlap, 207 `("categorical", [["contrastive"], ["contrastive", "masked_features"]])` for the SSL tasks, 208 `("float_log", 1e-4, 1e-2)` for the learning rate and ("int_log", 8, 64) for the feature maps. 209 2103. **Choose the search parameters** 211 212 You need to decide which metric you are optimizing for and for how many trials. Note that it has to be one 213 of the metrics you are computing: check `metric_functions` at *general.yaml* and add your 214 metric if it's not there. The `direction` parameter determines whether this metric is minimized or maximized. 215 The metric can also be averaged over a few of the most successful epochs (`average` parameter). If you want to 216 use `optuna`'s pruning feature, set `prune` to `True`. 217 218 You can also use parameter updates and load older experiments, as in the [training] section of this tutorial. 219 220 Here we will maximize the recall averaged over 5 top epochs and run 50 trials with pruning. 221 222Now we are ready to run! 223 224```python 225project.run_hyperparameter_search( 226 search_space={ 227 "data/overlap": ("int", 10, 80), 228 "general/ssl": ( 229 "categorical", 230 [["contrastive"], ["contrastive", "masked_features"]] 231 ), 232 "training/lr": ("float_log": 1e-4, 1e-2), 233 "model/num_f_maps": ("int_log", 8, 64), 234 }, 235 metric="recall", 236 n_trials=50, 237 average=5, 238 search_name="search_1", 239 direction="maximize", 240 prune=True, 241) 242``` 243 244After a search is finished, the best parameters are saved in the meta files and search graphs are saved at 245*project_name/results/searches/search_1*. You can see the best parameters by running 246`project.project.Project.list_best_parameters`: 247 248```python 249project.list_best_parameters('search_1') 250``` 251Those results can also be loaded in a training episode or saved in the configuration files directly. 252 253```python 254project.update_parameters( 255 load_search='search_1', 256 load_parameters=['training/lr', 'model/num_f_maps'], 257 round_to_binary=['model/num_f_maps'], 258) 259project.run_episode( 260 'episode_best_params', 261 load_search='search_1', 262 load_parameters=['data/overlap', 'general/ssl'], 263) 264``` 265In this example we saved the learning rate and the number of feature maps in the configuration files and 266loaded the other parameters to run 267the `episode_best_params` training episode. Note how we used the `round_to_binary` parameter. 268It will round the number of feature maps to the closest power of two (7 to 8, 35 to 32 and so on). This is useful 269for parameters like the number of features or the batch size. 270 271### Exploring results 272[exploring]: #exploring-results 273After you run a bunch of experiments you will likely want to get an overview. 274 275#### Visualization 276You can get a feeling for the predictions made by a model by running `project.project.Project.visualize_results`. 277 278```python 279project.visualize_results('episode_1', load_epoch=50) 280``` 281This command will generate a prediction for a random sample and visualize it compared to the ground truth. 282There is a lot of parameters you can customize, check them out in the documentation). 283 284Another available visualization type is training curve comparison with `project.project.Project.plot_episodes`. 285You can compare different metrics and modes across several episode or within one. For example, this command 286will plot the validation accuracy curves for the two episodes. 287 288```python 289project.plot_episodes(['episode_1', 'episode_2'], metrics=['accuracy']) 290``` 291And this will plot training and validation recall curves for `episode_3`. 292 293```python 294project.plot_episodes(['episode_3'], metrics=['recall'], modes=['train', 'val']) 295``` 296You can also plot several episodes as one curve. That can be useful, for example, with episodes 2 and 6 297in our tutorial, since `episode_6` loaded the model from `episode_2`. 298 299```python 300project.plot_episodes([['episode_2', 'episode_6'], 'episode_4'], metrics=['precision']) 301``` 302 303#### Tables 304Alternatively, you can start analyzing your experiments by putting them in a table. You can get a summary of 305your training episodes with `project.project.Project.list_episodes`. It provides you with three ways to filter 306the data. 307 3081. **Episode names** 309 310 You can directly say which episodes you want to look at and disregard all others. Let's say we want to see 311 episodes 1, 2, 3 and 4. 312 3132. **Parameters** 314 315 In a similar fashion, you can choose to only display certain parameters. See all available parameters by 316 running `project.list_episodes().columns`. Here we are only interested in the time of adding the record, 317 the recall results and the learning rate. 318 3193. **Value filter** 320 321 The last option is to filter by parameter values. Filters are defined by strings that look like this: 322 `'{parameter_name}::{sign}{value}'`. You can use as many filters as ypu want and separate them with a comma. 323 The parameter names are the same as in point 2, the sign can be either 324 `>`, `>=`, `<`, `<=` or `=` and you choose the value. Let's say we want to only get the episodes that took 325 at least 30 minutes to train and got to a recall that is higher than 0.4. That translates to 326 `'results/recall::>0.4,meta/training_time::>=00:30:00'`. 327 328Putting it all together, we get this command. 329 330```python 331project.list_episodes( 332 episodes=['episode_1', 'episode_2', 'episode_3', 'episode_4'], 333 display_parameters=['meta/time', 'results/recall', 'training/lr'], 334 episode_filter='results/recall::>0.4,meta/training_time::>=00:30:00', 335) 336``` 337 338There are similar functions for summarizing other history files: `project.project.Project.list_searches`, 339`project.project.Project.list_predictions`, `project.project.Project.list_suggestions` and they all follow the 340same pattern. 341 342### Making predictions 343When you feel good about one of the models, you can move on to making predictions. There are two ways to do that: 344generate pickled prediction dictionaries with `project.project.Project.run_prediction` or active learning 345and suggestion files for the [GUI] with `project.project.Project.run_suggestion`. 346 347By default the predictions will be made for the entire dataset at the data path of the project. However, you 348can also choose to only make them for the training, validation or test subset (set with `mode` parameter) or 349even for entirely new data (set with `data_path` parameter). Note that if you set the `data_path` 350value, **the split 351parameters will be disregarded** and `mode` will be forced to `'all'`! 352 353Here is an example of running these functions. 354 355```python 356project.run_prediction('prediction_1', episode_name='episode_3', load_epoch=150, mode='test') 357project.run_suggestion( 358 'suggestion_new_data', 359 suggestion_episode='episode_4', 360 suggestion_classes=['sleeping', 'eating'], 361 suggestion_threshold=0.6, 362 exclude_classes=['inactive'], 363 data_path='/path/to/new_data_folder', 364) 365``` 366The first command will generate a prediction dictionary with the model from epoch 150 of `episode_3` for the test 367subset of the project data (split according to the configuration files) and save it at 368*project_name/results/predictions/prediction_1.pickle*. The second command will create suggestion and active 369learning files for data at */path/to/new_data_folder* and save them at 370*project_name/results/suggestions/suggestion_new_data*. There is a lot of parameters you can tune here to 371get exactly what you need, so we really recommend reading the documentation of the 372`project.project.Project.run_suggestion` function before using it. 373 374The history of prediction and suggestion runs is recorded and at the [exploring] section 375we have described how to access it. 376 377[GUI]: https://github.com/amathislab/classic_annotation 378 379# Contribution 380Here you can learn how to add new elements to `dlc2action`. 381 382## Models 383[model]: #models 384Models in `dlc2action` are formalized as `model.base_model.Model` instances. This class inherits from 385`torch.nn.Module` but adds a more formalized structure and defines interaction with SSL modules. 386Note that **SSL modules and base models are added separately** to allow for 'mixing and matching', so make sure 387the model you want to add only predicts the target value. 388 389The process to add a new model would be as follows. 390 3911. **Separate your model into feature extraction and prediction** 392 393 If we want to add an SSL module to your module, where should it go? Everything up to that point is 394 your feature extractor and everything after is the prediction generator. 395 3962. **Write the modules** 397 398 Create a *{model_name}_modules.py* file at the *dlc2action/models* folder and write all necessary modules there. 399 4003. **Formalize input and output** 401 402 Make sure that your feature extraction and prediction generation modules accept a single input and return 403 a single output. 404 4054. **Write the model** 406 407 Create a *{model_name}.py* file at the *dlc2action/models* folder and write the code for the 408 `model.base_model.Model` child class. The `model.base_model.Model._feature_extractor` and 409 `model.base_model.Model._predictor` functions need to return your feature extraction and prediction 410 generation modules, respectively. See the `model.ms_tcn` code for reference. 411 4125. **Add options** 413 414 Add your model to the `models` dictionary at `dlc2action.options`. Add 415 `{'{model_name}.{class_name}.dump_patches': False, '{model_name}.{class_name}.training': False}` to the 416 `__pdoc__` dictionary at `dlc2action.model.__init__`. 417 4186. **Add config parameters** 419 420 Read the [config] explanation and add your parameters. 421 422## SSL 423SSL tasks are formalized as `ssl.base_ssl.SSLConstructor` instances. Read the documentation at `ssl` to 424learn more about this and follow the process. 425 4261. **Create a new file** 427 428 Create a new file for your constructor at *dlc2action/ssl* and start defining a `ssl.base_ssl.SSLConstructor` 429 child class. 430 4312. **Determine the type of your task** 432 433 You can find the descriptions of available SSL types at `ssl.base_ssl.SSLConstructor`. If what you want to do 434 does not fit any of those types, you can add a new one. In that case you will have to modify the 435 `model.base_model` 436 code. More specifically, add the new name to the `available_ssl_types` variable and define the interaction 437 at `model.base_model.Model.forward`. Set the `type` class variable to the type you decided on. 438 4393. **Define the data transformation** 440 441 You need to write a function that will take a *feature dictionary* (see `feature_extraction`) as input 442 and return SSL input and SSL target as output. Read more about that at `ssl`. The `transformation` method 443 of your class should perform that function. 444 4454. **Define the network module** 446 447 Take a look at `ssl.modules` and choose one of the existing modules or add your own. Remember that its 448 `forward` method has 449 to receive the output of a base model's feature extraction module (see the [model] tutorial for more details) 450 as input and return a value that will go into the SSL loss function as output. Your class's `construct_module` 451 method needs to return an instance of this module. 452 4535. **Define the loss function** 454 455 Choose your loss at `dlc2action.loss` or add a new one. 456 If you do decide to add a new loss, see the [loss] tutorial. 457 Set the `loss` method of your class to apply this loss. 458 4596. **Add options** 460 461 Add your model to the `ssl_constructors` dictionary at `dlc2action.options` and set a default loss weight 462 in the `ssl_weights` dictionary at *training.yaml*. 463 4647. **Add config parameters** 465 466 Read the [config] explanation and add your parameters. 467 468## Datasets 469Datasets in `dlc2action` are defined through `data.base_store.InputStore` and `data.base_store.BehaviorStore` 470classes. Before you start, please read the documentation at `data` to learn more about the assumptions we make 471and the different classes involved in data processing. Then, follow these instructions. 472 4731. **Figure out what you need to add** 474 475 What are the data and annotation types of your dataset? Can you find either of them at `options.input_stores` 476 and `options.annotation_stores`? If so, you are in luck. If not, move on to have a look at `data.annotation_store` 477 and `data.input_store`. It's likely that the classes we already have already do most of what you need. 478 For annotation, in most cases it should be enough to inherit from `data.annotation_store.ActionSegmentationStore` 479 and only change the `data.annotation_store.ActionSegmentationStore._open_annotations` function. With input stores 480 it can be more complicated, but should definitely use implementations at `data.input_store` as an example. 481 4822. **Write the class** 483 484 Create a new class at either `data.input_store` or `data.annotation_store`, inherit from whatever is the closest 485 to what you need and implement all abstract functions. 486 4873. **Add options** 488 489 Add your store to `options.annotation_stores` or `options.input_stores`. 490 4914. **Add config parameters** 492 493 Read the [config] explanation and add your parameters. 494 495## Losses 496Adding a new loss is fairly straightforward since `dlc2action` uses `torch.nn.Module` instances as losses. 497 4981. **Write the class** 499 500 Add a new loss class inheriting from `torch.nn.Module` at `dlc2action.loss`. Add it to one of the existing 501 submodules or create a new one if it isn't a good fit for anything. Generally, the `forward` method of your 502 loss should take prediction and target tensors as input, in that order, and return a float value. 503 5042. **Add options** 505 506 If you want your new loss to be an option for the main loss function in training, you should add 507 it to the `dlc2action.options.losses` dictionary. Note that in that case it has to have the input and output 508 formats described above. If your loss expects input from a multi-stage model, like MS-TCN, add its name to 509 `dlc2action.options.losses_multistage` as well. 510 5113. **Add config parameters** 512 513 Read the [config] explanation and add your parameters. 514 515## Metrics 516All metrics in `dlc2action` inherit from `metric.base_metric.Metric`. They store internal parameters that get updated 517with each batch and are then used to compute the metric value. You need to write the functions for resetting the 518parameters, updating them and calculating the value. 519 5201. **Write the class** 521 522 Add a new class at `metric.metrics`, inheriting from `metric.base_metric.Metric` and write the three 523 abstract methods (read the documentation for details). 524 5252. **Add options** 526 527 Add your new metric to the `dlc2action.options.metrics` dictionary. Next, if it is supposed to decrease with 528 increasing prediction quality, add its name to the `dlc2action.options.metrics_minimize` list and if it does not 529 say anything about how good the predictions are, add it to `dlc2action.options.metrics_no_direction`. 530 If it increases when predictions get better, you don't need to do anything else. 531 5323. **Add config parameters** 533 534 If you added your loss to the options at the previous step, you also need to add it to the config files. 535 Read the [config] explanation and do that. 536 537## Feature extractors 538Feature extractors in `dlc2action` are defined fairly loosely. You need to make a new class that inherits from 539one of `feature_extraction.FeatureExtractor` subclasses (i.e. `feature_extraction.PoseFeatureExtractor`). Each 540of those subclasses are defined for a subclass of `data.base_store.InputStore` for a specific broad type of 541data (i.e. `data.base_store.PoseInputStore` for pose estimation data). 542 543When extracting the features, `data.base_store.InputStore` instances will pass a data dictionary to the feature 544extractor and they will also contain the methods to extract information from that dictionary. You only 545need to use that information (coordinates, list of body parts, number of frames and so on) to compute 546some new features (distances, speeds, accelerations). Read the `feature_Extraction` documentation to find out more 547about that. When you're done with reading you can start with those instructions. 548 5491. **Write the class** 550 551 Create a new class at `feature_extraction` that inherits from a `feature_extraction.FeatureExtractor` subclass. 552 Look at the methods available for information extraction and write the `extract_features` 553 function that will transform 554 that information into some new features. 555 5562. **Add a transformer** 557 558 Since you've added some new features, `dlc2action` does not know how to handle them when adding augmentations. 559 You need to either create a new `transformer.base_transformer.Transformer` or modify an existing one to fix 560 that. Read the [augmentations] 561 tutorial to find out how. 562 5633. **Add options** 564 565 Add your feature extractor to `dlc2action.options.feature_extractors` and set the corresponding transformer 566 at `dlc2action.options.extractor_to_transformer`. 567 5683. **Add config parameters** 569 570 Read the [config] explanation and add your parameters. 571 572## Augmentations 573[augmentations]: #augmentations 574Augmentations in `dlc2action` are carried out by `transformer.base_transformer.Transformer` instances. 575Since they need to know how to handle different features, each transformer handles the output of specific 576`feature_extraction.FeatureExtractor` subclasses (only one transformer corresponds to each feature extractor, 577but each transformer can 578work with multiple feature extractors). Hence the instructions. 579 5801. **Figure out whether you need a new transformer** 581 582 If the feature extractor you are interested in already has a corresponding transformer, you only need to 583 add a new method to that transformer. You can find which transformer you need at 584 `dlc2action.options.extractor_to_transformer`. If the feature extractor is new, it might still be enough to 585 modify an existing transformer. Take a look at `transformer` and see if any of the existing classes are close 586 enough to what you need. If they are not, you will have to create a new class. 587 5882. **Add a new class** *(skip this step if you are not creating a new transformer)* 589 590 If you did decide to create a new class, create a new *{transformer_name}.py* file at `dlc2action.transformer` 591 and write it there, inheriting from `transformer.base_transformer.Transformer`. 592 5933. **Add augmentation functions** 594 595 Add new methods to your transformer class. Each method has to perform one augmentation, take 596 `(main_input: dict, ssl_inputs: list, ssl_targets: list)` 597 as input and return an output of the same format. Here `main_input` is a feature dictionary of the sample 598 data, `ssl_inputs` is a list of SSL input feature dictionaries and `ssl_targets` is a list of SSL target 599 feature dictionaries. The same augmentations are applied to all inputs if they are not `None` and 600 have feature keys that 601 are recognized by your transformer. 602 603 When you are done writing the methods, add them with their names to the dictionary returned by 604 `transformer.base_transformer.augmentations_dict`. If you think that your augmentation suits a very wide 605 range of datasets, add its name to the list returned by `transformer.base_transformer.default_augmentations` 606 as well. 607 6084. **Add options** *(skip this step if you are not creating a new transformer)* 609 610 Add your new transformer to `dlc2action.options.transformers`. 611 6125. **Add config parameters** 613 614 Read the [config] explanation and add your parameters. 615 616## Config files 617[config]: #config-files 618It is important to modify the config files when you add something new to `dlc2action` since it is supposed to 619mainly be used through the `project` interface that relies on the files being up-to-date. 620 621To add configuration parameters for a new model, feature extractor, 622input or annotation store, create a new file in the *model*, *features*, 623*data* or *annotation* subfolder of *dlc2action/config*, respectively. If you are adding a new loss, 624metric or SSL constructor, add a new dictionary to the corresponding file 625at the same folder. **Make sure to use the same name as what you've added to the options file.** Then just 626list all necessary parameters (use the same names as in the class signature), their default values 627and an explanation in the commentary. 628 629The commentary for every parameter has to start with its supposed type. Use one of `'bool'`, `'int'`, `'float'`, 630`'str'`, `'list'` or `'set'` or define a set of possible values with square brackets (e.g. `['pose', 'image']`). 631The type has to be followed with a semicolon and a short explanation. Please try to keep the explanations one line 632long. Here is an example: 633 634``` 635frame_limit: 15 # int; tracklets shorter than this number of frames will be discarded 636``` 637 638Note that if a parameter that you need is already defined at *dlc2action/config/general.yaml*, **it will be pulled 639automatically**, there is no need to add it to your new list of parameters. Just make sure that the name of the 640parameter in your class signature and at *general.yaml* is the same. 641 642In addition, there are also several *blanks* you can use to set the default parameters. Those *blanks* 643are filled at runtime after the dataset is created. Read more about them at the `project` documentation. 644 645When you are finished with defining the parameters, you should also describe them in the documentation (perhaps in 646more detail) at the [options] section. 647 648# Options 649[options]: #options 650Here you can find an overview of the parameters currently available in the settings. 651 652## Data types 653 654 - `'dlc_tracklet'`: DLC tracklet data, 655 - `'dlc_track'`: DLC track data, 656 657## Annotation types 658 659 - `'dlc'`: the classic_annotation GUI format, 660 - `'boris'`: BORIS format, 661 662## Data 663 664### Annotation 665 666#### BORIS and DLC 667 668- `behaviors: set` <br /> 669 the set of string names of the behaviors to predict 670- `annotation_suffix: set` <br /> 671 the set of string suffixes such that annotation file names have the format of 672 *{video_id}{annotation_suffix}* 673- `correction: dict` <br /> 674 a dictionary of label corrections where keys are the names to be replaced and values are the replacements 675 (e. g. `{'sleping': 'sleeping', 'calm locomotion': 'locomotion'}`) 676- `error_class: str` <br /> 677 a class denoting errors in pose estimation (the intervals annotated with this class will be ignored) 678- `min_frames_action: int` <br /> 679 the minimum number of frames for an action (shorter actions will be discarded during training) 680- `filter_annotated: bool` <br /> 681 if true, `dlc2action` will discard long unannotated intervals during training 682- `filter_background: bool` <br /> 683 if true, only label frames as background if a behavior is annotated somewhere close (in the same segment) 684- `filter_visibility: bool` <br /> 685 if true, only label behaviors if there is suffucient visibility (a fraction of frames in a segment larger than 686 `visibility_min_frac` has visibility score larger than `visibility_min_score`; the visibility score is defined by 687 `data.base_store.InputStore` instances and should generally be from 0 to 1 688- `visibility_min_score: float` <br /> 689 the minimum visibility score for visibility filtering (generally from 0 to 1) 690- `visibility_min_frac: float` <br /> 691 the minimum fraction of visible frames for visibility filtering 692 693 694#### CalMS21 695 696No parameters 697 698### Input 699 700#### DLC track and tracklet 701 702- `data_suffix: set` <br /> 703 the set of suffices such that the data files have the format of *{video_id}{data_suffix}* 704- `data_prefix: set` <br /> 705 a set of prefixes used in file names for different views such that the data files have the format 706 of *{data_prefix}{video_id}{data_suffix}* 707- `feature_suffix: str` <br /> 708 the string suffix such that the precomputed feature files are named *{video_id}{feature_suffix}*; 709 the files should be either stored at `data_path` or included in `file_paths` 710- `contert_int_indices: bool` <br /> 711 if true, convert any integer key i in feature files to 'ind{i}' 712- `canvas_shape: list` <br /> 713 the size of the canvas where the pose was defined 714- `len_segment: int` <br /> 715 the length of segments (in frames) to cut the videos into 716- `overlap: int` <br /> 717 the overlap (in frames) between neighboring segments 718- `ignored_bodyparts: set` <br /> 719 the set of string names of bodyparts to ignore 720- `default_agent_name: str` <br /> 721 the default agent name used in pose file generation 722- `frame_limit: int` (only for tracklets) <br /> 723 tracklets shorter than this number of frames will be discarded 724 725#### CalMS21 726 727- `len_segment: int` <br /> 728 the length of segments (in frames) to cut the videos into 729- `overlap: int` <br /> 730 the overlap (in frames) between neighboring segments 731- `load_unlabeled: bool` <br /> 732 if true, unlabeled data will be loaded 733- `file_prefix: str` <br /> 734 the prefix such that the data files are named {prefix}train.npy and {prefix}test.npy 735 736### Features 737 738#### Kinematic 739 740- `interactive: bool` <br /> 741 if true, features are computed for pairs of clips (agents) 742- `keys: set` <br /> 743 the keys to include 744 (a subset of `{"coords", "intra_distance", "speed_joints", "angle_joints_radian", "acc_joints", "inter_distance"}` 745 where 746 - `'coords'` is the raw coordinates, 747 - `'intra_distance'` is the distances between all pairs of keypoints, 748 - `'speed_joints'` is the approximated vector speeds of all keypoints, 749 - `'angle_joints_radian'` is the approximated angular speeds of all keypoints, 750 - `'acc_joints'` is the approximated accelerations of all keypoints, 751 - `'inter_distance'` is the distance between all pairs of keypoints where the points belong to different agents 752 (used if `interactive` is true)) 753 754## Augmentations 755 756### Kinematic 757 758- `augmentations: set` <br /> 759 a list of augmentations (a subset of `{'rotate', 'mask', 'add_noise', 'shift', 'zoom', 'mirror'}`) 760- `use_default_augmentations: bool` <br /> 761 if true, the default augmentation list will be used (`['mirror', 'shift', 'add_noise']`) 762- `rotation_limits: list` <br /> 763 list of float rotation angle limits ([low, high]; default [-pi/2, pi/2]) 764- `mirror_dim: set` <br /> 765 set of integer dimension indices that can be mirrored 766- `noise_std: float` <br /> 767 standard deviation of added noise 768- `zoom_limits: list` <br /> 769 list of float zoom limits ([low, high]; default [0.5, 1.5]) 770- `masking_probability: float` <br /> 771 the probability of masking a joint 772 773## Model 774 775### MS-TCN3 776 777- num_layers_PG: int` <br /> 778 number of layers in prediction generation stage 779- num_layers_R: int` <br /> 780 number of layers in refinement stages 781- `num_R: int` <br /> 782 number of refinement stages 783- `num_f_maps: int` <br /> 784 number of feature maps 785- `dims: int` <br /> 786 number of input features (set to 'dataset_features' to get the number from 787 `dlc2action.data.dataset.BehaviorDataset.dataset_features()`) 788- `dropout_rate: float` <br /> 789 dropout rate 790- `skip_connections_refinement: bool` <br /> 791 if true, skip connections are added to the refinement stages 792- `block_size_prediction: int` <br /> 793 if not null, skip connections are added to the prediction generation stage with this interval 794- `direction_R: str` <br /> 795 causality of refinement stages; choose from: 796 - `None`: no causality, 797 - `bidirectional`: a combination of forward and backward networks, 798 - `forward`: causal, 799 - `backward`: anticausal 800- `direction_PG: str` <br /> 801 causality of refinement stages (see `direction_R`) 802- `shared_weights: bool` <br /> 803 if `True``, weights are shared across refinement stages 804- `block_size_refinement: int` <br /> 805 if not 0, skip connections are added to the refinement stage with this interval 806- `PG_in_FE: bool` <br /> 807 if `True`, the prediction generation stage is included in the feature extractor 808- `rare_dilations: bool` <br /> 809 if `True`, dilation increases with layer less often 810- `num_heads: int` <br /> 811 the number of parallel refinement stages 812 813 814## General 815 816- `model_name: str` <br /> 817 model name; choose from: 818 - `'ms_tcn3'`: the original MS-TCN++ model with options to share weights across refinement stages or 819 add skip connections. 820- `num_classes: int` <br /> 821 number of classes (set to 'dataset_num_classes' to get the number from 822 `dlc2action.data.dataset.BehaviorDataset.num_classes()`) 823- `exclusive: bool` <br /> 824 if true, single-label classification is used; otherwise multi-label 825- `ssl: set` <br /> 826 a set of SSL types to use; choose from: 827 - `'contrastive'`: contrastive SSL with NT-Xent loss, 828 - `'pairwise'`: pairwise comparison SSL with triplet or circle loss, 829 - `'masked_features'`: prediction of randomly masked features with MSE loss, 830 - `'masked_frames'`: prediction of randomly masked frames with MSE loss, 831 - `'masked_joints'`: prediction of randomly masked joints with MSE loss, 832 - `'masked_features_tcn'`: prediction of randomly masked features with MSE loss and a TCN module, 833 - `'masked_frames_tcn'`: prediction of randomly masked frames with MSE loss and a TCN module, 834 - `'masked_joints_tcn'`: prediction of randomly masked joints with MSE loss and a TCN module 835- `metric_functions: set` <br /> 836 set of metric names; choose from: 837 - `'recall'`: frame-wise recall, 838 - `'segmental_recall'`: segmental recall, 839 - `'precision'`: frame-wise precision, 840 - `'segmental_precision'`: segmental precision, 841 - `'f1'`: frame-wise F1 score, 842 - `'segmental_f1'`: segmental F1 score, 843 - `'edit_distance'`: edit distance (as a fraction of length of segment), 844 - `'count'`: the fraction of predictions labeled with a specific behavior 845- `loss_function: str` <br /> 846 name of loss function; choose from: 847 - `'ms_tcn'`: a loss designed for the MS-TCN network; cross-entropy + MSE for consistency 848- `feature_extraction: str` <br /> 849 the feature extraction method; choose from: 850 - `'kinematic'`: extracts distances, speeds and accelerations for pose estimation data 851- `only_load_annotated: bool` <br /> 852 if true, the input files that don't have a matching annotation file will be disregarded 853- `ignored_clips: set` <br /> 854 a set of string clip ids (agent names) to be ignored 855 856## Losses 857 858### MS_TCN 859 860- `weights: list` <br /> 861 list of weights for weighted cross-entropy 862- `focal: bool` <br /> 863 if True, focal loss will be used 864- `gamma: float` <br /> 865 the gamma parameter of focal loss 866- `alpha: float` <br /> 867 the weight of consistency loss 868 869## Metrics 870 871### Recall, precision and F1 score (segmental and not) 872 873- `main_class: int` <br /> 874 if not null, recall will only be calculated for main_class 875- `average: str` <br /> 876 averaging method; choose from `'macro'`, `'micro'` or `'none'` 877- `ignored_classes: set` <br /> 878 a set of class ids to ignore in calculation 879- `iou_threshold: float` (only for segmental metrics) <br /> 880 intervals with IoU larger than this threshold are considered correct 881 882### Count 883 884- `classes: set` <br /> 885 a set of the class indices to count the occurrences of 886 887## SSL 888 889### Contrastive 890 891- `ssl_features: int` <br /> 892 length of clip feature vectors 893- `tau: float` <br /> 894 tau (NT-Xent loss parameter) 895- `len_segment: int` <br /> 896 length of the segments that enter the SSL module 897- `num_f_maps: list` <br /> 898 shape of the segments that enter the SSL module 899 900### Pairwise 901 902- `ssl_features: int` <br /> 903 length of clip feature vectors 904- `margin: float` <br /> 905 margin (triplet loss parameter) 906- `distance: str` <br /> 907 either 'cosine' or 'euclidean' 908- `loss: str` <br /> 909 either 'triplet' or 'circle' 910- `gamma: float` <br /> 911 gamma (triplet and circle loss parameter) 912- `len_segment: int` <br /> 913 length of the segments that enter the SSL module 914- `num_f_maps: list` <br /> 915 shape of the segments that enter the SSL module 916 917### Masked joints, features and frames 918 919- `frac_masked: float` <br /> 920 fraction of features to be masked 921- `num_ssl_layers: int` <br /> 922 number of layers in the SSL module 923- `num_ssl_f_maps: int` <br /> 924 number of feature maps in the SSL module 925- `dims: int` <br /> 926 number of features per frame in original input data 927- `num_f_maps: list` <br /> 928 shape of the segments that enter the SSL module 929 930### Contrastive masked 931 932- `ssl_features: int` <br /> 933 length of clip feature vectors 934- `tau: float` <br /> 935 tau (NT-Xent loss parameter) 936- `len_segment: int` <br /> 937 length of the segments that enter the SSL module 938- `num_f_maps: list` <br /> 939 shape of the segments that enter the SSL module 940- `num_masked: int` <br /> 941 number of frames to mask 942 943## Training 944 945- `lr: float` <br /> 946 learning rate 947- `device: str` <br /> 948 device name (recognized by `torch`) 949- `verbose: bool` <br /> 950 print training process 951- `augment_train: int` <br /> 952 either 1 to use augmentations during training or 0 to not use 953- `augment_val: int` <br /> 954 number of augmentations to average over at validation 955- `ssl_weights: dict` <br /> 956 dictionary of SSL loss function weights (keys are SSL names, values are weights) 957- `num_epochs: int` <br /> 958 number of epochs 959- `to_ram: bool` <br /> 960 transfer the dataset to RAM for training (preferred if the dataset fits in working memory) 961- `batch_size: int` <br /> 962 batch size 963- `freeze_features: bool` <br /> 964 freeze the feature extractor parameters 965- `ignore_tags: bool` <br /> 966 ignore meta tags (meta tags are generated by some datasets and used by some models and can contain 967 information such as annotator id); when `True`, all meta tags are set to `None` 968- `model_save_epochs: int` <br /> 969 interval for saving training checkpoints (the last epoch is always saved) 970- `use_test: float` <br /> 971 the fraction of the test dataset to use in training without labels (for SSL tasks) 972- `partition_method: str` <br /> 973 the train/test/val partitioning method; choose from: 974 - `'random'`: sort videos into subsets randomly, 975 - `'random:test-from-name'` (or `'random:test-from-name:{name}'`): sort videos into training and validation 976 subsets randomly and create 977 the test subset from the video ids that start with a speific substring (`'test'` by default, or `name` 978 if provided), 979 - `'random:equalize:segments'` and `'random:equalize:videos'`: sort videos into subsets randomly but 980 making sure that for the rarest classes at least `0.8 * val_frac` of the videos/segments that contain 981 occurrences of the class get into the validation subset and `0.8 * test_frac` get into the test subset; 982 this in ensured for all classes in order of increasing number of occurrences until the validation and 983 test subsets are full 984 - `'val-from-name:{val_name}:test-from-name:{test_name}'`: create the validation and test 985 subsets from the video ids that start with specific substrings (`val_name` for validation 986 and `test_name` for test) and sort all other videos into the training subset 987 - `'folders'`: read videos from folders named *test*, *train* and *val* into corresponding subsets, 988 - `'time'`: split each video into training, validation and test subsequences, 989 - `'time:strict'`: split each video into validation, test and training subsequences 990 and throw out the last segment in validation and test (to get rid of overlaps), 991 - `'file'`: split according to a split file. 992- `val_frac: float` <br /> 993 fraction of dataset to use as validation 994- `test_frac: float` <br /> 995 fraction of dataset to use as test 996- `split_path: str` <br /> 997 path to the split file (**only used when partition_method is `'file'`, 998 otherwise disregarded and filled automatically**) 999""" 1000 1001from dlc2action.version import __version__, VERSION 1002 1003__pdoc__ = {"options": False, "version": False, "scripts": False}