gaitmap.stride_segmentation.BaseDtw#
- class gaitmap.stride_segmentation.BaseDtw(template: BaseDtwTemplate | dict[Union[collections.abc.Hashable, str], gaitmap_mad.stride_segmentation.dtw._dtw_templates.templates.BaseDtwTemplate] | None = None, resample_template: bool = True, find_matches_method: typing_extensions.Literal[min_under_thres, find_peaks] = 'find_peaks', max_cost: float | None = None, min_match_length_s: float | None = None, max_match_length_s: float | None = None, max_template_stretch_ms: float | None = None, max_signal_stretch_ms: float | None = None, memory: Memory | None = None)[source]#
A basic implementation of subsequent dynamic time warping.
Note
This algorithm is only available via the
gaitmap_madpackage and distributed under a AGPL3 licence. To use it, you need to explicitly install thegaitmap_madpackage. Learn more about that here.This uses the DTW implementation of
tslearn. This class offers a convenient wrapper around this by providing support for various datatypes to be used as inputs. These cover three main usecases (for examples of all of these, see the Examples section):- A general purpose msDTW
If you require a msDTW with a class based interface independent of the context of stride segmentation (or even this library) you can create a simple template from a (n x m) numpy array using
create_dtw_template. This allows you to use just a simple numpy array as data input to thesegmentmethod. The data must have at least n samples and m columns. If it has more than m columns, the additional columns are ignored.- A simple way to segment multiple sensor with the same template
If you are using the basic datatypes of this library you can use this DTW implementation to easily apply a template to selected columns of multiple sensors. For this, the template is expected to be based on an
pd.DataFramewrapped in aDtwTemplate. The column names of this dataframe need to match the column names of the data the template should be applied to. The data can be passed as single-sensor dataframe (the columns correspond to individual sensor axis) or a multi-sensor dataset (either a dictionary of single-sensor dataframes or a dataframe with 2 level of column labels, were the upper corresponds to the sensor name. In case of the single-sensor-dataframe the matching and the output is identical to passing just numpy arrays. In case of a multi-sensor input, the provided template is applied to each of the sensors individually. All outputs are then dictionaries of the single-sensor outputs with the sensor name as key. In both cases, if a dataset has columns that are not listed in the template, they are simply ignored.- A way to apply specific templates to specific columns
In some cases different templates are required for different sensors. To do this, the template must be a dictionary of
DtwTemplateinstances, were the key corresponds to the sensor the template should be applied to. Note, that only dataframe templates are supported and not simple numpy array templates. The data input needs to be a multi-sensor dataset (see above for more information). The templates are then applied only to data belonging to the sensor with the same name. All sensors in the template dictionary must also be in the dataset. However, the dataset can have additional sensors, which are simply ignored. In this use case, the outputs are always dictionaries with the sensor name as key.
To better understand the different datatypes have a look at the coordinate system guide.
- Parameters:
- template
The template used for matching. The required data type and shape depends on the use case. For more details view the class docstring. Note that the
scaleparameter of the template is used to downscale the data before the matching is performed. If you have data in another data range (e.g. a different unit), the scale parameter of the template should be adjusted.- resample_template
If
Truethe template will be resampled to match the sampling rate of the data. This requires a valid value fortemplate.sampling_rate_hzvalue. The resampling is performed using linear interpolation. Note, that this might lead to unexpected results in case of short template arrays.- max_cost
The maximal allowed cost to find potential match in the cost function. Note that the cost is roughly calculated as:
sqrt(|template - data/template.scaling|). Its usage depends on the exactfind_matches_methodused. Refer to the specific funtion to learn more about this.- min_match_length_s
The minimal length of a sequence in seconds to be considered a match. Matches that result in shorter sequences, will be ignored. In general, this exclusion is performed as a post-processing step after the matching. If “find_peaks” is selected as
find_matches_method, the parameter is additionally used in the detection of matches directly.- max_match_length_s
The maximal length of a sequence in seconds to be considered a match. Matches that result in longer sequences will be ignored. This exclusion is performed as a post-processing step after the matching.
- max_template_stretch_ms
A local warping constraint for the DTW. It describes how many ms of the template are allowed to be mapped to just a single datapoint of the signal. The ms value will internally be converted to samples using the template sampling-rate (or the signal sampling-rate, if
resample_template=True). If no template sampling-rate is provided, this constrain can not be used.- max_signal_stretch_ms
A local warping constraint for the DTW. It describes how many ms of the signal are allowed to be mapped to just a single datapoint of the template. The ms value will internally be converted to samples using the data sampling-rate
- find_matches_method
Select the method used to find matches in the cost function.
- “min_under_thres”
Matches the implementation used in the paper [1] to detect strides in foot mounted IMUs. In this case
find_matches_min_under_thresholdwill be used as method.
- “find_peaks”
Uses
find_peakswith additional constraints to find stride candidates. In this casefind_matches_find_peakswill be used as method.
- memory
An optional
joblib.Memoryobject that can be provided to cache the creation of cost matrizes.
- Other Parameters:
- data
The data passed to the
segmentmethod.- sampling_rate_hz
The sampling rate of the data
- Attributes:
- matches_start_end_2D array of shape (n_detected_strides x 2) or dictionary with such values
The start (column 1) and end (column 2) of each detected match.
- costs_List of length n_detected_strides or dictionary with such values
The cost value associated with each stride.
- acc_cost_mat_array with the shapes (length_template x length_data) or dictionary with such values
The accumulated cost matrix of the DTW. The last row represents the cost function.
cost_function_1D array with the same length as the data or dictionary with such valuesCost function extracted from the accumulated cost matrix.
- paths_list of arrays with length n_detected_strides or dictionary with such values
The full path through the cost matrix of each detected stride. Note that the start and end values of the path might not match the start and the end values in
matches_start_end_, if certain post processing steps are applied.matches_start_end_original_2D array of shape (n_detected_strides x 2) or dictionary with such valuesReturn the starts and end directly from the paths.
Notes
msDTW simply calculates the DTW distance of a template at every possible timepoint in the signal. While the template is warped, it is advisable to use a template that has a similar length than the expected matches. Using
resample_templatecan help with that. Further, the template should cover the same signal range than the original signal. You can use thescaleparameter of theDtwTemplateto adapt your template to your data.If you see unexpected matches or missing matches in your results, it is advisable to plot
acc_cost_mat_andcost_function_. They can provide insight in the matching process.[1]Barth, J., Oberndorfer, C., Kugler, P., Schuldhaus, D., Winkler, J., Klucken, J., & Eskofier, B. (2013). Subsequence dynamic time warping as a method for robust step segmentation using gyroscope signals of daily life activities. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 6744-6747. https://doi.org/10.1109/EMBC.2013.6611104
Examples
Running a simple matching using arrays as input:
>>> from gaitmap.stride_segmentation import DtwTemplate >>> template_data = np.array([1, 2, 1]) >>> data = np.array([0, 0, 1, 2, 1, 0, 1, 2, 1, 0]) >>> template = DtwTemplate(template_data) >>> dtw = BaseDtw(template=template, max_cost=1, resample_template=False) >>> dtw = dtw.segment(data, sampling_rate_hz=1) # Sampling rate is not important for this example >>> dtw.matches_start_end_ array([[2, 4], [6, 8]])
Methods
clone()Create a new instance of the class with all parameters copied over.
from_json(json_str)Import an gaitmap object from its json representation.
get_params([deep])Get parameters for this algorithm.
segment(data, sampling_rate_hz, **_)Find matches by warping the provided template to the data.
set_params(**params)Set the parameters of this Algorithm.
to_json()Export the current object parameters as json.
- __init__(template: BaseDtwTemplate | dict[Union[collections.abc.Hashable, str], gaitmap_mad.stride_segmentation.dtw._dtw_templates.templates.BaseDtwTemplate] | None = None, resample_template: bool = True, find_matches_method: typing_extensions.Literal[min_under_thres, find_peaks] = 'find_peaks', max_cost: float | None = None, min_match_length_s: float | None = None, max_match_length_s: float | None = None, max_template_stretch_ms: float | None = None, max_signal_stretch_ms: float | None = None, memory: Memory | None = None) None[source]#
- clone() Self[source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- classmethod from_json(json_str: str) Self[source]#
Import an gaitmap object from its json representation.
For details have a look at the this example.
You can use the
to_jsonmethod of a class to export it as a compatible json string.- Parameters:
- json_str
json formatted string
- get_params(deep: bool = True) dict[str, Any][source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.
- segment(data: ndarray | DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], sampling_rate_hz: float, **_) Self[source]#
Find matches by warping the provided template to the data.
- Parameters:
- dataarray, single-sensor dataframe, or multi-sensor dataset
The input data. For details on the required datatypes review the class docstring.
- sampling_rate_hz
The sampling rate of the data signal. This will be used to convert all parameters provided in seconds into a number of samples and it will be used to resample the template if
resample_templateisTrue.
- Returns:
- self
The class instance with all result attributes populated