gaitmap.stride_segmentation.BaseDtw#

class gaitmap.stride_segmentation.BaseDtw(template: BaseDtwTemplate | dict[Union[collections.abc.Hashable, str], gaitmap_mad.stride_segmentation.dtw._dtw_templates.templates.BaseDtwTemplate] | None = None, resample_template: bool = True, find_matches_method: typing_extensions.Literal[min_under_thres, find_peaks] = 'find_peaks', max_cost: float | None = None, min_match_length_s: float | None = None, max_match_length_s: float | None = None, max_template_stretch_ms: float | None = None, max_signal_stretch_ms: float | None = None, memory: Memory | None = None)[source]#

A basic implementation of subsequent dynamic time warping.

Note

This algorithm is only available via the gaitmap_mad package and distributed under a AGPL3 licence. To use it, you need to explicitly install the gaitmap_mad package. Learn more about that here.

This uses the DTW implementation of tslearn. This class offers a convenient wrapper around this by providing support for various datatypes to be used as inputs. These cover three main usecases (for examples of all of these, see the Examples section):

A general purpose msDTW

If you require a msDTW with a class based interface independent of the context of stride segmentation (or even this library) you can create a simple template from a (n x m) numpy array using create_dtw_template. This allows you to use just a simple numpy array as data input to the segment method. The data must have at least n samples and m columns. If it has more than m columns, the additional columns are ignored.

A simple way to segment multiple sensor with the same template

If you are using the basic datatypes of this library you can use this DTW implementation to easily apply a template to selected columns of multiple sensors. For this, the template is expected to be based on an pd.DataFrame wrapped in a DtwTemplate. The column names of this dataframe need to match the column names of the data the template should be applied to. The data can be passed as single-sensor dataframe (the columns correspond to individual sensor axis) or a multi-sensor dataset (either a dictionary of single-sensor dataframes or a dataframe with 2 level of column labels, were the upper corresponds to the sensor name. In case of the single-sensor-dataframe the matching and the output is identical to passing just numpy arrays. In case of a multi-sensor input, the provided template is applied to each of the sensors individually. All outputs are then dictionaries of the single-sensor outputs with the sensor name as key. In both cases, if a dataset has columns that are not listed in the template, they are simply ignored.

A way to apply specific templates to specific columns

In some cases different templates are required for different sensors. To do this, the template must be a dictionary of DtwTemplate instances, were the key corresponds to the sensor the template should be applied to. Note, that only dataframe templates are supported and not simple numpy array templates. The data input needs to be a multi-sensor dataset (see above for more information). The templates are then applied only to data belonging to the sensor with the same name. All sensors in the template dictionary must also be in the dataset. However, the dataset can have additional sensors, which are simply ignored. In this use case, the outputs are always dictionaries with the sensor name as key.

To better understand the different datatypes have a look at the coordinate system guide.

Parameters:
template

The template used for matching. The required data type and shape depends on the use case. For more details view the class docstring. Note that the scale parameter of the template is used to downscale the data before the matching is performed. If you have data in another data range (e.g. a different unit), the scale parameter of the template should be adjusted.

resample_template

If True the template will be resampled to match the sampling rate of the data. This requires a valid value for template.sampling_rate_hz value. The resampling is performed using linear interpolation. Note, that this might lead to unexpected results in case of short template arrays.

max_cost

The maximal allowed cost to find potential match in the cost function. Note that the cost is roughly calculated as: sqrt(|template - data/template.scaling|). Its usage depends on the exact find_matches_method used. Refer to the specific funtion to learn more about this.

min_match_length_s

The minimal length of a sequence in seconds to be considered a match. Matches that result in shorter sequences, will be ignored. In general, this exclusion is performed as a post-processing step after the matching. If “find_peaks” is selected as find_matches_method, the parameter is additionally used in the detection of matches directly.

max_match_length_s

The maximal length of a sequence in seconds to be considered a match. Matches that result in longer sequences will be ignored. This exclusion is performed as a post-processing step after the matching.

max_template_stretch_ms

A local warping constraint for the DTW. It describes how many ms of the template are allowed to be mapped to just a single datapoint of the signal. The ms value will internally be converted to samples using the template sampling-rate (or the signal sampling-rate, if resample_template=True). If no template sampling-rate is provided, this constrain can not be used.

max_signal_stretch_ms

A local warping constraint for the DTW. It describes how many ms of the signal are allowed to be mapped to just a single datapoint of the template. The ms value will internally be converted to samples using the data sampling-rate

find_matches_method

Select the method used to find matches in the cost function.

  • “min_under_thres”

    Matches the implementation used in the paper [1] to detect strides in foot mounted IMUs. In this case find_matches_min_under_threshold will be used as method.

  • “find_peaks”

    Uses find_peaks with additional constraints to find stride candidates. In this case find_matches_find_peaks will be used as method.

memory

An optional joblib.Memory object that can be provided to cache the creation of cost matrizes.

Other Parameters:
data

The data passed to the segment method.

sampling_rate_hz

The sampling rate of the data

Attributes:
matches_start_end_2D array of shape (n_detected_strides x 2) or dictionary with such values

The start (column 1) and end (column 2) of each detected match.

costs_List of length n_detected_strides or dictionary with such values

The cost value associated with each stride.

acc_cost_mat_array with the shapes (length_template x length_data) or dictionary with such values

The accumulated cost matrix of the DTW. The last row represents the cost function.

cost_function_1D array with the same length as the data or dictionary with such values

Cost function extracted from the accumulated cost matrix.

paths_list of arrays with length n_detected_strides or dictionary with such values

The full path through the cost matrix of each detected stride. Note that the start and end values of the path might not match the start and the end values in matches_start_end_, if certain post processing steps are applied.

matches_start_end_original_2D array of shape (n_detected_strides x 2) or dictionary with such values

Return the starts and end directly from the paths.

Notes

msDTW simply calculates the DTW distance of a template at every possible timepoint in the signal. While the template is warped, it is advisable to use a template that has a similar length than the expected matches. Using resample_template can help with that. Further, the template should cover the same signal range than the original signal. You can use the scale parameter of the DtwTemplate to adapt your template to your data.

If you see unexpected matches or missing matches in your results, it is advisable to plot acc_cost_mat_ and cost_function_. They can provide insight in the matching process.

[1]

Barth, J., Oberndorfer, C., Kugler, P., Schuldhaus, D., Winkler, J., Klucken, J., & Eskofier, B. (2013). Subsequence dynamic time warping as a method for robust step segmentation using gyroscope signals of daily life activities. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 6744-6747. https://doi.org/10.1109/EMBC.2013.6611104

Examples

Running a simple matching using arrays as input:

>>> from gaitmap.stride_segmentation import DtwTemplate
>>> template_data = np.array([1, 2, 1])
>>> data = np.array([0, 0, 1, 2, 1, 0, 1, 2, 1, 0])
>>> template = DtwTemplate(template_data)
>>> dtw = BaseDtw(template=template, max_cost=1, resample_template=False)
>>> dtw = dtw.segment(data, sampling_rate_hz=1)  # Sampling rate is not important for this example
>>> dtw.matches_start_end_
array([[2, 4],
       [6, 8]])

Methods

clone()

Create a new instance of the class with all parameters copied over.

from_json(json_str)

Import an gaitmap object from its json representation.

get_params([deep])

Get parameters for this algorithm.

segment(data, sampling_rate_hz, **_)

Find matches by warping the provided template to the data.

set_params(**params)

Set the parameters of this Algorithm.

to_json()

Export the current object parameters as json.

__init__(template: BaseDtwTemplate | dict[Union[collections.abc.Hashable, str], gaitmap_mad.stride_segmentation.dtw._dtw_templates.templates.BaseDtwTemplate] | None = None, resample_template: bool = True, find_matches_method: typing_extensions.Literal[min_under_thres, find_peaks] = 'find_peaks', max_cost: float | None = None, min_match_length_s: float | None = None, max_match_length_s: float | None = None, max_template_stretch_ms: float | None = None, max_signal_stretch_ms: float | None = None, memory: Memory | None = None) None[source]#
clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

classmethod from_json(json_str: str) Self[source]#

Import an gaitmap object from its json representation.

For details have a look at the this example.

You can use the to_json method of a class to export it as a compatible json string.

Parameters:
json_str

json formatted string

get_params(deep: bool = True) dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

segment(data: ndarray | DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], sampling_rate_hz: float, **_) Self[source]#

Find matches by warping the provided template to the data.

Parameters:
dataarray, single-sensor dataframe, or multi-sensor dataset

The input data. For details on the required datatypes review the class docstring.

sampling_rate_hz

The sampling rate of the data signal. This will be used to convert all parameters provided in seconds into a number of samples and it will be used to resample the template if resample_template is True.

Returns:
self

The class instance with all result attributes populated

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

to_json() str[source]#

Export the current object parameters as json.

For details have a look at the this example.

You can use the from_json method of any gaitmap algorithm to load the object again.

Warning

This will only export the Parameters of the instance, but not any results!

Examples using gaitmap.stride_segmentation.BaseDtw#

BaseDtw simple segmentation

BaseDtw simple segmentation

BaseDtw simple segmentation