Note

Click here to download the full example code

SegmentationModel Training#

This example illustrates how a Hidden Markov Model (HMM) implemented by the RothSegmentationHmm can be trained from IMU data and presegmented stride lists. The used implementation is based on the work of Roth et al [1]

import numpy as np
from matplotlib import pyplot as plt

np.random.seed(1)

Getting some example data#

For this we take some example data that contains the regular walking movement during a 2x20m walk test of a healthy subject. The IMU signals are already rotated so that they align with the gaitmap SF coordinate system. The data contains information from two sensors - one from the right and one from the left foot.

from gaitmap.example_data import get_healthy_example_imu_data

data = get_healthy_example_imu_data()
sampling_rate_hz = 204.8
data.sort_index(axis=1).head(1)

sensor	left_sensor						right_sensor
axis	acc_x	acc_y	acc_z	gyr_x	gyr_y	gyr_z	acc_x	acc_y	acc_z	gyr_x	gyr_y	gyr_z
0.0	0.880811	2.762208	9.40865	-0.112402	-0.032157	-0.062261	0.311553	-2.398646	9.513275	-0.323037	0.084604	-0.025288

Preparing the data#

The HMM only makes use of the gyro information. Further, if you use this model, your data is expected to be in the gaitmap body-frame to be able to use the same model for the left and the right foot. Therefore, we need to transform the dataset into the body frame.

from gaitmap.utils.coordinate_conversion import convert_to_fbf

# We use the `..._like` parameters to identify the data of the left and the right foot based on the name of the sensor.
bf_data = convert_to_fbf(data, left_like="left_", right_like="right_")

Getting the example stride list#

For this we take the ground truth stride list provided with the example data. For new data this stride list can be generated by running the algorithms provided in the stride_segmentation module and then manually corrected, or by creating a stride list using ground truth data.

from gaitmap.example_data import get_healthy_example_stride_borders

stride_list = get_healthy_example_stride_borders()

from gaitmap.data_transform import ButterworthFilter

Initialize Model Parameters - Feature Transformation#

Here we define the feature space in which model training and later prediction will take place. You can choose different axis and or feature combinations as well as downsampling, filter and standardization steps. The following example has proved to work well in most cases.

from gaitmap.stride_segmentation.hmm import RothHmmFeatureTransformer

feature_transform = RothHmmFeatureTransformer(
    sampling_rate_feature_space_hz=51.2,
    axes=["gyr_ml"],
    features=["raw", "gradient"],
    low_pass_filter=ButterworthFilter(order=4, cutoff_freq_hz=10),
    window_size_s=0.2,
    standardization=True,
)

Initialize Model Parameters - Sub HMMs#

The segmentation process is defined as a two-class problem, namely “strides” and “transitions/null”. For each class we define a separate HMM and define all its components. Notice that the stride and transition model are different in architecture, number of states or number of gaussian mixture model (GMM) components. In this example all configurable parameters are exposed. These parameters might require optimization for your specific type of dataset!

from gaitmap.stride_segmentation.hmm import SimpleHmm

stride_model = SimpleHmm(
    n_states=20,
    n_gmm_components=6,
    algo_train="baum-welch",
    stop_threshold=1e-9,
    max_iterations=5,
    architecture="left-right-strict",
    verbose=True,
    name="stride_model",
)

transition_model = SimpleHmm(
    n_states=5,
    n_gmm_components=3,
    algo_train="baum-welch",
    stop_threshold=1e-9,
    max_iterations=5,
    architecture="left-right-loose",
    verbose=True,
    name="transition_model",
)

Initialize Model Parameters - Segmentation Model#

Finally we can combine the feature extraction and our defined sub-HMMs to the actual segmentation model were we can invoke the training process. Again, all configurable parameters are exposed for demonstration purpose. These parameters should again work for most usecases.

from gaitmap.stride_segmentation.hmm import RothSegmentationHmm

segmentation_model = RothSegmentationHmm(
    stride_model=stride_model,
    transition_model=transition_model,
    feature_transform=feature_transform,
    algo_predict="viterbi",
    algo_train="baum-welch",
    stop_threshold=1e-9,
    max_iterations=1,
    initialization="labels",
    verbose=True,
    name="segmentation_model",
)

Prepare Data for Training#

The HMM does not differentiate between left or right strides, (this is why we must have our data in the body-frame convention!). The main input format for the training process are gait sequences which include transitions as well as valid strides. To train on multiple sequences, we can just feed a list of gaitsequences into the model for training. For each gait sequence we also need to have a valid stride list. In this example we handle the data from the left and right foot as separate gait sequences and add them to a simple list. We have to do the same for the stride lists.

data_train_sequence = [bf_data["left_sensor"], bf_data["right_sensor"]]
stride_list_sequence = [stride_list["left_sensor"], stride_list["right_sensor"]]

Training#

Finally! Sit back relax and let the magic happen (depending on the number of input sequences this can take up to >30min). However, this small example runs quite fast! The model will internally perform the feature transformation of the dataset, train the individual sub models and finally combine them to a flatted segmentation model.

segmentation_model = segmentation_model.self_optimize(
    data_train_sequence, stride_list_sequence, sampling_rate_hz=sampling_rate_hz
)

[1] Improvement: 341.2437178557084      Time (s): 0.2085
[2] Improvement: 109.29279107726325     Time (s): 0.2077
[3] Improvement: 25.279694613596803     Time (s): 0.2082
[4] Improvement: 52.723999946801996     Time (s): 0.2078
[5] Improvement: -3.5209243279850853    Time (s): 0.2088
Total Training Improvement: 525.0192791653853
Total Training Time (s): 1.2424
/home/docs/checkouts/readthedocs.org/user_builds/gaitmap/checkouts/v2.6.0/packages/gaitmap_mad/src/gaitmap_mad/stride_segmentation/hmm/_utils.py:333: UserWarning: During training the improvement per epoch became NaN/infinite or negative! Run `self_optimize_with_info` and inspect the history element for more information. With a high likelihood, the final model is not usable and will result in errors during prediction. This usually happens when there is not enough data for a large number of distributions and states. To avoid this issue, reduce the number of distributions per state or the number of states. Or ideally, provide more data.
  warnings.warn(
[1] Improvement: 723.4614150029679      Time (s): 0.008724
[2] Improvement: 62.761125573766094     Time (s): 0.009014
[3] Improvement: 28.507724633026555     Time (s): 0.009044
[4] Improvement: 16.114249821317117     Time (s): 0.008944
[5] Improvement: 33.07876970054485      Time (s): 0.00881
Total Training Improvement: 863.9232847316225
Total Training Time (s): 0.0545
[1] Improvement: 43.84714447262195      Time (s): 0.2094
Total Training Improvement: 43.84714447262195
Total Training Time (s): 0.4585

Inspecting the Results#

Now all internal models which were initialized as “None” should be populated by pomegranate models. We can now have a look at the final transition matrix or the trained distributions (GMMs). You could now either use the model to predict stride borders on an unseen sequence or save it to a json file for later use.

np.set_printoptions(precision=3, linewidth=180, suppress=True)

print(segmentation_model.model.dense_transition_matrix()[0:-2, 0:-2])

print(segmentation_model.model.states[10])

[[0.891 0.109 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.925 0.075 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.86  0.14  0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.949 0.051 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.158 0.    0.    0.    0.795 0.047 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.667 0.333 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.669 0.331 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.627 0.373 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.716 0.284 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.664 0.336 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.666 0.334 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.652 0.348 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.589 0.411 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.516 0.484 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.527 0.473 0.    0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.525 0.475 0.    0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.447 0.553 0.    0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.632 0.368 0.    0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.82  0.18  0.    0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.494 0.506 0.    0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.618 0.382 0.    0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.67  0.33  0.    0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.647 0.353 0.   ]
 [0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.645 0.355]
 [0.017 0.    0.    0.    0.    0.304 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.68 ]]
{
    "class" : "State",
    "distribution" : {
        "class" : "GeneralMixtureModel",
        "distributions" : [
            {
                "class" : "Distribution",
                "name" : "MultivariateGaussianDistribution",
                "parameters" : [
                    [
                        1.5126628727953728,
                        -1.2325509070704483
                    ],
                    [
                        [
                            0.005687593991405892,
                            0.0023472485533848997
                        ],
                        [
                            0.0023472485533848997,
                            0.040877832841576395
                        ]
                    ]
                ],
                "frozen" : false
            },
            {
                "class" : "Distribution",
                "name" : "MultivariateGaussianDistribution",
                "parameters" : [
                    [
                        1.246494912094264,
                        -1.5631885798768912
                    ],
                    [
                        [
                            0.02247323938290144,
                            0.00967374480066069
                        ],
                        [
                            0.00967374480066069,
                            0.04359681875993598
                        ]
                    ]
                ],
                "frozen" : false
            },
            {
                "class" : "Distribution",
                "name" : "MultivariateGaussianDistribution",
                "parameters" : [
                    [
                        -0.14468908268310057,
                        -0.32692151837939903
                    ],
                    [
                        [
                            0.004734574595004851,
                            -0.007669695586920108
                        ],
                        [
                            -0.007669695586920108,
                            0.01245693964354575
                        ]
                    ]
                ],
                "frozen" : false
            },
            {
                "class" : "Distribution",
                "name" : "MultivariateGaussianDistribution",
                "parameters" : [
                    [
                        1.7391578250883177,
                        -1.4912029930432096
                    ],
                    [
                        [
                            0.0011035263021169747,
                            0.002104007563097542
                        ],
                        [
                            0.002104007563097542,
                            0.0349554610296975
                        ]
                    ]
                ],
                "frozen" : false
            },
            {
                "class" : "Distribution",
                "name" : "MultivariateGaussianDistribution",
                "parameters" : [
                    [
                        1.1445625377737405,
                        -1.7356582771533349
                    ],
                    [
                        [
                            0.015585621689172442,
                            -0.01362748197089966
                        ],
                        [
                            -0.01362748197089966,
                            0.031239367536969237
                        ]
                    ]
                ],
                "frozen" : false
            },
            {
                "class" : "Distribution",
                "name" : "MultivariateGaussianDistribution",
                "parameters" : [
                    [
                        0.7189351972580712,
                        -1.886781134785975
                    ],
                    [
                        [
                            0.02468903546241029,
                            0.00967892554067703
                        ],
                        [
                            0.00967892554067703,
                            0.02033165397565948
                        ]
                    ]
                ],
                "frozen" : false
            }
        ],
        "weights" : [
            0.30984372119247067,
            0.27134257464797806,
            0.01725767107205237,
            0.041169920711634715,
            0.09302063408620882,
            0.2673654782896554
        ]
    },
    "name" : "sa",
    "weight" : 1.0
}

Applying the Model to a Sequence#

in the follwoing we will apply the model to the same sequence we used for training, just to show that the model “learned” something. We will also plot the results to see how well the model performs.

from gaitmap.stride_segmentation.hmm import HmmStrideSegmentation

hmm = HmmStrideSegmentation(segmentation_model).segment(bf_data, sampling_rate_hz=sampling_rate_hz)
hmm.stride_list_

{'left_sensor':       start   end
s_id
     364   584
     584   802
     802  1023
    1023  1242
    1242  1458
    1458  1672
    1672  1887
    1887  2104
    2104  2327
    2327  2546
   2546  2773
   2773  2998
   2998  3231
   3231  3466
   3934  4163
   4163  4382
   4382  4603
   4603  4822
   4822  5043
   5043  5267
   5267  5489
   5489  5713
   5713  5936
   5936  6167
   6167  6395
   6395  6628
   6628  6858
   6858  7107, 'right_sensor':       start   end
s_id
     475   691
     691   913
     913  1133
    1133  1350
    1350  1565
    1565  1779
    1779  1995
    1995  2216
    2216  2436
    2436  2659
   2659  2887
   2887  3114
   3114  3351
   3351  3567
   3567  3816
   3816  4049
   4049  4274
   4274  4492
   4492  4712
   4712  4933
   4933  5153
   5153  5381
   5381  5601
   5601  5826
   5826  6051
   6051  6280
   6280  6511
   6511  6742
   6742  6966
   6966  7246}

Plotting the Results#

sensor = "left_sensor"

fig, axs = plt.subplots(nrows=2, sharex=True, figsize=(10, 5))
axs[0].set_title("gaitmap Body Frame Dataset")
axs[0].plot(bf_data.reset_index(drop=True)[sensor]["gyr_ml"])
for start, end in hmm.stride_list_["left_sensor"].to_numpy():
    axs[0].axvline(start, c="r")
    axs[0].axvline(end, c="r")
    axs[0].axvspan(start, end, alpha=0.2)
axs[0].set_ylabel("gyr-ml [deg/s]")

axs[1].set_title("Predicted Hidden State Sequence")
axs[1].plot(hmm.hidden_state_sequence_[sensor])
for start, end in hmm.matches_start_end_original_[sensor]:
    axs[1].axvline(start, c="g")
    axs[1].axvline(end, c="g")
    axs[1].axvspan(start, end, alpha=0.2)
axs[1].set_ylabel("Hidden State [N]")

axs[1].set_xlabel("Samples @ %d Hz" % sampling_rate_hz)
plt.xlim([6000, 7200])
fig.tight_layout()
plt.show()

gaitmap Body Frame Dataset, Predicted Hidden State Sequence

Total running time of the script: ( 0 minutes 6.260 seconds)

Estimated memory usage: 9 MB

Gallery generated by Sphinx-Gallery