gaitmap.evaluation_utils.calculate_aggregated_parameter_errors#

gaitmap.evaluation_utils.calculate_aggregated_parameter_errors(*, reference_parameter: DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], predicted_parameter: DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], calculate_per_sensor: bool = True, scoring_errors: Literal['ignore', 'warn', 'raise'] = 'warn', id_column: str = 's_id') DataFrame[source]#

Calculate various error metrics between a parameter predicted and a given ground truth.

This method can be applied to stride level parameters or aggregated parameters over a gait test/participant/… . In both cases the reference and the predicted values are simply aligned based on the specified id_column. All non-common entries are ignored for the calculation of the error metrics.

In general, we calculate four different groups of errors:

  • The error between the predicted and the reference value (predicted - reference)

  • The relative error between the predicted and the reference value ((predicted - reference) / reference)

  • The absolute error between the predicted and the reference value (abs(predicted - reference))

  • The absolute relative error between the predicted and the reference value (abs(predicted - reference) / abs(reference))

For each of these groups of errors, we calculate the maximum, minimum, mean, median, standard deviation, the 0.05/0.95 quantiles, and the upper/lower limit of aggreement (loa). In addition the ICC (intraclass correlation coefficient) with the respective 0.05/0.95 quantiles is calculated. All metrics are calculated for all columns that are available in both, the predicted parameters and the reference. It is up to you, to decide if a specific error metric makes sense for a given parameter.

In addition, the number of common entries (based on the id_column), additional entries in the reference and additional entries in the predicted values are calculated. These metrics are helpful, as parameter errors are only calculated for entries that are present in both the inputs. Entries between the predicted and the reference are matched based on the column/index name specified by the id_column paramter. For a normal output of the temporal or spatial parameter calculation, use id_column="s_id" column. If you are using the “pretty” output of these calculations, use id_column="stride id". In case you used custom calculations/aggregations, make sure you have a column/index that contains unique ids that match correctly between the predicted and the reference parameters. Parameters with np.nan are considered to be missing for the respective entry.

The metrics can either be calculated per sensor or for all sensors combined (see the calculate_per_sensor parameter). In the latter case, the error per entry is calculated and then all entry of all sensors combined before calculating the summary metrics (mean, std, …). This might be desired, if you have one sensor per foot, but want to have statistics over all entries independent of the foot.

Parameters:
reference_parameter

The reference the predicted values should be compared against. This must be the same type (i.e. single/multi sensor) as the predicted input. Further, sensor names, column names, and unique ids must match with the predicted_parameters.

predicted_parameter

The predicted parameter values. Usually, this is the output of the temporal or spatial parameter calculation. But, you can also pass a custom calculation/aggregation. Make sure you adjust the id_column parameter accordingly. This can be a Dataframe or a dict of such Dataframes.

calculate_per_sensor

A bool that can be set to False if you wish to calculate error metrics as if the entries were all taken by one sensor. Default is True.

scoring_errors

How to handle errors during the scoring. Can be one of ignore, warn, or raise. Default is warn. At the moment, this only effects the calculation of the ICC. In case of ignore, we will also ignore warnings that might be raised during the calculation. In all cases the value for a given metric is set to np.nan.

id_column

The name of the column/index that contains unique entry ids. This will be used to align the predicted and reference parameters. For a normal output of the temporal or spatial parameter calculation, use id_column="s_id" (default) column. If you are using the “pretty” output of these calculations, use id_column="stride id". For custom calculations/aggregations, make sure you have a column/index that contains unique ids.

Returns:
output

A Dataframe with one row per error metric and one column per parameter. In case of a multi-sensor predicted (and calculate_per_sensor=True), the dataframe has 2 column levels. The first level is the sensor name and the second one the parameter name.

Notes

We are using pandas.quantile() to calculate the quantiles. In case the input is NaN/Inf for some values, the quantile function might return NaN/Inf as well. Even if the method returns a value, note, that pandas calculates the quantiles by first removing NaN and then passing them to numpy.percentile(). The result will be different from passing the data to numpy.percentile() directly. These NaN errors might happen for the relative errors, if the reference parameter is 0.

Examples

>>> predicted_param = pd.DataFrame({"para1": [7, 3, 5, 9], "para2": [7, -1, 7, -6]}).rename_axis("trial id")
>>> reference = pd.DataFrame({"para1": [3, 6, 7, 8], "para2": [-7, -1, 6, -5]}).rename_axis("trial id")
>>> calculate_aggregated_parameter_errors(
...     predicted_parameter=predicted_param,
...     reference_parameter=reference,
...     id_column="trial id",
... )  
                             para1      para2
predicted_mean            6.000000   1.750000
reference_mean            6.000000  -1.750000
error_mean                0.000000   3.500000
abs_error_mean            2.500000   4.000000
rel_error_mean            0.168155  -0.408333
abs_rel_error_mean        0.561012   0.591667
predicted_std             2.581989   6.396614
reference_std             2.160247   5.737305
error_std                 3.162278   7.047458
abs_error_std             1.290994   6.683313
rel_error_std             0.818928   1.064712
abs_rel_error_std         0.537307   0.942956
predicted_median          6.000000   3.000000
reference_median          6.500000  -3.000000
error_median             -0.500000   0.500000
abs_error_median          2.500000   1.000000
rel_error_median         -0.080357   0.083333
abs_rel_error_median      0.392857   0.183333
predicted_q05             3.300000  -5.250000
reference_q05             3.450000  -6.700000
error_q05                -2.850000  -0.850000
abs_error_q05             1.150000   0.150000
rel_error_q05            -0.467857  -1.700000
abs_rel_error_q05         0.149107   0.025000
predicted_q95             8.700000   7.000000
reference_q95             7.850000   4.950000
error_q95                 3.550000  12.050000
abs_error_q95             3.850000  12.050000
rel_error_q95             1.152083   0.195000
abs_rel_error_q95         1.208333   1.730000
predicted_max             9.000000   7.000000
reference_max             8.000000   6.000000
error_max                 4.000000  14.000000
abs_error_max             4.000000  14.000000
rel_error_max             1.333333   0.200000
abs_rel_error_max         1.333333   2.000000
predicted_min             3.000000  -6.000000
reference_min             3.000000  -7.000000
error_min                -3.000000  -1.000000
abs_error_min             1.000000   0.000000
rel_error_min            -0.500000  -2.000000
abs_rel_error_min         0.125000   0.000000
predicted_loa_lower       0.939302 -10.787363
reference_loa_lower       1.765916 -12.995117
error_loa_lower          -6.198064 -10.313018
abs_error_loa_lower      -0.030349  -9.099293
rel_error_loa_lower      -1.436945  -2.495168
abs_rel_error_loa_lower  -0.492111  -1.256528
predicted_loa_upper      11.060698  14.287363
reference_loa_upper      10.234084   9.495117
error_loa_upper           6.198064  17.313018
abs_error_loa_upper       5.030349  17.099293
rel_error_loa_upper       1.773254   1.678502
abs_rel_error_loa_upper   1.614135   2.439861
icc                       0.256198   0.328814
icc_q05                  -0.710000  -0.670000
icc_q95                   0.920000   0.940000
n_common                  4.000000   4.000000
n_additional_reference    0.000000   0.000000
n_additional_predicted    0.000000   0.000000
>>> pd.set_option("display.max_columns", None)
>>> pd.set_option("display.width", 0)
>>> predicted_sensor_left = pd.DataFrame(columns=["para"], data=[23, 82, 42]).rename_axis("s_id")
>>> reference_sensor_left = pd.DataFrame(columns=["para"], data=[21, 86, 65]).rename_axis("s_id")
>>> predicted_sensor_right = pd.DataFrame(columns=["para"], data=[26, -58, -3]).rename_axis("s_id")
>>> reference_sensor_right = pd.DataFrame(columns=["para"], data=[96, -78, 86]).rename_axis("s_id")
>>> calculate_aggregated_parameter_errors(
...     predicted_parameter={"left_sensor": predicted_sensor_left, "right_sensor": predicted_sensor_right},
...     reference_parameter={"left_sensor": reference_sensor_left, "right_sensor": reference_sensor_right},
...     id_column="s_id",
... )  
                        left_sensor right_sensor
                               para         para
predicted_mean            49.000000   -11.666667
reference_mean            57.333333    34.666667
error_mean                -8.333333   -46.333333
abs_error_mean             9.666667    59.666667
rel_error_mean            -0.101707    -0.673487
abs_rel_error_mean         0.165199     0.673487
predicted_std             30.116441    42.665365
reference_std             33.171273    97.700222
error_std                 13.051181    58.226569
abs_error_std             11.590226    35.641736
rel_error_std              0.229574     0.392212
abs_rel_error_std          0.165180     0.392212
predicted_median          42.000000    -3.000000
reference_median          65.000000    86.000000
error_median              -4.000000   -70.000000
abs_error_median           4.000000    70.000000
rel_error_median          -0.046512    -0.729167
abs_rel_error_median       0.095238     0.729167
predicted_q05             24.900000   -52.500000
reference_q05             25.400000   -61.600000
error_q05                -21.100000   -87.100000
abs_error_q05              2.200000    25.000000
rel_error_q05             -0.323113    -1.004312
abs_rel_error_q05          0.051384     0.303686
predicted_q95             78.000000    23.100000
reference_q95             83.900000    95.000000
error_q95                  1.400000    11.000000
abs_error_q95             21.100000    87.100000
rel_error_q95              0.081063    -0.303686
abs_rel_error_q95          0.327985     1.004312
predicted_max             82.000000    26.000000
reference_max             86.000000    96.000000
error_max                  2.000000    20.000000
abs_error_max             23.000000    89.000000
rel_error_max              0.095238    -0.256410
abs_rel_error_max          0.353846     1.034884
predicted_min             23.000000   -58.000000
reference_min             21.000000   -78.000000
error_min                -23.000000   -89.000000
abs_error_min              2.000000    20.000000
rel_error_min             -0.353846    -1.034884
abs_rel_error_min          0.046512     0.256410
predicted_loa_lower      -10.028224   -95.290781
reference_loa_lower       -7.682361  -156.825768
error_loa_lower          -33.913649  -160.457409
abs_error_loa_lower      -13.050176   -10.191136
rel_error_loa_lower       -0.551671    -1.442223
abs_rel_error_loa_lower   -0.158554    -0.095249
predicted_loa_upper      108.028224    71.957448
reference_loa_upper      122.349028   226.159101
error_loa_upper           17.246982    67.790742
abs_error_loa_upper       32.383509   129.524469
rel_error_loa_upper        0.348258     0.095249
abs_rel_error_loa_upper    0.488952     1.442223
icc                        0.909121     0.628853
icc_q05                    0.130000    -0.570000
icc_q95                    1.000000     0.990000
n_additional_predicted     0.000000     0.000000
n_additional_reference     0.000000     0.000000
n_common                   3.000000     3.000000
>>> calculate_aggregated_parameter_errors(
...     predicted_parameter={"left_sensor": predicted_sensor_left, "right_sensor": predicted_sensor_right},
...     reference_parameter={"left_sensor": reference_sensor_left, "right_sensor": reference_sensor_right},
...     calculate_per_sensor=False,
... )  
                               para
predicted_mean            18.666667
reference_mean            46.000000
error_mean               -27.333333
abs_error_mean            34.666667
rel_error_mean            -0.387597
abs_rel_error_mean         0.419343
predicted_std             46.851539
reference_std             66.425899
error_std                 43.098337
abs_error_std             36.219700
rel_error_std              0.425081
abs_rel_error_std          0.387238
predicted_median          24.500000
reference_median          75.500000
error_median             -13.500000
abs_error_median          21.500000
rel_error_median          -0.305128
abs_rel_error_median       0.305128
predicted_q05            -44.250000
reference_q05            -53.250000
error_q05                -84.250000
abs_error_q05              2.500000
rel_error_q05             -0.958454
abs_rel_error_q05          0.058693
predicted_q95             72.000000
reference_q95             93.500000
error_q95                 15.500000
abs_error_q95             84.250000
rel_error_q95              0.059801
abs_rel_error_q95          0.958454
predicted_max             82.000000
reference_max             96.000000
error_max                 20.000000
abs_error_max             89.000000
rel_error_max              0.095238
abs_rel_error_max          1.034884
predicted_min            -58.000000
reference_min            -78.000000
error_min                -89.000000
abs_error_min              2.000000
rel_error_min             -1.034884
abs_rel_error_min          0.046512
predicted_loa_lower      -73.162349
reference_loa_lower      -84.194761
error_loa_lower         -111.806074
abs_error_loa_lower      -36.323945
rel_error_loa_lower       -1.220755
abs_rel_error_loa_lower   -0.339643
predicted_loa_upper      110.495682
reference_loa_upper      176.194761
error_loa_upper           57.139408
abs_error_loa_upper      105.657279
rel_error_loa_upper        0.445561
abs_rel_error_loa_upper    1.178329
icc                        0.663797
icc_q05                   -0.090000
icc_q95                    0.940000
n_additional_predicted     0.000000
n_additional_reference     0.000000
n_common                   6.000000