gaitmap.evaluation_utils.calculate_aggregated_parameter_errors#
- gaitmap.evaluation_utils.calculate_aggregated_parameter_errors(*, reference_parameter: DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], predicted_parameter: DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], calculate_per_sensor: bool = True, scoring_errors: Literal['ignore', 'warn', 'raise'] = 'warn', id_column: str = 's_id') DataFrame[source]#
Calculate various error metrics between a parameter predicted and a given ground truth.
This method can be applied to stride level parameters or aggregated parameters over a gait test/participant/… . In both cases the reference and the predicted values are simply aligned based on the specified
id_column. All non-common entries are ignored for the calculation of the error metrics.In general, we calculate four different groups of errors:
The error between the predicted and the reference value (
predicted - reference)The relative error between the predicted and the reference value (
(predicted - reference) / reference)The absolute error between the predicted and the reference value (
abs(predicted - reference))The absolute relative error between the predicted and the reference value (
abs(predicted - reference) / abs(reference))
For each of these groups of errors, we calculate the maximum, minimum, mean, median, standard deviation, the 0.05/0.95 quantiles, and the upper/lower limit of aggreement (loa). In addition the ICC (intraclass correlation coefficient) with the respective 0.05/0.95 quantiles is calculated. All metrics are calculated for all columns that are available in both, the predicted parameters and the reference. It is up to you, to decide if a specific error metric makes sense for a given parameter.
In addition, the number of common entries (based on the
id_column), additional entries in the reference and additional entries in the predicted values are calculated. These metrics are helpful, as parameter errors are only calculated for entries that are present in both the inputs. Entries between the predicted and the reference are matched based on the column/index name specified by theid_columnparamter. For a normal output of the temporal or spatial parameter calculation, useid_column="s_id"column. If you are using the “pretty” output of these calculations, useid_column="stride id". In case you used custom calculations/aggregations, make sure you have a column/index that contains unique ids that match correctly between the predicted and the reference parameters. Parameters withnp.nanare considered to be missing for the respective entry.The metrics can either be calculated per sensor or for all sensors combined (see the
calculate_per_sensorparameter). In the latter case, the error per entry is calculated and then all entry of all sensors combined before calculating the summary metrics (mean, std, …). This might be desired, if you have one sensor per foot, but want to have statistics over all entries independent of the foot.- Parameters:
- reference_parameter
The reference the predicted values should be compared against. This must be the same type (i.e. single/multi sensor) as the predicted input. Further, sensor names, column names, and unique ids must match with the
predicted_parameters.- predicted_parameter
The predicted parameter values. Usually, this is the output of the temporal or spatial parameter calculation. But, you can also pass a custom calculation/aggregation. Make sure you adjust the
id_columnparameter accordingly. This can be a Dataframe or a dict of such Dataframes.- calculate_per_sensor
A bool that can be set to
Falseif you wish to calculate error metrics as if the entries were all taken by one sensor. Default isTrue.- scoring_errors
How to handle errors during the scoring. Can be one of
ignore,warn, orraise. Default iswarn. At the moment, this only effects the calculation of the ICC. In case of ignore, we will also ignore warnings that might be raised during the calculation. In all cases the value for a given metric is set tonp.nan.- id_column
The name of the column/index that contains unique entry ids. This will be used to align the predicted and reference parameters. For a normal output of the temporal or spatial parameter calculation, use
id_column="s_id"(default) column. If you are using the “pretty” output of these calculations, useid_column="stride id". For custom calculations/aggregations, make sure you have a column/index that contains unique ids.
- Returns:
- output
A Dataframe with one row per error metric and one column per parameter. In case of a multi-sensor predicted (and
calculate_per_sensor=True), the dataframe has 2 column levels. The first level is the sensor name and the second one the parameter name.
See also
Notes
We are using pandas.quantile() to calculate the quantiles. In case the input is NaN/Inf for some values, the quantile function might return NaN/Inf as well. Even if the method returns a value, note, that pandas calculates the quantiles by first removing NaN and then passing them to
numpy.percentile(). The result will be different from passing the data tonumpy.percentile()directly. These NaN errors might happen for the relative errors, if the reference parameter is 0.Examples
>>> predicted_param = pd.DataFrame({"para1": [7, 3, 5, 9], "para2": [7, -1, 7, -6]}).rename_axis("trial id") >>> reference = pd.DataFrame({"para1": [3, 6, 7, 8], "para2": [-7, -1, 6, -5]}).rename_axis("trial id") >>> calculate_aggregated_parameter_errors( ... predicted_parameter=predicted_param, ... reference_parameter=reference, ... id_column="trial id", ... ) para1 para2 predicted_mean 6.000000 1.750000 reference_mean 6.000000 -1.750000 error_mean 0.000000 3.500000 abs_error_mean 2.500000 4.000000 rel_error_mean 0.168155 -0.408333 abs_rel_error_mean 0.561012 0.591667 predicted_std 2.581989 6.396614 reference_std 2.160247 5.737305 error_std 3.162278 7.047458 abs_error_std 1.290994 6.683313 rel_error_std 0.818928 1.064712 abs_rel_error_std 0.537307 0.942956 predicted_median 6.000000 3.000000 reference_median 6.500000 -3.000000 error_median -0.500000 0.500000 abs_error_median 2.500000 1.000000 rel_error_median -0.080357 0.083333 abs_rel_error_median 0.392857 0.183333 predicted_q05 3.300000 -5.250000 reference_q05 3.450000 -6.700000 error_q05 -2.850000 -0.850000 abs_error_q05 1.150000 0.150000 rel_error_q05 -0.467857 -1.700000 abs_rel_error_q05 0.149107 0.025000 predicted_q95 8.700000 7.000000 reference_q95 7.850000 4.950000 error_q95 3.550000 12.050000 abs_error_q95 3.850000 12.050000 rel_error_q95 1.152083 0.195000 abs_rel_error_q95 1.208333 1.730000 predicted_max 9.000000 7.000000 reference_max 8.000000 6.000000 error_max 4.000000 14.000000 abs_error_max 4.000000 14.000000 rel_error_max 1.333333 0.200000 abs_rel_error_max 1.333333 2.000000 predicted_min 3.000000 -6.000000 reference_min 3.000000 -7.000000 error_min -3.000000 -1.000000 abs_error_min 1.000000 0.000000 rel_error_min -0.500000 -2.000000 abs_rel_error_min 0.125000 0.000000 predicted_loa_lower 0.939302 -10.787363 reference_loa_lower 1.765916 -12.995117 error_loa_lower -6.198064 -10.313018 abs_error_loa_lower -0.030349 -9.099293 rel_error_loa_lower -1.436945 -2.495168 abs_rel_error_loa_lower -0.492111 -1.256528 predicted_loa_upper 11.060698 14.287363 reference_loa_upper 10.234084 9.495117 error_loa_upper 6.198064 17.313018 abs_error_loa_upper 5.030349 17.099293 rel_error_loa_upper 1.773254 1.678502 abs_rel_error_loa_upper 1.614135 2.439861 icc 0.256198 0.328814 icc_q05 -0.710000 -0.670000 icc_q95 0.920000 0.940000 n_additional_predicted 0.000000 0.000000 n_additional_reference 0.000000 0.000000 n_common 4.000000 4.000000
>>> pd.set_option("display.max_columns", None) >>> pd.set_option("display.width", 0) >>> predicted_sensor_left = pd.DataFrame(columns=["para"], data=[23, 82, 42]).rename_axis("s_id") >>> reference_sensor_left = pd.DataFrame(columns=["para"], data=[21, 86, 65]).rename_axis("s_id") >>> predicted_sensor_right = pd.DataFrame(columns=["para"], data=[26, -58, -3]).rename_axis("s_id") >>> reference_sensor_right = pd.DataFrame(columns=["para"], data=[96, -78, 86]).rename_axis("s_id") >>> calculate_aggregated_parameter_errors( ... predicted_parameter={"left_sensor": predicted_sensor_left, "right_sensor": predicted_sensor_right}, ... reference_parameter={"left_sensor": reference_sensor_left, "right_sensor": reference_sensor_right}, ... id_column="s_id", ... ) left_sensor right_sensor para para predicted_mean 49.000000 -11.666667 reference_mean 57.333333 34.666667 error_mean -8.333333 -46.333333 abs_error_mean 9.666667 59.666667 rel_error_mean -0.101707 -0.673487 abs_rel_error_mean 0.165199 0.673487 predicted_std 30.116441 42.665365 reference_std 33.171273 97.700222 error_std 13.051181 58.226569 abs_error_std 11.590226 35.641736 rel_error_std 0.229574 0.392212 abs_rel_error_std 0.165180 0.392212 predicted_median 42.000000 -3.000000 reference_median 65.000000 86.000000 error_median -4.000000 -70.000000 abs_error_median 4.000000 70.000000 rel_error_median -0.046512 -0.729167 abs_rel_error_median 0.095238 0.729167 predicted_q05 24.900000 -52.500000 reference_q05 25.400000 -61.600000 error_q05 -21.100000 -87.100000 abs_error_q05 2.200000 25.000000 rel_error_q05 -0.323113 -1.004312 abs_rel_error_q05 0.051384 0.303686 predicted_q95 78.000000 23.100000 reference_q95 83.900000 95.000000 error_q95 1.400000 11.000000 abs_error_q95 21.100000 87.100000 rel_error_q95 0.081063 -0.303686 abs_rel_error_q95 0.327985 1.004312 predicted_max 82.000000 26.000000 reference_max 86.000000 96.000000 error_max 2.000000 20.000000 abs_error_max 23.000000 89.000000 rel_error_max 0.095238 -0.256410 abs_rel_error_max 0.353846 1.034884 predicted_min 23.000000 -58.000000 reference_min 21.000000 -78.000000 error_min -23.000000 -89.000000 abs_error_min 2.000000 20.000000 rel_error_min -0.353846 -1.034884 abs_rel_error_min 0.046512 0.256410 predicted_loa_lower -10.028224 -95.290781 reference_loa_lower -7.682361 -156.825768 error_loa_lower -33.913649 -160.457409 abs_error_loa_lower -13.050176 -10.191136 rel_error_loa_lower -0.551671 -1.442223 abs_rel_error_loa_lower -0.158554 -0.095249 predicted_loa_upper 108.028224 71.957448 reference_loa_upper 122.349028 226.159101 error_loa_upper 17.246982 67.790742 abs_error_loa_upper 32.383509 129.524469 rel_error_loa_upper 0.348258 0.095249 abs_rel_error_loa_upper 0.488952 1.442223 icc 0.909121 0.628853 icc_q05 0.130000 -0.570000 icc_q95 1.000000 0.990000 n_additional_predicted 0.000000 0.000000 n_additional_reference 0.000000 0.000000 n_common 3.000000 3.000000
>>> calculate_aggregated_parameter_errors( ... predicted_parameter={"left_sensor": predicted_sensor_left, "right_sensor": predicted_sensor_right}, ... reference_parameter={"left_sensor": reference_sensor_left, "right_sensor": reference_sensor_right}, ... calculate_per_sensor=False, ... ) para predicted_mean 18.666667 reference_mean 46.000000 error_mean -27.333333 abs_error_mean 34.666667 rel_error_mean -0.387597 abs_rel_error_mean 0.419343 predicted_std 46.851539 reference_std 66.425899 error_std 43.098337 abs_error_std 36.219700 rel_error_std 0.425081 abs_rel_error_std 0.387238 predicted_median 24.500000 reference_median 75.500000 error_median -13.500000 abs_error_median 21.500000 rel_error_median -0.305128 abs_rel_error_median 0.305128 predicted_q05 -44.250000 reference_q05 -53.250000 error_q05 -84.250000 abs_error_q05 2.500000 rel_error_q05 -0.958454 abs_rel_error_q05 0.058693 predicted_q95 72.000000 reference_q95 93.500000 error_q95 15.500000 abs_error_q95 84.250000 rel_error_q95 0.059801 abs_rel_error_q95 0.958454 predicted_max 82.000000 reference_max 96.000000 error_max 20.000000 abs_error_max 89.000000 rel_error_max 0.095238 abs_rel_error_max 1.034884 predicted_min -58.000000 reference_min -78.000000 error_min -89.000000 abs_error_min 2.000000 rel_error_min -1.034884 abs_rel_error_min 0.046512 predicted_loa_lower -73.162349 reference_loa_lower -84.194761 error_loa_lower -111.806074 abs_error_loa_lower -36.323945 rel_error_loa_lower -1.220755 abs_rel_error_loa_lower -0.339643 predicted_loa_upper 110.495682 reference_loa_upper 176.194761 error_loa_upper 57.139408 abs_error_loa_upper 105.657279 rel_error_loa_upper 0.445561 abs_rel_error_loa_upper 1.178329 icc 0.663797 icc_q05 -0.090000 icc_q95 0.940000 n_additional_predicted 0.000000 n_additional_reference 0.000000 n_common 6.000000