gaitmap.evaluation_utils.evaluate_segmented_stride_list#
- gaitmap.evaluation_utils.evaluate_segmented_stride_list(*, ground_truth: DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], segmented_stride_list: DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame], tolerance: int | float = 0, one_to_one: bool = True, stride_list_postfix: str = '', ground_truth_postfix: str = '_ground_truth') DataFrame | dict[Union[collections.abc.Hashable, str], pandas.core.frame.DataFrame][source]#
Find True Positives, False Positives and True Negatives by comparing a segmented stride list with ground truth.
This compares a segmented stride list with a ground truth segmented stride list and returns True Positives, False Positives and True Negatives matches. The comparison is purely based on the start and end values of each stride in the lists. Two strides are considered a positive match, if both their start and their end values differ by less than the threshold.
By default (controlled by the one-to-one parameter), if multiple strides of the segmented stride list would match to a single ground truth stride (or vise-versa), only the stride with the lowest distance is considered an actual match. If
one_to_oneis set to False, all matches would be considered True positives. This might lead to unexpected results in certain cases and should not be used to calculate traditional metrics like precision and recall.It is highly recommended to order the stride lists and remove strides with large overlaps before applying this method to get reliable results.
- Parameters:
- ground_truth
The ground truth stride list.
- segmented_stride_list
The list of segmented strides.
- tolerance
The allowed tolerance between labels. Its unit depends on the units used in the stride lists. The comparison is done as
distance <= tolerance.- one_to_one
If True, only a single unique match per stride is considered. In case of multiple matches, the one with the lowest distance is considered. If case of multiple matches with the same distance, the first match will be considered. If False, multiple matches are possible. If this is set to False, some calculated metrics from these matches might not be well defined!
- stride_list_postfix
A postfix that will be append to the index name of the segmented stride list in the output.
- ground_truth_postfix
A postfix that will be append to the index name of the ground truth in the output.
- Returns:
- matches
A 3 column dataframe with the column names
s_id{stride_list_postfix},s_id{ground_truth_postfix}andmatch_type. Each row is a match containing the index value of the left and the right list, that belong together. Thematch_typecolumn indicates the type of match. For all segmented strides that have a match in the ground truth list, this will be “tp” (true positive). Segmented strides that do not have a match will be mapped to a NaN and the match-type will be “fp” (false positives) All ground truth strides that do not have a segmented counterpart are marked as “fn” (false negative). In case MultiSensorStrideLists were used as inputs, a dictionary of such dataframes is returned.
See also
gaitmap.evaluation_utils.match_stride_listsFind matching strides between stride lists.
gaitmap.evaluation_utils.evaluate_stride_event_listFind matching strides between stride event lists.
Examples
>>> stride_list_ground_truth = pd.DataFrame([[10, 21], [20, 34], [31, 40]], columns=["start", "end"]).rename_axis( ... "s_id" ... ) >>> stride_list_seg = pd.DataFrame([[10, 20], [21, 30], [31, 40], [50, 60]], columns=["start", "end"]).rename_axis( ... "s_id" ... ) >>> matches = evaluate_segmented_stride_list( ... ground_truth=stride_list_ground_truth, segmented_stride_list=stride_list_seg, tolerance=2 ... ) >>> matches s_id s_id_ground_truth match_type 0 0 0 tp 1 1 NaN fp 2 2 2 tp 3 3 NaN fp 4 NaN 1 fn
>>> stride_list_ground_truth_left = pd.DataFrame( ... [[10, 21], [20, 34], [31, 40]], columns=["start", "end"] ... ).rename_axis("s_id") >>> stride_list_ground_truth_right = pd.DataFrame( ... [[10, 21], [20, 34], [31, 40]], columns=["start", "end"] ... ).rename_axis("s_id") >>> stride_list_seg_left = pd.DataFrame( ... [[10, 20], [21, 30], [31, 40], [50, 60]], columns=["start", "end"] ... ).rename_axis("s_id") >>> stride_list_seg_right = pd.DataFrame([[10, 21], [20, 34], [31, 40]], columns=["start", "end"]).rename_axis( ... "s_id" ... ) >>> matches = evaluate_segmented_stride_list( ... ground_truth={"left_sensor": stride_list_ground_truth_left, "right_sensor": stride_list_ground_truth_right}, ... segmented_stride_list={"left_sensor": stride_list_seg_left, "right_sensor": stride_list_seg_right}, ... tolerance=2, ... ) >>> matches["left_sensor"] s_id s_id_ground_truth match_type 0 0 0 tp 1 1 NaN fp 2 2 2 tp 3 3 NaN fp 4 NaN 1 fn