.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/advanced_features/multi_process.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_advanced_features_multi_process.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_advanced_features_multi_process.py:


.. _example_mulit_process:

Running multiple pipelines in parallel
=======================================

When working on large datasets, it can dramatically speed up calculations if multiple processing cores on a modern CPU
are used in parallel.
Python is usually extremely bad at this, as the Global Interpreter Lock (GIL) limits every Python process to a single
Core.
While gaitmap tries to make use of lower level C-implementations (that release the GIL) for many of the heavy lifting
tasks, this does not result in the expected performance increase one might expect from going from a 2-core to a 4-core
processor.
To make proper use of multiple cores, Python - and in turn gaitmap - need to run multiple separate processes, each bound
to a different core.

The following example shows how a gaitmap algorithm (or pipeline of algorithms) can be run in parallel with different
parameter combinations.
A similar multiprocessing approach could be used to compute multiple subjects or recordings in parallel.

.. note:: To get the best performance you need to select a number of parallel processes that make sense for your CPU.
          While it might be tempting to set this number to the number of available processing threads, this might not
          always yield the best results.
          Modern CPU have adaptive clock speeds and hitting the processor with an all core load usually results in a
          reduction of per core performance.
          Hence, more parallel processes might not always result in the best overall performance.

The following example shows how you can make a parameter sweep for the Stride Segmentation Algorithm using `joblib`
as a helper module.
Other Python helpers to spawn multiple processes will of course work as well.

.. GENERATED FROM PYTHON SOURCE LINES 33-37

.. code-block:: default


    from pprint import pprint
    from typing import Any


.. GENERATED FROM PYTHON SOURCE LINES 38-40

Load some example data
----------------------

.. GENERATED FROM PYTHON SOURCE LINES 40-49

.. code-block:: default

    from gaitmap.example_data import get_healthy_example_imu_data
    from gaitmap.utils.coordinate_conversion import convert_to_fbf

    data = get_healthy_example_imu_data()
    bf_data = convert_to_fbf(data, left_like="left_", right_like="right_")
    sampling_rate_hz = 204.8

    from sklearn.model_selection import ParameterGrid


.. GENERATED FROM PYTHON SOURCE LINES 50-56

Preparing the stride segmentation
---------------------------------
In this example we simulate a Gridsearch.
To make this example as real-life as possible, we use the sklearn `ParameterGrid` to set up our parameter sweep.
Note, that we make use of gaitmaps `set_params` methods later.
Hence, we can specify parameters for the nested template object in the parameter grid using the double "_" notation.

.. GENERATED FROM PYTHON SOURCE LINES 56-63

.. code-block:: default

    from gaitmap.stride_segmentation import BarthDtw

    dtw = BarthDtw()
    parameter_grid = ParameterGrid({"max_cost": [1800, 2200], "template__use_cols": [("gyr_ml",), ("gyr_ml", "gyr_si")]})

    pprint(list(parameter_grid))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [{'max_cost': 1800, 'template__use_cols': ('gyr_ml',)},
     {'max_cost': 1800, 'template__use_cols': ('gyr_ml', 'gyr_si')},
     {'max_cost': 2200, 'template__use_cols': ('gyr_ml',)},
     {'max_cost': 2200, 'template__use_cols': ('gyr_ml', 'gyr_si')}]


.. GENERATED FROM PYTHON SOURCE LINES 64-71

Creating a function for the Multi-processing
--------------------------------------------
To run something in parallel, we need to write a function that will be executed in every process.
The function below returns the entire dtw object, which might not be desired, as each object contains a copy of the
entire data.
Further, the current concept will copy the entire data to each process.
This could be further optimized by using a read-only shared memory object for the data.

.. GENERATED FROM PYTHON SOURCE LINES 71-80

.. code-block:: default


    def run(dtw: BarthDtw, parameter: dict[str, Any]) -> BarthDtw:
        # For this run, change the parameters on the dtw object
        dtw = dtw.set_params(**parameter)
        dtw = dtw.segment(data=bf_data, sampling_rate_hz=sampling_rate_hz)
        return dtw


.. GENERATED FROM PYTHON SOURCE LINES 81-84

Do the Parallel Run
-------------------
Finally we use joblib to run the code in parallel using 2 workers.

.. GENERATED FROM PYTHON SOURCE LINES 84-88

.. code-block:: default

    from joblib import Parallel, delayed

    results = Parallel(n_jobs=2)(delayed(run)(dtw, para) for para in parameter_grid)


.. GENERATED FROM PYTHON SOURCE LINES 89-90

We will not inspect the results here, but we can see that each dtw object has different parameters set.

.. GENERATED FROM PYTHON SOURCE LINES 90-92

.. code-block:: default

    for r in results:
        pprint(r.get_params())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    {'conflict_resolution': True,
     'find_matches_method': 'find_peaks',
     'max_cost': 1800,
     'max_match_length_s': 3.0,
     'max_signal_stretch_ms': None,
     'max_template_stretch_ms': None,
     'memory': None,
     'min_match_length_s': 0.6,
     'resample_template': True,
     'snap_to_min_axis': 'gyr_ml',
     'snap_to_min_win_ms': 300,
     'template': BarthOriginalTemplate(scaling=FixedScaler(offset=0, scale=500.0), use_cols=('gyr_ml',)),
     'template__scaling': FixedScaler(offset=0, scale=500.0),
     'template__scaling__offset': 0,
     'template__scaling__scale': 500.0,
     'template__use_cols': ('gyr_ml',)}
    {'conflict_resolution': True,
     'find_matches_method': 'find_peaks',
     'max_cost': 1800,
     'max_match_length_s': 3.0,
     'max_signal_stretch_ms': None,
     'max_template_stretch_ms': None,
     'memory': None,
     'min_match_length_s': 0.6,
     'resample_template': True,
     'snap_to_min_axis': 'gyr_ml',
     'snap_to_min_win_ms': 300,
     'template': BarthOriginalTemplate(scaling=FixedScaler(offset=0, scale=500.0), use_cols=('gyr_ml', 'gyr_si')),
     'template__scaling': FixedScaler(offset=0, scale=500.0),
     'template__scaling__offset': 0,
     'template__scaling__scale': 500.0,
     'template__use_cols': ('gyr_ml', 'gyr_si')}
    {'conflict_resolution': True,
     'find_matches_method': 'find_peaks',
     'max_cost': 2200,
     'max_match_length_s': 3.0,
     'max_signal_stretch_ms': None,
     'max_template_stretch_ms': None,
     'memory': None,
     'min_match_length_s': 0.6,
     'resample_template': True,
     'snap_to_min_axis': 'gyr_ml',
     'snap_to_min_win_ms': 300,
     'template': BarthOriginalTemplate(scaling=FixedScaler(offset=0, scale=500.0), use_cols=('gyr_ml',)),
     'template__scaling': FixedScaler(offset=0, scale=500.0),
     'template__scaling__offset': 0,
     'template__scaling__scale': 500.0,
     'template__use_cols': ('gyr_ml',)}
    {'conflict_resolution': True,
     'find_matches_method': 'find_peaks',
     'max_cost': 2200,
     'max_match_length_s': 3.0,
     'max_signal_stretch_ms': None,
     'max_template_stretch_ms': None,
     'memory': None,
     'min_match_length_s': 0.6,
     'resample_template': True,
     'snap_to_min_axis': 'gyr_ml',
     'snap_to_min_win_ms': 300,
     'template': BarthOriginalTemplate(scaling=FixedScaler(offset=0, scale=500.0), use_cols=('gyr_ml', 'gyr_si')),
     'template__scaling': FixedScaler(offset=0, scale=500.0),
     'template__scaling__offset': 0,
     'template__scaling__scale': 500.0,
     'template__use_cols': ('gyr_ml', 'gyr_si')}


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  5.440 seconds)

**Estimated memory usage:**  217 MB


.. _sphx_glr_download_auto_examples_advanced_features_multi_process.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: multi_process.py <multi_process.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: multi_process.ipynb <multi_process.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_