Posts by Collection

publications

Computing Wasserstein Barycenter via operator splitting: the method of averaged marginals

Published in preprint, 2023

Links: paper / PDF / code.

Abstract:

The Wasserstein barycenter (WB) is an important tool for summarizing sets of probability measures. It finds applications in applied probability, clustering, image processing, etc. When the measures’ supports are finite, computing a balanced WB can be done by solving a linear optimization problem whose dimensions generally exceed standard solvers’ capabilities. In the more general setting where measures have different total masses, we propose a convex nonsmooth optimization formulation for the so-called unbalanced WB problem. Due to their colossal dimensions, we introduce a decomposition scheme based on the Douglas-Rachford splitting method that can be applied to both balanced and unbalanced WB problem variants. Our algorithm, which has the interesting interpretation of being built upon averaging marginals, operates a series of simple (and exact) projections that can be parallelized and even randomized, making it suitable for large-scale datasets. Numerical comparisons against state-of-the-art methods on several data sets from the literature illustrate the method’s performance.

Recommended citation: Mimouni, D., Malisani, P., Zhu, J., & de Oliveira, W. (2023). Computing Wasserstein Barycenter via operator splitting: the method of averaged marginals. arXiv preprint arXiv:2309.05315. https://arxiv.org/pdf/2309.05315

An Interpretable Distance Measure for Multivariate Non-Stationary Physiological Signals

Published in Proceedings of the Proceedings of the International Conference on Data Mining Workshops (ICDMW), 2023

$d_{symb}$: data-driven symbolic representation and distance measure for multivariate time series.

Recommended citation: S. W. Combettes, C. Truong and L. Oudre, "An Interpretable Distance Measure for Multivariate Non-Stationary Physiological Signals," 2023 IEEE International Conference on Data Mining Workshops (ICDMW), Shanghai, China, 2023, pp. 533-539, doi: 10.1109/ICDMW60847.2023.00076. https://ieeexplore.ieee.org/abstract/document/10411636

Arm-CODA: A Dataset of Upper-limb Human Movement during Routine Examination.

Published in Image Processing On Line, 2024

armCODA data set: open-access data set of multivariate physiological signals (biomedical time series).

Recommended citation: Sylvain W. Combettes, Paul Boniol, Antoine Mazarguil, Danping Wang, Diego Vaquero-Ramos, Marion Chauveau, Laurent Oudre, Nicolas Vayatis, Pierre-Paul Vidal, Alexandra Roren, Marie-Martine Lefèvre-Colau, Arm-CODA: A Dataset of Upper-limb Human Movement during Routine Examination, Image Processing On Line, 14 (2024), pp. 1–13. https://www.ipol.im/pub/art/2024/494/

$d_{symb}$ playground: an interactive tool to explore large multivariate time series datasets

Published in Proceedings of the International Conference on Data Engineering (ICDE) (to appear), 2024

$d_{symb}$ playground: Streamlit application for using $d_{symb}$.

Recommended citation: S. W. Combettes, P. Boniol, C. Truong, and L. Oudre. "d_{symb} playground: an interactive tool to explore large multivariate time series datasets." In Proceedings of the International Conference on Data Engineering (ICDE) (to appear), Utrecht, Netherlands, 2024. http://www.laurentoudre.fr/publis/dsymb_demo.pdf

teaching