Various useful functions
Apply a scaler to two arrays.
Dispatches based on the type of scaler:
None : returns inputs unchanged.
Object with a .transform() method : calls scaler.transform() on each.
Callable : calls scaler() on each (covers functions, lambdas,
PyTorch transforms, neural network encoders, etc.).
X_s (array-like) – Source samples.
X_t (array-like) – Target samples.
scaler (None, object with .transform(), or callable, optional) – Preprocessing to apply.
X_s_out (array-like) – Possibly transformed source samples.
X_t_out (array-like) – Possibly transformed target samples.
Checks whether or not the requested number of threads has a valid value.
Turn seed into a np.random.RandomState instance
seed (None | int | instance of RandomState) – If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.
Remove all components with zeros weights in \(\mathbf{a}\) and \(\mathbf{b}\)
Apply normalization to the loss matrix
C (ndarray, shape (n1, n2)) – The cost matrix to normalize.
norm (str) – Type of normalization from ‘median’, ‘max’, ‘log’, ‘loglog’. Any other value do not normalize.
C – The input cost matrix normalized according to given norm.
ndarray, shape (n1, n2)
Compute distance between samples in \(\mathbf{x_1}\) and \(\mathbf{x_2}\)
Note
This function is backend-compatible and will work on arrays from all compatible backends for the following metrics: ‘sqeuclidean’, ‘euclidean’, ‘cityblock’, ‘minkowski’, ‘cosine’, ‘correlation’.
x1 (array-like, shape (n1,d)) – matrix with n1 samples of size d
x2 (array-like, shape (n2,d), optional) – matrix with n2 samples of size d (if None then \(\mathbf{x_2} = \mathbf{x_1}\))
metric (str | callable, optional) – ‘sqeuclidean’ or ‘euclidean’ on all backends. On numpy the function also accepts from the scipy.spatial.distance.cdist function : ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulczynski1’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’.
p (float, optional) – p-norm for the Minkowski and the Weighted Minkowski metrics. Default value is 2.
w (array-like, rank 1) – Weights for the weighted metrics.
backend (str, optional) – Backend to use for the computation. If ‘auto’, the backend is
automatically selected based on the input data. if ‘scipy’,
the scipy.spatial.distance.cdist function is used (and gradients are
detached).
use_tensor (bool, optional) – If true use tensorized computation for the distance matrix which can cause memory issues for large datasets. Default is False and the parameter is used only for the ‘cityblock’ and ‘minkowski’ metrics.
nx (Backend, optional) – Backend to perform computations on. If omitted, the backend defaults to that of x1.
M – distance matrix computed with given metric
array-like, shape (n1, n2)
Compute standard cost matrices of size (n, n) for OT problems
ot.utils.dist01D Wasserstein barycenter: exact LP vs entropic regularization
1D Wasserstein barycenter demo for Unbalanced distributions
Considering the rows of \(\mathbf{X}\) (and \(\mathbf{Y} = \mathbf{X}\)) as vectors, compute the distance matrix between each pair of vectors.
Note
This function is backend-compatible and will work on arrays from all compatible backends.
X (array-like, shape (n_samples_1, n_features))
Y (array-like, shape (n_samples_2, n_features))
squared (boolean, optional) – Return squared Euclidean distances.
distances
array-like, shape (n_samples_1, n_samples_2)
Exponential map in Bures-Wasserstein space at Sigma:
Sigma (array-like (d,d)) – SPD matrix
S (array-like (d,d)) – Symmetric matrix
nx (module, optional) – The numerical backend module to use. If not provided, the backend will be fetched from the input matrices Sigma, S.
P – SPD matrix obtained as the exponential map of S at Sigma
array-like (d,d)
Convert a function to a numpy function.
fun_numpy – The converted function.
callable
For \(x\in S^1 \subset \mathbb{R}^2\), returns the coordinates in turn (in [0,1[).
x (ndarray, shape (n, 2)) – Samples on the circle with ambient coordinates
x_t – Coordinates on [0,1[
ndarray, shape (n,)
Examples
>>> u = np.array([[0.2,0.5,0.8]]) * (2 * np.pi)
>>> x1, y1 = np.cos(u), np.sin(u)
>>> x = np.concatenate([x1, y1]).T
>>> get_coordinate_circle(x)
array([0.2, 0.5, 0.8])
Get a low rank LazyTensor T=Q@R^T or T=Q@diag(d)@R^T
Q (ndarray, shape (n, r)) – First factor of the lowrank tensor
R (ndarray, shape (m, r)) – Second factor of the lowrank tensor
d (ndarray, shape (r,), optional) – Diagonal of the lowrank tensor
nx (Backend, optional) – Backend to use for the reduction
T – Lowrank tensor T=Q@R^T or T=Q@diag(d)@R^T
Extract a pair of parameters from a given parameter Used in unbalanced OT and COOT solvers to handle marginal regularization and entropic regularization.
parameter (float or indexable object)
nx (backend object)
param_1 (float)
param_2 (float)
Transform labels to start at a given value
y – The input vector of labels normalized according to given start value.
array-like, shape (n1, )
Transforms (n_samples,) vector of labels into a (n_samples, n_labels) matrix of masks.
y (array-like, shape (n_samples, )) – The vector of labels.
type_as (array_like) – Array of the same type of the expected output.
nx (Backend, optional) – Backend to perform computations on. If omitted, the backend defaults to that of y.
masks – The (n_samples, n_labels) matrix of label masks.
array-like, shape (n_samples, n_labels)
parallel map for multiprocessing. The function has been deprecated and only performs a regular map.
Project a symmetric matrix onto the space of symmetric matrices with eigenvalues larger or equal to vmin.
S (array_like (n, d, d) or (d, d)) – The input symmetric matrix or matrices.
nx (module, optional) – The numerical backend module to use. If not provided, the backend will be fetched from the input matrix S.
vmin (float, optional) – The minimum value for the eigenvalues. Eigenvalues below this value will be clipped to vmin.
note: (..) – This function is backend-compatible and will work on arrays: from all compatible backends.
P – The projected symmetric positive definite matrix.
ndarray (n, d, d) or (d, d)
ot.utils.proj_SDPCompute the closest point (orthogonal projection) on the generalized (n-1)-simplex of a vector \(\mathbf{v}\) wrt. to the Euclidean distance, thus solving:
If \(\mathbf{v}\) is a 2d array, compute all the projections wrt. axis 0
Note
This function is backend-compatible and will work on arrays from all compatible backends.
v ({array-like}, shape (n, d))
z (int, optional) – ‘size’ of the simplex (each vectors sum to z, 1 by default)
h – Array of projections on the simplex
ndarray, shape (n, d)
ot.utils.proj_simplexOptimizing the Gromov-Wasserstein distance with PyTorch
Projection of \(\mathbf{V}\) onto the simplex with cardinality constraint (maximum number of non-zero elements) and then scaled by z.
V (1-dim or 2-dim ndarray)
max_nz (int) – Maximum number of non-zero elements in the projection. If max_nz is larger than the number of elements in V, then the projection is equivalent to proj_simplex(V, z).
z (float or array) – If array, len(z) must be compatible with \(\mathbf{V}\)
axis (None or int) –
axis=None: project \(\mathbf{V}\) by \(P(\mathbf{V}.\mathrm{ravel}(), \text{max_nz}, z)\)
axis=1: project each \(\mathbf{V}_i\) by \(P(\mathbf{V}_i, \text{max_nz}, z_i)\)
axis=0: project each \(\mathbf{V}_{:, j}\) by \(P(\mathbf{V}_{:, j}, \text{max_nz}, z_j)\)
projection
ndarray, shape \(\mathbf{V}\).shape
References
Reduce a LazyTensor along an axis with function fun using batches.
When axis=None, reduce the LazyTensor to a scalar as a sum of fun over batches taken along dim.
Warning
This function works for tensor of any order but the reduction can be done only along the first two axis (or global). Also, in order to work, it requires that the slice of size batch_size along the axis to reduce (or axis 0 if axis=None) is can be computed and fits in memory.
a (LazyTensor) – LazyTensor to reduce
func (callable) – Function to apply to the LazyTensor
axis (int, optional) – Axis along which to reduce the LazyTensor. If None, reduce the LazyTensor to a scalar as a sum of fun over batches taken along axis 0. If 0 or 1 reduce the LazyTensor to a vector/matrix as a sum of fun over batches taken along axis.
nx (Backend, optional) – Backend to use for the reduction
batch_size (int, optional) – Size of the batches to use for the reduction (default=100)
res – Result of the reduction
array-like
Compute ot distance between samples in \(\mathbf{x_1}\) and \(\mathbf{x_2}\) with sparse weights given by w for the pairs of samples with indices i and j.
Note
This function is backend-compatible and will work on arrays from all compatible backends for the following metrics: ‘sqeuclidean’, ‘euclidean’, ‘cityblock’, ‘minkowski’.
x1 (array-like, shape (n1,d)) – matrix with n1 samples of size d
x2 (array-like, shape (n2,d), optional) – matrix with n2 samples of size d (if None then \(\mathbf{x_2} = \mathbf{x_1}\))
i (array-like, shape (k,)) – indices of samples in x1 to compute distance from
j (array-like, shape (k,)) – indices of samples in x2 to compute distance to
w (array-like, shape (k,), optional) – weights for each pair of samples to compute distance between. If None, all pairs are weighted equally (=1/k).
metric (str | callable, optional) – ‘sqeuclidean’, ‘euclidean’, ‘cityblock’ or ‘minkowski’.
p (float, optional) – p-norm for the Minkowski metric. Default value is 2.
batch_size (int, optional) – If specified, compute the distance in batches of size batch_size to avoid memory issues for large datasets. Default is None (no batching).
dist – sum of the distance between \(\mathbf{x_1}_i\) and \(\mathbf{x_2}_j\) computed with given metric and weighted by w
Return a uniform histogram of length n (simplex).
n (int) – number of bins in the histogram
type_as (array-like) – array of the same type of the expected output (numpy/pytorch/jax)
h – histogram of length n such that \(\forall i, \mathbf{h}_i = \frac{1}{n}\)
array-like, shape (n,)
Base class for OT barycenter results.
X (array-like, shape (n, d)) – Barycenter features.
C (array-like, shape (n, n)) – Barycenter structure for Gromov Wasserstein solutions.
b (array-like, shape (n,)) – Barycenter weights.
value (float, array-like) – Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
value_linear (float, array-like) – The linear part of the transport cost, i.e. the product between the transport plan and the cost.
value_quad (float, array-like) – The quadratic part of the transport cost for Gromov-Wasserstein solutions.
log (dict) – Dictionary containing potential information about the solver.
list_res (list of OTResult) – List of results for the individual OT matching with input distributions considered as sources and the learned barycenter distribution as target.
Barycenter features.
array-like, shape (n, d)
Barycenter structure for Gromov Wasserstein solutions.
array-like, shape (n, n)
Barycenter weights.
array-like, shape (n,)
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
float, array-like
The linear part of the transport cost, i.e. the product between the transport plan and the cost.
float, array-like
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
float, array-like
Barycenter structure for Gromov Wasserstein solutions.
Barycenter features.
Barycenter weights.
Appropriate citation(s) for this result, in plain text and BibTex formats.
List of results for the individual OT matching.
Dictionary containing potential information about the solver.
Optimization status of the solver.
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
The “minimal” transport cost, i.e. the product between the transport plan and the cost.
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
ot.utils.BaryResultBase class for most objects in POT
Code adapted from sklearn BaseEstimator class
Notes
All estimators should specify all the parameters that can be set
at the class level in their __init__ as explicit keyword
arguments (no *args or **kwargs).
Get parameters for this estimator.
deep (bool, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
params – Parameter names mapped to their values.
mapping of string to any
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter> so that it’s possible to update each
component of a nested object.
self
ot.utils.BaseEstimatorOT with Laplacian regularization for domain adaptation
OT for image color adaptation with mapping estimation
OT for domain adaptation on empirical distributions
Backend-aware data scaler with sklearn-compatible API.
Fit normalization statistics on a single array or on the concatenation of multiple arrays (joint fitting), then apply the same fixed transform to any array. Supports NumPy, PyTorch, JAX, and TensorFlow backends via POT’s backend abstraction.
norm (str, optional) –
Normalization method. One of:
'standard' (default) : zero mean, unit variance per feature
'minmax' : scale each feature to [0, 1]
'l2' : unit L2-norm per sample (row-wise, stateless)
Per-feature means (only for norm='standard').
array-like
Per-feature standard deviations (only for norm='standard').
array-like
Per-feature minimums (only for norm='minmax').
array-like
Per-feature maximums (only for norm='minmax').
array-like
Examples
>>> import numpy as np
>>> from ot.utils import DataScaler
>>> X_s = np.array([[1.0, 100.0], [2.0, 200.0]])
>>> X_t = np.array([[3.0, 300.0], [4.0, 400.0]])
>>> scaler = DataScaler(norm='standard').fit([X_s, X_t])
>>> X_s_scaled = scaler.transform(X_s)
Compute normalization statistics from one array or a list of arrays.
When given a list, arrays are concatenated along axis 0 before computing statistics (joint fitting).
X (array-like or list of array-like) – Data to fit on. If a list, arrays must have the same number of features (columns).
self
Apply the fitted transformation to X.
X (array-like or list of array-like) – Data to transform. If a list, each element is transformed and returned as a list.
X_scaled – Transformed data, same shape and backend as X. If X was a list, returns a list of transformed arrays.
array-like or list of array-like
ot.utils.DataScalerSliced Wasserstein Distance with input scaling (DataScaler)
A lazy tensor is a tensor that is not stored in memory. Instead, it is defined by a function that computes its values on the fly from slices.
Examples
>>> import numpy as np
>>> v = np.arange(5)
>>> def getitem(i,j, v):
... return v[i,None]+v[None,j]
>>> T = LazyTensor((5,5),getitem, v=v)
>>> T[1,2]
array([3])
>>> T[1,:]
array([[1, 2, 3, 4, 5]])
>>> T[:]
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
ot.utils.LazyTensorBase class for OT results.
potentials (tuple of array-like, shape (n1, n2)) – Dual potentials, i.e. Lagrange multipliers for the marginal constraints. This pair of arrays has the same shape, numerical type and properties as the input weights “a” and “b”.
value (float, array-like) – Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
value_linear (float, array-like) – The linear part of the transport cost, i.e. the product between the transport plan and the cost.
value_quad (float, array-like) – The quadratic part of the transport cost for Gromov-Wasserstein solutions.
plan (array-like, shape (n1, n2)) – Transport plan, encoded as a dense array.
log (dict) – Dictionary containing potential information about the solver.
backend (Backend) – Backend used to compute the results.
sparse_plan (array-like, shape (n1, n2)) – Transport plan, encoded as a sparse array.
lazy_plan (LazyTensor) – Transport plan, encoded as a symbolic POT or KeOps LazyTensor.
batch_size (int) – Batch size used to compute the results/marginals for LazyTensor.
Dual potentials, i.e. Lagrange multipliers for the marginal constraints. This pair of arrays has the same shape, numerical type and properties as the input weights “a” and “b”.
tuple of array-like, shape (n1, n2)
First dual potential, associated to the “source” measure “a”.
array-like, shape (n1,)
Second dual potential, associated to the “target” measure “b”.
array-like, shape (n2,)
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
float, array-like
The linear part of the transport cost, i.e. the product between the transport plan and the cost.
float, array-like
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
float, array-like
Transport plan, encoded as a dense array.
array-like, shape (n1, n2)
Transport plan, encoded as a sparse array.
array-like, shape (n1, n2)
Transport plan, encoded as a symbolic POT or KeOps LazyTensor.
Marginals of the transport plan: should be very close to “a” and “b” for balanced OT.
tuple of array-like, shape (n1,), (n2,)
Marginal of the transport plan for the “source” measure “a”.
array-like, shape (n1,)
Marginal of the transport plan for the “target” measure “b”.
array-like, shape (n2,)
Displacement vectors from the first to the second measure.
Displacement vectors from the second to the first measure.
Appropriate citation(s) for this result, in plain text and BibTex formats.
Transport plan, encoded as a symbolic KeOps LazyTensor.
Dictionary containing potential information about the solver.
First marginal of the transport plan, with the same shape as “a”.
Second marginal of the transport plan, with the same shape as “b”.
should be very close to “a” and “b” for balanced OT.
Marginals of the transport plan
Transport plan, encoded as a dense array.
First dual potential, associated to the “source” measure “a”.
Second dual potential, associated to the “target” measure “b”.
Dual potentials, i.e. Lagrange multipliers for the marginal constraints.
This pair of arrays has the same shape, numerical type and properties as the input weights “a” and “b”.
Transport plan, encoded as a sparse array.
Optimization status of the solver.
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
The “minimal” transport cost, i.e. the product between the transport plan and the cost.
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
ot.utils.OTResultDifferent gradient computations for regularized optimal transport
Solving Many Optimal Transport Problems in Parallel
Solve Fused Unbalanced Gromov Wasserstein with Adam
Decorator to mark a function or class as deprecated.
deprecated class from scikit-learn package https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/deprecation.py Issue a warning when the function is called/the class is instantiated and adds a warning to the docstring. The optional extra argument will be appended to the deprecation message and the docstring.
Note
To use this with the default value for extra, use empty parentheses:
>>> from ot.deprecation import deprecated
>>> @deprecated()
... def some_function(): pass
extra (str) – To be added to the deprecation messages.
Aim at raising an Exception when a undefined parameter is called |
Base class for OT barycenter results.
X (array-like, shape (n, d)) – Barycenter features.
C (array-like, shape (n, n)) – Barycenter structure for Gromov Wasserstein solutions.
b (array-like, shape (n,)) – Barycenter weights.
value (float, array-like) – Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
value_linear (float, array-like) – The linear part of the transport cost, i.e. the product between the transport plan and the cost.
value_quad (float, array-like) – The quadratic part of the transport cost for Gromov-Wasserstein solutions.
log (dict) – Dictionary containing potential information about the solver.
list_res (list of OTResult) – List of results for the individual OT matching with input distributions considered as sources and the learned barycenter distribution as target.
Barycenter features.
array-like, shape (n, d)
Barycenter structure for Gromov Wasserstein solutions.
array-like, shape (n, n)
Barycenter weights.
array-like, shape (n,)
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
float, array-like
The linear part of the transport cost, i.e. the product between the transport plan and the cost.
float, array-like
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
float, array-like
Barycenter structure for Gromov Wasserstein solutions.
Barycenter features.
Barycenter weights.
Appropriate citation(s) for this result, in plain text and BibTex formats.
List of results for the individual OT matching.
Dictionary containing potential information about the solver.
Optimization status of the solver.
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
The “minimal” transport cost, i.e. the product between the transport plan and the cost.
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
Base class for most objects in POT
Code adapted from sklearn BaseEstimator class
Notes
All estimators should specify all the parameters that can be set
at the class level in their __init__ as explicit keyword
arguments (no *args or **kwargs).
Get parameters for this estimator.
deep (bool, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
params – Parameter names mapped to their values.
mapping of string to any
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter> so that it’s possible to update each
component of a nested object.
self
Backend-aware data scaler with sklearn-compatible API.
Fit normalization statistics on a single array or on the concatenation of multiple arrays (joint fitting), then apply the same fixed transform to any array. Supports NumPy, PyTorch, JAX, and TensorFlow backends via POT’s backend abstraction.
norm (str, optional) –
Normalization method. One of:
'standard' (default) : zero mean, unit variance per feature
'minmax' : scale each feature to [0, 1]
'l2' : unit L2-norm per sample (row-wise, stateless)
Per-feature means (only for norm='standard').
array-like
Per-feature standard deviations (only for norm='standard').
array-like
Per-feature minimums (only for norm='minmax').
array-like
Per-feature maximums (only for norm='minmax').
array-like
Examples
>>> import numpy as np
>>> from ot.utils import DataScaler
>>> X_s = np.array([[1.0, 100.0], [2.0, 200.0]])
>>> X_t = np.array([[3.0, 300.0], [4.0, 400.0]])
>>> scaler = DataScaler(norm='standard').fit([X_s, X_t])
>>> X_s_scaled = scaler.transform(X_s)
Compute normalization statistics from one array or a list of arrays.
When given a list, arrays are concatenated along axis 0 before computing statistics (joint fitting).
X (array-like or list of array-like) – Data to fit on. If a list, arrays must have the same number of features (columns).
self
Apply the fitted transformation to X.
X (array-like or list of array-like) – Data to transform. If a list, each element is transformed and returned as a list.
X_scaled – Transformed data, same shape and backend as X. If X was a list, returns a list of transformed arrays.
array-like or list of array-like
A lazy tensor is a tensor that is not stored in memory. Instead, it is defined by a function that computes its values on the fly from slices.
Examples
>>> import numpy as np
>>> v = np.arange(5)
>>> def getitem(i,j, v):
... return v[i,None]+v[None,j]
>>> T = LazyTensor((5,5),getitem, v=v)
>>> T[1,2]
array([3])
>>> T[1,:]
array([[1, 2, 3, 4, 5]])
>>> T[:]
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
Base class for OT results.
potentials (tuple of array-like, shape (n1, n2)) – Dual potentials, i.e. Lagrange multipliers for the marginal constraints. This pair of arrays has the same shape, numerical type and properties as the input weights “a” and “b”.
value (float, array-like) – Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
value_linear (float, array-like) – The linear part of the transport cost, i.e. the product between the transport plan and the cost.
value_quad (float, array-like) – The quadratic part of the transport cost for Gromov-Wasserstein solutions.
plan (array-like, shape (n1, n2)) – Transport plan, encoded as a dense array.
log (dict) – Dictionary containing potential information about the solver.
backend (Backend) – Backend used to compute the results.
sparse_plan (array-like, shape (n1, n2)) – Transport plan, encoded as a sparse array.
lazy_plan (LazyTensor) – Transport plan, encoded as a symbolic POT or KeOps LazyTensor.
batch_size (int) – Batch size used to compute the results/marginals for LazyTensor.
Dual potentials, i.e. Lagrange multipliers for the marginal constraints. This pair of arrays has the same shape, numerical type and properties as the input weights “a” and “b”.
tuple of array-like, shape (n1, n2)
First dual potential, associated to the “source” measure “a”.
array-like, shape (n1,)
Second dual potential, associated to the “target” measure “b”.
array-like, shape (n2,)
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
float, array-like
The linear part of the transport cost, i.e. the product between the transport plan and the cost.
float, array-like
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
float, array-like
Transport plan, encoded as a dense array.
array-like, shape (n1, n2)
Transport plan, encoded as a sparse array.
array-like, shape (n1, n2)
Transport plan, encoded as a symbolic POT or KeOps LazyTensor.
Marginals of the transport plan: should be very close to “a” and “b” for balanced OT.
tuple of array-like, shape (n1,), (n2,)
Marginal of the transport plan for the “source” measure “a”.
array-like, shape (n1,)
Marginal of the transport plan for the “target” measure “b”.
array-like, shape (n2,)
Displacement vectors from the first to the second measure.
Displacement vectors from the second to the first measure.
Appropriate citation(s) for this result, in plain text and BibTex formats.
Transport plan, encoded as a symbolic KeOps LazyTensor.
Dictionary containing potential information about the solver.
First marginal of the transport plan, with the same shape as “a”.
Second marginal of the transport plan, with the same shape as “b”.
should be very close to “a” and “b” for balanced OT.
Marginals of the transport plan
Transport plan, encoded as a dense array.
First dual potential, associated to the “source” measure “a”.
Second dual potential, associated to the “target” measure “b”.
Dual potentials, i.e. Lagrange multipliers for the marginal constraints.
This pair of arrays has the same shape, numerical type and properties as the input weights “a” and “b”.
Transport plan, encoded as a sparse array.
Optimization status of the solver.
Full transport cost, including possible regularization terms and quadratic term for Gromov Wasserstein solutions.
The “minimal” transport cost, i.e. the product between the transport plan and the cost.
The quadratic part of the transport cost for Gromov-Wasserstein solutions.
Aim at raising an Exception when a undefined parameter is called
Apply a scaler to two arrays.
Dispatches based on the type of scaler:
None : returns inputs unchanged.
Object with a .transform() method : calls scaler.transform() on each.
Callable : calls scaler() on each (covers functions, lambdas,
PyTorch transforms, neural network encoders, etc.).
X_s (array-like) – Source samples.
X_t (array-like) – Target samples.
scaler (None, object with .transform(), or callable, optional) – Preprocessing to apply.
X_s_out (array-like) – Possibly transformed source samples.
X_t_out (array-like) – Possibly transformed target samples.
Checks whether or not the requested number of threads has a valid value.
Turn seed into a np.random.RandomState instance
seed (None | int | instance of RandomState) – If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.
Remove all components with zeros weights in \(\mathbf{a}\) and \(\mathbf{b}\)
Apply normalization to the loss matrix
C (ndarray, shape (n1, n2)) – The cost matrix to normalize.
norm (str) – Type of normalization from ‘median’, ‘max’, ‘log’, ‘loglog’. Any other value do not normalize.
C – The input cost matrix normalized according to given norm.
ndarray, shape (n1, n2)
Decorator to mark a function or class as deprecated.
deprecated class from scikit-learn package https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/deprecation.py Issue a warning when the function is called/the class is instantiated and adds a warning to the docstring. The optional extra argument will be appended to the deprecation message and the docstring.
Note
To use this with the default value for extra, use empty parentheses:
>>> from ot.deprecation import deprecated
>>> @deprecated()
... def some_function(): pass
extra (str) – To be added to the deprecation messages.
Compute distance between samples in \(\mathbf{x_1}\) and \(\mathbf{x_2}\)
Note
This function is backend-compatible and will work on arrays from all compatible backends for the following metrics: ‘sqeuclidean’, ‘euclidean’, ‘cityblock’, ‘minkowski’, ‘cosine’, ‘correlation’.
x1 (array-like, shape (n1,d)) – matrix with n1 samples of size d
x2 (array-like, shape (n2,d), optional) – matrix with n2 samples of size d (if None then \(\mathbf{x_2} = \mathbf{x_1}\))
metric (str | callable, optional) – ‘sqeuclidean’ or ‘euclidean’ on all backends. On numpy the function also accepts from the scipy.spatial.distance.cdist function : ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulczynski1’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’.
p (float, optional) – p-norm for the Minkowski and the Weighted Minkowski metrics. Default value is 2.
w (array-like, rank 1) – Weights for the weighted metrics.
backend (str, optional) – Backend to use for the computation. If ‘auto’, the backend is
automatically selected based on the input data. if ‘scipy’,
the scipy.spatial.distance.cdist function is used (and gradients are
detached).
use_tensor (bool, optional) – If true use tensorized computation for the distance matrix which can cause memory issues for large datasets. Default is False and the parameter is used only for the ‘cityblock’ and ‘minkowski’ metrics.
nx (Backend, optional) – Backend to perform computations on. If omitted, the backend defaults to that of x1.
M – distance matrix computed with given metric
array-like, shape (n1, n2)
Compute standard cost matrices of size (n, n) for OT problems
Considering the rows of \(\mathbf{X}\) (and \(\mathbf{Y} = \mathbf{X}\)) as vectors, compute the distance matrix between each pair of vectors.
Note
This function is backend-compatible and will work on arrays from all compatible backends.
X (array-like, shape (n_samples_1, n_features))
Y (array-like, shape (n_samples_2, n_features))
squared (boolean, optional) – Return squared Euclidean distances.
distances
array-like, shape (n_samples_1, n_samples_2)
Exponential map in Bures-Wasserstein space at Sigma:
Sigma (array-like (d,d)) – SPD matrix
S (array-like (d,d)) – Symmetric matrix
nx (module, optional) – The numerical backend module to use. If not provided, the backend will be fetched from the input matrices Sigma, S.
P – SPD matrix obtained as the exponential map of S at Sigma
array-like (d,d)
Convert a function to a numpy function.
fun_numpy – The converted function.
callable
For \(x\in S^1 \subset \mathbb{R}^2\), returns the coordinates in turn (in [0,1[).
x (ndarray, shape (n, 2)) – Samples on the circle with ambient coordinates
x_t – Coordinates on [0,1[
ndarray, shape (n,)
Examples
>>> u = np.array([[0.2,0.5,0.8]]) * (2 * np.pi)
>>> x1, y1 = np.cos(u), np.sin(u)
>>> x = np.concatenate([x1, y1]).T
>>> get_coordinate_circle(x)
array([0.2, 0.5, 0.8])
Get a low rank LazyTensor T=Q@R^T or T=Q@diag(d)@R^T
Q (ndarray, shape (n, r)) – First factor of the lowrank tensor
R (ndarray, shape (m, r)) – Second factor of the lowrank tensor
d (ndarray, shape (r,), optional) – Diagonal of the lowrank tensor
nx (Backend, optional) – Backend to use for the reduction
T – Lowrank tensor T=Q@R^T or T=Q@diag(d)@R^T
Extract a pair of parameters from a given parameter Used in unbalanced OT and COOT solvers to handle marginal regularization and entropic regularization.
parameter (float or indexable object)
nx (backend object)
param_1 (float)
param_2 (float)
Transform labels to start at a given value
y – The input vector of labels normalized according to given start value.
array-like, shape (n1, )
Transforms (n_samples,) vector of labels into a (n_samples, n_labels) matrix of masks.
y (array-like, shape (n_samples, )) – The vector of labels.
type_as (array_like) – Array of the same type of the expected output.
nx (Backend, optional) – Backend to perform computations on. If omitted, the backend defaults to that of y.
masks – The (n_samples, n_labels) matrix of label masks.
array-like, shape (n_samples, n_labels)
parallel map for multiprocessing. The function has been deprecated and only performs a regular map.
Project a symmetric matrix onto the space of symmetric matrices with eigenvalues larger or equal to vmin.
S (array_like (n, d, d) or (d, d)) – The input symmetric matrix or matrices.
nx (module, optional) – The numerical backend module to use. If not provided, the backend will be fetched from the input matrix S.
vmin (float, optional) – The minimum value for the eigenvalues. Eigenvalues below this value will be clipped to vmin.
note: (..) – This function is backend-compatible and will work on arrays: from all compatible backends.
P – The projected symmetric positive definite matrix.
ndarray (n, d, d) or (d, d)
Compute the closest point (orthogonal projection) on the generalized (n-1)-simplex of a vector \(\mathbf{v}\) wrt. to the Euclidean distance, thus solving:
If \(\mathbf{v}\) is a 2d array, compute all the projections wrt. axis 0
Note
This function is backend-compatible and will work on arrays from all compatible backends.
v ({array-like}, shape (n, d))
z (int, optional) – ‘size’ of the simplex (each vectors sum to z, 1 by default)
h – Array of projections on the simplex
ndarray, shape (n, d)
Projection of \(\mathbf{V}\) onto the simplex with cardinality constraint (maximum number of non-zero elements) and then scaled by z.
V (1-dim or 2-dim ndarray)
max_nz (int) – Maximum number of non-zero elements in the projection. If max_nz is larger than the number of elements in V, then the projection is equivalent to proj_simplex(V, z).
z (float or array) – If array, len(z) must be compatible with \(\mathbf{V}\)
axis (None or int) –
axis=None: project \(\mathbf{V}\) by \(P(\mathbf{V}.\mathrm{ravel}(), \text{max_nz}, z)\)
axis=1: project each \(\mathbf{V}_i\) by \(P(\mathbf{V}_i, \text{max_nz}, z_i)\)
axis=0: project each \(\mathbf{V}_{:, j}\) by \(P(\mathbf{V}_{:, j}, \text{max_nz}, z_j)\)
projection
ndarray, shape \(\mathbf{V}\).shape
References
Sparse projections onto the simplex Anastasios Kyrillidis, Stephen Becker, Volkan Cevher and, Christoph Koch ICML 2013 https://arxiv.org/abs/1206.1529
Reduce a LazyTensor along an axis with function fun using batches.
When axis=None, reduce the LazyTensor to a scalar as a sum of fun over batches taken along dim.
Warning
This function works for tensor of any order but the reduction can be done only along the first two axis (or global). Also, in order to work, it requires that the slice of size batch_size along the axis to reduce (or axis 0 if axis=None) is can be computed and fits in memory.
a (LazyTensor) – LazyTensor to reduce
func (callable) – Function to apply to the LazyTensor
axis (int, optional) – Axis along which to reduce the LazyTensor. If None, reduce the LazyTensor to a scalar as a sum of fun over batches taken along axis 0. If 0 or 1 reduce the LazyTensor to a vector/matrix as a sum of fun over batches taken along axis.
nx (Backend, optional) – Backend to use for the reduction
batch_size (int, optional) – Size of the batches to use for the reduction (default=100)
res – Result of the reduction
array-like
Compute ot distance between samples in \(\mathbf{x_1}\) and \(\mathbf{x_2}\) with sparse weights given by w for the pairs of samples with indices i and j.
Note
This function is backend-compatible and will work on arrays from all compatible backends for the following metrics: ‘sqeuclidean’, ‘euclidean’, ‘cityblock’, ‘minkowski’.
x1 (array-like, shape (n1,d)) – matrix with n1 samples of size d
x2 (array-like, shape (n2,d), optional) – matrix with n2 samples of size d (if None then \(\mathbf{x_2} = \mathbf{x_1}\))
i (array-like, shape (k,)) – indices of samples in x1 to compute distance from
j (array-like, shape (k,)) – indices of samples in x2 to compute distance to
w (array-like, shape (k,), optional) – weights for each pair of samples to compute distance between. If None, all pairs are weighted equally (=1/k).
metric (str | callable, optional) – ‘sqeuclidean’, ‘euclidean’, ‘cityblock’ or ‘minkowski’.
p (float, optional) – p-norm for the Minkowski metric. Default value is 2.
batch_size (int, optional) – If specified, compute the distance in batches of size batch_size to avoid memory issues for large datasets. Default is None (no batching).
dist – sum of the distance between \(\mathbf{x_1}_i\) and \(\mathbf{x_2}_j\) computed with given metric and weighted by w
Return a uniform histogram of length n (simplex).
n (int) – number of bins in the histogram
type_as (array-like) – array of the same type of the expected output (numpy/pytorch/jax)
h – histogram of length n such that \(\forall i, \mathbf{h}_i = \frac{1}{n}\)
array-like, shape (n,)