Wasserstein DRO¶
- class dro.src.linear_model.wasserstein_dro.WassersteinDRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear')¶
Bases:
BaseLinearDRO
Wasserstein Distributionally Robust Optimization (WDRO) model
This model minimizes a Wasserstein-robust loss function for both regression and classification.
The Wasserstein distance is defined as the minimum probability coupling of two distributions for the distance metric:
\[d((X_1, Y_1), (X_2, Y_2)) = (\|\Sigma^{1/2} (X_1 - X_2)\|_p)^{square} + \kappa |Y_1 - Y_2|,\]where parameters are:
\(\Sigma\): cost matrix, (a PSD Matrix);
\(\kappa\);
\(p\);
square (notation depending on the model type), where square = 2 for ‘svm’, ‘logistic’, ‘lad’; square = 1 for ‘ols’.
Reference:
[2] LAD / SVM / Logistic: <https://jmlr.org/papers/volume20/17-633/17-633.pdf>
Initialize Mahalanobis-Wasserstein DRO model.
- Parameters:
input_dim (int) – Dimension of feature space. Must satisfy :math:` ext{input_dim} geq 1`
model_type (str) –
Base model architecture. Supported:
'svm'
: Hinge loss (classification)'logistic'
: Logistic loss (classification)'ols'
: Least squares (regression)'lad'
: Least absolute deviation (regression)
fit_intercept (bool) – Whether to learn intercept term \(b\). Set to
False
for pre-centered data. Defaults to True.solver (str) –
Convex optimization solver. Valid options:
'MOSEK'
(commercial, recommended)
kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’
- Raises:
If input_dim < 1
If unsupported solver is selected
- Example:
>>> model = WassersteinDRO( ... input_dim=5, ... model_type='svm', ... solver='MOSEK' ... ) >>> model.cost_matrix.shape # (5, 5)
Note
Changing
cost_matrix
after initialization requires callingupdate()
- update(config)¶
Update Wasserstein-DRO model parameters dynamically.
- Parameters:
Configuration dictionary with keys:
'cost_matrix'
: Mahalanobis metric matrix \(\Sigma^{-1} \succ 0\)Shape: (input_dim, input_dim)
Type: numpy.ndarray
'eps'
: Wasserstein radius \(\epsilon \geq 0\)'p'
: Wasserstein order \(p \geq 1\) or'inf'
'kappa'
: Y-ambiguity radius \(\kappa \geq 0\) or'inf'
- Raises:
If cost_matrix is not positive definite
If eps < 0
If p < 1 and p ≠ ‘inf’
If kappa < 0 and kappa ≠ ‘inf’
If cost_matrix is not numpy array
If numeric parameters are not float/int
- Return type:
- Example:
>>> model = WassersteinDRO(input_dim=3) >>> new_config = { ... 'eps': 0.5, ... 'p': 2, ... 'cost_matrix': np.diag([1, 2, 3]) ... } >>> model.update(new_config) >>> model.p # 2.0
- fit(X, y)¶
Fit the model using CVXPY to solve the WDRO problem.
- Parameters:
X (numpy.ndarray) – Training feature matrix of shape (n_samples, n_features). Must satisfy n_features == self.input_dim.
Y (numpy.ndarray) –
Target values of shape (n_samples,). Format requirements:
Classification: ±1 labels
Regression: Continuous values
y (ndarray)
- Returns:
Dictionary containing trained parameters:
theta
: Weight vector of shape (n_features,)b
- Return type:
Dict[str, Any]
.raises: WassersteinDROError: If the optimization problem fails to solve.
- worst_distribution(X, y, compute_type, gamma=0)¶
Compute worst-case distribution under Wasserstein ambiguity set.
- Parameters:
X (numpy.ndarray) – Input feature matrix. Shape: (n_samples, n_features) Must satisfy
n_features == input_dim
y (numpy.ndarray) –
Target vector. Shape: (n_samples,)
Classification: binary labels (-1/1)
Regression: continuous values
compute_type (str) –
Computation methodology. Options:
'asymp'
: Asymptotic approximation (faster, less accurate)Supported models:
['svm', 'logistic', 'lad']
'exact'
: Exact dual solution (slower, precise)
gamma (float) – Regularization parameter for asymptotic method. Must satisfy \(\gamma > 0\) when
compute_type='asymp'
. Defaults to 0.
- Returns:
Dictionary containing:
'sample_pts'
: Worst-case sample locations. Shape: (m, n_features)'weights'
: Probability weights. Shape: (m,) with \(\sum w_i = 1\)
- Return type:
- Raises:
If
compute_type='asymp'
withmodel_type='ols'
If
compute_type='asymp'
andkappa == 'inf'
If gamma ≤ 0 when required
If input dimensions mismatch
- Example:
>>> X, y = np.random.randn(100, 3), np.random.randint(0,2,100) >>> model = WassersteinDRO(model_type='svm', input_dim=3) >>> wc_dist = model.worst_distribution(X, y, 'asymp', gamma=0.1) >>> wc_dist['weights'].sum() # Approximately 1.0
Note
Asymptotic method ignores curvature regularization (κ=infty)
Exact method requires
solver='MOSEK'
for conic constraints
Reference of Worst-case Distribution:
[1] SVM / Logistic / LAD: Theorem 20 (ii) in https://jmlr.org/papers/volume20/17-633/17-633.pdf, where eta is the theta in eq(27) and gamma = 0 in that equation.
[2] In all cases, we use a reduced dual case (e.g., Remark 5.2 of https://arxiv.org/pdf/2308.05414) to compute their worst-case distribution.
[3] General Worst-case Distributions can be found in: https://pubsonline.informs.org/doi/abs/10.1287/moor.2022.1275, where norm_theta is lambda* here.
- class dro.src.linear_model.wasserstein_dro.WassersteinDROsatisficing(input_dim, model_type, fit_intercept=True, solver='MOSEK', kernel='linear')¶
Bases:
BaseLinearDRO
Robust satisficing version of Wasserstein DRO
This model minimizes the subject to (approximated version) of the robust satisficing constraint of Wasserstein DRO. The Wasserstein Distance is defined as the minimum probability coupling of two distributions for the distance metric:
\[d((X_1, Y_1), (X_2, Y_2)) = (\|\Sigma^{1/2} (X_1 - X_2)\|_p)^{square} + \kappa |Y_1 - Y_2|,\]Reference: <https://pubsonline.informs.org/doi/10.1287/opre.2021.2238>
Initialize Robust satisficing version of Wasserstein DRO.
- Parameters:
input_dim (int) – Feature space dimension. Must satisfy \(d \geq 1\)
model_type (str) –
Base model architecture. Supported:
'svm'
'logistic'
'ols'
'lad'
fit_intercept (bool) – Whether to learn intercept \(b\). Disable for standardized data. Defaults to True.
solver (str) –
Convex optimization solver. Options:
'MOSEK'
(commercial, recommended)
kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’
- Raises:
If input_dim < 1
If invalid solver selected
- Initialization Defaults:
Cost matrix initialized as identity \(I_d\)
Target ratio :math:` au = 1/0.8` (20% performance margin)
Wasserstein order \(p=1\) (earth mover’s distance)
- Example:
>>> model = WassersteinDROsatisficing( ... input_dim=5, ... model_type='svm', ... solver='ECOS' ... ) >>> model.cost_matrix.shape # (5, 5)
- update(config)¶
Update model parameters based on configuration.
- fit(X, y)¶
Fit model to data by solving an optimization problem.
- Parameters:
X (
numpy.ndarray
) – input covariatesy (
numpy.ndarray
) – input labels
- Return type:
- fit_oracle(X, y)¶
Depreciated, find the optimal that given the ambiguity constraint.
- Args:
X (np.ndarray): Input feature matrix with shape (n_samples, n_features).
y (np.ndarray): Target vector with shape (n_samples,).
- Returns:
float: robust objective value
- worst_distribution(X, y)¶