Outlier-Robust WDRO¶
- class dro.src.linear_model.or_wasserstein_dro.ORWDRO(input_dim, model_type='svm', solver='MOSEK', eps=0.0, eta=0.0, dual_norm=1)¶
Bases:
BaseLinearDRO
Outlier-Robust Wasserstein Distributionally Robust Optimization (OR-WDRO) model.
Implements TV-corrupted p-Wasserstein DRO with dual norm constraints:
\[\min_{\theta} \sup_{Q \in \mathcal{B}_\epsilon(P)} \mathbb{E}_Q[\ell(\theta;X,y)] + \eta \cdot \text{TV}(P,Q)\]where \(\mathcal{B}_\epsilon(P)\) is the Wasserstein ball and \(\text{TV}\) is total variation.
ORWDRO_Paper: https://arxiv.org/pdf/2311.05573
Initialize OR-WDRO model with anomaly-robust parameters.
- Parameters:
input_dim (int) – Feature space dimension. Must be ≥ 1
model_type (str) –
Base learner type. Supported:
'svm'
: Hinge loss (classification)'lad'
: Least absolute deviation (regression)
eps (float) –
Wasserstein robustness radius. Defaults to 0.0
0: Standard empirical risk minimization
>0: Controls distributional robustness
eta (float) – Expected outlier fraction. Must satisfy \(0 \leq \eta \leq 0.5\). Defaults to 0.0
dual_norm (int) –
Wasserstein dual norm order. Valid values:
1: ℓ¹-Wasserstein (transportation cost)
2: ℓ²-Wasserstein (default)
solver (str)
- Raises:
If input_dim < 1
If model_type not in allowed set
If eps < 0 or eta < 0 or eta > 0.5
If dual_norm ∉ {1, 2}
- Example:
>>> model = ORWDRO( ... input_dim=5, ... model_type='svm', ... eps=0.1, ... eta=0.05, ... dual_norm=2 ... ) >>> model.sigma # sqrt(5) ≈ 2.236
Note
Computation complexity scales as \(O(\epsilon^{-2})\)
Set
eta=0
to disable outlier robustness
- update(config)¶
Update robustness parameters for OR-WDRO optimization.
- Parameters:
Dictionary containing parameters to update. Valid keys:
'eps'
: New Wasserstein radius (ε ≥ 0)'eta'
: New outlier fraction (0 ≤ η ≤ 0.5)'dual_norm'
: Norm order (1 or 2)
- Raises:
If parameter values violate type or range constraints
If unknown parameters are provided
- Return type:
- Example:
>>> model.update({ ... 'eps': 0.2, ... 'eta': 0.1, ... 'dual_norm': 2 ... }) # Updates multiple parameters atomically
- fit(X, y)¶
- Parameters:
X (numpy.ndarray) – Training feature matrix of shape (n_samples, n_features). Must satisfy n_features == self.input_dim.
Y (numpy.ndarray) –
Target values of shape (n_samples,). Format requirements:
Classification: ±1 labels
Regression: Continuous values
y (ndarray)
- Returns:
Dictionary containing trained parameters:
theta
: Weight vector of shape (n_features,)
- Return type:
Dict[str, Any]
- worst_distribution(X, y)¶
Compute the worst-case distribution.
Reference: Theorem 3 in https://arxiv.org/pdf/2311.05573
- Args:
X (np.ndarray): Input feature matrix with shape (n_samples, n_features). y (np.ndarray): Target vector with shape (n_samples,).
- Returns:
Dict[str, Any]: Dictionary containing ‘sample_pts’ and ‘weight’ keys for worst-case distribution.
- Raises:
ORWDROError: If the worst-case distribution optimization fails.