CVaR DRO¶

class dro.linear_model.cvar_dro.CVaRDRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear', alpha=1.0)¶

Bases: BaseLinearDRO

Conditional Value-at-Risk (CVaR) Distributionally Robust Optimization (DRO) model.

This model minimizes a robust loss function for both regression and classification tasks under a CVaR constraint.

Reference: <https://www.risk.net/journal-risk/2161159/optimization-conditional-value-risk>

Initialize a CVaR-constrained DRO model.

Parameters:

input_dim (int) – Number of input features. Must match training data dimension.
model_type (str) –
Base model architecture. Supported:
- 'svm': Hinge loss (classification)
- 'logistic': Logistic loss (classification)
- 'ols': Least squares (regression)
- 'lad': Least absolute deviation (regression)
fit_intercept (bool) – Whether to learn an intercept term. Disable if data is pre-centered. Defaults to True.
solver (str) – Convex optimization solver. Recommended: ‘MOSEK’ (commercial). Defaults to ‘MOSEK’.
alpha (float) – Risk level controlling the CVaR conservativeness. Must satisfy 0 < alpha ≤ 1. Defaults to 1.0 (equivalent to worst-case optimization).
kernel (str)

Raises:

ValueError –

If model_type is not in [‘svm’, ‘logistic’, ‘ols’]
If alpha is outside (0, 1]

update(config)¶

Dynamically update CVaR-DRO model configuration parameters.

Modifies risk-sensitive hyperparameters and optimization settings. Changes take effect immediately but require re-fitting the model to update solutions.

Parameters:

config (Dict[str, Any]) –

Dictionary of configuration updates. Supported keys:

alpha: Risk level parameter controlling CVaR conservativeness (must satisfy 0 < alpha ≤ 1)

Raises:

CVaRDROError –

If alpha is not in (0, 1]
If unrecognized configuration keys are provided (future-proofing)

Return type:

None

Example:

>>> model = CVaRDRO(input_dim=5, alpha=0.95)
>>> model.update({"alpha": 0.9})  # Valid adjustment
>>> model.alpha
0.9
>>> model.update({"alpha": 1.5})  # Invalid value
Traceback (most recent call last):
    ...
CVaRDROError: Risk parameter 'alpha' must be in the range (0, 1].

Note

Decreasing alpha makes the model less conservative (focuses on average risk)
Requires manual re-fitting via fit() after configuration changes
Configuration keys other than alpha will be silently ignored

fit(X, y)¶

Solve the CVaR-constrained distributionally robust optimization problem.

Constructs and solves the convex optimization problem to find model parameters that minimize the worst-case CVaR of the loss distribution. The solution defines a robust decision boundary/regression plane under adversarial distribution shifts.

Parameters:

X (numpy.ndarray) – Training feature matrix of shape (n_samples, n_features). Must match the input_dim specified during initialization.
y (numpy.ndarray) – Target values of shape (n_samples,). For classification, use ±1 labels; for regression, use continuous values.

Returns:

Dictionary containing trained parameters:

theta: Weight vector of shape (n_features,)
threshold: Optimal CVaR threshold value (stored as self.threshold_val)
b: Intercept term (only present if fit_intercept=True)

Return type:

Dict[str, Any]

Raises:

CVaRDROError –
- If the optimization solver fails to converge
- If the problem is infeasible with current alpha
ValueError –
- If X.shape[1] != self.input_dim
- If X.shape[0] != y.shape[0]

Optimization Formulation:

\[\min_{\theta,b} \ CVaR_\alpha(\ell(\theta, b; X, y)) \]

where:

\(CVaR_\alpha\) = Conditional Value-at-Risk at level \(\alpha\)

\(\ell\) = model-specific loss (hinge loss for SVM, squared loss for OLS, etc.)

Example:

>>> model = CVaRDRO(input_dim=3, alpha=0.9, model_type='svm')
>>> X_train = np.random.randn(100, 3)
>>> y_train = np.sign(np.random.randn(100))
>>> params = model.fit(X_train, y_train)
>>> print(params["theta"].shape)  # (3,)
>>> print("threshold" in params)  # True
>>> print(model.threshold_val == params["threshold"])  # True

Note

Higher \(\alpha\) values require solving more conservative (pessimistic) scenarios
The threshold value represents the \(\alpha\)-quantile of the loss distribution
Warm-starting not supported due to CVaR’s non-smooth nature

worst_distribution(X, y, precision=1e-05)¶

Compute the worst-case distribution under CVaR constraint.

Identifies samples contributing to the alpha-tail risk distribution and assigns uniform weights to these high-loss scenarios. This represents the adversarial distribution that maximizes the conditional expected loss.

Parameters:

X (numpy.ndarray) – Feature matrix of shape (n_samples, n_features). Must match the model’s input_dim (n_features).
y (numpy.ndarray) –
Target vector of shape (n_samples,). Requires pre-processed labels:
- Binary classification: ±1 labels
- Regression: Continuous values
precision (float) – Perturbation tolerance for loss threshold comparison. Compensates for numerical instability in loss computations. Defaults to 1e-5.

Returns:

Dictionary containing:

-sample_pts: Tuple of filtered feature matrix and targets (X_high, y_high), where X_high shape = (n_high_risk, n_features)

-weight: Uniform probability weights of shape (n_high_risk,) summing to 1

Return type:

Dict[str, Any]

Raises:

CVaRDROError –

If model hasn’t been fitted (threshold_val is None)
If all sample losses fall below CVaR threshold (empty distribution)

Optimization Formulation:

\[\mathcal{P}_{\text{worst}} = \{ (X_i,y_i) \mid \ell(\theta;X_i,y_i) > \tau_\alpha + \epsilon \}\]

where:

\(\tau_\alpha\) = CVaR threshold from fit()

\(\epsilon\) = precision parameter

Weights are assigned uniformly: \(p_i = 1 / |\mathcal{P}_{\text{worst}}|\)

Example:

>>> model = CVaRDRO(input_dim=3, alpha=0.95)
>>> X = np.random.randn(100, 3)
>>> y = np.sign(np.random.randn(100))
>>> model.fit(X, y)
>>> dist = model.worst_distribution(X, y)
>>> high_risk_X, high_risk_y = dist["sample_pts"]
>>> print(high_risk_X.shape[0] == len(dist["weight"]))  # True
>>> np.testing.assert_allclose(dist["weight"].sum(), 1.0, atol=1e-6)

Note

The returned distribution always includes samples with loss values exceeding \(\tau_\alpha + \epsilon\), where:
\(\tau_\alpha\) is the CVaR threshold from the fitted model
\(\epsilon\) (epsilon) is the precision parameter
If all sample losses are below :math:` au_lpha + epsilon`, returns an empty weight array and raises a UserWarning
The precision parameter mitigates false positives/negates from floating-point errors in loss comparisons (default=1e-5)