CVaR DRO

class dro.src.linear_model.cvar_dro.CVaRDRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear', alpha=1.0)

Bases: BaseLinearDRO

Conditional Value-at-Risk (CVaR) Distributionally Robust Optimization (DRO) model.

This model minimizes a robust loss function for both regression and classification tasks under a CVaR constraint.

Reference: <https://www.risk.net/journal-risk/2161159/optimization-conditional-value-risk>

Initialize a CVaR-constrained DRO model.

Parameters:
  • input_dim (int) – Number of input features. Must match training data dimension.

  • model_type (str) –

    Base model architecture. Supported:

    • 'svm': Hinge loss (classification)

    • 'logistic': Logistic loss (classification)

    • 'ols': Least squares (regression)

    • 'lad': Least absolute deviation (regression)

  • fit_intercept (bool) – Whether to learn an intercept term. Disable if data is pre-centered. Defaults to True.

  • solver (str) – Convex optimization solver. Recommended: ‘MOSEK’ (commercial). Defaults to ‘MOSEK’.

  • alpha (float) – Risk level controlling the CVaR conservativeness. Must satisfy 0 < alpha ≤ 1. Defaults to 1.0 (equivalent to worst-case optimization).

  • kernel (str)

Raises:

ValueError

  • If model_type is not in [‘svm’, ‘logistic’, ‘ols’]

  • If alpha is outside (0, 1]

update(config)

Dynamically update CVaR-DRO model configuration parameters.

Modifies risk-sensitive hyperparameters and optimization settings. Changes take effect immediately but require re-fitting the model to update solutions.

Parameters:

config (Dict[str, Any]) –

Dictionary of configuration updates. Supported keys:

  • alpha: Risk level parameter controlling CVaR conservativeness (must satisfy 0 < alpha ≤ 1)

Raises:

CVaRDROError

  • If alpha is not in (0, 1]

  • If unrecognized configuration keys are provided (future-proofing)

Return type:

None

Example:
>>> model = CVaRDRO(input_dim=5, alpha=0.95)
>>> model.update({"alpha": 0.9})  # Valid adjustment
>>> model.alpha
0.9
>>> model.update({"alpha": 1.5})  # Invalid value
Traceback (most recent call last):
    ...
CVaRDROError: Risk parameter 'alpha' must be in the range (0, 1].

Note

  • Decreasing alpha makes the model less conservative (focuses on average risk)

  • Requires manual re-fitting via fit() after configuration changes

  • Configuration keys other than alpha will be silently ignored

fit(X, y)

Solve the CVaR-constrained distributionally robust optimization problem.

Constructs and solves the convex optimization problem to find model parameters that minimize the worst-case CVaR of the loss distribution. The solution defines a robust decision boundary/regression plane under adversarial distribution shifts.

Parameters:
  • X (numpy.ndarray) – Training feature matrix of shape (n_samples, n_features). Must match the input_dim specified during initialization.

  • y (numpy.ndarray) – Target values of shape (n_samples,). For classification, use ±1 labels; for regression, use continuous values.

Returns:

Dictionary containing trained parameters:

  • theta: Weight vector of shape (n_features,)

  • threshold: Optimal CVaR threshold value (stored as self.threshold_val)

  • b: Intercept term (only present if fit_intercept=True)

Return type:

Dict[str, Any]

Raises:
  • CVaRDROError

    • If the optimization solver fails to converge

    • If the problem is infeasible with current alpha

  • ValueError

    • If X.shape[1] != self.input_dim

    • If X.shape[0] != y.shape[0]

Optimization Formulation:
\[\min_{\theta,b} \ CVaR_\alpha(\ell(\theta, b; X, y)) \]

where:

  • \(CVaR_\alpha\) = Conditional Value-at-Risk at level \(\alpha\)

  • \(\ell\) = model-specific loss (hinge loss for SVM, squared loss for OLS, etc.)

Example:
>>> model = CVaRDRO(input_dim=3, alpha=0.9, model_type='svm')
>>> X_train = np.random.randn(100, 3)
>>> y_train = np.sign(np.random.randn(100))
>>> params = model.fit(X_train, y_train)
>>> print(params["theta"].shape)  # (3,)
>>> print("threshold" in params)  # True
>>> print(model.threshold_val == params["threshold"])  # True

Note

  • Higher \(\alpha\) values require solving more conservative (pessimistic) scenarios

  • The threshold value represents the \(\alpha\)-quantile of the loss distribution

  • Warm-starting not supported due to CVaR’s non-smooth nature

worst_distribution(X, y, precision=1e-05)

Compute the worst-case distribution under CVaR constraint.

Identifies samples contributing to the alpha-tail risk distribution and assigns uniform weights to these high-loss scenarios. This represents the adversarial distribution that maximizes the conditional expected loss.

Parameters:
  • X (numpy.ndarray) – Feature matrix of shape (n_samples, n_features). Must match the model’s input_dim (n_features).

  • y (numpy.ndarray) –

    Target vector of shape (n_samples,). Requires pre-processed labels:

    • Binary classification: ±1 labels

    • Regression: Continuous values

  • precision (float) – Perturbation tolerance for loss threshold comparison. Compensates for numerical instability in loss computations. Defaults to 1e-5.

Returns:

Dictionary containing:

-sample_pts: Tuple of filtered feature matrix and targets (X_high, y_high), where X_high shape = (n_high_risk, n_features)

-weight: Uniform probability weights of shape (n_high_risk,) summing to 1

Return type:

Dict[str, Any]

Raises:

CVaRDROError

  • If model hasn’t been fitted (threshold_val is None)

  • If all sample losses fall below CVaR threshold (empty distribution)

Optimization Formulation:
\[\mathcal{P}_{\text{worst}} = \{ (X_i,y_i) \mid \ell(\theta;X_i,y_i) > \tau_\alpha + \epsilon \}\]

where:

  • \(\tau_\alpha\) = CVaR threshold from fit()

  • \(\epsilon\) = precision parameter

  • Weights are assigned uniformly: \(p_i = 1 / |\mathcal{P}_{\text{worst}}|\)

Example:
>>> model = CVaRDRO(input_dim=3, alpha=0.95)
>>> X = np.random.randn(100, 3)
>>> y = np.sign(np.random.randn(100))
>>> model.fit(X, y)
>>> dist = model.worst_distribution(X, y)
>>> high_risk_X, high_risk_y = dist["sample_pts"]
>>> print(high_risk_X.shape[0] == len(dist["weight"]))  # True
>>> np.testing.assert_allclose(dist["weight"].sum(), 1.0, atol=1e-6)

Note

  • The returned distribution always includes samples with loss values exceeding \(\tau_\alpha + \epsilon\), where:

  • \(\tau_\alpha\) is the CVaR threshold from the fitted model

  • \(\epsilon\) (epsilon) is the precision parameter

  • If all sample losses are below :math:` au_lpha + epsilon`, returns an empty weight array and raises a UserWarning

  • The precision parameter mitigates false positives/negates from floating-point errors in loss comparisons (default=1e-5)