Conditional CVaR DRO

class dro.src.linear_model.conditional_dro.ConditionalCVaRDRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear')

Bases: BaseLinearDRO

Y|X (ConditionalShiftBased) Conditional Value-at-Risk (Conditional-CVaR) Distributionally Robust Optimization (DRO) model that only allow likelihood ratio changes in Y|X.

This model minimizes a robust loss function for both regression and classification tasks under a CVaR constraint only for the distribution of Y|X.

Conditional CVaR DRO model following Theorem 2 in: with alpha(x) to be beta^T x for simplicity alpha corresponds to Gamma in the paper.

Reference: <https://arxiv.org/pdf/2209.01754.pdf>

Initialize the linear DRO model.

Parameters:
  • input_dim (int) – Number of input features. Must be ≥ 1.

  • model_type (str) – Type of base model. Valid options: ‘svm’, ‘logistic’, ‘ols’, ‘lad’. Defaults to ‘svm’.

  • fit_intercept (bool) – Whether to learn an intercept term. Set False for pre-centered data. Defaults to True.

  • solver (str) – Optimization solver. See class-level documentation for recommended options. Defaults to ‘MOSEK’.

  • kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’

Raises:

ValueError

  • If input_dim < 1

  • If model_type is not in [‘svm’, ‘logistic’, ‘ols’, ‘lad’]

Example:
>>> model = ConditionalCVaRDRO(input_dim=5, model_type='logistic')
>>> print(model.model_type)
'logistic'
>>> print(model.alpha)
1.0
update(config)

Update Conditional CVaR-DRO model configuration.

Modifies control features and risk sensitivity parameters dynamically. Changes affect subsequent optimization but require manual re-fitting.

Parameters:

config (Dict[str, Any]) –

Dictionary of configuration updates. Valid keys:

  • control_name: Indices of controlled features (0 ≤ index < input_dim)

  • alpha: Risk level for CVaR constraint (0 < alpha ≤ 1)

Raises:

ConditionalCVaRDROError

  • If control_name contains invalid indices

  • If alpha is outside (0, 1]

  • If unrecognized configuration keys are provided

Return type:

None

Control Features:

  • Controlled features (control_name) are protected from distribution shifts

  • Indices must satisfy: \(0 \leq \text{index} < \text{input_dim}\)

Example:
>>> model = ConditionalCVaRDRO(input_dim=5)
>>> model.update({
...     'control_name': [0, 2],  # Protect 1st and 3rd features
...     'alpha': 0.95
... })
>>> model.update({'control_name': [5]})  # Invalid index for input_dim=5
Traceback (most recent call last):
    ...
ConditionalCVaRDROError: All indices in 'control_name' must be in [0, input_dim - 1]

Note

  • Setting control_name=None disables feature protection

  • Lower alpha values reduce conservatism (focus on average risk)

  • Configuration changes invalidate previous solutions (requires re-fitting)

fit(X, y)

Solve the Conditional CVaR-constrained DRO problem via convex optimization.

Constructs and optimizes a distributionally robust model that minimizes the worst-case conditional expected loss, where the ambiguity set is constrained by both CVaR and feature control parameters.

Parameters:
  • X (numpy.ndarray) – Training feature matrix of shape (n_samples, n_features). Must satisfy n_features == self.input_dim.

  • y (numpy.ndarray) –

    Target values of shape (n_samples,). Format requirements:

    • Classification: ±1 labels

    • Regression: Continuous values

Returns:

Dictionary containing trained parameters:

  • theta: Weight vector of shape (n_features,)

  • b: Intercept term (exists if fit_intercept=True)

  • cvar_threshold: Optimal CVaR threshold value

Return type:

Dict[str, Any]

Raises:
  • ConditionalCVaRDROError

    • If optimization fails (solver error/infeasible)

    • If X and control features have dimension mismatch

  • ValueError

    • If X.shape[1] != self.input_dim

    • If X.shape[0] != y.shape[0]

Optimization Formulation:
\[\min_{\theta,b} \ \sup_{Q \in \mathcal{Q}} \mathbb{E}_Q[\ell(\theta,b;X,y)]\]

where the ambiguity set \(\mathcal{Q}\) satisfies:

\[\text{CVaR}_\alpha(\ell) \leq \tau \quad \text{and} \quad X_{\text{control}} = \mathbb{E}_P[X_{\text{control}}]\]
  • \(X_{\text{control}}\) = features specified by control_name

  • \(\tau\) = CVaR threshold (optimization variable)

Example:
>>> model = ConditionalCVaRDRO(input_dim=5, control_name=[0,2], alpha=0.9)
>>> X_train = np.random.randn(100, 5)
>>> y_train = np.sign(np.random.randn(100)) 
>>> params = model.fit(X_train, y_train)
>>> assert params["theta"].shape == (5,)
>>> assert "b" in params
>>> print(f"CVaR threshold: {params['cvar_threshold']:.4f}")

Note

  • Controlled features (control_name) are assumed fixed under distribution shifts

  • Solution cache is invalidated after calling update()