Chi-square DRO¶

class dro.src.linear_model.chi2_dro.Chi2DRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear')¶

Bases: BaseLinearDRO

Chi-squared Distributionally Robust Optimization (chi2-DRO) model.

This model minimizes a chi-squared robust loss function for both regression and classification.

Reference: <https://www.jmlr.org/papers/volume20/17-750/17-750.pdf>

Initialize the Chi-squared Distributionally Robust Optimization (Chi2-DRO) model.

Parameters:

input_dim (int) – Dimensionality of the input feature space. Must match the number of columns in the training data.
model_type (str) –
Base model architecture. Supported:
- 'svm': Hinge loss (classification)
- 'logistic': Logistic loss (classification)
- 'ols': Least squares (regression)
- 'lad': Least absolute deviation (regression)
fit_intercept (bool) – Whether to learn an intercept/bias term. If False, assumes data is already centered. Defaults to True.
solver (str) – Convex optimization solver. Recommended solvers: - 'MOSEK' (requires license) Defaults to ‘MOSEK’.
kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’

Raises:

ValueError –

If model_type is not in [‘svm’, ‘logistic’, ‘ols’, ‘lad’]
If input_dim ≤ 0

Note

‘lad’ (L1 loss) produces sparse solutions but requires longer solve times

update(config={})¶

Update the Chi-squared DRO model configuration parameters.

Dynamically modify robustness settings and optimization parameters after model initialization. Changes will affect subsequent operations (e.g., re-fitting the model).

Parameters:

config (Dict[str, Any]) –

Dictionary containing configuration updates. Supported keys:

eps: Robustness parameter controlling the size of the chi-squared ambiguity set (must be ≥ 0)
solver: Optimization solver to use (must be installed)

Raises:

Chi2DROError –

If eps is not a non-negative numeric value
If unrecognized configuration keys are provided

Example:

>>> model = Chi2DRO(input_dim=5)
>>> model.update({"eps": 0.5})  # Valid update
>>> model.eps  # Verify new value
0.5
>>> model.update({"eps": "invalid"})  # Will raise error
Traceback (most recent call last):
    ...
Chi2DROError: Robustness parameter 'eps' must be a non-negative float.

Note

Configuration changes don’t trigger automatic re-optimization
Larger eps values make solutions more conservative

fit(X, y)¶

Train the Chi-squared DRO model by solving the convex optimization problem.

Constructs and solves the distributionally robust optimization problem using CVXPY, where the ambiguity set is defined by the chi-squared divergence. The optimization objective and constraints are built dynamically based on input data.

Parameters:

X (numpy.ndarray) – Training feature matrix. Must have shape (n_samples, n_features), where n_features should match the input_dim specified during initialization.
y (numpy.ndarray) – Target values. For classification tasks, expected to be binary (±1 labels). Shape must be (n_samples,).

Returns:

Dictionary containing the trained model parameters:

theta: Weight vector of shape (n_features,)
b: Intercept term (only present if fit_intercept=True)

Return type:

Dict[str, Any]

Raises:

Chi2DROError –
- If the optimization solver fails to converge
- If the problem is infeasible due to invalid hyperparameters
ValueError –
- If X and y have inconsistent sample sizes (X.shape[0] != y.shape[0])
- If X has incorrect feature dimension (X.shape[1] != input_dim)

Optimization Problem:

\[\min_{\theta,b} \max_{P \in \mathcal{P}} \mathbb{E}_P[\ell(\theta, b; X, y)]\]

where \(\mathcal{P}\) is the ambiguity set defined by chi-squared divergence:

\[\mathcal{P} = \{ P: D_{\chi^2}(P \| P_0) \leq \epsilon \}\]

Example:

>>> model = Chi2DRO(input_dim=5, eps=0.5, fit_intercept=True)
>>> X_train = np.random.randn(100, 5)
>>> y_train = np.sign(np.random.randn(100))  # Binary labels
>>> params = model.fit(X_train, y_train)
>>> print(params["theta"].shape)  # (5,)
>>> print("b" in params)  # True

Note

Large values of eps increase robustness but may lead to conservative solutions
Warm-starting is not supported due to DRO problem structure

worst_distribution(X, y)¶

Compute the worst-case distribution within the chi-squared ambiguity set.

This method solves a convex optimization problem to find the probability distribution that maximizes the expected loss under the chi-squared divergence constraint. The result characterizes the adversarial data distribution the model is robust against.

Parameters:

X (numpy.ndarray) – Feature matrix of shape (n_samples, n_features). Must match the model’s input_dim (n_features).
y (numpy.ndarray) – Target vector of shape (n_samples,). For regression tasks, continuous values are expected; for classification, ±1 labels.

Returns:

Dictionary containing:

sample_pts: Original data points as a tuple (X, y)
weight: Worst-case probability weights of shape (n_samples,)

Return type:

Dict[str, Any]

Raises:

Chi2DROError –
- If the optimization solver fails to converge
- If the solution is infeasible or returns null weights
ValueError –
- If X and y have inconsistent sample sizes
- If X feature dimension ≠ input_dim

Optimization Formulation:

\[\max_{p \in \Delta} \ \sum_{i=1}^n p_i \cdot \ell_i \ \ \ s.t. \ \sum_{i=1}^n n(p_i - 1/n)^2 \leq \epsilon\]

where:

\(\ell_i\) is the loss for the i-th sample

\(\Delta\) is the probability simplex

\(\epsilon\) is the robustness parameter self.eps

Example:

>>> model = Chi2DRO(input_dim=5, eps=0.5)
>>> X = np.random.randn(100, 5)
>>> y = np.sign(np.random.randn(100))  # Binary labels
>>> dist = model.worst_distribution(X, y)
>>> print(dist["weight"].shape)  # (100,)
>>> np.testing.assert_allclose(dist["weight"].sum(), 1.0, rtol=1e-3)  # Sum to 1

Note

The weights are guaranteed to be non-negative and sum to 1
Larger eps allows more deviation from the empirical distribution
Requires prior model fitting via fit()