Chi-square DRO

class dro.src.linear_model.chi2_dro.Chi2DRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear')

Bases: BaseLinearDRO

Chi-squared Distributionally Robust Optimization (chi2-DRO) model.

This model minimizes a chi-squared robust loss function for both regression and classification.

Reference: <https://www.jmlr.org/papers/volume20/17-750/17-750.pdf>

Initialize the Chi-squared Distributionally Robust Optimization (Chi2-DRO) model.

Parameters:
  • input_dim (int) – Dimensionality of the input feature space. Must match the number of columns in the training data.

  • model_type (str) –

    Base model architecture. Supported:

    • 'svm': Hinge loss (classification)

    • 'logistic': Logistic loss (classification)

    • 'ols': Least squares (regression)

    • 'lad': Least absolute deviation (regression)

  • fit_intercept (bool) – Whether to learn an intercept/bias term. If False, assumes data is already centered. Defaults to True.

  • solver (str) – Convex optimization solver. Recommended solvers: - 'MOSEK' (requires license) Defaults to ‘MOSEK’.

  • kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’

Raises:

ValueError

  • If model_type is not in [‘svm’, ‘logistic’, ‘ols’, ‘lad’]

  • If input_dim ≤ 0

Note

  • ‘lad’ (L1 loss) produces sparse solutions but requires longer solve times

update(config={})

Update the Chi-squared DRO model configuration parameters.

Dynamically modify robustness settings and optimization parameters after model initialization. Changes will affect subsequent operations (e.g., re-fitting the model).

Parameters:

config (Dict[str, Any]) –

Dictionary containing configuration updates. Supported keys:

  • eps: Robustness parameter controlling the size of the chi-squared ambiguity set (must be ≥ 0)

  • solver: Optimization solver to use (must be installed)

Raises:

Chi2DROError

  • If eps is not a non-negative numeric value

  • If unrecognized configuration keys are provided

Example:
>>> model = Chi2DRO(input_dim=5)
>>> model.update({"eps": 0.5})  # Valid update
>>> model.eps  # Verify new value
0.5
>>> model.update({"eps": "invalid"})  # Will raise error
Traceback (most recent call last):
    ...
Chi2DROError: Robustness parameter 'eps' must be a non-negative float.

Note

  • Configuration changes don’t trigger automatic re-optimization

  • Larger eps values make solutions more conservative

fit(X, y)

Train the Chi-squared DRO model by solving the convex optimization problem.

Constructs and solves the distributionally robust optimization problem using CVXPY, where the ambiguity set is defined by the chi-squared divergence. The optimization objective and constraints are built dynamically based on input data.

Parameters:
  • X (numpy.ndarray) – Training feature matrix. Must have shape (n_samples, n_features), where n_features should match the input_dim specified during initialization.

  • y (numpy.ndarray) – Target values. For classification tasks, expected to be binary (±1 labels). Shape must be (n_samples,).

Returns:

Dictionary containing the trained model parameters:

  • theta: Weight vector of shape (n_features,)

  • b: Intercept term (only present if fit_intercept=True)

Return type:

Dict[str, Any]

Raises:
  • Chi2DROError

    • If the optimization solver fails to converge

    • If the problem is infeasible due to invalid hyperparameters

  • ValueError

    • If X and y have inconsistent sample sizes (X.shape[0] != y.shape[0])

    • If X has incorrect feature dimension (X.shape[1] != input_dim)

Optimization Problem:
\[\min_{\theta,b} \max_{P \in \mathcal{P}} \mathbb{E}_P[\ell(\theta, b; X, y)]\]

where \(\mathcal{P}\) is the ambiguity set defined by chi-squared divergence:

\[\mathcal{P} = \{ P: D_{\chi^2}(P \| P_0) \leq \epsilon \}\]
Example:
>>> model = Chi2DRO(input_dim=5, eps=0.5, fit_intercept=True)
>>> X_train = np.random.randn(100, 5)
>>> y_train = np.sign(np.random.randn(100))  # Binary labels
>>> params = model.fit(X_train, y_train)
>>> print(params["theta"].shape)  # (5,)
>>> print("b" in params)  # True

Note

  • Large values of eps increase robustness but may lead to conservative solutions

  • Warm-starting is not supported due to DRO problem structure

worst_distribution(X, y)

Compute the worst-case distribution within the chi-squared ambiguity set.

This method solves a convex optimization problem to find the probability distribution that maximizes the expected loss under the chi-squared divergence constraint. The result characterizes the adversarial data distribution the model is robust against.

Parameters:
  • X (numpy.ndarray) – Feature matrix of shape (n_samples, n_features). Must match the model’s input_dim (n_features).

  • y (numpy.ndarray) – Target vector of shape (n_samples,). For regression tasks, continuous values are expected; for classification, ±1 labels.

Returns:

Dictionary containing:

  • sample_pts: Original data points as a tuple (X, y)

  • weight: Worst-case probability weights of shape (n_samples,)

Return type:

Dict[str, Any]

Raises:
  • Chi2DROError

    • If the optimization solver fails to converge

    • If the solution is infeasible or returns null weights

  • ValueError

    • If X and y have inconsistent sample sizes

    • If X feature dimension ≠ input_dim

Optimization Formulation:
\[\max_{p \in \Delta} \ \sum_{i=1}^n p_i \cdot \ell_i \ \ \ s.t. \ \sum_{i=1}^n n(p_i - 1/n)^2 \leq \epsilon\]

where:

  • \(\ell_i\) is the loss for the i-th sample

  • \(\Delta\) is the probability simplex

  • \(\epsilon\) is the robustness parameter self.eps

Example:
>>> model = Chi2DRO(input_dim=5, eps=0.5)
>>> X = np.random.randn(100, 5)
>>> y = np.sign(np.random.randn(100))  # Binary labels
>>> dist = model.worst_distribution(X, y)
>>> print(dist["weight"].shape)  # (100,)
>>> np.testing.assert_allclose(dist["weight"].sum(), 1.0, rtol=1e-3)  # Sum to 1

Note

  • The weights are guaranteed to be non-negative and sum to 1

  • Larger eps allows more deviation from the empirical distribution

  • Requires prior model fitting via fit()