Chi-square DRO¶
- class dro.src.linear_model.chi2_dro.Chi2DRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear')¶
Bases:
BaseLinearDRO
Chi-squared Distributionally Robust Optimization (chi2-DRO) model.
This model minimizes a chi-squared robust loss function for both regression and classification.
Reference: <https://www.jmlr.org/papers/volume20/17-750/17-750.pdf>
Initialize the Chi-squared Distributionally Robust Optimization (Chi2-DRO) model.
- Parameters:
input_dim (int) – Dimensionality of the input feature space. Must match the number of columns in the training data.
model_type (str) –
Base model architecture. Supported:
'svm'
: Hinge loss (classification)'logistic'
: Logistic loss (classification)'ols'
: Least squares (regression)'lad'
: Least absolute deviation (regression)
fit_intercept (bool) – Whether to learn an intercept/bias term. If False, assumes data is already centered. Defaults to True.
solver (str) – Convex optimization solver. Recommended solvers: -
'MOSEK'
(requires license) Defaults to ‘MOSEK’.kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’
- Raises:
If model_type is not in [‘svm’, ‘logistic’, ‘ols’, ‘lad’]
If input_dim ≤ 0
Note
‘lad’ (L1 loss) produces sparse solutions but requires longer solve times
- update(config={})¶
Update the Chi-squared DRO model configuration parameters.
Dynamically modify robustness settings and optimization parameters after model initialization. Changes will affect subsequent operations (e.g., re-fitting the model).
- Parameters:
config (Dict[str, Any]) –
Dictionary containing configuration updates. Supported keys:
eps
: Robustness parameter controlling the size of the chi-squared ambiguity set (must be ≥ 0)solver
: Optimization solver to use (must be installed)
- Raises:
Chi2DROError –
If
eps
is not a non-negative numeric valueIf unrecognized configuration keys are provided
- Example:
>>> model = Chi2DRO(input_dim=5) >>> model.update({"eps": 0.5}) # Valid update >>> model.eps # Verify new value 0.5 >>> model.update({"eps": "invalid"}) # Will raise error Traceback (most recent call last): ... Chi2DROError: Robustness parameter 'eps' must be a non-negative float.
Note
Configuration changes don’t trigger automatic re-optimization
Larger
eps
values make solutions more conservative
- fit(X, y)¶
Train the Chi-squared DRO model by solving the convex optimization problem.
Constructs and solves the distributionally robust optimization problem using CVXPY, where the ambiguity set is defined by the chi-squared divergence. The optimization objective and constraints are built dynamically based on input data.
- Parameters:
X (numpy.ndarray) – Training feature matrix. Must have shape (n_samples, n_features), where n_features should match the input_dim specified during initialization.
y (numpy.ndarray) – Target values. For classification tasks, expected to be binary (±1 labels). Shape must be (n_samples,).
- Returns:
Dictionary containing the trained model parameters:
theta
: Weight vector of shape (n_features,)b
: Intercept term (only present if fit_intercept=True)
- Return type:
Dict[str, Any]
- Raises:
Chi2DROError –
If the optimization solver fails to converge
If the problem is infeasible due to invalid hyperparameters
If X and y have inconsistent sample sizes (X.shape[0] != y.shape[0])
If X has incorrect feature dimension (X.shape[1] != input_dim)
- Optimization Problem:
- \[\min_{\theta,b} \max_{P \in \mathcal{P}} \mathbb{E}_P[\ell(\theta, b; X, y)]\]
where \(\mathcal{P}\) is the ambiguity set defined by chi-squared divergence:
\[\mathcal{P} = \{ P: D_{\chi^2}(P \| P_0) \leq \epsilon \}\] - Example:
>>> model = Chi2DRO(input_dim=5, eps=0.5, fit_intercept=True) >>> X_train = np.random.randn(100, 5) >>> y_train = np.sign(np.random.randn(100)) # Binary labels >>> params = model.fit(X_train, y_train) >>> print(params["theta"].shape) # (5,) >>> print("b" in params) # True
Note
Large values of eps increase robustness but may lead to conservative solutions
Warm-starting is not supported due to DRO problem structure
- worst_distribution(X, y)¶
Compute the worst-case distribution within the chi-squared ambiguity set.
This method solves a convex optimization problem to find the probability distribution that maximizes the expected loss under the chi-squared divergence constraint. The result characterizes the adversarial data distribution the model is robust against.
- Parameters:
X (numpy.ndarray) – Feature matrix of shape (n_samples, n_features). Must match the model’s input_dim (n_features).
y (numpy.ndarray) – Target vector of shape (n_samples,). For regression tasks, continuous values are expected; for classification, ±1 labels.
- Returns:
Dictionary containing:
sample_pts
: Original data points as a tuple(X, y)
weight
: Worst-case probability weights of shape (n_samples,)
- Return type:
Dict[str, Any]
- Raises:
Chi2DROError –
If the optimization solver fails to converge
If the solution is infeasible or returns null weights
If X and y have inconsistent sample sizes
If X feature dimension ≠ input_dim
- Optimization Formulation:
- \[\max_{p \in \Delta} \ \sum_{i=1}^n p_i \cdot \ell_i \ \ \ s.t. \ \sum_{i=1}^n n(p_i - 1/n)^2 \leq \epsilon\]
where:
\(\ell_i\) is the loss for the i-th sample
\(\Delta\) is the probability simplex
\(\epsilon\) is the robustness parameter
self.eps
- Example:
>>> model = Chi2DRO(input_dim=5, eps=0.5) >>> X = np.random.randn(100, 5) >>> y = np.sign(np.random.randn(100)) # Binary labels >>> dist = model.worst_distribution(X, y) >>> print(dist["weight"].shape) # (100,) >>> np.testing.assert_allclose(dist["weight"].sum(), 1.0, rtol=1e-3) # Sum to 1
Note
The weights are guaranteed to be non-negative and sum to 1
Larger
eps
allows more deviation from the empirical distributionRequires prior model fitting via
fit()