TV DRO

class dro.src.linear_model.tv_dro.TVDRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', kernel='linear', eps=0.0)

Bases: BaseLinearDRO

Total Variation Distributionally Robust Optimization (TV-DRO) model.

Implements DRO with TV ambiguity set defined as:

\[\mathcal{P} = \{ Q \, | \, \text{TV}(Q, P) \leq \epsilon \}\]

where \(\text{TV}\) is the total variation distance.

Initialize TV-constrained DRO model.

Parameters:
  • input_dim (int) – Feature space dimension. Must be ≥ 1

  • model_type (str) –

    Base model architecture. Supported:

    • 'svm': Hinge loss (classification)

    • 'logistic': Logistic loss (classification)

    • 'ols': Least squares (regression)

    • 'lad': Least absolute deviation (regression)

  • fit_intercept (bool) – Whether to learn intercept term \(b\). Disable for pre-centered data. Defaults to True.

  • solver (str) – Convex optimization solver. Recommended: - 'MOSEK' (commercial)

  • kernel (str) – the kernel type to be used in the optimization model, default = ‘linear’

  • eps (float) –

    TV ambiguity radius. Special cases:

    • 0: Standard empirical risk minimization

    • >0: Controls distributional robustness

Raises:

ValueError

  • If input_dim < 1

  • If eps < 0

Example:
>>> model = TVDRO(
...     input_dim=5,
...     model_type='svm',
...     eps=0.1
... )
>>> model.eps  # 0.1
threshold_val

Decision boundary threshold (set during fitting)

update(config)

Update the model configuration.

Parameters:

config (Dict[str, Any]) –

Dictionary containing configuration updates. Supported keys:

  • eps: Robustness parameter controlling the size of the chi-squared ambiguity set (must be ≥ 0)

Raises:

TVDROError – If ‘eps’ is not in the valid range (0, 1).

Return type:

None

fit(X, y)

Fit the model using CVXPY to solve the robust optimization problem with TV constraint.

Parameters:
  • X (numpy.ndarray) – Training feature matrix of shape (n_samples, n_features). Must satisfy n_features == self.input_dim.

  • Y (numpy.ndarray) –

    Target values of shape (n_samples,). Format requirements:

    • Classification: ±1 labels

    • Regression: Continuous values

  • y (ndarray)

Returns:

Dictionary containing trained parameters:

  • theta: Weight vector of shape (n_features,)

  • threshold

  • b

Return type:

Dict[str, Any]

.raises: TVDROError: If the optimization problem fails to solve.

worst_distribution(X, y, precision=1e-05)

Compute the worst-case distribution based on TV constraint.

Parameters:
  • X (numpy.ndarray) – Feature matrix of shape (n_samples, n_features). Must match the model’s input_dim (n_features).

  • y (numpy.ndarray) – Target vector of shape (n_samples,). For regression tasks, continuous values are expected; for classification, ±1 labels.

  • precision (float)

Returns:

Dictionary containing:

  • sample_pts: Original data points as a tuple (X, y)

  • weight: Worst-case probability weights of shape (n_samples,)

Return type:

Dict[str, Any]

.raises: TVDROError: If the worst-case distribution calculation fails.