Bayesian DRO

class dro.src.linear_model.bayesian_dro.BayesianDRO(input_dim, model_type='svm', fit_intercept=True, solver='MOSEK', eps=0.0, distance_type='KL')

Bases: BaseLinearDRO

This model minimizes a Bayesian version for regression and other types of losses

Reference: <https://epubs.siam.org/doi/10.1137/21M1465548>

Initialize the Bayesian DRO model.

Parameters:
  • input_dim (int) – Dimensionality of the input features.

  • model_type (str) – Type of base model. Supported values: ‘svm’, ‘logistic’, ‘ols’, ‘lad’(‘svm’ for SVM, ‘logistic’ for Logistic Regression, ‘ols’ for Linear Regression with L2-loss, ‘lad’ for Linear Regression with L1-loss). Defaults to ‘svm’.

  • fit_intercept (bool) – Whether to fit an intercept term. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered). Defaults to True.

  • solver (str) – Optimization solver. Supported solvers: ‘MOSEK’.

  • eps (float) – Robustness parameter for KL-divergence ambiguity set. A higher value increases robustness. Defaults to 0.0 (non-robust).

  • distance_type (str) – Distance type in DRO model, default = ‘KL’. Also support ‘chi2’. Default to ‘KL’.

update(config)

Update the model configuration dynamically.

Modify parameters like robustness level (eps), optimization solver, distance metric type, or other algorithm settings during runtime.

Parameters:

config (Dict[str, Any]) –

Dictionary containing configuration key-value pairs to update. Supported keys include:

  • eps: Robustness parameter (non-negative float)

  • solver: Optimization solver (e.g., ‘MOSEK’, ‘SCS’)

  • distance_type: Distance metric (‘KL’ or ‘chi2’)

  • distribution_class: Distribution class

  • posterior_sample_ratio

  • posterior_param_num

  • Other model-specific parameters

Raises:

ValueError – If the configuration contains invalid keys, unsupported values, or negative values for parameters like eps.

Return type:

None

Note

  • Updating some parameters (e.g., solver) may trigger reinitialization of the optimizer.

  • For safety, avoid modifying input_dim or model_type after initialization.

Example:
>>> model.update({"eps": 0.5, "distance_type": "chi2", "solver": "MOSEK"})
resample(X, y)

Generate resampled data based on posterior parameters.

This method produces new feature and target arrays whose dimensions are determined by the model’s posterior parameters, input dimensionality, and the sample size of the original data.

Parameters:
  • X (numpy.ndarray) – Original feature matrix of shape (sample_size, input_dim).

  • y (numpy.ndarray) – Original target values of shape (sample_size,) or (sample_size, n_targets).

Returns:

A tuple containing the resampled feature matrix and target array.

  • Resampled X has shape (posterior_param_num, sample_size, input_dim)

  • Resampled y has shape (posterior_param_num, sample_size) (for single-target) or (posterior_param_num, sample_size, n_targets)

Return type:

tuple[numpy.ndarray, numpy.ndarray]

Raises:

ValueError

  • If X and y have inconsistent sample sizes (first dimension mismatch).

  • If posterior_param_num is not initialized (e.g., model not yet fitted).

Example:
>>> # After fitting a BayesianDRO model
>>> X_new, y_new = model.resample(X_train, y_train)
>>> print(X_new.shape)  # (n_params, 1000, 10) if sample_size=1000, input_dim=10
>>> print(y_new.shape)  # (n_params, 1000)
fit(X, y)

Train the Bayesian DRO model by solving the convex optimization problem.

Parameters:
  • X (numpy.ndarray) – Feature matrix of shape (original_sample_size, input_dim).

  • y (numpy.ndarray) – Target values of shape (original_sample_size,) for classification or (original_sample_size, n_targets) for regression.

Returns:

Dictionary containing the trained parameters:

  • theta: Weight vector of shape (input_dim,)

  • b: Intercept scalar (only if fit_intercept=True)

Return type:

Dict[str, Union[List[float], float]]

Raises:
  • BayesianDROError – If the optimization solver fails to converge.

  • ValueError

    • If X and y have inconsistent sample sizes.

    • If resampled data dimensions are incompatible with posterior_param_num.

Optimization Formulation:
Minimize:

(1/K) * Σ t_k + η * eps where K = posterior_param_num, eps = self.eps

Subject to:

  • Exponential cone constraints: ExpCone(per_loss - t, η, epi_g)

  • Loss bounds: per_loss model-specific loss (e.g., SVM hinge loss)

  • Ambiguity set: Σ epi_g sample_size

Example:
>>> model = BayesianDRO(input_dim=10, eps=0.1)
>>> X_train = np.random.randn(100, 10)
>>> y_train = np.sign(np.random.randn(100))
>>> params = model.fit(X_train, y_train)
>>> print(params["theta"])  # e.g., [0.5, -1.2, ..., 0.8]
>>> print(params["b"])      # e.g., 0.3