Sinkhorn DRO

class dro.src.linear_model.sinkhorn_dro.SinkhornLinearDRO(input_dim, model_type='svm', fit_intercept=True, reg_param=0.001, lambda_param=100.0, output_dim=1, max_iter=1000.0, learning_rate=0.01, k_sample_max=5, device='cpu')

Bases: BaseLinearDRO

Sinkhorn Distributionally Robust Optimization with Linear Models.

Reference: <https://arxiv.org/abs/2109.11926>

Initialize Sinkhorn Distributionally Robust Optimization model.

Parameters:
  • input_dim (int) – Dimension of input feature space (d)

  • model_type (str) –

    Base model architecture. Supported:

    • 'svm': Support Vector Machine (hinge loss)

    • 'logistic': Logistic Regression (cross-entropy loss)

    • 'ols': Ordinary Least Squares (L2 loss)

    • 'lad': Least Absolute Deviation (L1 loss)

  • fit_intercept (bool) – Whether to learn bias term \(b\). Disable for pre-centered data. Defaults to True.

  • reg_param (float) – Entropic regularization strength \(\epsilon\) controlling transport smoothness. Must be > 0. Defaults to 1e-3.

  • lambda_param (float) – Loss scaling factor \(\lambda\) balancing Wasserstein distance and loss. Must be > 0. Defaults to 1e2.

  • output_dim (int) – Dimension of model output. 1 for regression/binary classification. Defaults to 1.

  • max_iter (int) – Maximum number of Sinkhorn iterations. Should be ≥ 100. Defaults to 1e3.

  • learning_rate (float) – Step size for gradient-based optimization. Typical range: [1e-4, 1e-1]. Defaults to 1e-2.

  • k_sample_max (int) – Maximum level for Multilevel Monte Carlo sampling. Higher values improve accuracy but increase computation. Defaults to 5.

  • device (str) – Computation device. Supported: 'cpu' or 'cuda'. Defaults to ‘cpu’.

Raises:

ValueError

  • If any parameter violates numerical constraints (ε ≤ 0, λ ≤ 0, etc.)

  • If model_type is not in supported set

Example:
>>> model = SinkhornDRO(
...     input_dim=10,
...     model_type='svm',
...     reg_param=0.01,
...     lambda_param=50.0
... )
>>> print(model.device)  # 'cpu'

Note

  • Setting device='torch.device(cuda)' requires PyTorch with GPU support (CUDA-enabled)

  • It is recommended to retain the default k_sample_max=5 to balance accuracy and computational efficiency

update(config)

Update hyperparameters for Sinkhorn optimization.

Parameters:

config (dict[str, Any]) –

Dictionary containing parameter updates. Valid keys:

  • 'reg': Entropic regularization strength (ε > 0)

  • 'lambda': Loss scaling factor (λ > 0)

  • 'k_sample_max': Maximum MLMC sampling level (integer ≥ 1)

Raises:

ValueError – If any parameter value violates type or range constraints

Return type:

None

Example:
>>> model.update({
...     'reg': 0.01,
...     'lambda': 50.0,
...     'k_sample_max': 3
... })  # Explicit type conversion handled internally

Note

For GPU-accelerated computation, specify it during initialization instead of this function

predict(X)

Generate predictions using the optimized Sinkhorn DRO model.

Parameters:

X (numpy.ndarray) – Input feature matrix. Should have shape (n_samples, n_features) where n_features must match model’s input dimension. Supported dtype: float32/float64

Returns:

Model predictions. Shape depends on task:

  • Regression: (n_samples, 1)

  • Classification: (n_samples,) with 0/1 labels

Return type:

numpy.ndarray

Raises:

ValueError

  • If input dimension mismatch occurs (n_features != model_dim)

  • If input contains NaN/Inf values

Example:
>>> X_test = np.random.randn(10, 5).astype(np.float32)
>>> preds = model.predict(X_test)  # Shape: (10, 1) for regression

Note

  • Input data is automatically converted to PyTorch tensors

  • For large datasets (>1M samples), use batch prediction

score(X, y)

Evaluate model performance on given data.

Parameters:
  • X (numpy.ndarray) – Input feature matrix. Shape: (n_samples, n_features) Must match model’s expected input dimension

  • y (numpy.ndarray) –

    Target values. Shape requirements:

    • Regression: (n_samples,) or (n_samples, 1)

    • Classification: (n_samples,) with binary labels (0/1)

Returns:

Performance metrics:

  • Regression: Mean Squared Error (MSE) as float

  • Classification: Tuple of (accuracy%, macro-F1 score) in [0,1] range

Return type:

Union[float, Tuple[float, float]]

Example:
>>> # Regression
>>> X_reg, y_reg = np.random.randn(100,5), np.random.randn(100)
>>> model = SinkhornDRO(model_type='ols')
>>> mse = model.score(X_reg, y_reg)  # e.g. 0.153
>>> # Classification  
>>> X_clf, y_clf = np.random.randn(100,5), np.random.randint(0,2,100)
>>> model = SinkhornDRO(model_type='svm')
>>> acc, f1 = model.score(X_clf, y_clf)  # e.g. (0.92, 0.89)

Note

  • For regression tasks, outputs are not thresholded

  • Computation uses all available samples (no mini-batching)

fit(X, y, optimization_type='SG')

Train the Sinkhorn DRO model with specified optimization strategy.

Parameters:
  • X (numpy.ndarray) – Training feature matrix. Shape: (n_samples, n_features) Should match model’s input dimension (n_features == input_dim)

  • y (numpy.ndarray) –

    Target values. Shape requirements:

    • Regression: (n_samples,) continuous values

    • Classification: (n_samples,) binary labels (0/1)

  • optimization_type (str) –

    Optimization algorithm selection (Defaults to ‘SG’):

    • 'SG': Standard Stochastic Gradient (baseline)

    • 'MLMC': Multilevel Monte Carlo acceleration

    • 'RTMLMC': Real-Time MLMC with adaptive sampling

Returns:

Learned parameters containing:

  • 'theta': Model coefficients (n_features,)

  • 'bias': Intercept term (if fit_intercept=True)

Return type:

dict[str, numpy.ndarray]

Raises:

ValueError

  • If X/y have mismatched sample counts (n_samples)

  • If optimization_type not in {‘SG’, ‘MLMC’, ‘RTMLMC’}

  • If input_dim ≠ X.shape[1]

Example:
>>> model = SinkhornDRO(input_dim=5, model_type='svm')
>>> params = model.fit(X_train, y_train, 'MLMC')
>>> print(params['weights'])  # [-0.12, 1.45, ...]