NN DRO Base Class¶

class dro.src.neural_model.base_nn.Linear(input_dim, num_classes)¶

Bases: Module

Fully-connected neural layer for classification/regression.

Implements the linear transformation:

\[Y = XW^\top + b\]

where:

\(X \in \mathbb{R}^{N \times d}\): input features

\(W \in \mathbb{R}^{K \times d}\): weight matrix

\(b \in \mathbb{R}^K\): bias term

\(Y \in \mathbb{R}^{N \times K}\): output logits

Parameters:

input_dim (int) – Dimension of input features \(d\). Must be ≥ 1.
num_classes (int) – Number of output classes \(K\). Use 1 for binary classification.

Example::

>>> model = Linear(input_dim=5, num_classes=3)
>>> x = torch.randn(32, 5)  # batch_size=32
>>> y = model(x)
>>> y.shape
torch.Size([32, 3])

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(X)¶

Forward pass of the linear layer.

Parameters:: X (torch.Tensor) – Input tensor of shape \((N, d)\) where \(N\) = batch size
Returns:: Output logits of shape \((N, K)\)
Return type:: torch.Tensor

class dro.src.neural_model.base_nn.MLP(input_dim, num_classes, hidden_units=16, activation=ReLU(), dropout_rate=0.1)¶

Bases: Module

Multi-Layer Perceptron with dropout regularization.

Implements the forward computation:

\[\begin{split}h_1 &= \sigma(W_1 x + b_1) \\ h_2 &= \sigma(W_2 h_1 + b_2) \\ y &= W_o h_2 + b_o\end{split}\]

where:

\(\sigma\): activation function (default: ReLU)
\(p \in [0,1)\): dropout probability
\(x \in \mathbb{R}^d\): input features
\(y \in \mathbb{R}^K\): output logits

Parameters:

input_dim (int) – Input feature dimension \(d \geq 1\)
num_classes (int) – Output dimension \(K \geq 1\)
hidden_units (int) – Hidden layer dimension \(h \geq 1\), defaults to 16
activation (torch.nn.Module) – Nonlinear activation module, defaults to torch.nn.ReLU
dropout_rate (float) – Dropout probability \(p \in [0,1)\), defaults to 0.1

Example::

>>> model = MLP(
...     input_dim=64,
...     num_classes=10,
...     hidden_units=32,
...     activation=nn.GELU(),
...     dropout_rate=0.2
... )
>>> x = torch.randn(128, 64)  # batch_size=128
>>> y = model(x)
>>> y.shape
torch.Size([128, 10])

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(X)¶

Forward propagation with dropout regularization.

Parameters:: X (torch.Tensor) – Input tensor of shape \((N, d)\) where \(N\) = batch size
Returns:: Output logits of shape \((N, K)\)
Return type:: torch.Tensor

class dro.src.neural_model.base_nn.BaseNNDRO(input_dim, num_classes, task_type='classification', model_type='mlp', device=device(type='cpu'))¶

Bases: object

Neural Network-based Distributionally Robust Optimization (DRO) framework.

Implements the core DRO optimization objective:

\[\min_{\theta} \sup_{Q \in \mathcal{B}_\epsilon(P)} \mathbb{E}_Q[\ell(f_\theta(X), y)]\]

where:

\(f_ heta\): Parametric neural network

\(\mathcal{B}_\epsilon(P)\): Wasserstein ambiguity set

Initialize neural DRO framework.

Parameters:

input_dim (int) – Input feature dimension \(d \geq 1\)
num_classes (int) – Output dimension: - Classification: \(K \geq 2\) (number of classes) - Regression: Automatically set to 1
task_type (str) – Learning task type. Supported: - 'classification': Cross-entropy loss - 'regression': MSE loss
model_type (str) –
Neural architecture type. Supported:
- 'mlp': Multi-Layer Perceptron (default)
- linear
- resnet
- alexnet
device (torch.device) – Target computation device, defaults to CPU

Raises:

ValueError –

If input_dim < 1
If classification task with num_classes < 2
If unsupported model_type

Example (Classification):

>>> model = BaseNNDRO(
...     input_dim=64,
...     num_classes=10,
...     task_type="classification",
...     model_type="mlp",
...     device=torch.device("cuda")
... )

Example (Regression):

>>> model = BaseNNDRO(
...     input_dim=8,
...     num_classes=5,  # Auto-override to 1
...     task_type="regression"
... )

update(input_dim, num_classes, model, task_type='classification', device=device(type='cpu'))¶

Update user’s own model

Parameters:

input_dim (int) – Input feature dimension \(d \geq 1\)
num_classes (int) – Output dimension: - Classification: \(K \geq 2\) (number of classes) - Regression: Automatically set to 1
model (torch.nn.Module) – User’s own model
task_type (str) – Learning task type. Supported: - 'classification': Cross-entropy loss - 'regression': MSE loss
device (torch.device) – Target computation device, defaults to CPU

fit(X, y, train_ratio=0.8, lr=0.001, batch_size=32, epochs=100, verbose=True)¶

Train neural DRO model with Wasserstein robust optimization.

Parameters:

X (Union[numpy.ndarray, torch.Tensor]) –
Input feature matrix/tensor. Shape:
- \((N, d)\) where \(N\) = total samples
- Supports both numpy arrays and torch tensors
y (Union[numpy.ndarray, torch.Tensor]) –
Target labels. Shape:
- Classification: \((N,)\) (class indices). Note that y in {0,1} here.
- Regression: \((N,)\) or \((N, 1)\)
train_ratio (float) – Train-validation split ratio \(\in (0,1)\), defaults to 0.8
lr (float) – Learning rate \(\eta > 0\), defaults to 1e-3
batch_size (int) – Mini-batch size \(B \geq 1\), defaults to 32
epochs (int) – Maximum training epochs \(T \geq 1\), defaults to 100
verbose (bool) – Whether to print epoch-wise metrics, defaults to True

Returns:

Dictionary containing:

'acc, f1': for classification
'mse': for regression

Return type:

Dict[str, List[float]]

Raises:

ValueError –

If input dimensions mismatch
If train_ratio ∉ (0,1)
If batch_size > dataset size
If learning rate ≤ 0

Example (Classification):

>>> X, y = np.random.randn(1000, 64), np.random.randint(0,2,1000)
>>> model = BaseNNDRO(input_dim=64, num_classes=2)
>>> metrics = model.fit(X, y, lr=5e-4, epochs=50)
>>> plt.plot(metrics['val_accuracy'])

Example (Regression):

>>> X = torch.randn(500, 8)
>>> y = X @ torch.randn(8,1) + 0.1*torch.randn(500,1)
>>> model = BaseNNDRO(input_dim=8, task_type='regression')
>>> model.fit(X, y.squeeze(), batch_size=64)

predict(X)¶

Make predictions on input data.

Return type:: ndarray
Parameters:: X (ndarray | Tensor)

score(X, y)¶

Calculate classification accuracy.

Return type:

float

Parameters:

X (ndarray | Tensor)
y (ndarray | Tensor)

f1score(X, y)¶

Calculate macro-averaged F1 score.

Return type:

float

Parameters:

X (ndarray | Tensor)
y (ndarray | Tensor)