f-Divergence DRO

[7]:
import numpy as np

# Prepare Data
from dro.src.data.dataloader_regression import regression_basic
from dro.src.data.dataloader_classification import classification_basic
from dro.src.data.draw_utils import draw_classification

X, y = classification_basic(d = 2, num_samples = 100, radius = 2, visualize = True)
../../_images/api_notebooks_f_dro_tutorial_1_0.png

Standard f-divergence DRO

We include chi2, cvar, kl, tv distance, which corresponds to the standard definition of (generalized) f-divergence.

The following steps including model fitting, and worst-case illustrations.

[3]:
from dro.src.linear_model.chi2_dro import *
from dro.src.linear_model.cvar_dro import *
from dro.src.linear_model.tv_dro import *
from dro.src.linear_model.kl_dro import *


clf_model_chi2 = Chi2DRO(input_dim=2, model_type = 'logistic')
clf_model_cvar = CVaRDRO(input_dim=2, model_type = 'logistic')
clf_model_kl = KLDRO(input_dim = 2, model_type = 'logistic')
clf_model_tv = TVDRO(input_dim = 2, model_type = 'logistic')
[4]:
## model fitting
clf_model_chi2.update({'eps': 1})
print(clf_model_chi2.fit(X, y))
clf_model_kl.update({'eps': 1})
print(clf_model_kl.fit(X, y))
clf_model_tv.update({'eps': 0.3})
print(clf_model_tv.fit(X, y))
clf_model_cvar.update({'alpha':0.8})
print(clf_model_cvar.fit(X, y))
{'theta': [-0.6347179732189498, 1.9680342006517346], 'b': array(-0.51071442)}
{'theta': [-0.21802718101618923, 0.7358683074228745], 'dual': 0.16652943997935335, 'b': array(-0.233648)}
{'theta': [-1.4147453349806304e-07, 2.8321768527388613e-07], 'threshold': array(0.69314691), 'b': array(-4.42798951e-08)}
{'theta': [-1.3466023842845223, 3.823698339513048], 'threshold': array(0.00019594), 'b': array(-0.7506087)}
[5]:
# worst case distribution for each method
worst_chi2 = clf_model_chi2.worst_distribution(X, y)
draw_classification(worst_chi2['sample_pts'][0], worst_chi2['sample_pts'][1], weight = worst_chi2['weight'], title = 'worst-chi2')

worst_kl = clf_model_kl.worst_distribution(X, y)
draw_classification(worst_kl['sample_pts'][0], worst_kl['sample_pts'][1], weight = worst_kl['weight'], title = 'worst-kl')

worst_tv = clf_model_tv.worst_distribution(X, y)
draw_classification(worst_tv['sample_pts'][0], worst_tv['sample_pts'][1], weight = worst_tv['weight'], title = 'worst-tv')

worst_cvar = clf_model_cvar.worst_distribution(X, y)
draw_classification(worst_cvar['sample_pts'][0], worst_cvar['sample_pts'][1], weight = worst_cvar['weight'], title = 'worst-cvar')
../../_images/api_notebooks_f_dro_tutorial_5_0.png
../../_images/api_notebooks_f_dro_tutorial_5_1.png
../../_images/api_notebooks_f_dro_tutorial_5_2.png
../../_images/api_notebooks_f_dro_tutorial_5_3.png

data driven evaluation

[2]:
import numpy as np
from sklearn.datasets import make_regression
from dro.src.linear_model.chi2_dro import *


# Data generation
sample_num = 200
X, y = make_regression(n_samples = sample_num, n_features=10, noise = 5, random_state=42)

eps = 2 / sample_num

dro_model = Chi2DRO(input_dim = 10, model_type = 'ols', fit_intercept=False)
dro_model.update({'eps': 0.5})
dro_model.fit(X, y)
dro_model.evaluate(X, y)
[2]:
26.40481855042916
[3]:
errors = (dro_model.predict(X) - y) ** 2
np.mean(errors)
[3]:
24.05810847832465

These f-divergence DROs are suitable for handling general distribution shifts with likelihood misspecification, while can be too worst-case in practice.

Partial Distribution Shift

Some special kinds of DRO models can help handle problems of particular worst-case distribution shift, i.e., covariate shift (marginal_dro). Both of them are built from CVaR-DRO.

[8]:
from dro.src.linear_model.conditional_dro import *
from dro.src.linear_model.marginal_dro import *
from dro.src.data.dataloader_classification import classification_basic

X, y = classification_basic(d = 2, num_samples = 100, radius = 2, visualize = True)
../../_images/api_notebooks_f_dro_tutorial_11_0.png
[9]:
clf_model_margin = MarginalCVaRDRO(input_dim = 2, model_type = 'svm')

clf_model_cond = ConditionalCVaRDRO(input_dim = 2, model_type = 'logistic')

clf_model_margin.update({'alpha': 0.8})
clf_model_cond.update({'alpha': 0.4})

print('marginal', clf_model_margin.fit(X, y)['theta'])
print('conditional', clf_model_cond.fit(X, y))
marginal [-1.1801432134409149, 2.363217574296201]
conditional {'theta': [-1.2843606942632635, 3.5303014542999374], 'b': array(-0.64246721)}