FLamby integration in Fed-BioMed¶
This notebook showcases some examples of the integration between FLamby and Fed-BioMed.
For a thorough understanding, please visit the Tutorials section of our documentation.
This tutorial assumes that you know and understand the basics of Fed-BioMed, that you have already set up the network component, and are familiar with flow of adding data through the node CLI interface. For an introduction to Fed-BioMed, please follow our PyTorch MNIST tutorial.
Downloading FLamby datasets¶
Before using FLamby, you need to download the FLamby datasets that you plan to use. For licensing reasons, these are not including directly in the FLamby installation.
To download the fed_ixi
dataset in ${FEDBIOMED_DIR}/data
, follow FLamby download instructions. In a nutshell:
- execute on the researcher
#use the environment where Fed-BioMed node is installed
pip install nibabel
- then execute on each node (where
${FEDBIOMED_DIR}
is the base directory of Fed-BioMed):
#use the environment where Fed-BioMed node is installed
pip install nibabel
python $(find $CONDA_PREFIX -path */fed_ixi/dataset_creation_scripts/download.py) -o ${FEDBIOMED_DIR}/data
To download the fed_heart_disease
dataset in ${FEDBIOMED_DIR}/data
, follow FLamby download instructions. In a nutshell:
- execute on the researcher
#use the environment where Fed-BioMed researcher is installed
pip install wget
- then execute on each node (where
${FEDBIOMED_DIR}
is the base directory of Fed-BioMed):
#use the environment where Fed-BioMed node is installed
pip install wget
python $(find $CONDA_PREFIX -path */fed_heart_disease/dataset_creation_scripts/download.py) --output-folder ${FEDBIOMED_DIR}/data
Install dependencies¶
If you haven't done so already, install the additional dependencies required by the flamby datasets/features that you intend on using.
You may check out which dependencies are needed by each dataset directly from Flamby's setup.py
file. In our case we'll be using the federated IXI and federated heart disease datasets, hence we'll need wget, monai and nibabel.
! pip install wget nibabel # monai comes already packaged within fed-biomed
Running a FLamby experiment in a federated setting with Fed-BioMed¶
Before running a federated experiment, we need to add a FLamby dataset to a node. From a terminal, cd
to the Fed-BioMed root installation directory and run
fedbiomed node dataset add
Then follow these instructions:
- Select option 6 (
flamby
) when prompted about the data type - type any name for the database (suggested
flamby-ixi
), press Enter to continue - type
flixi
when prompted for tags, press Enter to continue - type any description (suggested
flamby-ixi
), press Enter to continue - select option 3 (
fed_ixi
) when prompted for the FLamby dataset to be configured - type a number in the given range, press Enter to continue
- type any description for the data loading plan (suggested
flamby-ixi-dlp
), press Enter to continue
Optionally, repeat the instructions above for the fed_heart_disease
dataset, using flheart
for tags.
Finally, start the node with
fedbiomed node start
Basic example: Fed-IXI¶
The first example will use the model, optimizer and loss function provided by FLamby for the IXI dataset.
The instructions for using FLamby are:
- define a
TorchTrainingPlan
- in the
training_data
function, instantiate aFlambyDataset
- make sure to include the necessary dependencies in the
init_dependencies
function
from fedbiomed.common.training_plans import TorchTrainingPlan
from flamby.datasets.fed_ixi import Baseline, BaselineLoss, Optimizer
from fedbiomed.common.data.flamby_dataset import FlambyDataset
from fedbiomed.common.data import DataManager
class MyTrainingPlan(TorchTrainingPlan):
def init_model(self, model_args):
return Baseline()
def init_optimizer(self, optimizer_args):
return Optimizer(self.model().parameters(), lr=optimizer_args["lr"])
def init_dependencies(self):
return ["from flamby.datasets.fed_ixi import Baseline, BaselineLoss, Optimizer",
"from fedbiomed.common.data.flamby_dataset import FlambyDataset",
"from fedbiomed.common.data import DataManager"]
def training_step(self, data, target):
output = self.model().forward(data)
return BaselineLoss().forward(output, target)
def training_data(self):
dataset = FlambyDataset()
loader_arguments = { 'shuffle': True}
return DataManager(dataset, **loader_arguments)
model_args = {}
training_args = {
'loader_args': { 'batch_size': 8, },
'optimizer_args': {
"lr" : 1e-3
},
'epochs': 1,
'dry_run': False,
'batch_maxnum': 2 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
tags = ['flixi']
rounds = 1
exp = Experiment(tags=tags,
model_args=model_args,
training_plan_class=MyTrainingPlan,
training_args=training_args,
round_limit=rounds,
aggregator=FedAverage(),
node_selection_strategy=None)
exp.run_once(increase=True)
Save trained model to file
exp.training_plan().export_model('./trained_model')
Basic example: Fed-Heart-Disease¶
We showcase similar functionalities as the above fed-ixi case, but with FLamby's Heart Disease dataset.
from fedbiomed.common.training_plans import TorchTrainingPlan
from flamby.datasets.fed_heart_disease import Baseline, BaselineLoss, Optimizer
from fedbiomed.common.data.flamby_dataset import FlambyDataset
from fedbiomed.common.data import DataManager
class FedHeartTrainingPlan(TorchTrainingPlan):
def init_model(self, model_args):
return Baseline()
def init_optimizer(self, optimizer_args):
return Optimizer(self.model().parameters(), lr=optimizer_args["lr"])
def init_dependencies(self):
return ["from flamby.datasets.fed_heart_disease import Baseline, BaselineLoss, Optimizer",
"from fedbiomed.common.data.flamby_dataset import FlambyDataset",
"from fedbiomed.common.data import DataManager"]
def training_step(self, data, target):
output = self.model().forward(data)
return BaselineLoss().forward(output, target)
def training_data(self):
dataset = FlambyDataset()
train_kwargs = { 'shuffle': True}
return DataManager(dataset, **train_kwargs)
training_args = {
'loader_args': { 'batch_size': 4, },
'optimizer_args': {
'lr': 0.001,
},
'epochs': 1,
'dry_run': False,
'log_interval': 2,
'batch_maxnum': 8,
'test_ratio' : 0.0,
'test_on_global_updates': False,
'test_on_local_updates': False,
}
model_args = {}
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
tags = ['flheart']
num_rounds = 1
exp = Experiment(tags=tags,
training_plan_class=FedHeartTrainingPlan,
training_args=training_args,
model_args=model_args,
round_limit=num_rounds,
aggregator=FedAverage(),
)
exp.run_once(increase=True)
Complex example: Fed-IXI with data preprocessing and custom training elements¶
This example demonstrates how to define transformations for data preprocessing and provide a customized model, optimizer, and loss function. Incidentally, it also shows how to use model_args
and training_args
to parametrize the model, optimizer, and training loop.
Definition of preprocessing transforms¶
This is achieved in the training_data
function. After instantiating the FlambyDataset
, you may use the init_transform
function to attach a preprocessing transformation for your data. Note that the transform that you define must be of type torchvision.transforms.Compose
or monai.transforms.Compose
.
Definition of custom model, optimizer and loss¶
This is achieved just like any TorchTrainingPlan
, through the functions init_model
, init_optimizer
, and training_step
.
from fedbiomed.common.training_plans import TorchTrainingPlan
from torch.optim import AdamW
from torch import nn
import torch.nn.functional as F
from unet import UNet
from monai.transforms import Compose, NormalizeIntensity, Resize
from fedbiomed.common.data.flamby_dataset import FlambyDataset
class UNetTrainingPlan(TorchTrainingPlan):
class MyUNet(nn.Module):
CHANNELS_DIMENSION = 1
def __init__(self, model_args):
super().__init__()
self.unet = UNet(
in_channels = model_args.get('in_channels',1),
out_classes = model_args.get('out_classes',2),
dimensions = model_args.get('dimensions',2),
num_encoding_blocks = model_args.get('num_encoding_blocks',5),
out_channels_first_layer = model_args.get('out_channels_first_layer',64),
normalization = model_args.get('normalization', None),
pooling_type = model_args.get('pooling_type', 'max'),
upsampling_type = model_args.get('upsampling_type','conv'),
preactivation = model_args.get('preactivation',False),
residual = model_args.get('residual',False),
padding = model_args.get('padding',0),
padding_mode = model_args.get('padding_mode','zeros'),
activation = model_args.get('activation','ReLU'),
initial_dilation = model_args.get('initial_dilation',None),
dropout = model_args.get('dropout',0),
monte_carlo_dropout = model_args.get('monte_carlo_dropout',0)
)
def forward(self, x):
x = self.unet.forward(x)
x = F.softmax(x, dim=UNetTrainingPlan.MyUNet.CHANNELS_DIMENSION)
return x
def init_model(self, model_args):
return UNetTrainingPlan.MyUNet(model_args)
def init_dependencies(self):
return ["from torch import nn",
'import torch.nn.functional as F',
'from torch.optim import AdamW',
'from unet import UNet',
'from monai.transforms import Compose, NormalizeIntensity, Resize',
'from fedbiomed.common.data.flamby_dataset import FlambyDataset']
def init_optimizer(self, optimizer_args):
return AdamW(self.model().parameters(),
lr=optimizer_args["lr"],
betas=optimizer_args["betas"],
eps=optimizer_args["eps"])
@staticmethod
def get_dice_loss(output, target, epsilon=1e-9):
SPATIAL_DIMENSIONS = 2, 3, 4
p0 = output
g0 = target
p1 = 1 - p0
g1 = 1 - g0
tp = (p0 * g0).sum(dim=SPATIAL_DIMENSIONS)
fp = (p0 * g1).sum(dim=SPATIAL_DIMENSIONS)
fn = (p1 * g0).sum(dim=SPATIAL_DIMENSIONS)
num = 2 * tp
denom = 2 * tp + fp + fn + epsilon
dice_score = num / denom
return 1. - dice_score
def training_step(self, data, target):
output = self.model().forward(data)
loss = UNetTrainingPlan.get_dice_loss(output, target)
avg_loss = loss.mean()
return avg_loss
def testing_step(self, data, target):
prediction = self.model().forward(data)
loss = UNetTrainingPlan.get_dice_loss(prediction, target)
avg_loss = loss.mean() # average per batch
return avg_loss
def training_data(self):
dataset = FlambyDataset()
transform = Compose([Resize((48,60,48)), NormalizeIntensity()])
dataset.init_transform(transform)
train_kwargs = { 'shuffle': True}
return DataManager(dataset, **train_kwargs)
model_args = {
'in_channels': 1,
'out_classes': 2,
'dimensions': 3,
'num_encoding_blocks': 3,
'out_channels_first_layer': 8,
'normalization': 'batch',
'upsampling_type': 'linear',
'padding': True,
'activation': 'PReLU',
}
training_args = {
'loader_args': { 'batch_size': 16, },
'optimizer_args': {
'lr': 0.001,
'betas': (0.9, 0.999),
'eps': 1e-08
},
'epochs': 1,
'dry_run': False,
'log_interval': 2,
'test_ratio' : 0.0,
'test_on_global_updates': False,
'test_on_local_updates': False,
'batch_maxnum': 2 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
tags = ['flixi']
num_rounds = 1
exp = Experiment(tags=tags,
model_args=model_args,
training_plan_class=UNetTrainingPlan,
training_args=training_args,
round_limit=num_rounds,
aggregator=FedAverage(),
)
exp.run_once(increase=True)