FLamby integration in Fed-BioMed¶

This notebook showcases some examples of the integration between FLamby and Fed-BioMed.

For a thorough understanding, please visit the Tutorials section of our documentation.

This tutorial assumes that you know and understand the basics of Fed-BioMed, that you have already set up the network component, and are familiar with flow of adding data through the node CLI interface. For an introduction to Fed-BioMed, please follow our PyTorch MNIST tutorial.

Downloading FLamby datasets¶

Before using FLamby, you need to download the FLamby datasets that you plan to use. For licensing reasons, these are not including directly in the FLamby installation.

To download the fed_ixi dataset in ${FEDBIOMED_DIR}/data, follow FLamby download instructions. In a nutshell:

execute on the researcher

#use the environment where Fed-BioMed node is installed
pip install nibabel

then execute on each node (where ${FEDBIOMED_DIR} is the base directory of Fed-BioMed):

#use the environment where Fed-BioMed node is installed
pip install nibabel
# if a conda environment is used
python $(find $CONDA_PREFIX -path */fed_ixi/dataset_creation_scripts/download.py) -o ${FEDBIOMED_DIR}/data
# if a virtualenv environment is used 
python $(find $VIRTUAL_ENV -path */fed_ixi/dataset_creation_scripts/download.py) -o ${FEDBIOMED_DIR}/data

To download the fed_heart_disease dataset in ${FEDBIOMED_DIR}/data, follow FLamby download instructions. In a nutshell:

execute on the researcher

#use the environment where Fed-BioMed researcher is installed
pip install wget

then execute on each node (where ${FEDBIOMED_DIR} is the base directory of Fed-BioMed):

#use the environment where Fed-BioMed node is installed
pip install wget
# if a conda environment is used
python $(find $CONDA_PREFIX -path */fed_heart_disease/dataset_creation_scripts/download.py) --output-folder ${FEDBIOMED_DIR}/data
# if a virtual environment is used
python $(find $VIRTUAL_ENV -path */fed_heart_disease/dataset_creation_scripts/download.py) --output-folder ${FEDBIOMED_DIR}/data

Install dependencies¶

If you haven't done so already, install the additional dependencies required by the flamby datasets/features that you intend on using.

You may check out which dependencies are needed by each dataset directly from Flamby's setup.py file. In our case we'll be using the federated IXI and federated heart disease datasets, hence we'll need wget, monai and nibabel.

In [ ]:

  Copied!     
 
! pip install wget nibabel  # monai comes already packaged within fed-biomed
! pip install wget nibabel # monai comes already packaged within fed-biomed

Running a FLamby experiment in a federated setting with Fed-BioMed¶

Before running a federated experiment, we need to add a FLamby dataset to a node. From a terminal, cd to the Fed-BioMed root installation directory and run

fedbiomed node dataset add

Then follow these instructions:

Select option 6 (flamby) when prompted about the data type
type any name for the database (suggested flamby-ixi), press Enter to continue
type flixi when prompted for tags, press Enter to continue
type any description (suggested flamby-ixi), press Enter to continue
select option 3 (fed_ixi) when prompted for the FLamby dataset to be configured
type a number in the given range, press Enter to continue
type any description for the data loading plan (suggested flamby-ixi-dlp), press Enter to continue

Optionally, repeat the instructions above for the fed_heart_disease dataset, using flheart for tags.

Finally, start the node with

fedbiomed node start

Basic example: Fed-IXI¶

The first example will use the model, optimizer and loss function provided by FLamby for the IXI dataset.

The instructions for using FLamby are:

define a TorchTrainingPlan
in the training_data function, instantiate a FlambyDataset
make sure to include the necessary dependencies in the init_dependencies function

In [ ]:

  Copied!     
 
from fedbiomed.common.training_plans import TorchTrainingPlan
from flamby.datasets.fed_ixi import Baseline, BaselineLoss, Optimizer
from fedbiomed.common.data.flamby_dataset import FlambyDataset
from fedbiomed.common.data import DataManager


class MyTrainingPlan(TorchTrainingPlan):
    def init_model(self, model_args):
        return Baseline()

    def init_optimizer(self, optimizer_args):
        return Optimizer(self.model().parameters(), lr=optimizer_args["lr"])

    def init_dependencies(self):
        return ["from flamby.datasets.fed_ixi import Baseline, BaselineLoss, Optimizer",
                "from fedbiomed.common.data.flamby_dataset import FlambyDataset",
                "from fedbiomed.common.data import DataManager"]

    def training_step(self, data, target):
        output = self.model().forward(data)
        return BaselineLoss().forward(output, target)

    def training_data(self):
        dataset = FlambyDataset()
        loader_arguments = { 'shuffle': True}
        return DataManager(dataset, **loader_arguments)
from fedbiomed.common.training_plans import TorchTrainingPlan from flamby.datasets.fed_ixi import Baseline, BaselineLoss, Optimizer from fedbiomed.common.data.flamby_dataset import FlambyDataset from fedbiomed.common.data import DataManager class MyTrainingPlan(TorchTrainingPlan): def init_model(self, model_args): return Baseline() def init_optimizer(self, optimizer_args): return Optimizer(self.model().parameters(), lr=optimizer_args["lr"]) def init_dependencies(self): return ["from flamby.datasets.fed_ixi import Baseline, BaselineLoss, Optimizer", "from fedbiomed.common.data.flamby_dataset import FlambyDataset", "from fedbiomed.common.data import DataManager"] def training_step(self, data, target): output = self.model().forward(data) return BaselineLoss().forward(output, target) def training_data(self): dataset = FlambyDataset() loader_arguments = { 'shuffle': True} return DataManager(dataset, **loader_arguments)

In [ ]:

  Copied!     
 
model_args = {}

training_args = {
    'loader_args': { 'batch_size': 8, },
    'optimizer_args': {
        "lr" : 1e-3
    },
    'epochs': 1,
    'dry_run': False,
    'batch_maxnum': 2 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
model_args = {} training_args = { 'loader_args': { 'batch_size': 8, }, 'optimizer_args': { "lr" : 1e-3 }, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 2 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples }

In [ ]:

  Copied!     
 
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['flixi']
rounds = 1

exp = Experiment(tags=tags,
                 model_args=model_args,
                 training_plan_class=MyTrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)
from fedbiomed.researcher.federated_workflows import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['flixi'] rounds = 1 exp = Experiment(tags=tags, model_args=model_args, training_plan_class=MyTrainingPlan, training_args=training_args, round_limit=rounds, aggregator=FedAverage(), node_selection_strategy=None)

In [ ]:

  Copied!     
 
exp.run_once(increase=True)
exp.run_once(increase=True)

Save trained model to file

In [ ]:

  Copied!     
 
exp.training_plan().export_model('./trained_model')
exp.training_plan().export_model('./trained_model')

Basic example: Fed-Heart-Disease¶

We showcase similar functionalities as the above fed-ixi case, but with FLamby's Heart Disease dataset.

In [0]:

  Copied!     
 
from fedbiomed.common.training_plans import TorchTrainingPlan
from flamby.datasets.fed_heart_disease import Baseline, BaselineLoss, Optimizer
from fedbiomed.common.data.flamby_dataset import FlambyDataset
from fedbiomed.common.data import DataManager

class FedHeartTrainingPlan(TorchTrainingPlan):
    def init_model(self, model_args):
        return Baseline()

    def init_optimizer(self, optimizer_args):
        return Optimizer(self.model().parameters(), lr=optimizer_args["lr"])

    def init_dependencies(self):
        return ["from flamby.datasets.fed_heart_disease import Baseline, BaselineLoss, Optimizer",
                "from fedbiomed.common.data.flamby_dataset import FlambyDataset",
                "from fedbiomed.common.data import DataManager"]

    def training_step(self, data, target):
        output = self.model().forward(data)
        return BaselineLoss().forward(output, target)

    def training_data(self):
        dataset = FlambyDataset()
        train_kwargs = { 'shuffle': True}
        return DataManager(dataset, **train_kwargs)
from fedbiomed.common.training_plans import TorchTrainingPlan from flamby.datasets.fed_heart_disease import Baseline, BaselineLoss, Optimizer from fedbiomed.common.data.flamby_dataset import FlambyDataset from fedbiomed.common.data import DataManager class FedHeartTrainingPlan(TorchTrainingPlan): def init_model(self, model_args): return Baseline() def init_optimizer(self, optimizer_args): return Optimizer(self.model().parameters(), lr=optimizer_args["lr"]) def init_dependencies(self): return ["from flamby.datasets.fed_heart_disease import Baseline, BaselineLoss, Optimizer", "from fedbiomed.common.data.flamby_dataset import FlambyDataset", "from fedbiomed.common.data import DataManager"] def training_step(self, data, target): output = self.model().forward(data) return BaselineLoss().forward(output, target) def training_data(self): dataset = FlambyDataset() train_kwargs = { 'shuffle': True} return DataManager(dataset, **train_kwargs)

In [ ]:

  Copied!     
 
training_args = {
    'loader_args': { 'batch_size': 4, },
    'optimizer_args': {
        'lr': 0.001,
    },
    'epochs': 1,
    'dry_run': False,
    'log_interval': 2,
    'batch_maxnum': 8,
    'test_ratio' : 0.0,
    'test_on_global_updates': False,
    'test_on_local_updates': False,
}

model_args = {}
training_args = { 'loader_args': { 'batch_size': 4, }, 'optimizer_args': { 'lr': 0.001, }, 'epochs': 1, 'dry_run': False, 'log_interval': 2, 'batch_maxnum': 8, 'test_ratio' : 0.0, 'test_on_global_updates': False, 'test_on_local_updates': False, } model_args = {}

In [ ]:

  Copied!     
 
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['flheart']
num_rounds = 1

exp = Experiment(tags=tags,
                 training_plan_class=FedHeartTrainingPlan,
                 training_args=training_args,
                 model_args=model_args,
                 round_limit=num_rounds,
                 aggregator=FedAverage(),
                )
from fedbiomed.researcher.federated_workflows import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['flheart'] num_rounds = 1 exp = Experiment(tags=tags, training_plan_class=FedHeartTrainingPlan, training_args=training_args, model_args=model_args, round_limit=num_rounds, aggregator=FedAverage(), )

In [ ]:

  Copied!     
 
exp.run_once(increase=True)
exp.run_once(increase=True)

Complex example: Fed-IXI with data preprocessing and custom training elements¶

This example demonstrates how to define transformations for data preprocessing and provide a customized model, optimizer, and loss function. Incidentally, it also shows how to use model_args and training_args to parametrize the model, optimizer, and training loop.

Definition of preprocessing transforms¶

This is achieved in the training_data function. After instantiating the FlambyDataset, you may use the init_transform function to attach a preprocessing transformation for your data. Note that the transform that you define must be of type torchvision.transforms.Compose or monai.transforms.Compose.

Definition of custom model, optimizer and loss¶

This is achieved just like any TorchTrainingPlan, through the functions init_model, init_optimizer, and training_step.

In [ ]:

  Copied!     
 
from fedbiomed.common.training_plans import TorchTrainingPlan
from torch.optim import AdamW
from torch import nn
import torch.nn.functional as F
from unet import UNet
from monai.transforms import Compose, NormalizeIntensity, Resize
from fedbiomed.common.data.flamby_dataset import FlambyDataset

class UNetTrainingPlan(TorchTrainingPlan):

    class MyUNet(nn.Module):
        CHANNELS_DIMENSION = 1

        def __init__(self, model_args):
            super().__init__()
            self.unet = UNet(
            in_channels = model_args.get('in_channels',1),
            out_classes = model_args.get('out_classes',2),
            dimensions = model_args.get('dimensions',2),
            num_encoding_blocks = model_args.get('num_encoding_blocks',5),
            out_channels_first_layer = model_args.get('out_channels_first_layer',64),
            normalization = model_args.get('normalization', None),
            pooling_type = model_args.get('pooling_type', 'max'),
            upsampling_type = model_args.get('upsampling_type','conv'),
            preactivation = model_args.get('preactivation',False),
            residual = model_args.get('residual',False),
            padding = model_args.get('padding',0),
            padding_mode = model_args.get('padding_mode','zeros'),
            activation = model_args.get('activation','ReLU'),
            initial_dilation = model_args.get('initial_dilation',None),
            dropout = model_args.get('dropout',0),
            monte_carlo_dropout = model_args.get('monte_carlo_dropout',0)
        )

        def forward(self, x):
            x = self.unet.forward(x)
            x = F.softmax(x, dim=UNetTrainingPlan.MyUNet.CHANNELS_DIMENSION)
            return x

    def init_model(self, model_args):
        return UNetTrainingPlan.MyUNet(model_args)

    def init_dependencies(self):
        return ["from torch import nn",
               'import torch.nn.functional as F',
               'from torch.optim import AdamW',
               'from unet import UNet',
               'from monai.transforms import Compose, NormalizeIntensity, Resize',
               'from fedbiomed.common.data.flamby_dataset import FlambyDataset']

    def init_optimizer(self, optimizer_args):
        return AdamW(self.model().parameters(),
                     lr=optimizer_args["lr"],
                     betas=optimizer_args["betas"],
                     eps=optimizer_args["eps"])

    @staticmethod
    def get_dice_loss(output, target, epsilon=1e-9):
        SPATIAL_DIMENSIONS = 2, 3, 4
        p0 = output
        g0 = target
        p1 = 1 - p0
        g1 = 1 - g0
        tp = (p0 * g0).sum(dim=SPATIAL_DIMENSIONS)
        fp = (p0 * g1).sum(dim=SPATIAL_DIMENSIONS)
        fn = (p1 * g0).sum(dim=SPATIAL_DIMENSIONS)
        num = 2 * tp
        denom = 2 * tp + fp + fn + epsilon
        dice_score = num / denom
        return 1. - dice_score

    def training_step(self, data, target):
        output = self.model().forward(data)
        loss = UNetTrainingPlan.get_dice_loss(output, target)
        avg_loss = loss.mean()
        return avg_loss

    def testing_step(self, data, target):
        prediction = self.model().forward(data)
        loss = UNetTrainingPlan.get_dice_loss(prediction, target)
        avg_loss = loss.mean()  # average per batch
        return avg_loss

    def training_data(self):
        dataset = FlambyDataset()
        transform = Compose([Resize((48,60,48)), NormalizeIntensity()])
        dataset.init_transform(transform)
        train_kwargs = { 'shuffle': True}
        return DataManager(dataset, **train_kwargs)
from fedbiomed.common.training_plans import TorchTrainingPlan from torch.optim import AdamW from torch import nn import torch.nn.functional as F from unet import UNet from monai.transforms import Compose, NormalizeIntensity, Resize from fedbiomed.common.data.flamby_dataset import FlambyDataset class UNetTrainingPlan(TorchTrainingPlan): class MyUNet(nn.Module): CHANNELS_DIMENSION = 1 def __init__(self, model_args): super().__init__() self.unet = UNet( in_channels = model_args.get('in_channels',1), out_classes = model_args.get('out_classes',2), dimensions = model_args.get('dimensions',2), num_encoding_blocks = model_args.get('num_encoding_blocks',5), out_channels_first_layer = model_args.get('out_channels_first_layer',64), normalization = model_args.get('normalization', None), pooling_type = model_args.get('pooling_type', 'max'), upsampling_type = model_args.get('upsampling_type','conv'), preactivation = model_args.get('preactivation',False), residual = model_args.get('residual',False), padding = model_args.get('padding',0), padding_mode = model_args.get('padding_mode','zeros'), activation = model_args.get('activation','ReLU'), initial_dilation = model_args.get('initial_dilation',None), dropout = model_args.get('dropout',0), monte_carlo_dropout = model_args.get('monte_carlo_dropout',0) ) def forward(self, x): x = self.unet.forward(x) x = F.softmax(x, dim=UNetTrainingPlan.MyUNet.CHANNELS_DIMENSION) return x def init_model(self, model_args): return UNetTrainingPlan.MyUNet(model_args) def init_dependencies(self): return ["from torch import nn", 'import torch.nn.functional as F', 'from torch.optim import AdamW', 'from unet import UNet', 'from monai.transforms import Compose, NormalizeIntensity, Resize', 'from fedbiomed.common.data.flamby_dataset import FlambyDataset'] def init_optimizer(self, optimizer_args): return AdamW(self.model().parameters(), lr=optimizer_args["lr"], betas=optimizer_args["betas"], eps=optimizer_args["eps"]) @staticmethod def get_dice_loss(output, target, epsilon=1e-9): SPATIAL_DIMENSIONS = 2, 3, 4 p0 = output g0 = target p1 = 1 - p0 g1 = 1 - g0 tp = (p0 * g0).sum(dim=SPATIAL_DIMENSIONS) fp = (p0 * g1).sum(dim=SPATIAL_DIMENSIONS) fn = (p1 * g0).sum(dim=SPATIAL_DIMENSIONS) num = 2 * tp denom = 2 * tp + fp + fn + epsilon dice_score = num / denom return 1. - dice_score def training_step(self, data, target): output = self.model().forward(data) loss = UNetTrainingPlan.get_dice_loss(output, target) avg_loss = loss.mean() return avg_loss def testing_step(self, data, target): prediction = self.model().forward(data) loss = UNetTrainingPlan.get_dice_loss(prediction, target) avg_loss = loss.mean() # average per batch return avg_loss def training_data(self): dataset = FlambyDataset() transform = Compose([Resize((48,60,48)), NormalizeIntensity()]) dataset.init_transform(transform) train_kwargs = { 'shuffle': True} return DataManager(dataset, **train_kwargs)

In [ ]:

  Copied!     
 
model_args = {
    'in_channels': 1,
    'out_classes': 2,
    'dimensions': 3,
    'num_encoding_blocks': 3,
    'out_channels_first_layer': 8,
    'normalization': 'batch',
    'upsampling_type': 'linear',
    'padding': True,
    'activation': 'PReLU',
}

training_args = {
    'loader_args': { 'batch_size': 16, },
    'optimizer_args': {
        'lr': 0.001,
        'betas': (0.9, 0.999),
        'eps': 1e-08
    },
    'epochs': 1,
    'dry_run': False,
    'log_interval': 2,
    'test_ratio' : 0.0,
    'test_on_global_updates': False,
    'test_on_local_updates': False,
    'batch_maxnum': 2 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
model_args = { 'in_channels': 1, 'out_classes': 2, 'dimensions': 3, 'num_encoding_blocks': 3, 'out_channels_first_layer': 8, 'normalization': 'batch', 'upsampling_type': 'linear', 'padding': True, 'activation': 'PReLU', } training_args = { 'loader_args': { 'batch_size': 16, }, 'optimizer_args': { 'lr': 0.001, 'betas': (0.9, 0.999), 'eps': 1e-08 }, 'epochs': 1, 'dry_run': False, 'log_interval': 2, 'test_ratio' : 0.0, 'test_on_global_updates': False, 'test_on_local_updates': False, 'batch_maxnum': 2 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples } 

In [ ]:

  Copied!     
 
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['flixi']
num_rounds = 1

exp = Experiment(tags=tags,
                 model_args=model_args,
                 training_plan_class=UNetTrainingPlan,
                 training_args=training_args,
                 round_limit=num_rounds,
                 aggregator=FedAverage(),
                )
from fedbiomed.researcher.federated_workflows import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['flixi'] num_rounds = 1 exp = Experiment(tags=tags, model_args=model_args, training_plan_class=UNetTrainingPlan, training_args=training_args, round_limit=num_rounds, aggregator=FedAverage(), )

In [ ]:

  Copied!     
 
exp.run_once(increase=True)
exp.run_once(increase=True)