Advanced optimizers in Fed-BioMed¶
Difficulty level: advanced
Introduction¶
This tutorial presents on how to deal with heterogeneous dataset by changing its Optimizer
. In Fed-BioMed
, one can specify two sort of Optimizer
s:
- a
Optimizer
on theNode
side, defined on theTraining Plan
- a
Optimizer
on theResearcher
side, configured in theExperiment
Advanced Optimizer
are backed by declearn
package, a python package focused on Optimization
for Federated Learning. Advanced Optimizer
can be used regardless of the machine learning framework (meaning it is compatible with both sklearn and PyTorch)
In this tutorial you will learn:
- how to use and chain one or several
Optimizers
onNode
andResearcher
side - how to use fedopt
- how to use
Optimizers
that exchange auxiliary variables such asScaffold
For further details you can refer to the Optimizer
section in the User Guide as well as the declearn documentation on Optimizers
.
1. Configuring Nodes
¶
Before starting, we need to configure several Nodes
and add MedNist dataset to it. Node configuration steps require fedbiomed-node
conda environment. Please make sure that you have the necessary conda environment: this is explained in the installation tutorial.
Please open a terminal, cd
to the base directory of the cloned fedbiomed project and follow the steps below.
- Configuration Steps:
- Run
fedbiomed node dataset add
in the terminal - It will ask you to select the data type that you want to add. The third option has been configured to add the MedNIST dataset. Please type
3
and continue. - Please use default tags which are
#MEDNIST
and#dataset
. - For the next step, please select the directory that you want to download the MNIST dataset.
- After the download is completed you will see the details of the MNIST dataset on the screen.
- Run
Please run the command below in the same terminal to make sure the MNIST dataset is successfully added to the Node.
$ fedbiomed node --path my-node dataset add
$ fedbiomed node --path my-node start
In another terminal, you may proceed by launching a second Node
. Please repeat the above configuration steps, but by specifying another configuration file (for instance conf2.ini
).
$ fedbiomed node --path my-second-node dataset add
$ fedbiomed node --path my-second-node start
2. Defining an Optimizer
on Node
side¶
Optimizers
are defined through the init_optimizer
method of the training plan
. They must be set using Fed-BioMed
Optimizer
object (ie from fedbiomed.common.optimizers.optimizer.Optimizer
)
2.1 With PyTorch framework¶
In this tutorial we have showcased the use of a PyTorch model with PyTorch native optimizers, such as torch.optim.SGD
. In the present tutorial, we will see how to use declearn
cross frameworks optimizers
PyTorch Training Plan
Below is a simple implementation of a declearn
SGD Optimizer
on a PyTorch model. It is equivalent to the following Training Plan
(that uses native Pytorch Optimizer torch.optim.SGD
):
class MyTrainingPlan(TorchTrainingPlan):
...
def init_optimizer(self, optimizer_args):
return torch.optim.SGD(self.model().parameters(), lr = optimizer_args['lr'])
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms
from torchvision.models import densenet121
from fedbiomed.common.optimizers.optimizer import Optimizer
# Here we define the model to be used.
# we will use the densnet121 model
class MyTrainingPlan(TorchTrainingPlan):
def init_dependencies(self):
deps = ["from torchvision import datasets, transforms",
"from torchvision.models import densenet121",
"from fedbiomed.common.optimizers.optimizer import Optimizer"]
return deps
def init_model(self):
self.loss_function = torch.nn.CrossEntropyLoss()
model = densenet121(pretrained=True)
model.classifier =nn.Sequential(nn.Linear(1024,512), nn.Softmax())
return model
def init_optimizer(self, optimizer_args):
# Defines and return a declearn optimizer
# equivalent: Optimizer(lr=optimizer_args['lr'], modules=[], regurlarizers=[])
return Optimizer(lr=optimizer_args['lr'])
def training_data(self, batch_size = 48):
preprocess = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
)])
train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
train_kwargs = {'batch_size': batch_size, 'shuffle': True}
return DataManager(dataset=train_data, **train_kwargs)
def training_step(self, data, target):
output = self.model().forward(data)
loss = self.loss_function(output, target)
return loss
2.2 Sklearn Training Plan
¶
For another machine learning framework such as sklearn, init_optimizer
method syntax is the same
from fedbiomed.common.training_plans import FedSGDClassifier
from fedbiomed.common.data import DataManager
from fedbiomed.common.optimizers.optimizer import Optimizer
from torchvision import datasets, transforms
import torch
class MyTrainingPlan(FedSGDClassifier):
# Declares and return dependencies
def init_dependencies(self):
deps = ["from torchvision import datasets, transforms",
"from fedbiomed.common.optimizers.optimizer import Optimizer",
"import torch"]
return deps
def training_data(self, batch_size):
# in comparison to PyTorch Training Plan, preprocess involves additional steps in order to be used
# with sklearn SGDClassifier, which is expecting vectors in lieu of arrays
# here we are grayscaling and reshaping images
squeezer = lambda x: torch.squeeze(x) # removes extra dimensions
preprocess = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
),
transforms.Grayscale(1),
transforms.Resize((64*64, 1)),
transforms.Lambda(squeezer)
])
train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
return DataManager(dataset=train_data, batch_size=batch_size)
# Defines and return a declearn optimizer
def init_optimizer(self, optimizer_args):
return Optimizer(lr=optimizer_args['lr'])
2.3 Using a more advanced Optimizer
with Regularizer
¶
Optimizer
from fedbiomed.common.optimizers.optimizer
with learning rate equals .1
can be written as Optimizer(lr=.1, decay=0., modules=[], regularizers=[])
, where:
decay
is the weight decay ;modules
is a python list containing no, one or severaldeclearn
OptiModules
;regularizers
is a python list containing no, one or severaldeclearn
Regularizers
.
We will re-use the Pytorch Training Plan
already defined above and show how to use a Adam
Optimizer
with Ridge
as the Regularizer
. For that, we need to import the Adam
and the Ridge
versions of declearn
(AdamModule
and RidgeRegularizer
).
Then the Training Plan
can be defined as follow (for PyTorch):
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms
from torchvision.models import densenet121
from fedbiomed.common.optimizers.optimizer import Optimizer
from fedbiomed.common.optimizers.declearn import AdamModule, RidgeRegularizer
# Here we define the model to be used.
# we will use the densnet121 model
class MyTrainingPlan(TorchTrainingPlan):
def init_dependencies(self):
deps = ["from torchvision import datasets, transforms",
"from torchvision.models import densenet121",
"from fedbiomed.common.optimizers.optimizer import Optimizer",
"fedbiomed.common.optimizers.declearn import AdamModule",
"fedbiomed.common.optimizers.declearn import RidgeRegularizer"]
return deps
def init_model(self):
self.loss_function = torch.nn.CrossEntropyLoss()
model = densenet121(pretrained=True)
model.classifier =nn.Sequential(nn.Linear(1024,512), nn.Softmax())
return model
def init_optimizer(self, optimizer_args):
# Defines and return a declearn optimizer
# equivalent: Optimizer(lr=optimizer_args['lr'], modules=[], regurlarizers=[])
return Optimizer(lr=optimizer_args['lr'], modules=[AdamModule()], regularizers=[RidgeRegularizer()])
def training_data(self, batch_size = 48):
preprocess = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
)])
train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
train_kwargs = {'batch_size': batch_size, 'shuffle': True}
return DataManager(dataset=train_data, **train_kwargs)
def training_step(self, data, target):
output = self.model().forward(data)
loss = self.loss_function(output, target)
return loss
2.4. Create the Experiment
¶
Once the Training Plan
has been created with a specific framework model, definition of the Experiment
is the same as the one in PyTorch or Scikit-Learn, as shown below:
Note: There are a small additional parameters you have to configure in the model_args
for scikit-learn, that have been added but that will be ignored for PyTorch model
lr = 1e-3
model_args = {'n_features': 64*64,
'n_classes' : 6,
'eta0':lr}
training_args = {
'loader_args': {
'batch_size': 8,
},
'optimizer_args': {
"lr" : lr
},
'dry_run': False,
'num_updates': 50
}
tags = ['#dataset', '#MEDNIST']
rounds = 2
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators import FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy
exp = Experiment()
exp.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp.set_model_args(model_args=model_args)
exp.set_training_args(training_args=training_args)
exp.set_tags(tags = tags)
exp.set_aggregator(aggregator=FedAverage())
exp.set_round_limit(rounds)
exp.set_training_data(training_data=None, from_tags=True)
exp.set_strategy(node_selection_strategy=DefaultStrategy())
exp.run(increase=True)
Save trained model to file
exp.training_plan().export_model('./trained_model')
To get and display the content of all OptiModules
(respectively the Regularizers
) available and compatible with Fed-BioMed , one can use list_optim_modules
(resp. list_optim_regularizers
), as shown as below:
from fedbiomed.common.optimizers.declearn import list_optim_modules, list_optim_regularizers
list_optim_modules(), list_optim_regularizers()
3. Defining an Optimizer
on Researcher
side: FedOpt
¶
In some case, you may want to use Adaptive Federated Optimization, also called FedOpt
: the idea behind FedOpt
is to optimize also the global model on Researcher
side in addition to the Nodes
local models, mainly to tackle data heterogeneity. Optimization on Researcher
side is done by computing a pseudo gradient, which is the difference of the updates whithin 2 successive Round
s.
Adaptative Federated Optimization can be done in Fed-BioMed
with declearn
modules through the use of Experiment.set_agg_optimizer
method.
Important: Please note that it is not possible to use native framework optimizers on Researcher
side (such as torch.optim.Optimizer
for instance). Only Fed-BioMed
/declearn
Optimizer
can be used.
For instance, if one wants to use FedYogi
, using the first Training Plan
(that is based on SGD optimizer), the Experiment
will be written as:
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators import FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy
from fedbiomed.common.optimizers.declearn import YogiModule as FedYogi
exp = Experiment()
exp.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp.set_model_args(model_args=model_args)
exp.set_training_args(training_args=training_args)
exp.set_tags(tags = tags)
exp.set_aggregator(aggregator=FedAverage())
exp.set_round_limit(rounds)
exp.set_training_data(training_data=None, from_tags=True)
exp.set_strategy(node_selection_strategy=DefaultStrategy())
# here we are adding an Optimizer on Researcher side (FedYogi)
fed_opt = Optimizer(lr=.8, modules=[FedYogi()])
exp.set_agg_optimizer(fed_opt)
exp.run(increase=True)
Save trained model to file
exp.training_plan().export_model('./trained_model')
4. Defining Scaffold
through Optimizer
¶
In the following subsection, we will present Scaffold
: Scaffold
purpose is to limit the so called client drift that may happen when dealing with heterogenous dataset accross Node
s. For that, Scaffold
involves the exchange between Node
s and Researcher
of additional parameters called correction states, which quantitize how much clients has drifted (drift can be considered as the difference between client's local extrema and global extrema). In Fed-BioMed
, additional parameters that are requiered by Optimizers
are called auxiliary variables
: correction states in Scaffold
is one of them.
declearn
comes with Scaffold
as 2 OptiModules
:
- a
ScaffoldClientModule
onNode
side ; - a
ScaffoldServerModule
onResearcher
side.
For plain Scaffold
, the Training Plan
would look like (for a PyTorch model):
Important: FedAvg
Aggregator
in Fed-BioMed
refers to the way model weights are aggregated, and should not be confused with the FedAvg
algorithm, which is basically a SGD optimizer performed on Node
side using FedAvg
Aggregtor
.
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms
from torchvision.models import densenet121
from fedbiomed.common.optimizers.optimizer import Optimizer
from fedbiomed.common.optimizers.declearn import ScaffoldClientModule
# Here we define the model to be used.
# we will use the densnet121 model
class MyTrainingPlan(TorchTrainingPlan):
def init_dependencies(self):
deps = ["from torchvision import datasets, transforms",
"from torchvision.models import densenet121",
"from fedbiomed.common.optimizers.optimizer import Optimizer",
"from declearn.optimizer.modules import ScaffoldClientModule"]
return deps
def init_model(self):
self.loss_function = torch.nn.CrossEntropyLoss()
model = densenet121(pretrained=True)
model.classifier =nn.Sequential(nn.Linear(1024,512), nn.Softmax())
return model
def init_optimizer(self, optimizer_args):
# Defines and return a declearn optimizer
return Optimizer(lr=optimizer_args['lr'], modules=[ScaffoldClientModule()])
def training_data(self, batch_size = 48):
preprocess = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
)])
train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
train_kwargs = {'batch_size': batch_size, 'shuffle': True}
return DataManager(dataset=train_data, **train_kwargs)
def training_step(self, data, target):
output = self.model().forward(data)
loss = self.loss_function(output, target)
return loss
The Experiment
will be defined that way, with an Optimizer
configured with ScaffoldServerModule
:
lr = 1e-3
model_args = {}
training_args = {
'loader_args': {
'batch_size': 8,
},
'optimizer_args': {
"lr" : lr
},
'dry_run': False,
'num_updates': 50
}
tags = ['#dataset', '#MEDNIST']
rounds = 2
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators import FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy
from fedbiomed.common.optimizers.declearn import ScaffoldServerModule
exp = Experiment()
exp.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp.set_model_args(model_args=model_args)
exp.set_training_args(training_args=training_args)
exp.set_tags(tags = tags)
exp.set_aggregator(aggregator=FedAverage())
exp.set_round_limit(rounds)
exp.set_training_data(training_data=None, from_tags=True)
exp.set_strategy(node_selection_strategy=DefaultStrategy())
# here we are adding an Optimizer on Researcher side (FedYogi)
fed_opt = Optimizer(lr=.8, modules=[ScaffoldServerModule()])
exp.set_agg_optimizer(fed_opt)
exp.run(increase=True)
Save trained model to file
exp.training_plan().export_model('./trained_model')
exp.run(rounds=1, increase=True)
Save trained model to file
exp.training_plan().export_model('./trained_model')
5. Explore advanced Optimizer
feature through declearn
and the Fed-BioMed user guide¶
Congrats!
In this tutorial, you learned how to conduct your Experiment
using advanced cross framework Optimizer
provided by declearn
. declearn
modules offers the possibilty to chain Optimizers
and Regularizers
, making possible to customize as much as possible your federated Expermient
. declearn
compatible modules with Fed-BioMed are provided in fedbiomed.common.optimizers.declearn
For more in depth analysis on declearn
Optimizer
, please reach the Optimizer
section in the User Guide
Please also check declearn
documentation for further details reagrding declearn
package.