Transfer-learning in Fed-BioMed tutorial¶

Goal of this tutoriel¶

This tutorial shows how to do 2d images classification example on MedNIST dataset using pretrained PyTorch model.

The goal of this tutorial is to provide an example of transfer learning methods with Fed-BioMed for medical images classification.

About the model¶

The model used is Densenet-121 model(“Densely Connected Convolutional Networks”) pretrained on ImageNet dataset. The Pytorch pretrained model Densenet121 to perform image classification on the MedNIST dataset. The goal of this Densenet121 model is to predict the class of MedNIST medical images.

About MedNIST¶

MedNIST provides an artificial 2d classification dataset created by gathering different medical imaging datasets from TCIA, the RSNA Bone Age Challenge, and the NIH Chest X-ray dataset. The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license.

MedNIST dataset is downloaded from the resources provided by the project MONAI

The dataset MedNIST has 58954 images of size (3, 64, 64) distributed into 6 classes (10000 images per class except for BreastMRI class which has 8954 images). Classes are AbdomenCT, BreastMRI, CXR, ChestCT, Hand, HeadCT. It has the structure:

└── MedNIST/

├── AbdomenCT/

└── BreastMRI/

└── CXR/

└── ChestCT/

└── Hand/

└── HeadCT/

Transfer-learning¶

Transfer learning is a machine learning technique where a model trained on one task is repurposed or adapted for a second related task. Transfer learning uses a pre-trained neural network on a large dataset, as Imagenet is used to train DenseNet model to perform classification of a wide diversity of images.

The objective is that the knowledge gained from learning one task can be useful for learning another task (as we do here, the knowledge of DenseNet model trained on ImageNet is used to classify medical images in 6 categories). This is particularly beneficial when the amount of labeled data for the target task is limited, as the pre-trained model has already learned useful features and representations from a large dataset.

Transfer learning is typically applied in one of two ways:

(I) Feature Extraction: In this approach, the pre-trained model is used as a fixed feature extractor. The earlier layers of the neural network, which capture general features and patterns, are frozen, and only the later layers are replaced or retrained for the new task.
(II) Fine-tuning: In this approach, the pre-trained model is further trained or partially trained on the new task. This allows the model to adapt its learned representations to the specifics of the new task while retaining some of the knowledge gained from the original task.

In this example, we load on two nodes a sampled dataset ( 500 images and 1000 images) of MedNIST to illustrate transfer-learning's effectiveness. The sampled dataset is made with a random selection of images and return a sampled dataset with balanced classes, to avoid classification's bias. We will run two independant TrainingPlan experiments, one without transfer-learning and the second with transfer learning. We will compare these two experiments running on DenseNet model with focus on loss value and accuracy as metrics to evaluate the effectiveness of Transfer-learning methods.

Nota: This Transfer-Learning example is not to be confused with Federated Transfer Learning-FTL (see for example this paper). The example only showcases here Transfer Learning on a Federated Learning use case.

1. Load dataset or sampled dataset¶

In a new Fed-BioMed environment, run the script python: python fbm-researcher/notebooks/transfer-learning/download_sample_of_mednist.py -n 2, with -n 2 the number of Nodes you want to create ( for more details about this script, please run python fbm-researcher/notebooks/transfer-learning/download_sample_of_mednist.py --help)
The script will ask for each Nodes created the number of samples you want for your dataset. For example you could: Enter 500 the first time the script ask the number of samples, and 1000 the second time Scripts will output component directories for each of Nodes, with configured database, using the following naming convention: node_MedNIST_<i>_sampled where <i> corresponds to the number of Node created. Components will be created in the directory where this script is executed. Eventually, it will add the dataset to the already created Nodes.
Finally launch your Nodes by running: fedbiomed node --path node_MedNIST_1_sampled start. In another terminal, run fedbiomed node --path node_MedNIST_2_sampled start.

Wait until you get Starting task manager.

2. Launch the researcher¶

From the root directory of Fed-BioMed, run : fedbiomed researcher start
It opens the Jupyter notebook.

To make sure that MedNIST dataset is loaded in the node we can send a request to the network to list the available dataset in the node. The list command should output an entry for mednist data.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.requests import Requests
from fedbiomed.researcher.config import config
req = Requests(config)
req.list()
from fedbiomed.researcher.requests import Requests from fedbiomed.researcher.config import config req = Requests(config) req.list()

Import of librairies¶

In [ ]:

  Copied!     
 
import torch
import torch.nn as nn
from torchvision.models.densenet import DenseNet121_Weights
import pandas as pd
from fedbiomed.common.training_plans import TorchTrainingPlan

from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
import torch import torch.nn as nn from torchvision.models.densenet import DenseNet121_Weights import pandas as pd from fedbiomed.common.training_plans import TorchTrainingPlan from fedbiomed.researcher.federated_workflows import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage 

I- Adapt the last layer to your classification's goal¶

Here we use the DenseNet model that allows classification through 10000 samples. We could adapt this classification's task to the MedNIST dataset by replacing the last layer with our classifier. The model.classifier layer of the DenseNet-121 model classifies images through 6 classes, in the Training Plan, by adapting the num_classes value (can be done in through model_args argument).

Data augmentation¶

You could perform data augmentation through the preprocess part if you need. Here I show random flip, rotation and crops. You could do the preprocessing of images by doing only transforms.resize, transforms.to_tensor and transforms.normalize, as mentionned in the code below (commented lines).

I. Run an expriment for image's classification without Transfer-learning¶

Here we propose to run as first experiment a TrainingPlan0 with the untrained DenseNet model. Then, we will compare the loss value from the two other experiments allowing Transfer-learning methods.

We don't use the pre-trained weights. It is important to adapt learning rate. I propose you to start with lr=1e-4 and we could adapt learning rate according to the metric's evaluation.

I -1. Define Training plan experiment¶

In [ ]:

  Copied!     
 
class MyTrainingPlan1(TorchTrainingPlan):

    def init_model(self, model_args):
        model = models.densenet121(weights=None)  # here model coefficients are set to random weights

        # add the classifier 
        num_classes = model_args['num_classes'] 
        num_ftrs = model.classifier.in_features
        model.classifier= nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )
      
        return model

    def init_dependencies(self):
        return [
            "from torchvision import datasets, transforms, models",
            "import torch.optim as optim",
            "from torchvision.models import densenet121"
        ]


    def init_optimizer(self, optimizer_args):        
        return optim.Adam(self.model().parameters(), lr=optimizer_args["lr"])

    
    # training data
    
    def training_data(self):

        # Transform images and  do data augmentation 
        preprocess = transforms.Compose([
                transforms.Resize((224,224)),
                transforms.ToTensor(),
                transforms.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])
           ])
    
        train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
        train_kwargs = { 'shuffle': True}
        return DataManager(dataset=train_data, **train_kwargs)

    def training_step(self, data, target):
        output = self.model().forward(data)
        loss_func = nn.CrossEntropyLoss()
        loss   = loss_func(output, target)
        return loss
class MyTrainingPlan1(TorchTrainingPlan): def init_model(self, model_args): model = models.densenet121(weights=None) # here model coefficients are set to random weights # add the classifier num_classes = model_args['num_classes'] num_ftrs = model.classifier.in_features model.classifier= nn.Sequential( nn.Linear(num_ftrs, 512), nn.ReLU(inplace=True), nn.Dropout(0.5), nn.Linear(512, num_classes) ) return model def init_dependencies(self): return [ "from torchvision import datasets, transforms, models", "import torch.optim as optim", "from torchvision.models import densenet121" ] def init_optimizer(self, optimizer_args): return optim.Adam(self.model().parameters(), lr=optimizer_args["lr"]) # training data def training_data(self): # Transform images and do data augmentation preprocess = transforms.Compose([ transforms.Resize((224,224)), transforms.ToTensor(), transforms.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225]) ]) train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess) train_kwargs = { 'shuffle': True} return DataManager(dataset=train_data, **train_kwargs) def training_step(self, data, target): output = self.model().forward(data) loss_func = nn.CrossEntropyLoss() loss = loss_func(output, target) return loss 

In [ ]:

  Copied!     
 
training_args = {
    'loader_args': { 'batch_size': 32, }, 
    'optimizer_args': {'lr': 1e-3}, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100, # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
    'random_seed': 1234
}

model_args = {
    'num_classes': 6, # adapt this number to the number of classes in your dataset
}
training_args = { 'loader_args': { 'batch_size': 32, }, 'optimizer_args': {'lr': 1e-3}, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100, # Fast pass for development : only use ( batch_maxnum * batch_size ) samples 'random_seed': 1234 } model_args = { 'num_classes': 6, # adapt this number to the number of classes in your dataset }

In [ ]:

  Copied!     
 
tags =  ['#MEDNIST', '#dataset']

rounds = 1 # adjsut the number of rounds 

exp = Experiment(tags=tags,
                 training_plan_class=MyTrainingPlan1,
                 model_args=model_args,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage())

# testing section 
from fedbiomed.common.metrics import MetricTypes
exp.set_test_ratio(.1) 
exp.set_test_on_global_updates(True)
exp.set_test_metric(MetricTypes.ACCURACY)

exp.set_tensorboard(True)
tags = ['#MEDNIST', '#dataset'] rounds = 1 # adjsut the number of rounds exp = Experiment(tags=tags, training_plan_class=MyTrainingPlan1, model_args=model_args, training_args=training_args, round_limit=rounds, aggregator=FedAverage()) # testing section from fedbiomed.common.metrics import MetricTypes exp.set_test_ratio(.1) exp.set_test_on_global_updates(True) exp.set_test_metric(MetricTypes.ACCURACY) exp.set_tensorboard(True)

I - 3. Run your experiment¶

In [ ]:

  Copied!     
 
exp.run()
exp.run()

For example, At the end of training experiment, I obtained¶

fedbiomed INFO - VALIDATION ON GLOBAL UPDATES 
					 NODE_ID: node_mednist_1_sampled 
					 Round 2 | Iteration: 1/1 (100%) | Samples: 50/50
 					 ACCURACY: 0.740000 
					 ---------

fedbiomed INFO - VALIDATION ON GLOBAL UPDATES 
					 NODE_ID: node_mednist_2_sampled 
					 Round 2 | Iteration: 1/1 (100%) | Samples: 100/100
 					 ACCURACY: 0.780000 
					 ---------

I - 4. Save your model¶

You could save your model to later use it in a new TrainingPlan This save allows to import the model including your layers's modification and weights values.

In [ ]:

  Copied!     
 
#save model 
exp.training_plan().export_model('./training_plan1_densenet_MedNIST')
#save model exp.training_plan().export_model('./training_plan1_densenet_MedNIST')

I - 5. Results in tensorboard¶

In [ ]:

  Copied!     
 
from fedbiomed.researcher.config import config
tensorboard_dir = config.vars['TENSORBOARD_RESULTS_DIR']
from fedbiomed.researcher.config import config tensorboard_dir = config.vars['TENSORBOARD_RESULTS_DIR']

In [ ]:

  Copied!     
 
%load_ext tensorboard
%load_ext tensorboard

In [ ]:

  Copied!     
 
%tensorboard --logdir "$tensorboard_dir"
%tensorboard --logdir "$tensorboard_dir"

I - 6. Training timing¶

In [ ]:

  Copied!     
 
print("\nList the training rounds : ", exp.training_replies().keys())

print("\nList the nodes for the last training round and their timings : ")
round_data = exp.training_replies()[rounds - 1]
for r in round_data.values():
    print("\t- {id} :\
    \n\t\trtime_training={rtraining:.2f} seconds\
    \n\t\tptime_training={ptraining:.2f} seconds\
    \n\t\trtime_total={rtotal:.2f} seconds".format(id = r['node_id'],
        rtraining = r['timing']['rtime_training'],
        ptraining = r['timing']['ptime_training'],
        rtotal = r['timing']['rtime_total']))
print('\n')
print("\nList the training rounds : ", exp.training_replies().keys()) print("\nList the nodes for the last training round and their timings : ") round_data = exp.training_replies()[rounds - 1] for r in round_data.values(): print("\t- {id} :\ \n\t\trtime_training={rtraining:.2f} seconds\ \n\t\tptime_training={ptraining:.2f} seconds\ \n\t\trtime_total={rtotal:.2f} seconds".format(id = r['node_id'], rtraining = r['timing']['rtime_training'], ptraining = r['timing']['ptime_training'], rtotal = r['timing']['rtime_total'])) print('\n')

II - Run an expriment for image's classification using Transfer-learning¶

II-1. Downloading the pretrained model's weights¶

Here I download and save the model's weights through Torch.hub using the command below in a file 'pretrained_model.pt'

In [ ]:

  Copied!     
 
model = torch.hub.load('pytorch/vision:v0.10.0', 'densenet121', weights=DenseNet121_Weights.DEFAULT)
torch.save(model.state_dict(), 'pretrained_model.pt')
torch.save(model.state_dict(), 'pretrained_model2.pt')
model = torch.hub.load('pytorch/vision:v0.10.0', 'densenet121', weights=DenseNet121_Weights.DEFAULT) torch.save(model.state_dict(), 'pretrained_model.pt') torch.save(model.state_dict(), 'pretrained_model2.pt')

II-2. Adapt the last layer to your classification's goal¶

Here we use the DenseNet model that allows classification through 1500 samples (on 2 nodes). We could adapt this classification's task to the MedNIST dataset by replacing the last layer with our classifier. The model.classifier layer of the DenseNet-121 model classifies images through 6 classes, in the Training Plan, by adapting the num_classes value (can be done in through model_args argument).

The dataset is defined below, after TrainingPlan as previously shown.

You could also import the model you saved to perform your second TrainingPlan experiment (let's see below)

In this experiment I will unfreeze two last block layers and the classifier layers. Other layers will stay frozen (i.e. they will not change during the experiment).

I introduce a new argument in model_args called num_unfrozen_blocks. This argument specifies the number of blocks left unfrozen. In DenseNet model, layers are grouped whithin blocks. There is a total of 12 blocks, containing several layers each. In our experiment, we will consider rather freezing blocks of layer than layers.

In [ ]:

  Copied!     
 
from fedbiomed.common.training_plans import TorchTrainingPlan
class MyTrainingPlan2(TorchTrainingPlan):

    def init_model(self, model_args):
        model = models.densenet121(weights=None)
        # let's unfreeze layers of the last dense block
        num_unfrozen_layer = model_args['num_unfrozen_blocks']
        for param in model.features[:-num_unfrozen_layer].parameters():
            param.requires_grad = False

        # add the classifier 
        num_ftrs = model.classifier.in_features
        num_classes = model_args['num_classes'] 
        model.classifier = nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, num_classes)       
            )
        
        return model

    def init_dependencies(self):
        return [
            "from torchvision import datasets, transforms, models",
            "import torch.optim as optim"
        ]


    def init_optimizer(self, optimizer_args):        
        return optim.Adam(self.model().parameters(), lr=optimizer_args["lr"])

    def training_data(self):
        
        # Custom torch Dataloader for MedNIST data and transform images and perform data augmentation 
       
        preprocess = transforms.Compose([
                transforms.Resize((224,224)),  
                transforms.ToTensor(),
                transforms.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])
           ])
        train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
        train_kwargs = { 'shuffle': True}
        return DataManager(dataset=train_data, **train_kwargs)



    def training_step(self, data, target):
        output = self.model().forward(data)
        loss_func = nn.CrossEntropyLoss()
        loss   = loss_func(output, target)
        return loss
from fedbiomed.common.training_plans import TorchTrainingPlan class MyTrainingPlan2(TorchTrainingPlan): def init_model(self, model_args): model = models.densenet121(weights=None) # let's unfreeze layers of the last dense block num_unfrozen_layer = model_args['num_unfrozen_blocks'] for param in model.features[:-num_unfrozen_layer].parameters(): param.requires_grad = False # add the classifier num_ftrs = model.classifier.in_features num_classes = model_args['num_classes'] model.classifier = nn.Sequential( nn.Linear(num_ftrs, 512), nn.ReLU(inplace=True), nn.Linear(512, num_classes) ) return model def init_dependencies(self): return [ "from torchvision import datasets, transforms, models", "import torch.optim as optim" ] def init_optimizer(self, optimizer_args): return optim.Adam(self.model().parameters(), lr=optimizer_args["lr"]) def training_data(self): # Custom torch Dataloader for MedNIST data and transform images and perform data augmentation preprocess = transforms.Compose([ transforms.Resize((224,224)), transforms.ToTensor(), transforms.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225]) ]) train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess) train_kwargs = { 'shuffle': True} return DataManager(dataset=train_data, **train_kwargs) def training_step(self, data, target): output = self.model().forward(data) loss_func = nn.CrossEntropyLoss() loss = loss_func(output, target) return loss 

In [ ]:

  Copied!     
 
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

training_args = {
    'loader_args': { 'batch_size': 32, }, 
    'optimizer_args': {'lr': 1e-4}, # You could decrease the learning rate
    'epochs': 1, # you can increase the epoch's number =10
    'dry_run': False,
    'random_seed': 1234,
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
model_args={
    'num_classes': 6,
    'num_unfrozen_blocks': 2  
}
tags =  ['#MEDNIST', '#dataset']
rounds = 1  # you can increase the rounds's number 

exp = Experiment(tags=tags,
                 training_plan_class=MyTrainingPlan2,
                 model_args=model_args,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage())

from fedbiomed.common.metrics import MetricTypes
exp.set_test_ratio(.1)
exp.set_test_on_global_updates(True)
exp.set_test_metric(MetricTypes.ACCURACY)

exp.set_tensorboard(True)
from fedbiomed.researcher.experiment import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage training_args = { 'loader_args': { 'batch_size': 32, }, 'optimizer_args': {'lr': 1e-4}, # You could decrease the learning rate 'epochs': 1, # you can increase the epoch's number =10 'dry_run': False, 'random_seed': 1234, 'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples } model_args={ 'num_classes': 6, 'num_unfrozen_blocks': 2 } tags = ['#MEDNIST', '#dataset'] rounds = 1 # you can increase the rounds's number exp = Experiment(tags=tags, training_plan_class=MyTrainingPlan2, model_args=model_args, training_args=training_args, round_limit=rounds, aggregator=FedAverage()) from fedbiomed.common.metrics import MetricTypes exp.set_test_ratio(.1) exp.set_test_on_global_updates(True) exp.set_test_metric(MetricTypes.ACCURACY) exp.set_tensorboard(True) 

In [ ]:

  Copied!     
 
# here we load the model we have saved with torch-hub weights

exp.training_plan().import_model('pretrained_model.pt')
# here we load the model we have saved with torch-hub weights exp.training_plan().import_model('pretrained_model.pt')

II - 3. Run your experiment¶

In [ ]:

  Copied!     
 
exp.run()
exp.run()

For example, At the end of training experiment, I obtained :¶

fedbiomed INFO - VALIDATION ON GLOBAL UPDATES 
					 NODE_ID: node_mednist_1_sampled 
					 Round 2 | Iteration: 1/1 (100%) | Samples: 50/50
 					 ACCURACY: 1.0000
					 ---------

fedbiomed INFO - VALIDATION ON GLOBAL UPDATES 
					 NODE_ID: node_mednist_2_sampled 
					 Round 2 | Iteration: 1/1 (100%) | Samples: 100/100
 					 ACCURACY: 1.0000 
					 ---------

In [ ]:

  Copied!     
 
print("\nList the training rounds : ", exp.training_replies().keys())

print("\nList the nodes for the last training round and their timings : ")
round_data = exp.training_replies()[rounds - 1]
for r in round_data.values():
    print("\t- {id} :\
    \n\t\trtime_training={rtraining:.2f} seconds\
    \n\t\tptime_training={ptraining:.2f} seconds\
    \n\t\trtime_total={rtotal:.2f} seconds".format(id = r['node_id'],
        rtraining = r['timing']['rtime_training'],
        ptraining = r['timing']['ptime_training'],
        rtotal = r['timing']['rtime_total']))
print('\n')
print("\nList the training rounds : ", exp.training_replies().keys()) print("\nList the nodes for the last training round and their timings : ") round_data = exp.training_replies()[rounds - 1] for r in round_data.values(): print("\t- {id} :\ \n\t\trtime_training={rtraining:.2f} seconds\ \n\t\tptime_training={ptraining:.2f} seconds\ \n\t\trtime_total={rtotal:.2f} seconds".format(id = r['node_id'], rtraining = r['timing']['rtime_training'], ptraining = r['timing']['ptime_training'], rtotal = r['timing']['rtime_total'])) print('\n') 

II - 4. Export your model¶

In [ ]:

  Copied!     
 
#save model 
exp.training_plan().export_model('./training_plan2_densenet_MedNIST')
#save model exp.training_plan().export_model('./training_plan2_densenet_MedNIST')

II - 5. Display losses on Tensorboard¶

In [ ]:

  Copied!     
 
%reload_ext tensorboard
%reload_ext tensorboard

In [ ]:

  Copied!     
 
%tensorboard --logdir "$tensorboard_dir" --port 6007
%tensorboard --logdir "$tensorboard_dir" --port 6007

II - 6. Save and Import your model and parameters¶

You could import your first model from TrainingPlan1 instead of loading the original DenseNet. You could also retrieve the model's features.

In [ ]:

  Copied!     
 
# import your model from a file
model_features_ = torch.load('./training_plan2_densenet_MedNIST')
model_features_
# import your model from a file model_features_ = torch.load('./training_plan2_densenet_MedNIST') model_features_ 

II - 7. check model parameters changed/unchanged¶

Here we are just making sure that the layers that were supoosed to be modified have indeed been modified, between the original model downloaded from pytorch hub and the trained model.

We will discard the batch normalization layers, since those may have changed during the transfer learning operation

Let's first have a look to the layers in the model that we left unfrozen.

In [ ]:

  Copied!     
 
# unfrozen layers during transfer learning (MyTrainingPlan2)
model_features = exp.training_plan().model()
model_features.features[:-model_args['num_unfrozen_blocks']]
# unfrozen layers during transfer learning (MyTrainingPlan2) model_features = exp.training_plan().model() model_features.features[:-model_args['num_unfrozen_blocks']]

In [ ]:

  Copied!     
 
# Here we check if Layers of the DenseNet model have changed between the initial model and the model extracted
# from the training plan (after transfer learning)
model_features = exp.training_plan().model()

table = pd.DataFrame(columns=["Layer name", "Layer set to frozen", "Is Layer changed?"])
ref_model = torch.load('pretrained_model.pt')  # reloading model downloaded from pytorch hub


remove_norm_layers= lambda name : not any([x in name for x in ('norm', 'batch') ])
    

layers = list(ref_model.keys())
ours_layers = model_features.features[:-model_args['num_unfrozen_blocks']]
ours_layers = ['features.'+ x for x in ours_layers.state_dict().keys()]

_counter = 0
for i, (layer_name, param) in enumerate(model_features.state_dict().items()):
    if i >= len(layers):
        continue
    l = layers[i]

    if remove_norm_layers(l) :
        r_tensor = ref_model[l]
        if 'classifier' in layer_name:
            table.loc[_counter] = [l, l in ours_layers, "non comparable"]

        else:
            t = model_features.get_parameter(l)
            _is_close = bool(torch.isclose(r_tensor, t).all())
            table.loc[_counter] = [l, l in ours_layers, not _is_close, ]

    _counter += 1
# Here we check if Layers of the DenseNet model have changed between the initial model and the model extracted # from the training plan (after transfer learning) model_features = exp.training_plan().model() table = pd.DataFrame(columns=["Layer name", "Layer set to frozen", "Is Layer changed?"]) ref_model = torch.load('pretrained_model.pt') # reloading model downloaded from pytorch hub remove_norm_layers= lambda name : not any([x in name for x in ('norm', 'batch') ]) layers = list(ref_model.keys()) ours_layers = model_features.features[:-model_args['num_unfrozen_blocks']] ours_layers = ['features.'+ x for x in ours_layers.state_dict().keys()] _counter = 0 for i, (layer_name, param) in enumerate(model_features.state_dict().items()): if i >= len(layers): continue l = layers[i] if remove_norm_layers(l) : r_tensor = ref_model[l] if 'classifier' in layer_name: table.loc[_counter] = [l, l in ours_layers, "non comparable"] else: t = model_features.get_parameter(l) _is_close = bool(torch.isclose(r_tensor, t).all()) table.loc[_counter] = [l, l in ours_layers, not _is_close, ] _counter += 1

In [ ]:

  Copied!     
 
# display comaprison table content
table
# display comaprison table content table 

The table displays all layers, the one modified and untouched during the training. "non comparable" means layers that have been modified from original model to our use case. Those layers are the classifiying layers.

Conclusions¶

Through these experiments, we have observed a better accuracy and a faster decreasing loss value with transfer-learning methods instead of using the untrained model.

To conclude with the method of transfer learning, it is depending on how many data you have. You could choose to train more layers and compare the metrics with partial fine-tuning. You choose the method that gives the best metrics for your experiment.