Dataloadingplan

Classes that simplify imports from fedbiomed.common.dataloadingplan

Classes

DataLoadingBlock

DataLoadingBlock()

Bases: ABC

The building blocks of a DataLoadingPlan.

A DataLoadingBlock describes an intermediary layer between the researcher and the node's filesystem. It allows the node to specify a customization in the way data is "perceived" by the data loaders during training.

A DataLoadingBlock is identified by its type_id attribute. Thus, this attribute should be unique among all DataLoadingBlockTypes in the same DataLoadingPlan. Moreover, we may test equality between a DataLoadingBlock and a string by checking its type_id, as a means of easily testing whether a DataLoadingBlock is contained in a collection.

Correct usage of this class requires creating ad-hoc subclasses. The DataLoadingBlock class is not intended to be instantiated directly.

Subclasses of DataLoadingBlock must respect the following conditions:

implement a default constructor
the implemented constructor must call super().__init__()
extend the serialize(self) and the deserialize(self, load_from: dict) functions
both serialize and deserialize must call super's serialize and deserialize respectively
the deserialize function must always return self
the serialize function must update the dict returned by super's serialize
implement an apply function that takes arbitrary arguments and applies the logic of the loading_block
update the _validation_scheme to define rules for all new fields returned by the serialize function

Attributes:

Name	Type	Description
`__serialization_id`		(str) identifies one serialized instance of the DataLoadingBlock

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def __init__(self):
    self.__serialization_id = "serialized_dlb_" + str(uuid.uuid4())
    self._serialization_validator = SerializationValidation()
    self._serialization_validator.update_validation_scheme(
        SerializationValidation.dlb_default_scheme()
    )

Functions

apply `abstractmethod`

apply(*args, **kwargs)

Abstract method representing an application of the DataLoadingBlock

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

@abstractmethod
def apply(self, *args, **kwargs):
    """Abstract method representing an application of the DataLoadingBlock"""
    pass

deserialize

deserialize(load_from)

Reconstruct the DataLoadingBlock from a serialized version.

Parameters:

Name	Type	Description	Default
`load_from`	`dict`	a dictionary as obtained by the serialize function.	required

Returns: the self instance

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def deserialize(self, load_from: dict) -> TDataLoadingBlock:
    """Reconstruct the DataLoadingBlock from a serialized version.

    Args:
        load_from (dict): a dictionary as obtained by the serialize function.
    Returns:
        the self instance
    """
    self._serialization_validator.validate(
        load_from, FedbiomedLoadingBlockValueError
    )
    self.__serialization_id = load_from["dlb_id"]
    return self

get_serialization_id

get_serialization_id()

Expose serialization id as read-only

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def get_serialization_id(self):
    """Expose serialization id as read-only"""
    return self.__serialization_id

instantiate_class `staticmethod`

instantiate_class(loading_block)

Instantiate one DataLoadingBlock object of the type defined in the arguments.

Uses the loading_block_module and loading_block_class fields of the loading_block argument to identify the type of DataLoadingBlock to be instantiated, then calls its default constructor. Note that this function does not call deserialize.

Parameters:

Name	Type	Description	Default
`loading_block`	`dict`	DataLoadingBlock metadata in the format returned by the serialize function.	required

Returns: A default-constructed instance of a DataLoadingBlock of the type defined in the metadata. Raises: FedbiomedLoadingBlockError: if the instantiation process raised any exception.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

@staticmethod
def instantiate_class(loading_block: dict) -> TDataLoadingBlock:
    """Instantiate one [DataLoadingBlock][fedbiomed.common.dataloadingplan.DataLoadingBlock]
    object of the type defined in the arguments.

    Uses the `loading_block_module` and `loading_block_class` fields of the loading_block argument to
    identify the type of [DataLoadingBlock][fedbiomed.common.dataloadingplan.DataLoadingBlock]
    to be instantiated, then calls its default constructor.
    Note that this function **does not call deserialize**.

    Args:
        loading_block (dict): [DataLoadingBlock][fedbiomed.common.dataloadingplan.DataLoadingBlock]
            metadata in the format returned by the serialize function.
    Returns:
        A default-constructed instance of a
            [DataLoadingBlock][fedbiomed.common.dataloadingplan.DataLoadingBlock]
            of the type defined in the metadata.
    Raises:
       FedbiomedLoadingBlockError: if the instantiation process raised any exception.
    """
    try:
        dlb_module = import_module(loading_block["loading_block_module"])  # noqa: F841
        dlb = eval(f"dlb_module.{loading_block['loading_block_class']}()")
    except Exception as e:
        msg = (
            f"{ErrorNumbers.FB614.value}: could not instantiate DataLoadingBlock from the following metadata: "
            + f"{loading_block} because of {type(e).__name__}: {e}"
        )
        logger.debug(msg)
        raise FedbiomedLoadingBlockError(msg) from e
    return dlb

instantiate_key `staticmethod`

instantiate_key(key_module, key_classname, loading_block_key_str)

Imports and loads DataLoadingBlockTypes regarding the passed arguments

Parameters:

Name	Type	Description	Default
`key_module`	`str`	description	required
`key_classname`	`str`	description	required
`loading_block_key_str`	`str`	description	required

Raises:

Type	Description
`FedbiomedDataLoadingPlanError`	description

Returns:

Name	Type	Description
`DataLoadingBlockTypes`	`DataLoadingBlockTypes`	description

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

@staticmethod
def instantiate_key(
    key_module: str, key_classname: str, loading_block_key_str: str
) -> DataLoadingBlockTypes:
    """Imports and loads [DataLoadingBlockTypes][fedbiomed.common.constants.DataLoadingBlockTypes]
    regarding the passed arguments

    Args:
        key_module (str): _description_
        key_classname (str): _description_
        loading_block_key_str (str): _description_

    Raises:
        FedbiomedDataLoadingPlanError: _description_

    Returns:
        DataLoadingBlockTypes: _description_
    """
    try:
        keys = import_module(key_module)  # noqa: F841
        loading_block_key = eval(f"keys.{key_classname}('{loading_block_key_str}')")
    except Exception as e:
        msg = (
            f"{ErrorNumbers.FB615.value} Error deserializing loading block key "
            + f"{loading_block_key_str} with path {key_module}.{key_classname} "
            + f"because of {type(e).__name__}: {e}"
        )
        logger.debug(msg)
        raise FedbiomedDataLoadingPlanError(msg) from e
    return loading_block_key

serialize

serialize()

Serializes the class in a format similar to json.

Returns:

Type	Description
`dict`	a dictionary of key-value pairs sufficient for reconstructing
`dict`	the DataLoadingBlock.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def serialize(self) -> dict:
    """Serializes the class in a format similar to json.

    Returns:
        a dictionary of key-value pairs sufficient for reconstructing
        the DataLoadingBlock.
    """
    return dict(
        loading_block_class=self.__class__.__qualname__,
        loading_block_module=self.__module__,
        dlb_id=self.__serialization_id,
    )

DataLoadingPlan(*args, **kwargs)

Bases: Dict[DataLoadingBlockTypes, DataLoadingBlock]

Customizations to the way the data is loaded and presented for training.

A DataLoadingPlan is a dictionary of {name: DataLoadingBlock} pairs. Each DataLoadingBlock represents a customization to the way data is loaded and presented to the researcher. These customizations are defined by the node, but they operate on a Dataset class, which is defined by the library and instantiated by the researcher.

To exploit this functionality, a Dataset must be modified to accept the customizations provided by the DataLoadingPlan. To simplify this process, we provide the DataLoadingPlanMixin class below.

The DataLoadingPlan class should be instantiated directly, no subclassing is needed. The DataLoadingPlan is a dict, and exposes the same interface as a dict.

Attributes:

Name	Type	Description
`dlp_id`		str representing a unique plan id (auto-generated)
`desc`		str representing an optional user-friendly short description
`target_dataset_type`		a DatasetTypes enum representing the type of dataset targeted by this DataLoadingPlan

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def __init__(self, *args, **kwargs):
    super(DataLoadingPlan, self).__init__(*args, **kwargs)
    self.dlp_id = "dlp_" + str(uuid.uuid4())
    self.desc = ""
    self.target_dataset_type = DatasetTypes.NONE
    self._serialization_validation = SerializationValidation()
    self._serialization_validation.update_validation_scheme(
        SerializationValidation.dlp_default_scheme()
    )

Attributes

desc `instance-attribute`

desc = ''

dlp_id `instance-attribute`

dlp_id = 'dlp_' + str(uuid4())

target_dataset_type `instance-attribute`

target_dataset_type = NONE

Functions

deserialize

deserialize(serialized_dlp, serialized_loading_blocks)

Reconstruct the DataLoadingPlan][fedbiomed.common.dataloadingplan.DataLoadingPlan] from a serialized version.

Calling this function will clear the contained [DataLoadingBlockTypes].

This function may not be used to "update" nor to "append to" a DataLoadingPlan.

Parameters:

Name	Type	Description	Default
`serialized_dlp`	`dict`	a dictionary of data loading plan metadata, as obtained from the first output of the serialize function	required
`serialized_loading_blocks`	`List[dict]`	a list of dictionaries of loading_block metadata, as obtained from the second output of the serialize function	required

Returns: the self instance

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def deserialize(
    self, serialized_dlp: dict, serialized_loading_blocks: List[dict]
) -> TDataLoadingPlan:
    """Reconstruct the DataLoadingPlan][fedbiomed.common.dataloadingplan.DataLoadingPlan] from a serialized version.

    !!! warning "Calling this function will *clear* the contained [DataLoadingBlockTypes]."
        This function may not be used to "update" nor to "append to"
        a [DataLoadingPlan][fedbiomed.common.dataloadingplan.DataLoadingPlan].

    Args:
        serialized_dlp: a dictionary of data loading plan metadata, as obtained from the first output of the
            serialize function
        serialized_loading_blocks: a list of dictionaries of loading_block metadata, as obtained from the
            second output of the serialize function
    Returns:
        the self instance
    """
    self._serialization_validation.validate(
        serialized_dlp, FedbiomedDataLoadingPlanValueError
    )

    self.clear()
    self.dlp_id = serialized_dlp["dlp_id"]
    self.desc = serialized_dlp["dlp_name"]
    self.target_dataset_type = DatasetTypes(serialized_dlp["target_dataset_type"])
    for loading_block_key_str, dlb_id in serialized_dlp["loading_blocks"].items():
        key_module, key_classname = serialized_dlp["key_paths"][
            loading_block_key_str
        ]
        loading_block_key = DataLoadingBlock.instantiate_key(
            key_module, key_classname, loading_block_key_str
        )
        loading_block = next(
            filter(lambda x: x["dlb_id"] == dlb_id, serialized_loading_blocks)
        )
        dlb = DataLoadingBlock.instantiate_class(loading_block)
        self[loading_block_key] = dlb.deserialize(loading_block)
    return self

infer_dataset_type `staticmethod`

infer_dataset_type(dataset)

Infer the type of a given dataset.

This function provides the mapping between a dataset's class and the DatasetTypes enum. If the dataset exposes the correct interface (i.e. the get_dataset_type method) then it directly calls that, otherwise it tries to apply some heuristics to guess the type of dataset.

Parameters:

Name	Type	Description	Default
`dataset`	`Any`	the dataset whose type we want to infer.	required

Returns: a DatasetTypes enum element which identifies the type of the dataset. Raises: FedbiomedDataLoadingPlanValueError: if the dataset does not have a get_dataset_type method and moreover the type could not be guessed.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

@staticmethod
def infer_dataset_type(dataset: Any) -> DatasetTypes:
    """Infer the type of a given dataset.

    This function provides the mapping between a dataset's class and the DatasetTypes enum. If the dataset exposes
    the correct interface (i.e. the get_dataset_type method) then it directly calls that, otherwise it tries to
    apply some heuristics to guess the type of dataset.

    Args:
        dataset: the dataset whose type we want to infer.
    Returns:
        a DatasetTypes enum element which identifies the type of the dataset.
    Raises:
        FedbiomedDataLoadingPlanValueError: if the dataset does not have a `get_dataset_type` method and moreover
            the type could not be guessed.
    """
    if hasattr(dataset, "get_dataset_type"):
        return dataset.get_dataset_type()
    elif dataset.__class__.__name__ == "ImageFolder":
        # ImageFolder could be both an images type or mednist. Try to identify mednist with some heuristic.
        if hasattr(dataset, "classes") and all(
            [
                x in dataset.classes
                for x in [
                    "AbdomenCT",
                    "BreastMRI",
                    "CXR",
                    "ChestCT",
                    "Hand",
                    "HeadCT",
                ]
            ]
        ):
            return DatasetTypes.MEDNIST
        else:
            return DatasetTypes.IMAGES
    elif dataset.__class__.__name__ == "MNIST":
        return DatasetTypes.DEFAULT
    msg = (
        f"{ErrorNumbers.FB615.value} Trying to infer dataset type of {dataset} is not supported "
        + f"for datasets of type {dataset.__class__.__qualname__}"
    )
    logger.debug(msg)
    raise FedbiomedDataLoadingPlanValueError(msg)

serialize

serialize()

Serializes the class in a format similar to json.

Returns:

Type	Description
`Tuple[dict, List]`	a tuple sufficient for reconstructing the DataLoading plan. It includes: - a dictionary of key-value pairs with the DataLoadingPlan parameters. - a list of dict containing the data for reconstruction all the DataLoadingBlock of the DataLoadingPlan

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def serialize(self) -> Tuple[dict, List]:
    """Serializes the class in a format similar to json.

    Returns:
        a tuple sufficient for reconstructing the DataLoading plan. It includes:
            - a dictionary of key-value pairs with the
            [DataLoadingPlan][fedbiomed.common.dataloadingplan.DataLoadingPlan] parameters.
            - a list of dict containing the data for reconstruction all the DataLoadingBlock
                of the [DataLoadingPlan][fedbiomed.common.dataloadingplan.DataLoadingPlan]
    """
    return dict(
        dlp_id=self.dlp_id,
        dlp_name=self.desc,
        target_dataset_type=self.target_dataset_type.value,
        loading_blocks={
            key.value: dlb.get_serialization_id() for key, dlb in self.items()
        },
        key_paths={
            key.value: (f"{key.__module__}", f"{key.__class__.__qualname__}")
            for key in self.keys()
        },
    ), [dlb.serialize() for dlb in self.values()]

DataLoadingPlanMixin

DataLoadingPlanMixin()

Utility class to enable DLP functionality in a dataset.

Any Dataset class that inherits from [DataLoadingPlanMixin] will have the basic tools necessary to support a DataLoadingPlan. Typically, the logic of each specific DataLoadingBlock in the DataLoadingPlan will be implemented in the form of hooks that are called within the Dataset's implementation using the helper function apply_dlb defined below.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def __init__(self):
    self._dlp = None

Functions

apply_dlb

apply_dlb(default_ret_value, dlb_key, *args, **kwargs)

Apply one DataLoadingBlock identified by its key.

Note that we want to easily support the case where the DataLoadingPlan is not activated, or the requested loading block is not contained in the DataLoadingPlan. This is achieved by providing a default return value to be returned when the above conditions are met. Hence, most of the calls to apply_dlb will look like this:

value = self.apply_dlb(value, 'my-loading-block', my_apply_args)

This will ensure that value is not changed if the DataLoadingPlan is not active.

Parameters:

Name	Type	Description	Default
`default_ret_value`	`Any`	the value to be returned in case that the dlp functionality is not required	required
`dlb_key`	`DataLoadingBlockTypes`	the key of the DataLoadingBlock to be applied	required
`*args`	`Optional[Any]`	forwarded to the DataLoadingBlock's apply function	`()`
`**kwargs`	`Optional[Any]`	forwarded to the DataLoadingBlock's apply function	`{}`

Returns: the output of the DataLoadingBlock's apply function, or the default_ret_value when dlp is None or it does not contain the requested loading block

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def apply_dlb(
    self,
    default_ret_value: Any,
    dlb_key: DataLoadingBlockTypes,
    *args: Optional[Any],
    **kwargs: Optional[Any],
) -> Any:
    """Apply one DataLoadingBlock identified by its key.

    Note that we want to easily support the case where the DataLoadingPlan
    is not activated, or the requested loading block is not contained in the
    DataLoadingPlan. This is achieved by providing a default return value
    to be returned when the above conditions are met. Hence, most of the
    calls to apply_dlb will look like this:
    ```
    value = self.apply_dlb(value, 'my-loading-block', my_apply_args)
    ```
    This will ensure that value is not changed if the DataLoadingPlan is
    not active.

    Args:
        default_ret_value: the value to be returned in case that the dlp
            functionality is not required
        dlb_key: the key of the DataLoadingBlock to be applied
        *args: forwarded to the DataLoadingBlock's apply function
        **kwargs: forwarded to the DataLoadingBlock's apply function
    Returns:
        the output of the DataLoadingBlock's apply function, or
            the default_ret_value when dlp is None or it does not contain
            the requested loading block
    """
    if not isinstance(dlb_key, DataLoadingBlockTypes):
        raise FedbiomedDataLoadingPlanValueError(
            f"Key {dlb_key} is not of enum type DataLoadingBlockTypes"
            f" in DataLoadingPlanMixin.apply_dlb"
        )
    if self._dlp is not None and dlb_key in self._dlp:
        return self._dlp[dlb_key].apply(*args, **kwargs)
    else:
        return default_ret_value

clear_dlp

clear_dlp()

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def clear_dlp(self):
    self._dlp = None

set_dlp

set_dlp(dlp)

Sets the dlp if the target dataset type is appropriate

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def set_dlp(self, dlp: DataLoadingPlan):
    """Sets the dlp if the target dataset type is appropriate"""
    if not isinstance(dlp, DataLoadingPlan):
        msg = (
            f"{ErrorNumbers.FB615.value} Trying to set a DataLoadingPlan but the argument is of type "
            + f"{type(dlp).__name__}"
        )
        logger.debug(msg)
        raise FedbiomedDataLoadingPlanValueError(msg)

    dataset_type = DataLoadingPlan.infer_dataset_type(
        self
    )  # `self` here will refer to the Dataset instance
    if (
        dlp.target_dataset_type != DatasetTypes.NONE
        and dataset_type != dlp.target_dataset_type
    ):
        raise FedbiomedDataLoadingPlanValueError(
            f"Trying to set {dlp} on dataset of type {dataset_type.value} but "
            f"the target type is {dlp.target_dataset_type}"
        )
    elif dlp.target_dataset_type == DatasetTypes.NONE:
        dlp.target_dataset_type = dataset_type
    self._dlp = dlp

MapperBlock

MapperBlock()

Bases: DataLoadingBlock

A DataLoadingBlock for mapping values.

This DataLoadingBlock can be used whenever an "indirect mapping" is needed. For example, it can be used to implement a correspondence between a set of "logical" abstract names and a set of folder names on the filesystem.

The apply function of this DataLoadingBlock takes a "key" as input (a str) and returns the mapped value corresponding to map[key]. Note that while the constructor of this class sets a value for type_id, developers are recommended to set a more meaningful value that better speaks to their application.

Multiple instances of this loading_block may be used in the same DataLoadingPlan, provided that they are given different type_id via the constructor.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def __init__(self):
    super(MapperBlock, self).__init__()
    self.map = {}
    self._serialization_validator.update_validation_scheme(
        MapperBlock._extra_validation_scheme()
    )

Attributes

map `instance-attribute`

map = {}

Functions

apply

apply(key)

Returns the value mapped to the key, if it exists.

Raises:

Type	Description
`FedbiomedLoadingBlockError`	if map is not a dict or the key does not exist.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def apply(self, key):
    """Returns the value mapped to the key, if it exists.

    Raises:
        FedbiomedLoadingBlockError: if map is not a dict or the key does not exist.
    """
    if not isinstance(self.map, dict) or key not in self.map:
        msg = f"{ErrorNumbers.FB614.value} Mapper block error: no key '{key}' in mapping dictionary"
        logger.debug(msg)
        raise FedbiomedLoadingBlockError(msg)
    return self.map[key]

deserialize

deserialize(load_from)

Reconstruct the DataLoadingBlock from a serialized version.

Parameters:

Name	Type	Description	Default
`load_from`	`dict`	a dictionary as obtained by the serialize function.	required

Returns: the self instance

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def deserialize(self, load_from: dict) -> DataLoadingBlock:
    """Reconstruct the [DataLoadingBlock][fedbiomed.common.dataloadingplan.DataLoadingBlock]
    from a serialized version.

    Args:
        load_from (dict): a dictionary as obtained by the serialize function.
    Returns:
        the self instance
    """
    super(MapperBlock, self).deserialize(load_from)
    self.map = load_from["map"]
    return self

serialize

serialize()

Serializes the class in a format similar to json.

Returns:

Type	Description
`dict`	a dictionary of key-value pairs sufficient for reconstructing
`dict`	the DataLoadingBlock.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def serialize(self) -> dict:
    """Serializes the class in a format similar to json.

    Returns:
        a dictionary of key-value pairs sufficient for reconstructing
        the [DataLoadingBlock][fedbiomed.common.dataloadingplan.DataLoadingBlock].
    """
    ret = super(MapperBlock, self).serialize()
    ret.update({"map": self.map})
    return ret

SerializationValidation

SerializationValidation()

Provide Validation capabilities for serializing/deserializing a [DataLoadingBlock] or [DataLoadingPlan].

When a developer inherits from [DataLoadingBlock] to define a custom loading block, they are required to call the _serialization_validator.update_validation_scheme function with a dictionary argument containing the rules to validate all the additional fields that will be used in the serialization of their loading block.

These rules must follow the syntax explained in the SchemeValidator class.

For example

    class MyLoadingBlock(DataLoadingBlock):
        def __init__(self):
            self.my_custom_data = {}
            self._serialization_validator.update_validation_scheme({
                'custom_data': {
                    'rules': [dict, ...any other rules],
                    'required': True
                }
            })
        def serialize(self):
            serialized = super().serialize()
            serialized.update({'custom_data': self.my_custom_data})
            return serialized

Attributes:

Name	Type	Description
`_validation_scheme`		(dict) an extensible set of rules to validate the DataLoadingBlock metadata.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def __init__(self):
    self._validation_scheme = {}

Functions

dlb_default_scheme `classmethod`

dlb_default_scheme()

The dictionary of default validation rules for a serialized [DataLoadingBlock].

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

@classmethod
def dlb_default_scheme(cls) -> Dict:
    """The dictionary of default validation rules for a serialized [DataLoadingBlock]."""
    return {
        "loading_block_class": {
            "rules": [str, cls._identifier_validation_hook],
            "required": True,
        },
        "loading_block_module": {
            "rules": [str, cls._identifier_validation_hook],
            "required": True,
        },
        "dlb_id": {
            "rules": [str, cls._serial_id_validation_hook],
            "required": True,
        },
    }

dlp_default_scheme `classmethod`

dlp_default_scheme()

The dictionary of default validation rules for a serialized [DataLoadingPlan].

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

@classmethod
def dlp_default_scheme(cls) -> Dict:
    """The dictionary of default validation rules for a serialized [DataLoadingPlan]."""
    return {
        "dlp_id": {
            "rules": [str],
            "required": True,
        },
        "dlp_name": {
            "rules": [str],
            "required": True,
        },
        "target_dataset_type": {
            "rules": [str, cls._target_dataset_type_validator],
            "required": True,
        },
        "loading_blocks": {
            "rules": [dict, cls._loading_blocks_types_validator],
            "required": True,
        },
        "key_paths": {"rules": [dict, cls._key_paths_validator], "required": True},
    }

update_validation_scheme

update_validation_scheme(new_scheme)

Updates the validation scheme.

Parameters:

Name	Type	Description	Default
`new_scheme`	`dict`	(dict) new dict of rules	required

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def update_validation_scheme(self, new_scheme: dict) -> None:
    """Updates the validation scheme.

    Args:
        new_scheme: (dict) new dict of rules
    """
    self._validation_scheme.update(new_scheme)

validate

validate(dlb_metadata, exception_type, only_required=True)

Validate a dict of dlb_metadata according to the _validation_scheme.

Parameters:

Name	Type	Description	Default
`dlb_metadata (dict)`		the [DataLoadingBlock] metadata, as returned by serialize or as loaded from the node database.	required
`exception_type`	`Type[FedbiomedError]`	the type of the exception to be raised when validation fails.	required
`only_required (bool)`		see SchemeValidator.populate_with_defaults	required

Raises: exception_type: if the validation fails.

Source code in fedbiomed/common/dataloadingplan/_data_loading_plan.py

def validate(
    self,
    dlb_metadata: Dict,
    exception_type: Type[FedbiomedError],
    only_required: bool = True,
) -> None:
    """Validate a dict of dlb_metadata according to the _validation_scheme.

    Args:
        dlb_metadata (dict) : the [DataLoadingBlock] metadata, as returned by serialize or as loaded from the
            node database.
        exception_type (Type[FedbiomedError]): the type of the exception to be raised when validation fails.
        only_required (bool) : see SchemeValidator.populate_with_defaults
    Raises:
        exception_type: if the validation fails.
    """
    try:
        sc = SchemeValidator(self._validation_scheme)
    except RuleError as e:
        msg = ErrorNumbers.FB614.value + f": {e}"
        logger.critical(msg)
        raise exception_type(msg) from e

    try:
        dlb_metadata = sc.populate_with_defaults(
            dlb_metadata, only_required=only_required
        )
    except ValidatorError as e:
        msg = ErrorNumbers.FB614.value + f": {e}"
        logger.critical(msg)
        raise exception_type(msg) from e

    try:
        sc.validate(dlb_metadata)
    except ValidateError as e:
        msg = ErrorNumbers.FB614.value + f": {e}"
        logger.critical(msg)
        raise exception_type(msg) from e

Dataloadingplan

Classes

DataLoadingBlock

Functions

apply abstractmethod

deserialize

get_serialization_id

instantiate_class staticmethod

instantiate_key staticmethod

serialize

DataLoadingPlan

Attributes

desc instance-attribute

dlp_id instance-attribute

target_dataset_type instance-attribute

Functions

deserialize

infer_dataset_type staticmethod

serialize

DataLoadingPlanMixin

Functions

apply_dlb

clear_dlp

set_dlp

MapperBlock

Attributes

map instance-attribute

Functions

apply

deserialize

serialize

SerializationValidation

Functions

dlb_default_scheme classmethod

dlp_default_scheme classmethod

update_validation_scheme

validate

apply `abstractmethod`

instantiate_class `staticmethod`

instantiate_key `staticmethod`

desc `instance-attribute`

dlp_id `instance-attribute`

target_dataset_type `instance-attribute`

infer_dataset_type `staticmethod`

map `instance-attribute`

dlb_default_scheme `classmethod`

dlp_default_scheme `classmethod`