mlreflect.training package
Submodules
mlreflect.training.footprint module
- class mlreflect.training.footprint.FootprintRescaler(reflectivity: numpy.ndarray, true_ratio: float, errors: list, q: Optional[numpy.ndarray] = None, wavelength: Optional[float] = None, theta: Optional[numpy.ndarray] = None)[source]
Bases:
object
Returns reflectivity curves with rescaled (“incorrect”) footprint for standard (specular) XRR geometries :param reflectivity: Reflected intensity values of one curve (1D) or several curves (2D). :param true_ratio: Assumed “true” ratio of beam width to sample length used to produce the fictional footprint on
the curves. E.g. for beam width of 200 microns and sample length of 10 mm
true_ratio = 0.02
.- Parameters
errors – List of error factors that the
true_ratio
can be multiplied by to produce the new (“incorrect”) footprint corrections.q – ndarray of q-values corresponding the provided intensity matrix. If this value is provided, the wavelength has to be specified as well.
wavelength – Fictional wavelength in Angstroms that is used to calculate the scattering angle.
theta – ndarray angle values corresponding to the provided intensity matrix. If this is provided,
q
andwavelength
do not need to be provided.
- static apply_footprint(intensity: numpy.ndarray, scattering_angle: numpy.ndarray, ratio: float) numpy.ndarray [source]
- static correct_footprint(intensity: numpy.ndarray, scattering_angle: numpy.ndarray, ratio: float) numpy.ndarray [source]
- static normalize_to_first(rescaled_intensity: numpy.ndarray, original_intensity: numpy.ndarray)[source]
- static normalize_to_max(rescaled_intensity: numpy.ndarray, original_intensity: numpy.ndarray)[source]
- property rescaled_reflectivity
mlreflect.training.noise_generator module
- class mlreflect.training.noise_generator.BaseGenerator(reflectivity, labels, batch_functions, batch_size=32, shuffle=True)[source]
Bases:
tensorflow.python.keras.utils.data_utils.Sequence
- class mlreflect.training.noise_generator.NoiseGenerator(reflectivity: numpy.ndarray, labels: numpy.ndarray, input_preprocessor: mlreflect.training.preprocessing.InputPreprocessor, batch_size=32, shuffle=True, mode='single', noise_range=None, background_range=None, relative_background_spread: float = 0.1)[source]
Bases:
mlreflect.training.noise_generator.BaseGenerator
Generator object that returns a standardized batch of reflectivity and labels with random noise and background.
- Parameters
reflectivity – Training reflectivity curves
labels – Training labels on the same order as reflectivity
input_preprocessor – :class:
InputPreprocessor
object with or without stored standardization valuesbatch_size – Number of samples per mini batch
shuffle – If
True
, shuffles reflectivity and labels after every epochnoise_range – Tuple
(min, max)
between which the shot noise levels are randomly generatedbackground_range – Tuple
(min, max)
between which the background levels are randomly generatedmode – ‘single’: random noise and background levels are generated for every curve of a mini batch ‘batch’: random noise and background levels are generated for each mini batch
relative_background_spread – Relative standard deviation of the normal distribution (e.g. a value of
0.1
means the standard deviation is 10% of the mean)
- class mlreflect.training.noise_generator.NoiseGeneratorLog(reflectivity, labels, batch_size=32, mode='single', shuffle=True, noise_range=None, background_range=None, relative_background_spread: float = 0.1)[source]
mlreflect.training.prediction module
- class mlreflect.training.prediction.Prediction(model_path: str, label_names: List[str])[source]
Bases:
object
- mean_absolute_error(predicted_labels: Union[pandas.core.frame.DataFrame, numpy.ndarray], test_labels: Union[pandas.core.frame.DataFrame, numpy.ndarray])[source]
mlreflect.training.preprocessing module
- class mlreflect.training.preprocessing.InputPreprocessor[source]
Bases:
object
Allows standardization while storing mean and standard deviation for later use.
- Returns
InputPreprocessor
- property has_saved_standardization
- reset_mean_and_std()[source]
Resets previously stored mean and standard deviation for standardization.
- property standard_mean
- property standard_std
- class mlreflect.training.preprocessing.OutputPreprocessor(sample: mlreflect.data_generation.multilayer.MultilayerStructure, normalization: str = 'min_to_zero')[source]
Bases:
object
Class for preprocessing reflectivity labels for training and validation.
- Parameters
sample –
MultilayerStructure
object where the sample layers and their names and parameter ranges are defined.normalization – Defines how the output labels are normalized. “min_to_zero” (default): shifts minimum value to
0
and scales maximum value to1
). “absolute_max”: scales absolute maximum value to1
).
- Returns
OutputPreprocessor
- add_constant_labels(predicted_labels_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Adds all labels in
constant_labels
topredicted_labels_df
.
- property all_label_names
- property all_label_parameters
- apply_preprocessing(labels: Union[pandas.core.frame.DataFrame, numpy.ndarray]) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] [source]
Removes all constant labels and applies normalization to the non-constant labels.
- Parameters
labels – Pandas
DataFrame
or ndarray of randomly generated labels.- Returns
DataFrame constant_labels: DataFrame
- Return type
normalized_labels
- property constant_labels
- normalize_labels(label_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Normalizes all constant labels and returns normalized
DataFrame
.
- property number_of_labels
- property number_of_layers
- remove_labels(label_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Removes labels in
constant_labels
fromlabel_df
and returnsDataFrame
.
- renormalize_labels(label_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Removes normalization from all labels in
label_df
.
- restore_labels(predicted_labels: Union[pandas.core.frame.DataFrame, numpy.ndarray]) pandas.core.frame.DataFrame [source]
Takes the predicted labels, reverts normalization and adds constant labels and returns those as DataFrame.
- property used_labels
mlreflect.training.training module
- class mlreflect.training.training.Trainer(sample_structure: mlreflect.data_generation.multilayer.MultilayerStructure, q_values: numpy.ndarray, random_seed=None)[source]
Bases:
object
Train a neural network model for a given sample structure and q values
- Parameters
sample_structure – MultilayerStructure object that describes the sample. Should only have one non-constant layer.
q_values – ndarray of the q values used for training. Should be similar to the experimental q values.
random_seed – random seed for the training data generation. None means the seed is chosen randomly.
- generate_training_data(training_samples: int = 131072)[source]
Generate a training data set for training the neural network.
- property has_training_data
- train(n_epochs=175, batch_size=512, verbose=1, val_split=0.2) Tuple[mlreflect.models.trained_model.TrainedModel, History] [source]
Train a fully-connected neural network with the generated training data.
- Parameters
n_epochs – Number of epochs to train for.
batch_size – Number of curves per training batch. Must be smaller than val_split times the training set size.
verbose – Determines the amount of text output during training (0, 1, 2).
val_split – The fraction of the training set that is withheld for validation.
- Returns
- TrainedModel object that contains the trained keras model as well as other parameters
necessary to predict test data.
history: Training history output from keras model.fit().
- Return type
trained_model
Module contents
- class mlreflect.training.InputPreprocessor[source]
Bases:
object
Allows standardization while storing mean and standard deviation for later use.
- Returns
InputPreprocessor
- property has_saved_standardization
- reset_mean_and_std()[source]
Resets previously stored mean and standard deviation for standardization.
- property standard_mean
- property standard_std
- class mlreflect.training.NoiseGenerator(reflectivity: numpy.ndarray, labels: numpy.ndarray, input_preprocessor: mlreflect.training.preprocessing.InputPreprocessor, batch_size=32, shuffle=True, mode='single', noise_range=None, background_range=None, relative_background_spread: float = 0.1)[source]
Bases:
mlreflect.training.noise_generator.BaseGenerator
Generator object that returns a standardized batch of reflectivity and labels with random noise and background.
- Parameters
reflectivity – Training reflectivity curves
labels – Training labels on the same order as reflectivity
input_preprocessor – :class:
InputPreprocessor
object with or without stored standardization valuesbatch_size – Number of samples per mini batch
shuffle – If
True
, shuffles reflectivity and labels after every epochnoise_range – Tuple
(min, max)
between which the shot noise levels are randomly generatedbackground_range – Tuple
(min, max)
between which the background levels are randomly generatedmode – ‘single’: random noise and background levels are generated for every curve of a mini batch ‘batch’: random noise and background levels are generated for each mini batch
relative_background_spread – Relative standard deviation of the normal distribution (e.g. a value of
0.1
means the standard deviation is 10% of the mean)
- class mlreflect.training.NoiseGeneratorLog(reflectivity, labels, batch_size=32, mode='single', shuffle=True, noise_range=None, background_range=None, relative_background_spread: float = 0.1)[source]
- class mlreflect.training.OutputPreprocessor(sample: mlreflect.data_generation.multilayer.MultilayerStructure, normalization: str = 'min_to_zero')[source]
Bases:
object
Class for preprocessing reflectivity labels for training and validation.
- Parameters
sample –
MultilayerStructure
object where the sample layers and their names and parameter ranges are defined.normalization – Defines how the output labels are normalized. “min_to_zero” (default): shifts minimum value to
0
and scales maximum value to1
). “absolute_max”: scales absolute maximum value to1
).
- Returns
OutputPreprocessor
- add_constant_labels(predicted_labels_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Adds all labels in
constant_labels
topredicted_labels_df
.
- property all_label_names
- property all_label_parameters
- apply_preprocessing(labels: Union[pandas.core.frame.DataFrame, numpy.ndarray]) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] [source]
Removes all constant labels and applies normalization to the non-constant labels.
- Parameters
labels – Pandas
DataFrame
or ndarray of randomly generated labels.- Returns
DataFrame constant_labels: DataFrame
- Return type
normalized_labels
- property constant_labels
- normalize_labels(label_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Normalizes all constant labels and returns normalized
DataFrame
.
- property number_of_labels
- property number_of_layers
- remove_labels(label_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Removes labels in
constant_labels
fromlabel_df
and returnsDataFrame
.
- renormalize_labels(label_df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]
Removes normalization from all labels in
label_df
.
- restore_labels(predicted_labels: Union[pandas.core.frame.DataFrame, numpy.ndarray]) pandas.core.frame.DataFrame [source]
Takes the predicted labels, reverts normalization and adds constant labels and returns those as DataFrame.
- property used_labels
- class mlreflect.training.Prediction(model_path: str, label_names: List[str])[source]
Bases:
object
- mean_absolute_error(predicted_labels: Union[pandas.core.frame.DataFrame, numpy.ndarray], test_labels: Union[pandas.core.frame.DataFrame, numpy.ndarray])[source]
- class mlreflect.training.Trainer(sample_structure: mlreflect.data_generation.multilayer.MultilayerStructure, q_values: numpy.ndarray, random_seed=None)[source]
Bases:
object
Train a neural network model for a given sample structure and q values
- Parameters
sample_structure – MultilayerStructure object that describes the sample. Should only have one non-constant layer.
q_values – ndarray of the q values used for training. Should be similar to the experimental q values.
random_seed – random seed for the training data generation. None means the seed is chosen randomly.
- generate_training_data(training_samples: int = 131072)[source]
Generate a training data set for training the neural network.
- property has_training_data
- train(n_epochs=175, batch_size=512, verbose=1, val_split=0.2) Tuple[mlreflect.models.trained_model.TrainedModel, History] [source]
Train a fully-connected neural network with the generated training data.
- Parameters
n_epochs – Number of epochs to train for.
batch_size – Number of curves per training batch. Must be smaller than val_split times the training set size.
verbose – Determines the amount of text output during training (0, 1, 2).
val_split – The fraction of the training set that is withheld for validation.
- Returns
- TrainedModel object that contains the trained keras model as well as other parameters
necessary to predict test data.
history: Training history output from keras model.fit().
- Return type
trained_model