fsml.learn package
Subpackages
Submodules
fsml.learn.config module
fsml.learn.reverse module
Train and test the inverse problem, i.e., given mean and variance predict the values of the parameters. However, this is a harder problem with respect to the previous one, due to the presence of outliners. For this reason more robust approach is used: RandomForest
- fsml.learn.reverse.get_ranges(in_number: Any) List[int]
Given an integer it returns [i / 2, i, i + i / 2]
- fsml.learn.reverse.perform_grid_search(params: Dict[str, Any], train_x: List[List[float]], train_y: List[List[float]], test_x: List[List[float]], test_y: List[List[float]]) Tuple[float, RandomForestRegressor]
Perform Grid search from results obtained from RandomSearch and compare the obtained accuracy with the one given as input.
- Parameters:
params – The parameters from RandomSearch
random_acc – the input accuracy
- Returns:
The best estimator
- fsml.learn.reverse.perform_random_search(train_x: List[List[float]], train_y: List[List[float]], test_x: List[List[float]], test_y: List[List[float]]) Tuple[float, Dict[str, Any]]
Perform Random Search
- Parameters:
train_x – the input features of the train set
train_y – the ground truth of the train set
test_x – the input features of the test set
test_y – the ground truth of the test set
- Returns:
The accuracy
- fsml.learn.reverse.save_model(estimator: RandomForestRegressor, filepath: str) None
Save the input model into the input file path
- fsml.learn.reverse.split(columns: Iterable[str]) Tuple[List[str], List[str]]
Split the input columns to find the parameters and the variables names
- fsml.learn.reverse.train_and_test(data_file: str, random_search: bool = False, grid_search: bool = False) None
Train and test a RandomForest Regressor. The regressor can either be a default one, defined in the config file, or the best estimator obtained by running first RandomForest and then Grid Search or just Grid Search with a set of default possibilities. Finally it saves the model to the default model path
- Parameters:
data_file – the file with the dataset
random_search – True to apply random search
grid_search – True to apply Grid Search
- Returns:
fsml.learn.test module
- class fsml.learn.test.Tester(model_path: str, test_dataloader: FSMLDataLoader, **kwargs)
Bases:
object
Just a class for testing
- test() float
Just run the test and return the final accuracy
- fsml.learn.test.test(paths_and_dataloaders: List[Tuple[str, FSMLDataLoader]], **kwargs) List[float]
Run the testing phase against some pre-trained models.
- Parameters:
paths_and_dataloaders – a list of tuple (model_path, dataloader)
num_hidden_input – The number of hidden layer in input side
num_hidden_output – The number of hidden layer in output side
hidden_input_size – The number of neurons for each input hidden layer
hidden_output_size – The number of neurons for each output hidden layer
- Returns:
The list with all the accuracies
fsml.learn.train module
- class fsml.learn.train.KFoldCrossValidationWrapper
Bases:
object
A simple wrapper class for K-Fold Cross Validation
- static kFoldValidation(dataset: FSMLOneMeanStdDataset, model: FSML_MLP_Predictor, epoch: int, kf_split: int, batch_size: int, use: bool = True) Callable
Run kFold Cross Validation
- static setup_kFold_validation(dataset: FSMLOneMeanStdDataset, kf_split: int, batch_size: int) List[Tuple[int, FSMLDataLoader]]
Setup the kfold validation, i.e., returns a list of triple (fold index, train dataloader, validation dataloader)
- Parameters:
dataset – The dataset to split
kf_split – The total number of split
batch_size – the batch_size argument to the dataloader
- Returns:
a list of triple (fold index, train dataloader, validation dataloader)
- class fsml.learn.train.Trainer(train_dataset: ~fsml.learn.data_management.dataset.FSMLOneMeanStdDataset, train_dataloader: ~fsml.learn.data_management.dataloader.FSMLDataLoader, optimizer: ~torch.optim.optimizer.Optimizer, model: ~torch.nn.modules.module.Module, lr_scheduler: ~torch.optim.lr_scheduler._LRScheduler, grad_clip: int, num_epochs: int = 30, criterion: ~torch.nn.modules.loss._Loss | ~torch.nn.modules.loss._WeightedLoss = <class 'torch.nn.modules.loss.MSELoss'>, k_fold: int = 5, accuracy_threshold: float = 0.94, imgs_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/moments-learning/checkouts/latest/docs/img', model_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/moments-learning/checkouts/latest/docs/models')
Bases:
object
Just a class for training
- plot() None
Plot the result (train loss and train acc over epochs)
- run() str
Run the training and return the path of the model
- fsml.learn.train.train(path: str, criterion: ~torch.nn.modules.loss._Loss | ~torch.nn.modules.loss._WeightedLoss = <class 'torch.nn.modules.loss.MSELoss'>, batch_size: int = 32, lr: float = 0.0001, k_fold: int = 3, num_epochs: int = 250, accuracy_threshold: float = 2.0, patience: float = 10, min_lr: float = 1e-06, grad_clip: float = 5, factor: float = 0.5, mode: str = 'min', **kwargs) List[Tuple[str, FSMLDataLoader]]
Run the training. The input path can be either a path to a CSV file that contains the dataset, or to a folder with multiple CSV files (so multiple datasets). In the first case the training procedure will be runned only for that file, otherwise, in the second case, multiple times as the number of files.
- Parameters:
path – the path to a CSV file or CSV folder
criterion – the PyTorch type of loss (or a custom one)
batch_size – the Size of the batch for the dataloader
lr – initial learning rate
k_fold – number of cross fold validation
num_epochs – The total number of epochs
accuracy_threshold – Stop when the current accuracy overcome a value
patience – Number of epochs with no improvement after which learning rate will be reduced
min_lr – A lower bound on the learning rate of all param groups
grad_clip – the gradient clipping value
factor – Factor by which the learning rate will be reduced
mode – The mode with the scheduler will reduce the learning rate
- Returns:
A list of tuple (model_path, dataloader)