logai.algorithms.nn_model.forecast_nn package
Submodules
logai.algorithms.nn_model.forecast_nn.base_nn module
- class logai.algorithms.nn_model.forecast_nn.base_nn.Embedder(vocab_size: int, embedding_dim: int, pretrain_matrix: array | None = None, freeze: bool = False)
Bases:
Module
Learnable embedder for embedding loglines.
- Parameters:
vocab_size – vocabulary size.
embedding_dim – embedding dimension.
pretrain_matrix – torch.Tensor object containing the pretrained embedding of the vocabulary tokens.
freeze – Freeze embeddings to pretrained ones if set to True, otherwise makes the embeddings learnable.
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class logai.algorithms.nn_model.forecast_nn.base_nn.ForecastBasedNN(config: ForecastBasedNNParams)
Bases:
Module
Model for learning log representations through a forecasting based self-supervised task.
- Parameters:
config – ForecastBasedNNParams config class for parameters of forecasting based neural log representation models.
- fit(train_loader: DataLoader, dev_loader: DataLoader | None = None)
Fit method for training model
- Parameters:
train_loader – dataloader (torch.utils.data.DataLoader) for the train dataset.
dev_loader – dataloader (torch.utils.data.DataLoader) for the train dataset. Defaults to None, for which no evaluation is run.
- Returns:
dict containing the best loss on dev dataset.
- load_model(model_save_file: str = '')
Loading model from file.
- Parameters:
model_save_file – path to file where model would be saved.
- predict(test_loader: DataLoader, dtype: str = 'test')
Predict method on test data.
- Parameters:
test_loader – dataloader (torch.utils.data.DataLoader) for test (or development) dataset.
dtype – can be of type “test” or “dev” based on which the predict method is called for.
- Returns:
dict object containing the overall evaluation metrics for test (or dev) data.
- save_model()
Saving model to file as specified in config
- class logai.algorithms.nn_model.forecast_nn.base_nn.ForecastBasedNNParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001)
Bases:
Config
Config for neural representation learning for logs using forecasting based self-supervised tasks.
- Parameters:
model_name – name of the model.
metadata_filepath – path to file containing meta data (pretrained token embeddings in case if semantic log representations are used in feature type).
output_dir – path to output directory where the model would be dumped.
feature_type – (should be “semantics” or “sequential”)type of log feature representations used for the log-lines or log-sequences.
label_type – type of label (should be “anomaly” or “next_log”) based on whether supervised or unsupervised (forcast based) model is being used.
eval_type – (should be “session” or None) whether to aggregate and report the evaluation metrics at the level of sessions (based on the span_id in the log data) or at the level of each logline.
topk – the prediction at top-k to consider, when deciding whether an evaluation instance is an anomaly or not.
embedding_dim – dimension of the embedding space. Both for sequential and semantic type feature representation, the input log feature representation is passed through an embedding layer which projects it to the embedding_dim.
hidden_size – dimension of the hidden representations.
freeze – whether to freeze the embedding layer to use the pretrained embeddings or to further train it on the given task.
gpu – device number if gpu is used (otherwise -1 or None will use cpu).
patience – number of eval_steps, the model waits for performance on validation data to improve, before early stopping the training.
num_train_epochs – number of training epochs.
batch_size – batch size.
learning_rate – learning rate.
- batch_size: int
- embedding_dim: int
- eval_type: str
- feature_type: str
- freeze: bool
- gpu: int
- label_type: str
- learning_rate: int
- metadata_filepath: str
- model_name: str
- num_train_epochs: int
- output_dir: str
- patience: int
- topk: int
logai.algorithms.nn_model.forecast_nn.cnn module
- class logai.algorithms.nn_model.forecast_nn.cnn.CNN(config: CNNParams)
Bases:
ForecastBasedNN
CNN based model for learning log representation through a self-supervised forecasting task over log sequences.
- Parameters:
config – parameters for CNN log representation learning model.
- forward(input_dict)
Forward method for cnn model.
:param input_dict : dict containing the session_idx, features, window_anomalies and window_labels as in ForecastNNVectorizedDataset object. :return: dict containing loss and prediction tensor.
- class logai.algorithms.nn_model.forecast_nn.cnn.CNNParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001, kernel_sizes: list = [2, 3, 4])
Bases:
ForecastBasedNNParams
Config for CNN based log representation learning.
- Parameters:
kernel_sizes – the kernel size (default value: list = [2, 3, 4]).
- kernel_sizes: list
logai.algorithms.nn_model.forecast_nn.lstm module
- class logai.algorithms.nn_model.forecast_nn.lstm.Attention(input_size, max_seq_len)
Bases:
Module
Attention model for lstm based log representation learning.
- Parameters:
input_size – input dimension.
max_seq_len – maximum sequence length.
- forward(lstm_input)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- glorot(tensor)
- zeros(tensor)
- class logai.algorithms.nn_model.forecast_nn.lstm.LSTM(config: LSTMParams)
Bases:
ForecastBasedNN
LSTM based model for learning log representation through a self-supervised forecasting task over log sequences. :param config: parameters for lstm based model.
- forward(input_dict)
Forward method for lstm model.
- Parameters:
input_dict – dict containing the session_idx, features, window_anomalies and window_labels as in ForecastNNVectorizedDataset object.
- Returns:
dict containing loss and prediction tensor.
- class logai.algorithms.nn_model.forecast_nn.lstm.LSTMParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001, num_directions: int = 2, num_layers: int = 1, max_token_len: int | None = None, use_attention: bool = False)
Bases:
ForecastBasedNNParams
Config for lstm based log representation learning.
- Parameters:
num_directions – whether bidirectional or unidirectional (left to right) model.
num_layers – number of hidden layers in the neural network.
max_token_len – maximum token length of the input.
use_attention – whether to use attention or not.
- max_token_len: int
- num_directions: int
- num_layers: int
- use_attention: bool
logai.algorithms.nn_model.forecast_nn.transformer module
- class logai.algorithms.nn_model.forecast_nn.transformer.Transformer(config: TransformerParams)
Bases:
ForecastBasedNN
Transformer based model for learning log representation through a self-supervised forecasting task over log sequences.
- forward(input_dict)
Forward method of transformer based model.
- Parameters:
input_dict – dict containing the session_idx, features, window_anomalies and window_labels as in ForecastNNVectorizedDataset object.
- Returns:
dict containing loss and prediction tensor.
- class logai.algorithms.nn_model.forecast_nn.transformer.TransformerParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001, nhead: int = 4, num_layers: int = 1)
Bases:
ForecastBasedNNParams
Config for transformer based log representation learning.
- Parameters:
nhead – number of attention heads.
num_layers – number of hidden layers in the neural network.
- nhead: int
- num_layers: int
logai.algorithms.nn_model.forecast_nn.utils module
- logai.algorithms.nn_model.forecast_nn.utils.seed_everything(seed=1234)
Fix the random seeds throughout the python environment
- Parameters:
seed – Seed value. Defaults to 1234.
- logai.algorithms.nn_model.forecast_nn.utils.set_device(gpu: int | None = None)
Set device (cpu or gpu). Use -1 to specify cpu. If not manually set device would be automatically set to gpu. if gpu is available otherwise cpu would be used.
- Parameters:
gpu – device number of gpu (use -1 for cpu). Defaults to None.
- Returns:
torch device type object.
- logai.algorithms.nn_model.forecast_nn.utils.tensor2flatten_arr(tensor)
Convert tensor to flat numpy array.
- Parameters:
tensor – tensor object
- Returns:
flat numpy array