logai.algorithms.nn_model.forecast_nn package

Submodules

logai.algorithms.nn_model.forecast_nn.base_nn module

class logai.algorithms.nn_model.forecast_nn.base_nn.Embedder(vocab_size: int, embedding_dim: int, pretrain_matrix: array | None = None, freeze: bool = False)

Bases: Module

Learnable embedder for embedding loglines.

Parameters:
  • vocab_size – vocabulary size.

  • embedding_dim – embedding dimension.

  • pretrain_matrix – torch.Tensor object containing the pretrained embedding of the vocabulary tokens.

  • freeze – Freeze embeddings to pretrained ones if set to True, otherwise makes the embeddings learnable.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class logai.algorithms.nn_model.forecast_nn.base_nn.ForecastBasedNN(config: ForecastBasedNNParams)

Bases: Module

Model for learning log representations through a forecasting based self-supervised task.

Parameters:

config – ForecastBasedNNParams config class for parameters of forecasting based neural log representation models.

fit(train_loader: DataLoader, dev_loader: DataLoader | None = None)

Fit method for training model

Parameters:
  • train_loader – dataloader (torch.utils.data.DataLoader) for the train dataset.

  • dev_loader – dataloader (torch.utils.data.DataLoader) for the train dataset. Defaults to None, for which no evaluation is run.

Returns:

dict containing the best loss on dev dataset.

load_model(model_save_file: str = '')

Loading model from file.

Parameters:

model_save_file – path to file where model would be saved.

predict(test_loader: DataLoader, dtype: str = 'test')

Predict method on test data.

Parameters:
  • test_loader – dataloader (torch.utils.data.DataLoader) for test (or development) dataset.

  • dtype – can be of type “test” or “dev” based on which the predict method is called for.

Returns:

dict object containing the overall evaluation metrics for test (or dev) data.

save_model()

Saving model to file as specified in config

class logai.algorithms.nn_model.forecast_nn.base_nn.ForecastBasedNNParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001)

Bases: Config

Config for neural representation learning for logs using forecasting based self-supervised tasks.

Parameters:
  • model_name – name of the model.

  • metadata_filepath – path to file containing meta data (pretrained token embeddings in case if semantic log representations are used in feature type).

  • output_dir – path to output directory where the model would be dumped.

  • feature_type – (should be “semantics” or “sequential”)type of log feature representations used for the log-lines or log-sequences.

  • label_type – type of label (should be “anomaly” or “next_log”) based on whether supervised or unsupervised (forcast based) model is being used.

  • eval_type – (should be “session” or None) whether to aggregate and report the evaluation metrics at the level of sessions (based on the span_id in the log data) or at the level of each logline.

  • topk – the prediction at top-k to consider, when deciding whether an evaluation instance is an anomaly or not.

  • embedding_dim – dimension of the embedding space. Both for sequential and semantic type feature representation, the input log feature representation is passed through an embedding layer which projects it to the embedding_dim.

  • hidden_size – dimension of the hidden representations.

  • freeze – whether to freeze the embedding layer to use the pretrained embeddings or to further train it on the given task.

  • gpu – device number if gpu is used (otherwise -1 or None will use cpu).

  • patience – number of eval_steps, the model waits for performance on validation data to improve, before early stopping the training.

  • num_train_epochs – number of training epochs.

  • batch_size – batch size.

  • learning_rate – learning rate.

batch_size: int
embedding_dim: int
eval_type: str
feature_type: str
freeze: bool
gpu: int
hidden_size: int
label_type: str
learning_rate: int
metadata_filepath: str
model_name: str
num_train_epochs: int
output_dir: str
patience: int
topk: int

logai.algorithms.nn_model.forecast_nn.cnn module

class logai.algorithms.nn_model.forecast_nn.cnn.CNN(config: CNNParams)

Bases: ForecastBasedNN

CNN based model for learning log representation through a self-supervised forecasting task over log sequences.

Parameters:

config – parameters for CNN log representation learning model.

forward(input_dict)

Forward method for cnn model.

:param input_dict : dict containing the session_idx, features, window_anomalies and window_labels as in ForecastNNVectorizedDataset object. :return: dict containing loss and prediction tensor.

class logai.algorithms.nn_model.forecast_nn.cnn.CNNParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001, kernel_sizes: list = [2, 3, 4])

Bases: ForecastBasedNNParams

Config for CNN based log representation learning.

Parameters:

kernel_sizes – the kernel size (default value: list = [2, 3, 4]).

kernel_sizes: list

logai.algorithms.nn_model.forecast_nn.lstm module

class logai.algorithms.nn_model.forecast_nn.lstm.Attention(input_size, max_seq_len)

Bases: Module

Attention model for lstm based log representation learning.

Parameters:
  • input_size – input dimension.

  • max_seq_len – maximum sequence length.

forward(lstm_input)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

glorot(tensor)
zeros(tensor)
class logai.algorithms.nn_model.forecast_nn.lstm.LSTM(config: LSTMParams)

Bases: ForecastBasedNN

LSTM based model for learning log representation through a self-supervised forecasting task over log sequences. :param config: parameters for lstm based model.

forward(input_dict)

Forward method for lstm model.

Parameters:

input_dict – dict containing the session_idx, features, window_anomalies and window_labels as in ForecastNNVectorizedDataset object.

Returns:

dict containing loss and prediction tensor.

class logai.algorithms.nn_model.forecast_nn.lstm.LSTMParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001, num_directions: int = 2, num_layers: int = 1, max_token_len: int | None = None, use_attention: bool = False)

Bases: ForecastBasedNNParams

Config for lstm based log representation learning.

Parameters:
  • num_directions – whether bidirectional or unidirectional (left to right) model.

  • num_layers – number of hidden layers in the neural network.

  • max_token_len – maximum token length of the input.

  • use_attention – whether to use attention or not.

max_token_len: int
num_directions: int
num_layers: int
use_attention: bool

logai.algorithms.nn_model.forecast_nn.transformer module

class logai.algorithms.nn_model.forecast_nn.transformer.Transformer(config: TransformerParams)

Bases: ForecastBasedNN

Transformer based model for learning log representation through a self-supervised forecasting task over log sequences.

forward(input_dict)

Forward method of transformer based model.

Parameters:

input_dict – dict containing the session_idx, features, window_anomalies and window_labels as in ForecastNNVectorizedDataset object.

Returns:

dict containing loss and prediction tensor.

class logai.algorithms.nn_model.forecast_nn.transformer.TransformerParams(model_name: str | None = None, metadata_filepath: str | None = None, output_dir: str | None = None, feature_type: str = '', label_type: str = '', eval_type: str = 'session', topk: int = 10, embedding_dim: int = 100, hidden_size: int = 100, freeze: bool = False, gpu: int | None = None, patience: int = 5, num_train_epochs: int = 100, batch_size: int = 1024, learning_rate: int = 0.0001, nhead: int = 4, num_layers: int = 1)

Bases: ForecastBasedNNParams

Config for transformer based log representation learning.

Parameters:
  • nhead – number of attention heads.

  • num_layers – number of hidden layers in the neural network.

nhead: int
num_layers: int

logai.algorithms.nn_model.forecast_nn.utils module

logai.algorithms.nn_model.forecast_nn.utils.seed_everything(seed=1234)

Fix the random seeds throughout the python environment

Parameters:

seed – Seed value. Defaults to 1234.

logai.algorithms.nn_model.forecast_nn.utils.set_device(gpu: int | None = None)

Set device (cpu or gpu). Use -1 to specify cpu. If not manually set device would be automatically set to gpu. if gpu is available otherwise cpu would be used.

Parameters:

gpu – device number of gpu (use -1 for cpu). Defaults to None.

Returns:

torch device type object.

logai.algorithms.nn_model.forecast_nn.utils.tensor2flatten_arr(tensor)

Convert tensor to flat numpy array.

Parameters:

tensor – tensor object

Returns:

flat numpy array

Module contents