logai.utils package
Submodules
logai.utils.constants module
logai.utils.dataset_utils module
- logai.utils.dataset_utils.split_train_dev_test_for_anomaly_detection(logrecord, training_type, test_data_frac_neg_class, test_data_frac_pos_class=None, shuffle=False)
Util method to split a logrecord object into train dev and test splits, where the splitting fractions are based on the SPAN_ID field of the logrecord.
- Parameters:
logrecord – (LogRecordObject): input logrecord object to be split into train, dev and test splits
training_type – (str): ‘supervised’ or ‘unsupervised’
test_data_frac_neg_class – (float): fraction of the negative class to be . Defaults to None.
test_data_frac_pos_class – (float, optional): when supervised mode is selected, fraction of the positive class data to be used for test data. (fraction for dev data is fixed).For unsupervised mode this value is fixed to 1.0
shuffle – (bool, optional): whether to shuffle the log data when splitting into train and test. If False, then it uses the chronological ordering, where the first (chronologically first) split will constitute train data, second one development data and third one as test data. Defaults to False.
- Returns:
logrecord_train: logrecord object containing train data.
logrecord_dev: logrecord object containing dev data.
logrecord_test: logrecord object containing test data.
logai.utils.evaluate module
- logai.utils.evaluate.get_accuracy_precision_recall(y: array, y_labels: array)
Evaluates the anomaly and labels.
- Parameters:
y – Model inference results.
y_labels – y labels.
- Returns:
Accuracy, precision, and recall.
logai.utils.file_utils module
- logai.utils.file_utils.file_exists(path: str)
util function to check if file exists
- Parameters:
path – (str): path to file
- Returns:
bool: if file exists or not
- logai.utils.file_utils.read_file(filepath: str)
Reads yaml, json, csv or pickle files.
- Parameters:
filepath – (str): path to file
- Returns:
data object containing file contents
logai.utils.functions module
Module that includes common data manipulation functions to be applied by pandas dataframes.
- logai.utils.functions.get_parameter_list(row)
- logai.utils.functions.pad(x, max_len: array, padding_value: int = 0)
Method to trim or pad any 1-d numpy array to a given max length with the given padding value
- Parameters:
x – (np.array): given 1-d numpy array to be padded/trimmed
max_len – (int): maximum length of padded/trimmed output
padding_value – (int, optional): padding value. Defaults to 0.
- Returns:
np.array: padded/trimmed numpy array
- logai.utils.functions.pd_to_timeseries(log_features: Series)
Convert pandas.DataFrame to merlion.TimeSeries for log counter vectors.
- Parameters:
log_features – log feature dataframe must only contain two columns [‘timestamp’: datetime, constants.LOGLINE_COUNTS: int].
- Returns:
merlion.TimeSeries type.
logai.utils.tokenize module
Module that includes common tokenization functions to be applied by pandas dataframes.
- logai.utils.tokenize.replace_delimeters(logline, delimeter_regex)
Remove customer delimeters.
- logai.utils.tokenize.tokenize(logline, config)
Common tokenization of logline and using space to separate tokens.