logai.algorithms.parsing_algo package

Submodules

logai.algorithms.parsing_algo.ael module

This is wrapping the logpai/logparser implementation of AEL algorithm link: https://github.com/logpai/logparser/blob/master/logparser/AEL/AEL.py.

class logai.algorithms.parsing_algo.ael.AEL(params: AELParams)

Bases: ParsingAlgo

categorize()

Categorizes templates bin by bin.

fit(loglines: DataFrame)

Fit method to train log parser on given log data. Since AEL Log Parser does not require any training, this method is empty.

has_diff(tokens1: list, tokens2: list)

Method to check if there is significant different between two given token sequences.

Parameters:
  • tokens1 – The first token sequence.

  • tokens2 – The second token sequence.

Returns:

0 if no significant difference between given token sequences else 1.

load_data(loglines: Series)

Method to load log data (pandas Series object) to a format compatible for parsing.

Parameters:

loglines – The log data to be parsed.

merge_event(e1, e2)

Method to merge two events.

Parameters:
  • e1 – The first event to merge (merged in-place).

  • e2 – The second event to merge.

Returns:

The merged event.

parse(loglines: Series) Series

Parse method to run log parser on given log data.

Parameters:

loglines – The raw log data to be parsed.

Returns:

The parsed log data.

reconcile()

Merges events if a bin has too many events.

tokenize()

Puts logs into bins according to (# of ‘<*>’, # of token).

class logai.algorithms.parsing_algo.ael.AELParams(rex: str | None = None, minEventCount: int = 2, merge_percent: int = 1, keep_para: bool = True)

Bases: Config

Parameters for the AEL Log Parsing algorithm. For more details see https://github.com/logpai/logparser/tree/master/logparser/AEL.

Parameters:
  • rex – A rex string.

  • minEventCount – The minimum event count.

  • merge_percent – The merge percentage.

  • keep_para – Whether to keep parameters.

keep_para: bool
merge_percent: int
minEventCount: int
rex: str
class logai.algorithms.parsing_algo.ael.Event(logidx, Eventstr='')

Bases: object

Event class to wrap log events.

refresh_id()

Generates id for a log event using the hashing function.

logai.algorithms.parsing_algo.drain module

This file is wrapping the IMB/Drain3 implementation. Link: https://github.com/IBM/Drain3/blob/master/drain3/drain.py

class logai.algorithms.parsing_algo.drain.Drain(params: ~logai.algorithms.parsing_algo.drain.DrainParams, profiler=<logai.algorithms.parsing_algo.drain.NullProfiler object>)

Bases: ParsingAlgo

property clusters
fit(logline: Series)

fit parsing algorithm with input

Parameters:

loglines – pd.Series of loglines as input

Returns:

pd.Dataframe

static has_numbers(s)
match(content: str)

Match against an already existing cluster. Match shall be perfect (sim_th=1.0). New cluster will not be created as a result of this call, nor any cluster modifications.

Parameters:

content – The log message to match.

Returns:

Matched cluster or None of no match found.

parse(logline: Series) Series

Parse method to run log parser on a given log data.

Parameters:

logline – The raw log data to be parsed.

Returns:

The parsed log data.

class logai.algorithms.parsing_algo.drain.DrainParams(depth: int = 3, sim_th: float = 0.4, max_children: int = 100, max_clusters: int | None = None, extra_delimiters: tuple = (), param_str: str = '*')

Bases: Config

Parameters for Drain Log Parser. For more details on parameters see https://github.com/logpai/Drain3/blob/master/drain3/drain.py.

Parameters:
  • depth – The depth of tree.

  • sim_th – The similarity threshold.

  • max_children – The max number of children nodes.

  • max_clusters – The max number of clusters.

  • extra_delimiters – Extra delimiters.

  • param_str – The wildcard parameter string.

depth: int = 3
extra_delimiters: tuple = ()
classmethod from_dict(config_dict)

Loads a config from a config dict.

Parameters:

config_dict – The config parameters in a dict.

max_children: int = 100
max_clusters: int = None
param_str: str = '*'
sim_th: float = 0.4
class logai.algorithms.parsing_algo.drain.LogCluster(log_template_tokens: list, cluster_id: int)

Bases: object

cluster_id
get_template()
log_template_tokens
size
class logai.algorithms.parsing_algo.drain.LogClusterCache(maxsize, getsizeof=None)

Bases: LRUCache

Least Recently Used (LRU) cache which allows callers to conditionally skip cache eviction algorithm when accessing elements.

get(key)

Returns the value of the item with the specified key without updating the cache eviction algorithm.

class logai.algorithms.parsing_algo.drain.Node

Bases: object

cluster_ids: List[int]
key_to_child_node: Dict[str, Node]
class logai.algorithms.parsing_algo.drain.NullProfiler

Bases: Profiler

A no-op profiler. Use it instead of SimpleProfiler in case you want to disable profiling.

end_section(section_name='')
report(period_sec=30)
start_section(section_name: str)
class logai.algorithms.parsing_algo.drain.Profiler

Bases: ABC

abstract end_section(section_name='')
abstract report(period_sec=30)
abstract start_section(section_name: str)

logai.algorithms.parsing_algo.iplom module

class logai.algorithms.parsing_algo.iplom.Event(eventStr)

Bases: object

Event class to wrap a Log Event

class logai.algorithms.parsing_algo.iplom.IPLoM(params: IPLoMParams)

Bases: ParsingAlgo

IPLoM Log Parsing algorithm. For details see https://github.com/logpai/logparser/tree/master/logparser/IPLoM.

DetermineP1P2(partition)
Get_Mapping_Position(partition, uniqueTokensCountLS)
Get_Rank_Posistion(cardOfS, Lines_that_match_S, one_m)
PrintEventStats()
PrintPartitions()
fit(loglines: Series)

IPLoM does not support model fit. Call parse() directly with given input and get log templates.

parse(loglines: Series) Series

Parsing method to parse the raw log data.

Parameters:

loglines – The raw log data.

Returns:

The parsed log data.

class logai.algorithms.parsing_algo.iplom.IPLoMParams(rex: str | None = None, logformat: str | None = None, maxEventLen: int = 200, step2Support: float = 0, PST: float = 0, CT: float = 0, lowerBound: float = 0.25, upperBound: float = 0.9, keep_para: bool = True)

Bases: Config

Parameters for the IPLoM Log Parser. For more details on parameters see https://github.com/logpai/logparser/tree/master/logparser/IPLoM.

Parameters:
  • rex – The rex string.

  • logformat – The log format.

  • maxEventLen – The max event length.

  • step2Support – The step to support.

  • lowerBound – The lower bound threshold.

  • upperBound – The upper bound threshold.

  • keep_para – Whether to keep parameters.

CT: float
PST: float
keep_para: bool
logformat: str
lowerBound: float
maxEventLen: int
rex: str
step2Support: float
upperBound: float
class logai.algorithms.parsing_algo.iplom.Partition(stepNo, numOfLogs=0, lenOfLogs=0)

Bases: object

Wrap around the logs and the step number

Module contents

class logai.algorithms.parsing_algo.AEL(params: AELParams)

Bases: ParsingAlgo

categorize()

Categorizes templates bin by bin.

fit(loglines: DataFrame)

Fit method to train log parser on given log data. Since AEL Log Parser does not require any training, this method is empty.

has_diff(tokens1: list, tokens2: list)

Method to check if there is significant different between two given token sequences.

Parameters:
  • tokens1 – The first token sequence.

  • tokens2 – The second token sequence.

Returns:

0 if no significant difference between given token sequences else 1.

load_data(loglines: Series)

Method to load log data (pandas Series object) to a format compatible for parsing.

Parameters:

loglines – The log data to be parsed.

merge_event(e1, e2)

Method to merge two events.

Parameters:
  • e1 – The first event to merge (merged in-place).

  • e2 – The second event to merge.

Returns:

The merged event.

parse(loglines: Series) Series

Parse method to run log parser on given log data.

Parameters:

loglines – The raw log data to be parsed.

Returns:

The parsed log data.

reconcile()

Merges events if a bin has too many events.

tokenize()

Puts logs into bins according to (# of ‘<*>’, # of token).

class logai.algorithms.parsing_algo.Drain(params: ~logai.algorithms.parsing_algo.drain.DrainParams, profiler=<logai.algorithms.parsing_algo.drain.NullProfiler object>)

Bases: ParsingAlgo

property clusters
fit(logline: Series)

fit parsing algorithm with input

Parameters:

loglines – pd.Series of loglines as input

Returns:

pd.Dataframe

static has_numbers(s)
match(content: str)

Match against an already existing cluster. Match shall be perfect (sim_th=1.0). New cluster will not be created as a result of this call, nor any cluster modifications.

Parameters:

content – The log message to match.

Returns:

Matched cluster or None of no match found.

parse(logline: Series) Series

Parse method to run log parser on a given log data.

Parameters:

logline – The raw log data to be parsed.

Returns:

The parsed log data.

class logai.algorithms.parsing_algo.IPLoM(params: IPLoMParams)

Bases: ParsingAlgo

IPLoM Log Parsing algorithm. For details see https://github.com/logpai/logparser/tree/master/logparser/IPLoM.

DetermineP1P2(partition)
Get_Mapping_Position(partition, uniqueTokensCountLS)
Get_Rank_Posistion(cardOfS, Lines_that_match_S, one_m)
PrintEventStats()
PrintPartitions()
fit(loglines: Series)

IPLoM does not support model fit. Call parse() directly with given input and get log templates.

parse(loglines: Series) Series

Parsing method to parse the raw log data.

Parameters:

loglines – The raw log data.

Returns:

The parsed log data.