logai.algorithms.clustering_algo package
Submodules
logai.algorithms.clustering_algo.birch module
- class logai.algorithms.clustering_algo.birch.BirchAlgo(params: BirchParams)
Bases:
ClusteringAlgo
BIRCH algorithm for log clustering. This is a wrapper class for the Birch Clustering algorithm in scikit-learn https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html.
- fit(log_features: DataFrame)
Trains a BIRCH model.
- Parameters:
log_features – The log features for training.
- predict(log_features: DataFrame) Series
Predicts using trained BIRCH model.
- Parameters:
log_features – The log features for inference.
- Returns:
A pandas series of log cluster labels.
- class logai.algorithms.clustering_algo.birch.BirchParams(branching_factor: int = 50, n_clusters: int | None = None, threshold: float = 1.5)
Bases:
Config
Parameters for Birch Clustering Algo. For more details on the parameters, see https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html.
- Parameters:
branching_factor – Maximum number of CF subclusters in each node.
n_clusters – Number of clusters after the final clustering step, which treats the subclusters from the leaves as new samples.
threshold – The radius of the subcluster obtained by merging a new sample and the closest subcluster should be lesser than the threshold.
- branching_factor: int
- n_clusters: int
- threshold: float
logai.algorithms.clustering_algo.dbscan module
- class logai.algorithms.clustering_algo.dbscan.DbScanAlgo(params: DbScanParams)
Bases:
ClusteringAlgo
DBSCAN algorithm for log clustering. This is a wrapper class for DBScan based from scikit-learn library https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html
- fit(log_features: DataFrame)
Trains a DBSCAN model.
- Parameters:
log_features – The log features as training data.
- predict(log_features: DataFrame) Series
Predicts using the trained DBSCAN model.
- Parameters:
log_features – The log features for inference.
- Returns:
A pandas series of cluster labels.
- class logai.algorithms.clustering_algo.dbscan.DbScanParams(eps: float = 0.3, min_samples: int = 10, metric: str = 'euclidean', metric_params: object | None = None, algorithm: str = 'auto', leaf_size: int = 30, p: float | None = None, n_jobs: int | None = None)
Bases:
Config
Parameters for DBScan based clustering algorithm. For more details on parameters see https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html.
- Parameters:
eps – The maximum distance between two samples for one to be considered as in the neighborhood of the other.
min_samples – The number of samples (or total weight) in a neighborhood for a point to be considered as a core point.
metric – The metric to use when calculating distance between instances in a feature array.
metric_params – Additional keyword arguments for the metric function.
algorithm – The algorithm to be used by the NearestNeighbors module to compute pointwise distances and find nearest neighbors, i.e.,
{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}
.leaf_size – Leaf size passed to BallTree or cKDTree.
p – The power of the Minkowski metric to be used to calculate distance between points.
n_jobs – The number of parallel jobs to run.
- algorithm: str
- eps: float
- leaf_size: int
- metric: str
- metric_params: object
- min_samples: int
- n_jobs: int
- p: float
logai.algorithms.clustering_algo.kmeans module
- class logai.algorithms.clustering_algo.kmeans.KMeansAlgo(params: KMeansParams)
Bases:
ClusteringAlgo
K-means algorithm for log clustering. This is a wrapper class for K-Means clustering method from scikit-learn library https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.
- fit(log_features: DataFrame)
Fits a K-means model.
- Parameters:
log_features – The log features for training
- predict(log_features: DataFrame) Series
Predicts using trained K-means model.
- Parameters:
log_features – The log features for inference.
- Returns:
A pandas series of cluster labels.
- class logai.algorithms.clustering_algo.kmeans.KMeansParams(n_clusters: int = 8, init: str = 'k-means++', n_init: int = 10, max_iter: int = 300, tol: float = 0.0001, verbose: int = 0, random_state: int | None = None, copy_x: bool = True, algorithm: str = 'auto')
Bases:
Config
Parameters of the KMeans Clustering algorithm. For more details on the parameters see https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.
- Parameters:
n_clusters – The number of clusters to form as well as the number of centroids to generate.
init – Method for initialization, i.e.,
{‘k-means++’, ‘random’}
.n_init – Number of times the k-means algorithm is run with different centroid seeds.
max_iter – Maximum number of iterations of the k-means algorithm for a single run.
tol – Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.
verbose – Verbosity mode.
random_state – Determines random number generation for centroid initialization.
copy_x – If copy_x is True (default), then the original data is not modified. If False, the original data is modified, and put back before the function returns.
algorithm – K-means algorithm to use, i.e.,
{“lloyd”, “elkan”, “auto”, “full”}
.
- algorithm: str
- copy_x: bool
- init: str
- max_iter: int
- n_clusters: int
- n_init: int
- random_state: int
- tol: float
- verbose: int
Module contents
- class logai.algorithms.clustering_algo.BirchAlgo(params: BirchParams)
Bases:
ClusteringAlgo
BIRCH algorithm for log clustering. This is a wrapper class for the Birch Clustering algorithm in scikit-learn https://scikit-learn.org/stable/modules/generated/sklearn.cluster.Birch.html.
- fit(log_features: DataFrame)
Trains a BIRCH model.
- Parameters:
log_features – The log features for training.
- predict(log_features: DataFrame) Series
Predicts using trained BIRCH model.
- Parameters:
log_features – The log features for inference.
- Returns:
A pandas series of log cluster labels.
- class logai.algorithms.clustering_algo.DbScanAlgo(params: DbScanParams)
Bases:
ClusteringAlgo
DBSCAN algorithm for log clustering. This is a wrapper class for DBScan based from scikit-learn library https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html
- fit(log_features: DataFrame)
Trains a DBSCAN model.
- Parameters:
log_features – The log features as training data.
- predict(log_features: DataFrame) Series
Predicts using the trained DBSCAN model.
- Parameters:
log_features – The log features for inference.
- Returns:
A pandas series of cluster labels.
- class logai.algorithms.clustering_algo.KMeansAlgo(params: KMeansParams)
Bases:
ClusteringAlgo
K-means algorithm for log clustering. This is a wrapper class for K-Means clustering method from scikit-learn library https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.
- fit(log_features: DataFrame)
Fits a K-means model.
- Parameters:
log_features – The log features for training
- predict(log_features: DataFrame) Series
Predicts using trained K-means model.
- Parameters:
log_features – The log features for inference.
- Returns:
A pandas series of cluster labels.