pyrca.outliers package
pyrca.outliers.base module
Base classes for all outliers.
- class pyrca.outliers.base.BaseDetector
Bases:
BaseModel
Base class for Outlier (Anomaly) Detectors. This class should not be used directly, Use dervied class instead.
- config_class = None
- train(df, **kwargs)
Train method for fitting the data.
- Parameters
df (
DataFrame
) – The training dataset.kwargs – Parameters needed for training.
- Returns
- predict(df, **kwargs)
Predict anomalies/outliers given the input data.
- Parameters
df (
DataFrame
) – The test dataset.kwargs – Parameters needed for prediction.
- Returns
The detection results.
- update_config(d)
Updates the configurations (hyperparameters).
- Parameters
d (
Dict
) – The new parameters in a dict format.
- class pyrca.outliers.base.DetectorMixin
Bases:
object
Check data quality and train
- check_data_and_train(df, **kwargs)
Data quality check and training.
- Parameters
df – The training dataset.
kwargs – Additional parameters.
- class pyrca.outliers.base.DetectionResults(anomalous_metrics=<factory>, anomaly_timestamps=<factory>, anomaly_labels=<factory>, anomaly_info=<factory>)
Bases:
object
The class for storing anomaly detection results.
- anomalous_metrics: list
- anomaly_timestamps: dict
- anomaly_labels: dict
- anomaly_info: dict
- to_dict()
Converts the anomaly detection results into a dict.
- Return type
dict
- classmethod merge(results)
Merges multiple detection results.
- Parameters
results (
list
) – A list ofDetectionResults
objects.- Returns
The merged
DetectionResults
object.
pyrca.outliers.stats module
The statistical-based anomaly detector.
- class pyrca.outliers.stats.StatsDetectorConfig(default_sigma=4.0, thres_win_size=5, thres_reduce_func='mean', score_win_size=3, anomaly_threshold=0.5, sigmas=None, manual_thresholds=None, custom_win_sizes=None, custom_anomaly_thresholds=None)
Bases:
BaseConfig
The configuration class for the stats anomaly detector.
- Parameters
default_sigma (
float
) – The default sigma value for computing the threshold, e.g., abs(x - mean) > sigma * std.thres_win_size (
int
) – The size of the smoothing window for computing bounds.thres_reduce_func (
str
) – The reduction function for bounds, i.e., “mean” uses the mean value and standard deviation, “median” uses the median value and median absolute deviation.score_win_size (
int
) – The default window size for computing anomaly scores.anomaly_threshold (
float
) – The default anomaly detection threshold, e.g., what percentage of points in a small window (with size score_win_size) that violates abs(x - mean) <= sigma * std is considered as an anomaly.sigmas (
Optional
[dict
]) – Variable-specific sigmas other than default for certain variables.manual_thresholds (
Optional
[dict
]) – Manually specified lower and upper thresholds, e.g., {“lower”: 0, “upper”: 10}.custom_win_sizes (
Optional
[dict
]) – Variable-specific window sizes other than default for certain variables.custom_anomaly_thresholds (
Optional
[dict
]) – Variable-specific anomaly detection thresholds other than default for certain variables.
- default_sigma: float = 4.0
- thres_win_size: int = 5
- thres_reduce_func: str = 'mean'
- score_win_size: int = 3
- anomaly_threshold: float = 0.5
- sigmas: dict = None
- manual_thresholds: dict = None
- custom_win_sizes: dict = None
- custom_anomaly_thresholds: dict = None
- class pyrca.outliers.stats.StatsDetector(config)
Bases:
BaseDetector
,DetectorMixin
The statistics-based anomaly detector. During training, it will estimate the mean and std of the training time series. During prediction/detection, for each timestamp t, it will consider a small window around t, and compute the anomaly score based on the percentage of the points in this small window such that abs(x - mean) > sigma * std. If this percentage is greater than a certain threshold, the timestamp t is considered as an anomaly.
- config_class
alias of
StatsDetectorConfig
- to_dict()
Converts a trained detector into a python dictionary.
- Return type
Dict
- classmethod from_dict(d)
Creates a
StatsDetector
from a python dictionary.- Return type
- update_bounds(d)
Updates the bounds manually.
- Parameters
d (
Dict
) – The bounds of certain metrics, e.g., {“metric A”: (0, 1)}.