pyrca.outliers package

base

Base classes for all outliers.

stats

The statistical-based anomaly detector.

pyrca.outliers.base module

Base classes for all outliers.

class pyrca.outliers.base.BaseDetector

Bases: BaseModel

Base class for Outlier (Anomaly) Detectors. This class should not be used directly, Use dervied class instead.

config_class = None
train(df, **kwargs)

Train method for fitting the data.

Parameters
  • df (DataFrame) – The training dataset.

  • kwargs – Parameters needed for training.

Returns

predict(df, **kwargs)

Predict anomalies/outliers given the input data.

Parameters
  • df (DataFrame) – The test dataset.

  • kwargs – Parameters needed for prediction.

Returns

The detection results.

update_config(d)

Updates the configurations (hyperparameters).

Parameters

d (Dict) – The new parameters in a dict format.

class pyrca.outliers.base.DetectorMixin

Bases: object

Check data quality and train

check_data_and_train(df, **kwargs)

Data quality check and training.

Parameters
  • df – The training dataset.

  • kwargs – Additional parameters.

class pyrca.outliers.base.DetectionResults(anomalous_metrics=<factory>, anomaly_timestamps=<factory>, anomaly_labels=<factory>, anomaly_info=<factory>)

Bases: object

The class for storing anomaly detection results.

anomalous_metrics: list
anomaly_timestamps: dict
anomaly_labels: dict
anomaly_info: dict
to_dict()

Converts the anomaly detection results into a dict.

Return type

dict

classmethod merge(results)

Merges multiple detection results.

Parameters

results (list) – A list of DetectionResults objects.

Returns

The merged DetectionResults object.

pyrca.outliers.stats module

The statistical-based anomaly detector.

class pyrca.outliers.stats.StatsDetectorConfig(default_sigma=4.0, thres_win_size=5, thres_reduce_func='mean', score_win_size=3, anomaly_threshold=0.5, sigmas=None, manual_thresholds=None, custom_win_sizes=None, custom_anomaly_thresholds=None)

Bases: BaseConfig

The configuration class for the stats anomaly detector.

Parameters
  • default_sigma (float) – The default sigma value for computing the threshold, e.g., abs(x - mean) > sigma * std.

  • thres_win_size (int) – The size of the smoothing window for computing bounds.

  • thres_reduce_func (str) – The reduction function for bounds, i.e., “mean” uses the mean value and standard deviation, “median” uses the median value and median absolute deviation.

  • score_win_size (int) – The default window size for computing anomaly scores.

  • anomaly_threshold (float) – The default anomaly detection threshold, e.g., what percentage of points in a small window (with size score_win_size) that violates abs(x - mean) <= sigma * std is considered as an anomaly.

  • sigmas (Optional[dict]) – Variable-specific sigmas other than default for certain variables.

  • manual_thresholds (Optional[dict]) – Manually specified lower and upper thresholds, e.g., {“lower”: 0, “upper”: 10}.

  • custom_win_sizes (Optional[dict]) – Variable-specific window sizes other than default for certain variables.

  • custom_anomaly_thresholds (Optional[dict]) – Variable-specific anomaly detection thresholds other than default for certain variables.

default_sigma: float = 4.0
thres_win_size: int = 5
thres_reduce_func: str = 'mean'
score_win_size: int = 3
anomaly_threshold: float = 0.5
sigmas: dict = None
manual_thresholds: dict = None
custom_win_sizes: dict = None
custom_anomaly_thresholds: dict = None
class pyrca.outliers.stats.StatsDetector(config)

Bases: BaseDetector, DetectorMixin

The statistics-based anomaly detector. During training, it will estimate the mean and std of the training time series. During prediction/detection, for each timestamp t, it will consider a small window around t, and compute the anomaly score based on the percentage of the points in this small window such that abs(x - mean) > sigma * std. If this percentage is greater than a certain threshold, the timestamp t is considered as an anomaly.

config_class

alias of StatsDetectorConfig

to_dict()

Converts a trained detector into a python dictionary.

Return type

Dict

classmethod from_dict(d)

Creates a StatsDetector from a python dictionary.

Return type

StatsDetector

update_bounds(d)

Updates the bounds manually.

Parameters

d (Dict) – The bounds of certain metrics, e.g., {“metric A”: (0, 1)}.