merlion.post_process package
This package implements some simple rules to post-process the output of an
anomaly detection model. This includes rules for reshaping a sequence to follow
a standard normal distribution (merlion.post_process.calibrate
), sparsifying
a sequence based on a threshold (merlion.post_process.threshold
), and composing
together sequences of post-processing rules (merlion.post_process.sequence
).
Base class for post-processing rules in Merlion. |
|
Contains the |
|
Class to compose a sequence of post-rules into a single post-rule. |
|
Post-rule to transform anomaly scores to follow a standard normal distribution. |
|
Rules that use a threshold to sparsify a sequence of anomaly scores. |
Submodules
merlion.post_process.base module
Base class for post-processing rules in Merlion.
- class merlion.post_process.base.PostRuleBase
Bases:
object
Base class for post-processing rules in Merlion. These objects are primarily for post-processing the sequence of anomaly scores returned by anomaly detection models. All post-rules are callable objects, and they have a
train()
method which may accept additional implementation-specific keyword arguments.- to_dict()
- classmethod from_dict(state_dict)
- abstract train(anomaly_scores)
merlion.post_process.factory module
Contains the PostRuleFactory
.
- class merlion.post_process.factory.PostRuleFactory
Bases:
object
- classmethod get_post_rule_class(name)
- Return type
Type
[PostRuleBase
]
- classmethod create(name, **kwargs)
Uses the given
kwargs
to create a post-rule of the given name- Return type
merlion.post_process.sequence module
Class to compose a sequence of post-rules into a single post-rule.
- class merlion.post_process.sequence.PostRuleSequence(post_rules)
Bases:
PostRuleBase
- train(anomaly_scores, **kwargs)
- Return type
- to_dict()
- classmethod from_dict(state_dict)
merlion.post_process.calibrate module
Post-rule to transform anomaly scores to follow a standard normal distribution.
- class merlion.post_process.calibrate.AnomScoreCalibrator(max_score, abs_score=True, anchors=None)
Bases:
PostRuleBase
Learns a monotone function which reshapes an input sequence of anomaly scores, to follow a standard normal distribution. This makes the anomaly scores from many diverse models interpretable as z-scores.
- Parameters
max_score (
float
) – the maximum possible uncalibrated scoreabs_score (
bool
) – whether to consider the absolute values of the anomaly scores, rather than the raw value.anchors (
Optional
[List
[Tuple
[float
,float
]]]) – a sequence of (x, y) pairs mapping an uncalibrated anomaly score to a calibrated anomaly score. Optional, as this will be set byAnomScoreCalibrator.train
.
- property anchors
- train(anomaly_scores, retrain_calibrator=False)
- Parameters
anomaly_scores (
TimeSeries
) –TimeSeries
of raw anomaly scores that we will use to train the calibrator.retrain_calibrator – Whether to re-train the calibrator on a new sequence of anomaly scores, if it has already been trained once. In practice, we find better results if this is
False
.
- Return type
merlion.post_process.threshold module
Rules that use a threshold to sparsify a sequence of anomaly scores.
- class merlion.post_process.threshold.Threshold(alm_threshold=None, abs_score=True)
Bases:
PostRuleBase
Zeroes all anomaly scores whose absolute value is less than the threshold.
- Parameters
alm_threshold (
Optional
[float
]) – Float describing the anomaly threshold.abs_score – If ‘True’, consider the absolute value instead of the raw value of score.
- class TSADMetric(value)
Bases:
Enum
Enumeration of evaluation metrics for time series anomaly detection. For each value, the name is the metric, and the value is a partial function of form
f(ground_truth, predicted, **kwargs)
- MeanTimeToDetect = functools.partial(<function accumulate_tsad_score>, metric=<function TSADScoreAccumulator.mean_time_to_detect>)
- F1 = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.f1>, score_type=<ScoreType.RevisedPointAdjusted: 2>))
- Precision = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.precision>, score_type=<ScoreType.RevisedPointAdjusted: 2>))
- Recall = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.recall>, score_type=<ScoreType.RevisedPointAdjusted: 2>))
- PointwiseF1 = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.f1>, score_type=<ScoreType.Pointwise: 0>))
- PointwisePrecision = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.precision>, score_type=<ScoreType.Pointwise: 0>))
- PointwiseRecall = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.recall>, score_type=<ScoreType.Pointwise: 0>))
- PointAdjustedF1 = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.f1>, score_type=<ScoreType.PointAdjusted: 1>))
- PointAdjustedPrecision = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.precision>, score_type=<ScoreType.PointAdjusted: 1>))
- PointAdjustedRecall = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.recall>, score_type=<ScoreType.PointAdjusted: 1>))
- NABScore = functools.partial(<function accumulate_tsad_score>, metric=<function TSADScoreAccumulator.nab_score>)
- NABScoreLowFN = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.nab_score>, fn_weight=2.0))
- NABScoreLowFP = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.nab_score>, fp_weight=0.22))
- F2 = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.f_beta>, score_type=<ScoreType.RevisedPointAdjusted: 2>, beta=2.0))
- F5 = functools.partial(<function accumulate_tsad_score>, metric=functools.partial(<function TSADScoreAccumulator.f_beta>, score_type=<ScoreType.RevisedPointAdjusted: 2>, beta=5.0))
- train(anomaly_scores, anomaly_labels=None, metric=None, unsup_quantile=None, max_early_sec=None, max_delay_sec=None, min_allowed_score=None)
If
metric
is available, generates candidate percentiles:[80, 90, 95, 98, 99, 99.5, 99.9]
. Also considers the user-specified candidate percentile inunsup_quantile
. Chooses the best percentile based onmetric
.If
metric
is not provided, usesunsup_quantile
to choose the threshold. Otherwise, uses the default threshold specified inalm_threshold
.- Parameters
anomaly_scores (
TimeSeries
) –TimeSeries
of anomaly scores returned by the model.anomaly_labels (
Optional
[TimeSeries
]) –TimeSeries
of ground truth anomaly labels.metric (
Optional
[TSADMetric
]) – Metric used to evaluate the performance of candidate thresholds.unsup_quantile (
Optional
[float
]) – User-specified quantile to use as a candidate.max_early_sec – Maximum allowed lead time (in seconds) from a detection to the start of an anomaly.
max_delay_sec – Maximum allowed delay (in seconds) from the start of an anomaly and a valid detection.
min_allowed_score – The minimum allowed value of the evaluation
metric
. If the best candidate threshold achieves a lower value of the metric, we retain with the current (default) threshold.
- Return type
- to_simple_threshold()
- class merlion.post_process.threshold.AggregateAlarms(alm_threshold=None, abs_score=True, min_alm_in_window=2, alm_window_minutes=60, alm_suppress_minutes=120)
Bases:
Threshold
Applies basic post-filtering to a time series of anomaly scores
Determine which points are anomalies by comparing the absolute value of their anomaly score to
alm_threshold
Only fire an alarm when
min_alm_in_window
of points (within a window ofalarm_window_minutes
minutes) are labeled as anomalies.If there is an alarm, then all alarms for the next
alm_suppress_minutes
minutes will be suppressed.
Return a time series of filtered anomaly scores, where the only non-zero values are the anomaly scores which were marked as alarms (and not suppressed).
- Parameters
alm_threshold (
Optional
[float
]) – Float describing the anomaly threshold.abs_score – If ‘True’, consider the absolute value instead of the raw value of score.
- property alm_threshold
- property abs_score
- property window_secs
- property suppress_secs
- filter(time_series)
- Return type
- train(anomaly_scores, anomaly_labels=None, metric=None, unsup_quantile=None, max_early_sec=None, max_delay_sec=None, min_allowed_score=None)
If
metric
is available, generates candidate percentiles:[80, 90, 95, 98, 99, 99.5, 99.9]
. Also considers the user-specified candidate percentile inunsup_quantile
. Chooses the best percentile based onmetric
.If
metric
is not provided, usesunsup_quantile
to choose the threshold. Otherwise, uses the default threshold specified inalm_threshold
.- Parameters
anomaly_scores (
TimeSeries
) –TimeSeries
of anomaly scores returned by the model.anomaly_labels (
Optional
[TimeSeries
]) –TimeSeries
of ground truth anomaly labels.metric (
Optional
[TSADMetric
]) – Metric used to evaluate the performance of candidate thresholds.unsup_quantile (
Optional
[float
]) – User-specified quantile to use as a candidate.max_early_sec – Maximum allowed lead time (in seconds) from a detection to the start of an anomaly.
max_delay_sec – Maximum allowed delay (in seconds) from the start of an anomaly and a valid detection.
min_allowed_score – The minimum allowed value of the evaluation
metric
. If the best candidate threshold achieves a lower value of the metric, we retain with the current (default) threshold.
- Return type
- to_simple_threshold()
- merlion.post_process.threshold.get_adaptive_thres(x, hist_gap_thres=None, bin_sz=None)
Look for gaps in the histogram of anomaly scores (i.e. histogram bins with zero items inside them). Set the detection threshold to the avg bin size s.t. the 2 bins have a gap of hist_gap_thres or more
- class merlion.post_process.threshold.AdaptiveThreshold(alm_threshold=None, abs_score=True, bin_sz=10, default_hist_gap_thres=1.2)
Bases:
Threshold
Zeroes all anomaly scores whose absolute value is less than the threshold.
- Parameters
alm_threshold (
Optional
[float
]) – Float describing the anomaly threshold.abs_score – If ‘True’, consider the absolute value instead of the raw value of score.
- train(anomaly_scores, anomaly_labels=None, metric=None, unsup_quantile=None, max_early_sec=None, max_delay_sec=None, min_allowed_score=None)
If
metric
is available, generates candidate percentiles:[80, 90, 95, 98, 99, 99.5, 99.9]
. Also considers the user-specified candidate percentile inunsup_quantile
. Chooses the best percentile based onmetric
.If
metric
is not provided, usesunsup_quantile
to choose the threshold. Otherwise, uses the default threshold specified inalm_threshold
.- Parameters
anomaly_scores (
TimeSeries
) –TimeSeries
of anomaly scores returned by the model.anomaly_labels (
Optional
[TimeSeries
]) –TimeSeries
of ground truth anomaly labels.metric (
Optional
[TSADMetric
]) – Metric used to evaluate the performance of candidate thresholds.unsup_quantile (
Optional
[float
]) – User-specified quantile to use as a candidate.max_early_sec – Maximum allowed lead time (in seconds) from a detection to the start of an anomaly.
max_delay_sec – Maximum allowed delay (in seconds) from the start of an anomaly and a valid detection.
min_allowed_score – The minimum allowed value of the evaluation
metric
. If the best candidate threshold achieves a lower value of the metric, we retain with the current (default) threshold.
- Return type
- class merlion.post_process.threshold.AdaptiveAggregateAlarms(alm_threshold=None, abs_score=True, min_alm_in_window=2, alm_window_minutes=60, alm_suppress_minutes=120, bin_sz=10, default_hist_gap_thres=1.2)
Bases:
AggregateAlarms
Applies basic post-filtering to a time series of anomaly scores
Determine which points are anomalies by comparing the absolute value of their anomaly score to
alm_threshold
Only fire an alarm when
min_alm_in_window
of points (within a window ofalarm_window_minutes
minutes) are labeled as anomalies.If there is an alarm, then all alarms for the next
alm_suppress_minutes
minutes will be suppressed.
Return a time series of filtered anomaly scores, where the only non-zero values are the anomaly scores which were marked as alarms (and not suppressed).
- Parameters
alm_threshold (
Optional
[float
]) – Float describing the anomaly threshold.abs_score – If ‘True’, consider the absolute value instead of the raw value of score.
- threshold_class
alias of
AdaptiveThreshold
- property bin_sz
- property default_hist_gap_thres