merlion.models.anomaly package
Contains all anomaly detection models. Forecasterbased anomaly detection models
may be found in merlion.models.anomaly.forecast_based
. Changepoint detection models may be
found in merlion.models.anomaly.change_point
.
For anomaly detection, we define an abstract DetectorBase
class which inherits from ModelBase
and supports the
following interface, in addition to model.save
and DetectorClass.load
defined for ModelBase
:
model = DetectorClass(config)
initialization with a modelspecific config
configs contain:
a (potentially trainable) data preprocessing transform from
merlion.transform
; note thatmodel.transform
is a property which refers tomodel.config.transform
a (potentially trainable) postprocessing rule from
merlion.post_process
; note thatmodel.post_rule
is a property which refers tomodel.config.post_rule
. In general, this postrule will have two stages:calibration
andthresholding
.booleans
enable_calibrator
andenable_threshold
(both defaulting toTrue
) indicating whether to enable calibration and thresholding in the postrule.modelspecific hyperparameters
model.get_anomaly_score(time_series, time_series_prev=None)
returns a time series of anomaly scores for each timestamp in
time_series
time_series_prev
(optional): the most recent context, only used for some models. If not provided, the training data is used as the context instead.
model.get_anomaly_label(time_series, time_series_prev=None)
returns a time series of postprocessed anomaly scores for each timestamp in
time_series
. These scores are calibrated to correspond to zscores ifenable_calibrator
isTrue
, and they have also been filtered by a thresholding rule (model.threshold
) ifenable_threshold
isTrue
.threshold
is specified manually in the config (though it may be modified byDetectorBase.train
), .time_series_prev
(optional): the most recent context, only used for some models. If not provided, the training data is used as the context instead.
model.train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
trains the model on the time series
train_data
anomaly_labels
(optional): a time series aligned withtrain_data
, which indicates whether each time stamp is anomaloustrain_config
(optional): extra configuration describing how the model should be trained (e.g. learning rate for theLSTMDetector
). Not used for all models. Classlevel default provided for models which do use it.post_rule_train_config
: extra configuration describing how to train the model’s postrule. Classlevel default is provided for all models.returns a time series of anomaly scores produced by the model on
train_data
.
Base class for anomaly detectors. 

Dynamic Baseline anomaly detection model for time series with daily, weekly or monthly trends. 

Window Statistics anomaly detection model for data with weekly seasonality. 

The classic isolation forest model for anomaly detection. 

Wrapper around AWS's Random Cut Forest anomaly detection model. 

Spectral Residual algorithm for anomaly detection 

Simple static thresholding model for anomaly detection. 

Multiple zscore model (static thresholding at multiple time scales). 

The autoencoderbased anomaly detector for multivariate time series 

Deep autoencoding Gaussian mixture model for anomaly detection (DAGMM) 

The LSTMencoderdecoderbased anomaly detector for multivariate time series 

The VAEbased anomaly detector for multivariate time series 

Deep Point Anomaly Detector algorithm. 
Subpackages
 merlion.models.anomaly.forecast_based package
 Submodules
 merlion.models.anomaly.forecast_based.base module
 merlion.models.anomaly.forecast_based.arima module
 merlion.models.anomaly.forecast_based.sarima module
 merlion.models.anomaly.forecast_based.ets module
 merlion.models.anomaly.forecast_based.prophet module
 merlion.models.anomaly.forecast_based.lstm module
 merlion.models.anomaly.forecast_based.mses module
 merlion.models.anomaly.change_point package
Submodules
merlion.models.anomaly.base module
Base class for anomaly detectors.
 class merlion.models.anomaly.base.DetectorConfig(max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, normalize=None, **kwargs)
Bases:
Config
Config object used to define an anomaly detection model.
Base class of the object used to configure an anomaly detection model.
 Parameters
max_score (
float
) – maximum possible uncalibrated anomaly scorethreshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 calibrator: AnomScoreCalibrator = None
 property post_rule
 Returns
The full postprocessing rule. Includes calibration if
enable_calibrator
isTrue
, followed by thresholding ifenable_threshold
isTrue
.
 classmethod from_dict(config_dict, return_unused_kwargs=False, calibrator=None, **kwargs)
Constructs a
Config
from a Python dictionary of parameters. Parameters
config_dict (
Dict
[str
,Any
]) – dict that will be used to instantiate this object.return_unused_kwargs – whether to return any unused keyword args.
dim – the dimension of the time series. handled as a special case.
kwargs – any additional parameters to set (overriding config_dict).
 Returns
Config
object initialized from the dict.
 class merlion.models.anomaly.base.NoCalibrationDetectorConfig(enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, transform: TransformBase = None, **kwargs)
Bases:
DetectorConfig
Abstract config object for an anomaly detection model that should never perform anomaly score calibration.
Base class of the object used to configure an anomaly detection model.
 Parameters
enable_calibrator –
False
because this config assumes calibrated outputs from the model.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
 property calibrator
 Returns
None
 property enable_calibrator
 Returns
False
 class merlion.models.anomaly.base.DetectorBase(config)
Bases:
ModelBase
Base class for an anomaly detection model.
 Parameters
config (
DetectorConfig
) – model configuration
 config_class
alias of
DetectorConfig
 property threshold
 property calibrator
 property post_rule
 abstract train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 train_post_rule(anomaly_scores, anomaly_labels=None, post_rule_train_config=None)
 abstract get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
 get_anomaly_label(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores, processed by any relevant postrules (calibration and/or thresholding).
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores, filtered by the model’s postrule
 get_figure(time_series, time_series_prev=None, *, filter_scores=True, plot_time_series_prev=False, fig=None)
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to plot & predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.filter_scores – whether to filter the anomaly scores by the postrule before plotting them.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.fig (
Optional
[Figure
]) – aFigure
we might want to add anomaly scores onto.
 Return type
 Returns
a
Figure
of the model’s anomaly score predictions.
 plot_anomaly(time_series, time_series_prev=None, *, filter_scores=True, plot_time_series_prev=False, figsize=(1000, 600), ax=None)
Plots the time series in matplotlib as a line graph, with points in the series overlaid as points colorcoded to indicate their severity as anomalies.
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to plot & predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. Plotted as context if given.filter_scores – whether to filter the anomaly scores by the postrule before plotting them.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
ax – matplotlib axes to add this plot to
 Returns
matplotlib figure & axes
 plot_anomaly_plotly(time_series, time_series_prev=None, *, filter_scores=True, plot_time_series_prev=False, figsize=None)
Plots the time series in plotly as a line graph, with points in the series overlaid as points colorcoded to indicate their severity as anomalies.
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to plot & predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. Plotted as context if given.filter_scores – whether to filter the anomaly scores by the postrule before plotting them.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
 Returns
plotly figure
merlion.models.anomaly.dbl module
Dynamic Baseline anomaly detection model for time series with daily, weekly or monthly trends.
 class merlion.models.anomaly.dbl.DynamicBaselineConfig(fixed_period=None, train_window=None, wind_sz='1h', trends=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, **kwargs)
Bases:
DetectorConfig
Configuration class for
DynamicBaseline
.Base class of the object used to configure an anomaly detection model.
 Parameters
fixed_period (
Optional
[Tuple
[str
,str
]]) –(t0, tf)
; Train the model on all datapoints occurring betweent0
andtf
(inclusive).train_window (
Optional
[str
]) – A string representing a duration of time to serve as the scope for a rolling dynamic baseline model.wind_sz (
str
) – The window size in minutes to bucket times of day. This parameter only applied if a daily trend is one of the trends used.trends (
Optional
[List
[str
]]) – The list of trends to use. Supported trends are “daily”, “weekly” and “monthly”.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
 property fixed_period
 property trends
 determine_train_window()
 to_dict(_skipped_keys=None)
 Returns
dict with keyword arguments used to initialize the config class.
 class merlion.models.anomaly.dbl.DynamicBaseline(config)
Bases:
DetectorBase
Dynamic baselinebased anomaly detector.
Detects anomalies by comparing data to historical data that has occurred in the same window of time, as defined by any combination of time of day, day of week, or day of month.
A DBL model can have a fixed period or a dynamic rolling period. A fixed period model trains its baselines exclusively on datapoints occurring in the fixed period, while a rolling period model trains continually on the most recent datapoints within its
trainwindow
. Parameters
config (
DynamicBaselineConfig
) – model configuration
 config_class
alias of
DynamicBaselineConfig
 property train_window
 property fixed_period
 property has_fixed_period
 property data: UnivariateTimeSeries
 Return type
 get_relevant(data)
Returns the subset of the data that should be used for training or updating.
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
 Parameters
train_data (
TimeSeries
) – train_data[t] = (timestamp_t, value_t)anomaly_labels (
Optional
[TimeSeries
]) – anomaly_labels[i] = (timestamp_i, is_anom(timestamp_i))train_config – unused
post_rule_train_config – config to train the post rule
 Return type
 Returns
anomaly scores of training data
 get_anomaly_score(time_series, time_series_prev=None)
 Parameters
time_series (
TimeSeries
) – a list of (timestamps, score) pairstime_series_prev (
Optional
[TimeSeries
]) – ignored
 Return type
 get_baseline(time_stamps)
Returns the dynamic baselines corresponding to the time stamps :type time_stamps:
List
[float
] :param time_stamps: a list of timestamps Return type
 check_dim(time_series)
 update(new_data)
 class merlion.models.anomaly.dbl.Trend(value)
Bases:
Enum
Enumeration of the supported trends.
 daily = 1
 weekly = 2
 monthly = 3
 class merlion.models.anomaly.dbl.Segment(key)
Bases:
object
Class representing a segment. The class maintains a mean (baseline) along with a variance so that a zscore can be computed.
 add(x)
 drop(x)
 score(x)
 class merlion.models.anomaly.dbl.Segmenter(trends, wind_sz)
Bases:
object
Class for managing the segments that belong to a
DynamicBaseline
model. Parameters
trends (
List
[Trend
]) – A list of trend types to create segments based on.wind_sz (
str
) – The window size in minutes to bucket times of day. Only used if a daily trend is one of the trends used.
 day_delta = Timedelta('1 days 00:00:00')
 hour_delta = Timedelta('0 days 01:00:00')
 min_delta = Timedelta('0 days 00:01:00')
 zero_delta = Timedelta('0 days 00:00:00')
 reset()
 property wind_delta
 property trends
 property trend
 window_key(t)
 weekday_key(t)
 day_key(t)
 segment_key(timestamp)
 add(t, x)
 drop(t, x)
 score(t, x)
 get_baseline(t)
 Return type
Tuple
[float
,float
]
merlion.models.anomaly.windstats module
Window Statistics anomaly detection model for data with weekly seasonality.
 class merlion.models.anomaly.windstats.WindStatsConfig(wind_sz=30, max_day=4, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform: TransformBase = None, **kwargs)
Bases:
DetectorConfig
Config class for
WindStats
.Base class of the object used to configure an anomaly detection model.
 Parameters
wind_sz – the window size in minutes, default is 30 minute window
max_day – maximum number of week days stored in memory (only mean and std of each window are stored). Here, the days are first bucketed by weekday and then by window id.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
 class merlion.models.anomaly.windstats.WindStats(config=None)
Bases:
DetectorBase
Sliding Window Statistics based Anomaly Detector. This detector assumes the time series comes with a weekly seasonality. It divides the week into buckets of the specified size (in minutes). For a given (t, v) it computes an anomaly score by comparing the current value v against the historical values (mean and standard deviation) for that window of time. Note that if multiple matches (specified by the parameter max_day) can be found in history with the same weekday and same time window, then the minimum of the scores is returned.
config.wind_sz: the window size in minutes, default is 30 minute window config.max_days: maximum number of week days stored in memory (only mean and std of each window are stored) here the days are first bucketized by weekday and then bucketized by window id.
 config_class
alias of
WindStatsConfig
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
merlion.models.anomaly.isolation_forest module
The classic isolation forest model for anomaly detection.
 class merlion.models.anomaly.isolation_forest.IsolationForestConfig(max_n_samples=None, n_estimators=100, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, **kwargs)
Bases:
DetectorConfig
Configuration class for
IsolationForest
.Base class of the object used to configure an anomaly detection model.
 Parameters
max_n_samples (
Optional
[int
]) – Maximum number of samples to allow the isolation forest to train on. SpecifyNone
to use all samples in the training data.n_estimators (
int
) – number of trees in the isolation forest.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
 class merlion.models.anomaly.isolation_forest.IsolationForest(config)
Bases:
DetectorBase
The classic isolation forest algorithm, proposed in Liu et al. 2008
 Parameters
config (
IsolationForestConfig
) – model configuration
 config_class
alias of
IsolationForestConfig
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.random_cut_forest module
Wrapper around AWS’s Random Cut Forest anomaly detection model.
 class merlion.models.anomaly.random_cut_forest.JVMSingleton
Bases:
object
 class merlion.models.anomaly.random_cut_forest.RandomCutForestConfig(n_estimators=100, parallel=False, seed=None, max_n_samples=512, thread_pool_size=1, online_updates=False, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, **kwargs)
Bases:
DetectorConfig
Configuration class for
RandomCutForest
. Refer to https://github.com/aws/randomcutforestbyaws/tree/main/Java for further documentation and defaults of the Java class.Base class of the object used to configure an anomaly detection model.
 Parameters
n_estimators (
int
) – The number of trees in this forest.parallel (
bool
) – If true, then the forest will create an internal thread pool. Forest updates and traversals will be submitted to this thread pool, and individual trees will be updated or traversed in parallel. For larger shingle sizes, dimensions, and number of trees, parallelization may improve throughput. We recommend users benchmark against their target use case.seed (
Optional
[int
]) – the random seedmax_n_samples (
int
) – The number of samples retained by by stream samplers in this forest.thread_pool_size (
int
) – The number of threads to use in the internal thread pool.online_updates (
bool
) – Whether to update the model while running using it to evaluate new data.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
 property java_params
 class merlion.models.anomaly.random_cut_forest.RandomCutForest(config)
Bases:
DetectorBase
The random cut forest is a refinement of the classic isolation forest algorithm. It was proposed in Guha et al. 2016.
 Parameters
config (
RandomCutForestConfig
) – model configuration
 config_class
alias of
RandomCutForestConfig
 property online_updates: bool
 Return type
bool
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.spectral_residual module
Spectral Residual algorithm for anomaly detection
 class merlion.models.anomaly.spectral_residual.SpectralResidualConfig(local_wind_sz=21, q=3, estimated_points=5, predicting_points=5, target_seq_index=None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform: TransformBase = None, **kwargs)
Bases:
DetectorConfig
Config class for
SpectralResidual
anomaly detector.Base class of the object used to configure an anomaly detection model.
 Parameters
local_wind_sz – Number of previous saliency points to consider when computing the anomaly score
q – Window size of local frequency average computations
estimated_points – Number of padding points to add to the timeseries for saliency map calculations.
predicting_points – Number of points to consider when computing gradient for padding points
target_seq_index – Index of the univariate whose anomalies we want to detect.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
The Saliency Map is computed as follows:
\[\begin{split}R(f) &= \log(A(\mathscr{F}(\textbf{x})))  \left(\frac{1}{q}\right)_{1 \times q} * (A(\mathscr{F}(\textbf{x})) \\ S_m &= \mathscr{F}^{1} (R(f))\end{split}\]where \(*\) is the convolution operator, and \(\mathscr{F}\) is the Fourier Transform. The anomaly scores then are computed as:
\[S(x) = \frac{S(x)  \overline{S(\textbf{x})}}{\overline{S(\textbf{x})}}\]where \(\textbf{x}\) are the last
local_wind_sz
points in the timeseries.The
estimated_points
andpredicting_points
parameters are used to pad the end of the timeseries with reasonable values. This is done so that the later points in the timeseries are in the middle of averaging windows rather than in the end.
 class merlion.models.anomaly.spectral_residual.SpectralResidual(config=None)
Bases:
DetectorBase
Spectral Residual Algorithm for Anomaly Detection.
Spectral Residual Anomaly Detection algorithm based on the algorithm described by Ren et al. (2019). After taking the frequency spectrum, compute the log deviation from the mean. Use inverse fourier transform to obtain the saliency map. Anomaly scores for a point in the time series are obtained by comparing the saliency score of the point to the average of the previous points.
 Parameters
config (
Optional
[SpectralResidualConfig
]) – model configuration
 config_class
alias of
SpectralResidualConfig
 property target_seq_index: int
 Return type
int
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
merlion.models.anomaly.stat_threshold module
Simple static thresholding model for anomaly detection.
 class merlion.models.anomaly.stat_threshold.StatThresholdConfig(max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, normalize=None, **kwargs)
Bases:
DetectorConfig
,NormalizingConfig
Config class for
StatThreshold
.Base class of the object used to configure an anomaly detection model.
 Parameters
max_score (
float
) – maximum possible uncalibrated anomaly scorethreshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 class merlion.models.anomaly.stat_threshold.StatThreshold(config)
Bases:
DetectorBase
Anomaly detection based on a static threshold.
 Parameters
config (
DetectorConfig
) – model configuration
 config_class
alias of
StatThresholdConfig
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.zms module
Multiple zscore model (static thresholding at multiple time scales).
 class merlion.models.anomaly.zms.ZMSConfig(base=2, n_lags=None, lag_inflation=1.0, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, normalize=None, **kwargs)
Bases:
DetectorConfig
,NormalizingConfig
Configuration class for
ZMS
anomaly detection model. The transform of this config is actually a preprocessing step, followed by the desired number of lag transforms, and a final mean/variance normalization step. This full transform may be accessed asZMSConfig.full_transform
. Note that the normalization is inherited fromNormalizingConfig
.Base class of the object used to configure an anomaly detection model.
 Parameters
base (
int
) – The base to use for computing exponentially distant lags.n_lags (
Optional
[int
]) – The number of lags to be used. If None, n_lags will be chosen later as the maximum number of lags possible for the initial training set.lag_inflation (
float
) – See math below for the precise mathematical role of the lag inflation. Consider the lag inflation a measure of distrust toward higher lags, Iflag_inflation
> 1, the higher the lag inflation, the less likely the model is to select a higher lag’s zscore as the anomaly score.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
\[\begin{split}\begin{align*} \text{Let } \space z_k(x_t) \text{ be the zscore of the } & k\text{lag at } t, \space \Delta_k(x_t) \text{ and } p \text{ be the lag inflation} \\ & \\ \text{the anomaly score } z(x_t) & = z_{k^*}(x_t) \\ \text{where } k^* & = \text{argmax}_k \space  z_k(x_t)  / k^p \end{align*}\end{split}\] property full_transform
Returns the full transform, including the preprocessing step, lags, and final mean/variance normalization.
 to_dict(_skipped_keys=None)
 Returns
dict with keyword arguments used to initialize the config class.
 property n_lags
 class merlion.models.anomaly.zms.ZMS(config)
Bases:
DetectorBase
Multiple ZScore based Anomaly Detector.
ZMS is designed to detect spikes, dips, sharp trend changes (up or down) relative to historical data. Anomaly scores capture not only magnitude but also direction. This lets one distinguish between positive (spike) negative (dip) anomalies for example.
The algorithm builds models of normalcy at multiple exponentiallygrowing time scales. The zeroth order model is just a model of the values seen recently. The kth order model is similar except that it models not values, but rather their klags, defined as x(t)x(tk), for k in 1, 2, 4, 8, 16, etc. The algorithm assigns the maximum absolute zscore of all the models of normalcy as the overall anomaly score.
\[\begin{split}\begin{align*} \text{Let } \space z_k(x_t) \text{ be the zscore of the } & k\text{lag at } t, \space \Delta_k(x_t) \text{ and } p \text{ be the lag inflation} \\ & \\ \text{the anomaly score } z(x_t) & = z_{k^*}(x_t) \\ \text{where } k^* & = \text{argmax}_k \space  z_k(x_t)  / k^p \end{align*}\end{split}\] Parameters
config (
DetectorConfig
) – model configuration
 property n_lags
 property lag_scales: List[int]
 Return type
List
[int
]
 property lag_inflation
 property adjust_z_scores: bool
 Return type
bool
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.autoencoder module
The autoencoderbased anomaly detector for multivariate time series
 class merlion.models.anomaly.autoencoder.AutoEncoderConfig(hidden_size=5, layer_sizes=(25, 10, 5), sequence_len=1, lr=0.001, batch_size=512, num_epochs=50, **kwargs)
Bases:
DetectorConfig
,NormalizingConfig
Configuration class for AutoEncoder. The normalization is inherited from
NormalizingConfig
. The input data will be standardized automatically.Base class of the object used to configure an anomaly detection model.
 Parameters
hidden_size (
int
) – The latent sizelayer_sizes (
Sequence
[int
]) – The hidden layer sizes for the MLP encoder and decoder, e.g., (25, 10, 5) for encoder and (5, 10, 25) for decodersequence_len (
int
) – The input series length, e.g., input = [x(tsequence_len+1)…,x(t1),x(t)]lr (
float
) – The learning rate during trainingbatch_size (
int
) – The batch size during trainingnum_epochs (
int
) – The number of training epochsmax_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 class merlion.models.anomaly.autoencoder.AutoEncoder(config)
Bases:
DetectorBase
The autoencoderbased multivariate time series anomaly detector. This detector utilizes an autoencoder to infer the correlations between different time series and estimate the joint distribution of the variables for anomaly detection.
 Parameters
config (
AutoEncoderConfig
) – model configuration
 config_class
alias of
AutoEncoderConfig
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Train a multivariate time series anomaly detector.
 Parameters
train_data (
TimeSeries
) – ATimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – ATimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – ATimeSeries
immediately precedingtime_series
.
 Return type
 Returns
A univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.vae module
The VAEbased anomaly detector for multivariate time series
 class merlion.models.anomaly.vae.VAEConfig(encoder_hidden_sizes=(25, 10, 5), decoder_hidden_sizes=(5, 10, 25), latent_size=5, sequence_len=1, kld_weight=1.0, dropout_rate=0.0, num_eval_samples=10, lr=0.001, batch_size=1024, num_epochs=10, **kwargs)
Bases:
DetectorConfig
,NormalizingConfig
Configuration class for VAE. The normalization is inherited from
NormalizingConfig
. The input data will be standardized automatically.Base class of the object used to configure an anomaly detection model.
 Parameters
encoder_hidden_sizes (
Sequence
[int
]) – The hidden layer sizes of the MLP encoderdecoder_hidden_sizes (
Sequence
[int
]) – The hidden layer sizes of the MLP decoderlatent_size (
int
) – The latent sizesequence_len (
int
) – The input series length, e.g., input = [x(tsequence_len+1)…,x(t1),x(t)]kld_weight (
float
) – The regularization weight for the KL divergence termdropout_rate (
float
) – The dropout rate for the encoder and decodernum_eval_samples (
int
) – The number of sampled latent variables during predictionlr (
float
) – The learning rate during trainingbatch_size (
int
) – The batch size during trainingnum_epochs (
int
) – The number of training epochsmax_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 class merlion.models.anomaly.vae.VAE(config)
Bases:
DetectorBase
The VAEbased multivariate time series anomaly detector. This detector utilizes a variational autoencoder to infer the correlations between different time series and estimate the distribution of the reconstruction errors for anomaly detection.
 Parameters
config (
VAEConfig
) – model configuration
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Train a multivariate time series anomaly detector.
 Parameters
train_data (
TimeSeries
) – ATimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – ATimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – ATimeSeries
immediately precedingtime_series
.
 Return type
 Returns
A univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.dagmm module
Deep autoencoding Gaussian mixture model for anomaly detection (DAGMM)
 class merlion.models.anomaly.dagmm.DAGMMConfig(gmm_k=3, hidden_size=5, sequence_len=1, lambda_energy=0.1, lambda_cov_diag=0.005, lr=0.001, batch_size=256, num_epochs=10, **kwargs)
Bases:
DetectorConfig
,NormalizingConfig
Configuration class for DAGMM. The normalization is inherited from
NormalizingConfig
. The input data will be standardized automatically.Base class of the object used to configure an anomaly detection model.
 Parameters
gmm_k (
int
) – The number of Gaussian distributionshidden_size (
int
) – The hidden size of the autoencoder module in DAGMMsequence_len (
int
) – The input series length, e.g., input = [x(tsequence_len+1)…,x(t1),x(t)]lambda_energy (
float
) – The regularization weight for the energy termlambda_cov_diag (
float
) – The regularization weight for the covariance diagonal entrieslr (
float
) – The learning rate during trainingbatch_size (
int
) – The batch size during trainingnum_epochs (
int
) – The number of training epochsmax_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 class merlion.models.anomaly.dagmm.DAGMM(config)
Bases:
DetectorBase
Deep autoencoding Gaussian mixture model for anomaly detection (DAGMM). DAGMM combines an autoencoder with a Gaussian mixture model to model the distribution of the reconstruction errors. DAGMM jointly optimizes the parameters of the deep autoencoder and the mixture model simultaneously in an endtoend fashion.
 Parameters
config (
DAGMMConfig
) – model configuration
 config_class
alias of
DAGMMConfig
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Train a multivariate time series anomaly detector.
 Parameters
train_data (
TimeSeries
) – ATimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – ATimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – ATimeSeries
immediately precedingtime_series
.
 Return type
 Returns
A univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.lstm_ed module
The LSTMencoderdecoderbased anomaly detector for multivariate time series
 class merlion.models.anomaly.lstm_ed.LSTMEDConfig(hidden_size=5, sequence_len=20, n_layers=(1, 1), dropout=(0, 0), lr=0.001, batch_size=256, num_epochs=10, **kwargs)
Bases:
DetectorConfig
,NormalizingConfig
Configuration class for LSTMencoderdecoder. The normalization is inherited from
NormalizingConfig
. The input data will be standardized automatically.Base class of the object used to configure an anomaly detection model.
 Parameters
hidden_size (
int
) – The hidden state size of the LSTM modulessequence_len (
int
) – The input series length, e.g., input = [x(tsequence_len+1)…,x(t1),x(t)]n_layers (
Sequence
[int
]) – The number of layers for the LSTM encoder and decoder.n_layer
has two values, i.e.,n_layer[0]
is the number of encoder layers andn_layer[1]
is the number of decoder layers.dropout (
Sequence
[int
]) – The dropout rate for the LSTM encoder and decoder.dropout
has two values, i.e.,dropout[0]
is the dropout rate for the encoder anddropout[1]
is the dropout rate for the decoder.lr (
float
) – The learning rate during trainingbatch_size (
int
) – The batch size during trainingnum_epochs (
int
) – The number of training epochsmax_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 class merlion.models.anomaly.lstm_ed.LSTMED(config)
Bases:
DetectorBase
The LSTMencoderdecoderbased multivariate time series anomaly detector. The time series representation is modeled by an encoderdecoder network where both encoder and decoder are LSTMs. The distribution of the reconstruction error is estimated for anomaly detection.
 Parameters
config (
LSTMEDConfig
) – model configuration
 config_class
alias of
LSTMEDConfig
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Train a multivariate time series anomaly detector.
 Parameters
train_data (
TimeSeries
) – ATimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – ATimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
 Parameters
time_series (
TimeSeries
) – TheTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – ATimeSeries
immediately precedingtime_series
.
 Return type
 Returns
A univariate
TimeSeries
of anomaly scores
merlion.models.anomaly.deep_point_anomaly_detector module
Deep Point Anomaly Detector algorithm.
 class merlion.models.anomaly.deep_point_anomaly_detector.DeepPointAnomalyDetectorConfig(max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, transform=None, normalize=None, **kwargs)
Bases:
DetectorConfig
Config object used to define an anomaly detection model.
Base class of the object used to configure an anomaly detection model.
 Parameters
max_score (
float
) – maximum possible uncalibrated anomaly scorethreshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
 class merlion.models.anomaly.deep_point_anomaly_detector.DeepPointAnomalyDetector(config)
Bases:
DetectorBase
Given a time series tuple (time, signal), this algorithm trains an MLP with each element in time and corresponding signal as inputtaget pair. Once the MLP is trained for a few itertions, the loss values at each time is regarded as the anomaly score for the corresponding signal. The intuition is that DNNs learn global patterns before overfitting local details. Therefore any point anomalies in the signal will have high MLP loss. These intuitions can be found in: Arpit, Devansh, et al. “A closer look at memorization in deep networks.” ICML 2017 Rahaman, Nasim, et al. “On the spectral bias of neural networks.” ICML 2019
 Parameters
config (
DeepPointAnomalyDetectorConfig
) – model configuration
 config_class
alias of
DeepPointAnomalyDetectorConfig
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its postrule (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores