merlion.models.anomaly.forecast_based package
Contains all forecasterbased anomaly detectors. These models support all functionality
of both anomaly detectors (merlion.models.anomaly
) and forecasters
(merlion.models.forecast
).
Forecastingbased anomaly detectors are instances of an abstract ForecastingDetectorBase
class. Many forecasting models support anomaly detection variants, where the anomaly score
is based on the difference between the predicted and true time series value, and optionally
the model’s uncertainty in its own prediction.
Base class for anomaly detectors based on forecasting models. 

Classic ARIMA (AutoRegressive Integrated Moving Average) forecasting model, adapted for anomaly detection. 

Seasonal ARIMA (SARIMA) forecasting model, adapted for anomaly detection. 

ETS (error, trend, seasonal) forecasting model, adapted for anomaly detection. 

Adaptation of Facebook's Prophet forecasting model to anomaly detection. 

Adaptation of a LSTM neural net forecaster, to the task of anomaly detection. 

MSES (MultiScale Exponential Smoother) forecasting model adapted for anomaly detection. 
Submodules
merlion.models.anomaly.forecast_based.base module
Base class for anomaly detectors based on forecasting models.
 class merlion.models.anomaly.forecast_based.base.ForecastingDetectorBase(config)
Bases:
ForecasterBase
,DetectorBase
Base class for a forecastbased anomaly detector.
 Parameters
config (
ForecasterConfig
) – model configuration
 forecast_to_anom_score(time_series, forecast, stderr)
Compare a model’s forecast to a ground truth time series, in order to compute anomaly scores. By default, we compute a zscore if model uncertainty (
stderr
) is given, or the residuals if there is no model uncertainty. Parameters
time_series (
TimeSeries
) – the ground truth time series.forecast (
TimeSeries
) – the model’s forecasted values for the time seriesstderr (
Optional
[TimeSeries
]) – the standard errors of the model’s forecast
 Return type
 Returns
Anomaly scores based on the difference between the ground truth values of the time series, and the model’s forecast.
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the underlying forecaster (unsupervised) on the training data. Converts the forecast into anomaly scores, and and then trains the postrule for filtering anomaly scores (supervised, if labels are given) on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s postrule. The model’s default postrule train config is used if none is supplied here.
 Return type
 Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
 get_figure(*, time_series=None, time_stamps=None, time_series_prev=None, plot_anomaly=True, filter_scores=True, plot_forecast=False, plot_forecast_uncertainty=False, plot_time_series_prev=False)
 Parameters
time_series (
Optional
[TimeSeries
]) – the time series over whose timestamps we wish to make a forecast. Exactly one oftime_series
ortime_stamps
should be provided.time_stamps (
Optional
[List
[int
]]) – a list of timestamps we wish to forecast for. Exactly one oftime_series
ortime_stamps
should be provided.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.plot_anomaly – Whether to plot the model’s predicted anomaly scores.
filter_scores – whether to filter the anomaly scores by the postrule before plotting them.
plot_forecast – Whether to plot the model’s forecasted values.
plot_forecast_uncertainty – whether to plot uncertainty estimates (the interquartile range) for forecast values. Not supported for all models.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.
 Return type
 Returns
a
Figure
of the model’s anomaly score predictions and/or forecast.
 plot_anomaly(time_series, time_series_prev=None, *, filter_scores=True, plot_forecast=False, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600), ax=None)
Plots the time series in matplotlib as a line graph, with points in the series overlaid as points colorcoded to indicate their severity as anomalies. Optionally allows you to overlay the model’s forecast & the model’s uncertainty in its forecast (if applicable).
 Parameters
time_series (
TimeSeries
) – The time series we wish to plot, with colorcoding to indicate anomalies.time_series_prev (
Optional
[TimeSeries
]) – A time series immediately precedingtime_series
, which is used to initialize the time series model. Otherwise, we assumetime_series
immediately follows the training data.filter_scores – whether to filter the anomaly scores by the postrule before plotting them.
plot_forecast – Whether to plot the model’s forecast, in addition to the anomaly scores.
plot_forecast_uncertainty – Whether to plot the model’s uncertainty in its own forecast, in addition to the forecast and anomaly scores. Only used if
plot_forecast
isTrue
.plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
ax – matplotlib axis to add this plot to
 Returns
matplotlib figure & axes
 plot_anomaly_plotly(time_series, time_series_prev=None, *, filter_scores=True, plot_forecast=False, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600))
Plots the time series in matplotlib as a line graph, with points in the series overlaid as points colorcoded to indicate their severity as anomalies. Optionally allows you to overlay the model’s forecast & the model’s uncertainty in its forecast (if applicable).
 Parameters
time_series (
TimeSeries
) – The time series we wish to plot, with colorcoding to indicate anomalies.time_series_prev (
Optional
[TimeSeries
]) – A time series immediately precedingtime_series
, which is used to initialize the time series model. Otherwise, we assumetime_series
immediately follows the training data.filter_scores – whether to filter the anomaly scores by the postrule before plotting them.
plot_forecast – Whether to plot the model’s forecast, in addition to the anomaly scores.
plot_forecast_uncertainty – Whether to plot the model’s uncertainty in its own forecast, in addition to the forecast and anomaly scores. Only used if
plot_forecast
isTrue
.plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
 Returns
plotly figure
 plot_forecast(*, time_series=None, time_stamps=None, time_series_prev=None, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600), ax=None)
Plots the forecast for the time series in matplotlib, optionally also plotting the uncertainty of the forecast, as well as the past values (both true and predicted) of the time series.
 Parameters
time_series (
Optional
[TimeSeries
]) – the time series over whose timestamps we wish to make a forecast. Exactly one oftime_series
ortime_stamps
should be provided.time_stamps (
Optional
[List
[int
]]) – a list of timestamps we wish to forecast for. Exactly one oftime_series
ortime_stamps
should be provided.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.plot_forecast_uncertainty – whether to plot uncertainty estimates (the interquartile range) for forecast values. Not supported for all models.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
ax – matplotlib axis to add this plot to
 Returns
(fig, ax): matplotlib figure & axes the figure was plotted on
 plot_forecast_plotly(*, time_series=None, time_stamps=None, time_series_prev=None, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600))
Plots the forecast for the time series in plotly, optionally also plotting the uncertainty of the forecast, as well as the past values (both true and predicted) of the time series.
 Parameters
time_series (
Optional
[TimeSeries
]) – the time series over whose timestamps we wish to make a forecast. Exactly one oftime_series
ortime_stamps
should be provided.time_stamps (
Optional
[List
[int
]]) – a list of timestamps we wish to forecast for. Exactly one oftime_series
ortime_stamps
should be provided.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.plot_forecast_uncertainty – whether to plot uncertainty estimates (the interquartile range) for forecast values. Not supported for all models.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
merlion.models.anomaly.forecast_based.arima module
Classic ARIMA (AutoRegressive Integrated Moving Average) forecasting model, adapted for anomaly detection.
 class merlion.models.anomaly.forecast_based.arima.ArimaDetectorConfig(order=(4, 1, 2), seasonal_order=(0, 0, 0, 0), max_forecast_steps: int = None, target_seq_index: int = None, transform: TransformBase = None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
ArimaConfig
,DetectorConfig
Configuration class for
Arima
. Just aSarima
model with seasonal order(0, 0, 0, 0)
.Base class of the object used to configure an anomaly detection model.
 Parameters
order – Order is (p, d, q) for an ARIMA(p, d, q) process. d must be an integer indicating the integration order of the process, while p and q must be integers indicating the AR and MA orders (so that all lags up to those orders are included).
seasonal_order – (0, 0, 0, 0) because ARIMA has no seasonal order.
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like
MSES
andLGBMForecaster
.target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.anomaly.forecast_based.arima.ArimaDetector(config)
Bases:
ForecastingDetectorBase
,Arima
 config_class
alias of
ArimaDetectorConfig
merlion.models.anomaly.forecast_based.sarima module
Seasonal ARIMA (SARIMA) forecasting model, adapted for anomaly detection.
 class merlion.models.anomaly.forecast_based.sarima.SarimaDetectorConfig(order=(4, 1, 2), seasonal_order=(2, 0, 1, 24), max_forecast_steps: int = None, target_seq_index: int = None, transform: TransformBase = None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
SarimaConfig
,DetectorConfig
Config class for
Sarima
(Seasonal AutoRegressive Integrated Moving Average).Base class of the object used to configure an anomaly detection model.
 Parameters
order – Order is (p, d, q) for an ARIMA(p, d, q) process. d must be an integer indicating the integration order of the process, while p and q must be integers indicating the AR and MA orders (so that all lags up to those orders are included).
seasonal_order – Seasonal order is (P, D, Q, S) for seasonal ARIMA process, where s is the length of the seasonality cycle (e.g. s=24 for 24 hours on hourly granularity). P, D, Q are as for ARIMA.
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like
MSES
andLGBMForecaster
.target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.anomaly.forecast_based.sarima.SarimaDetector(config)
Bases:
ForecastingDetectorBase
,Sarima
 config_class
alias of
SarimaDetectorConfig
merlion.models.anomaly.forecast_based.ets module
ETS (error, trend, seasonal) forecasting model, adapted for anomaly detection.
 class merlion.models.anomaly.forecast_based.ets.ETSDetectorConfig(max_forecast_steps=None, target_seq_index=None, error='add', trend='add', damped_trend=True, seasonal='add', seasonal_periods=None, transform: TransformBase = None, enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, **kwargs)
Bases:
ETSConfig
,NoCalibrationDetectorConfig
Configuration class for
ETS
model. ETS model is an underlying state space model consisting of an error term (E), a trend component (T), a seasonal component (S), and a level component. Each component is flexible with different traits with additive (‘add’) or multiplicative (‘mul’) formulation. Refer to https://otexts.com/fpp2/taxonomy.html for more information about ETS model.Base class of the object used to configure an anomaly detection model.
 Parameters
max_forecast_steps – Number of steps we would like to forecast for.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
error – The error term. “add” or “mul”.
trend – The trend component. “add”, “mul” or None.
damped_trend – Whether or not an included trend component is damped.
seasonal – The seasonal component. “add”, “mul” or None.
seasonal_periods – The length of the seasonality cycle.
None
by default.transform – Transformation to preprocess input time series.
enable_calibrator –
False
because this config assumes calibrated outputs from the model.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.anomaly.forecast_based.ets.ETSDetector(config)
Bases:
ForecastingDetectorBase
,ETS
 config_class
alias of
ETSDetectorConfig
merlion.models.anomaly.forecast_based.prophet module
Adaptation of Facebook’s Prophet forecasting model to anomaly detection.
 class merlion.models.anomaly.forecast_based.prophet.ProphetDetectorConfig(max_forecast_steps=None, target_seq_index=None, yearly_seasonality='auto', weekly_seasonality='auto', daily_seasonality='auto', seasonality_mode='additive', holidays=None, uncertainty_samples=100, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
ProphetConfig
,DetectorConfig
Configuration class for Facebook’s
Prophet
model, as described by Taylor & Letham, 2017.Base class of the object used to configure an anomaly detection model.
 Parameters
max_forecast_steps (
Optional
[int
]) – Max # of steps we would like to forecast for.target_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.yearly_seasonality (
Union
[bool
,int
]) – If bool, whether to enable yearly seasonality. By default, it is activated if there are >= 2 years of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 10).weekly_seasonality (
Union
[bool
,int
]) – If bool, whether to enable weekly seasonality. By default, it is activated if there are >= 2 weeks of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 3).daily_seasonality (
Union
[bool
,int
]) – If bool, whether to enable daily seasonality. By default, it is activated if there are >= 2 days of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 4).seasonality_mode – ‘additive’ (default) or ‘multiplicative’.
holidays – pd.DataFrame with columns holiday (string) and ds (date type) and optionally columns lower_window and upper_window which specify a range of days around the date to be included as holidays. lower_window=2 will include 2 days prior to the date as holidays. Also optionally can have a column prior_scale specifying the prior scale for that holiday. Can also be a dict corresponding to the desired pd.DataFrame.
uncertainty_samples (
int
) – The number of posterior samples to draw in order to calibrate the anomaly scores.transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.anomaly.forecast_based.prophet.ProphetDetector(config)
Bases:
ForecastingDetectorBase
,Prophet
 config_class
alias of
ProphetDetectorConfig
merlion.models.anomaly.forecast_based.lstm module
Adaptation of a LSTM neural net forecaster, to the task of anomaly detection.
 class merlion.models.anomaly.forecast_based.lstm.LSTMDetectorConfig(max_forecast_steps, nhid=1024, model_strides=(1,), target_seq_index=None, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
LSTMConfig
,DetectorConfig
Configuration class for
LSTM
.Base class of the object used to configure an anomaly detection model.
 Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for. Required for some models likeMSES
andLGBMForecaster
.nhid – hidden dimension of LSTM
model_strides – tuple indicating the stride(s) at which we would like to subsample the input data before giving it to the model.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.anomaly.forecast_based.lstm.LSTMDetector(config)
Bases:
ForecastingDetectorBase
,LSTM
 Parameters
config (
LSTMConfig
) – model configuration
 config_class
alias of
LSTMDetectorConfig
merlion.models.anomaly.forecast_based.mses module
MSES (MultiScale Exponential Smoother) forecasting model adapted for anomaly detection.
 class merlion.models.anomaly.forecast_based.mses.MSESDetectorConfig(max_forecast_steps, online_updates=True, max_backstep=None, recency_weight=0.5, accel_weight=1.0, optimize_acc=True, eta=0.0, rho=0.0, phi=2.0, inflation=1.0, target_seq_index=None, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
MSESConfig
,DetectorConfig
Configuration class for an MSES forecasting model adapted for anomaly detection.
Letting
w
be the recency weight,B
the maximum backstep,x_t
the last seen data point, andl_s,t
the series of losses for scales
.\[\begin{split}\begin{align*} \hat{x}_{t+h} & = \sum_{b=0}^B p_{b} \cdot (x_{tb} + v_{b+h,t} + a_{b+h,t}) \\ \space \\ \text{where} \space\space & v_{b+h,t} = \text{EMA}_w(\Delta_{b+h} x_t) \\ & a_{b+h,t} = \text{EMA}_w(\Delta_{b+h}^2 x_t) \\ \text{and} \space\space & p_b = \sigma(z)_b \space\space \\ \text{if} & \space\space z_b = (b+h)^\phi \cdot \text{EMA}_w(l_{b+h,t}) \cdot \text{RWSE}_w(l_{b+h,t})\\ \end{align*}\end{split}\] Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for. Required for some models likeMSES
andLGBMForecaster
.max_backstep – Max backstep to use in forecasting. If we train with x(0),…,x(t), Then, the bth model MSES uses will forecast x(t+h) by anchoring at x(tb) and predicting xhat(t+h) = x(tb) + delta_hat(b+h).
recency_weight – The recency weight parameter to use when estimating delta_hat.
accel_weight – The weight to scale the acceleration by when computing delta_hat. Specifically, delta_hat(b+h) = velocity(b+h) + accel_weight * acceleration(b+h).
optimize_acc – If True, the acceleration correction will only be used at scales ranging from 1,…(max_backstep+max_forecast_steps)/2.
eta – The parameter used to control the rate at which recency_weight gets tuned when online updates are made to the model and losses can be computed.
rho – The parameter that determines what fraction of the overall error is due to velcity error, while the rest is due to the complement. The error at any scale will be determined as
rho * velocity_error + (1rho) * loss_error
.phi – The parameter used to exponentially inflate the magnitude of loss error at different scales. Loss error for scale
s
will be increased by a factor ofphi ** s
.inflation – The inflation exponent to use when computing the distribution p(bh) over the models when forecasting at horizon h according to standard errors of the estimated velocities over the models; inflation=1 is equivalent to using the softmax function.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.anomaly.forecast_based.mses.MSESDetector(config)
Bases:
ForecastingDetectorBase
,MSES
 Parameters
config (
MSESConfig
) – model configuration
 config_class
alias of
MSESDetectorConfig
 property online_updates
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Return type
 Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores