merlion.models.anomaly.forecast_based package

Contains all forecaster-based anomaly detectors. These models support all functionality of both anomaly detectors (merlion.models.anomaly) and forecasters (merlion.models.forecast).

Forecasting-based anomaly detectors are instances of an abstract ForecastingDetectorBase class. Many forecasting models support anomaly detection variants, where the anomaly score is based on the difference between the predicted and true time series value, and optionally the model’s uncertainty in its own prediction.

base

Base class for anomaly detectors based on forecasting models.

arima

Classic ARIMA (AutoRegressive Integrated Moving Average) forecasting model, adapted for anomaly detection.

sarima

Seasonal ARIMA (SARIMA) forecasting model, adapted for anomaly detection.

ets

ETS (error, trend, seasonal) forecasting model, adapted for anomaly detection.

prophet

Adaptation of Facebook's Prophet forecasting model to anomaly detection.

lstm

Adaptation of a LSTM neural net forecaster, to the task of anomaly detection.

mses

MSES (Multi-Scale Exponential Smoother) forecasting model adapted for anomaly detection.

Submodules

merlion.models.anomaly.forecast_based.base module

Base class for anomaly detectors based on forecasting models.

class merlion.models.anomaly.forecast_based.base.ForecastingDetectorBase(config)

Bases: ForecasterBase, DetectorBase

Base class for a forecast-based anomaly detector.

Parameters

config (ForecasterConfig) – model configuration

forecast_to_anom_score(time_series, forecast, stderr)

Compare a model’s forecast to a ground truth time series, in order to compute anomaly scores. By default, we compute a z-score if model uncertainty (stderr) is given, or the residuals if there is no model uncertainty.

Parameters
  • time_series (TimeSeries) – the ground truth time series.

  • forecast (TimeSeries) – the model’s forecasted values for the time series

  • stderr (Optional[TimeSeries]) – the standard errors of the model’s forecast

Return type

DataFrame

Returns

Anomaly scores based on the difference between the ground truth values and the model’s forecast.

train(train_data, train_config=None, exog_data=None, anomaly_labels=None, post_rule_train_config=None)

Trains the anomaly detector (unsupervised) and its post-rule (supervised, if labels are given) on train data.

Parameters
  • train_data (TimeSeries) – a TimeSeries of metric values to train the model.

  • train_config – Additional training configs, if needed. Only required for some models.

  • anomaly_labels – a TimeSeries indicating which timestamps are anomalous. Optional.

  • post_rule_train_config – The config to use for training the model’s post-rule. The model’s default post-rule train config is used if none is supplied here.

Return type

TimeSeries

Returns

A TimeSeries of the model’s anomaly scores on the training data.

train_post_process(train_result, anomaly_labels=None, post_rule_train_config=None)

Converts the train result (anom scores on train data) into a TimeSeries object and trains the post-rule.

Parameters
  • train_result (Tuple[Union[TimeSeries, DataFrame], Union[TimeSeries, DataFrame, None]]) – Raw anomaly scores on the training data.

  • anomaly_labels – a TimeSeries indicating which timestamps are anomalous. Optional.

  • post_rule_train_config – The config to use for training the model’s post-rule. The model’s default post-rule train config is used if none is supplied here.

Return type

TimeSeries

get_anomaly_score(time_series, time_series_prev=None, exog_data=None)

Returns the model’s predicted sequence of anomaly scores.

Parameters
  • time_series (TimeSeries) – the TimeSeries we wish to predict anomaly scores for.

  • time_series_prev (Optional[TimeSeries]) – a TimeSeries immediately preceding time_series. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume that time_series immediately follows the training data.

Return type

TimeSeries

Returns

a univariate TimeSeries of anomaly scores

get_anomaly_label(time_series, time_series_prev=None, exog_data=None)

Returns the model’s predicted sequence of anomaly scores, processed by any relevant post-rules (calibration and/or thresholding).

Parameters
  • time_series (TimeSeries) – the TimeSeries we wish to predict anomaly scores for.

  • time_series_prev (Optional[TimeSeries]) – a TimeSeries immediately preceding time_series. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume that time_series immediately follows the training data.

Return type

TimeSeries

Returns

a univariate TimeSeries of anomaly scores, filtered by the model’s post-rule

get_figure(*, time_series=None, time_stamps=None, time_series_prev=None, exog_data=None, plot_anomaly=True, filter_scores=True, plot_forecast=False, plot_forecast_uncertainty=False, plot_time_series_prev=False)
Parameters
  • time_series (Optional[TimeSeries]) – the time series over whose timestamps we wish to make a forecast. Exactly one of time_series or time_stamps should be provided.

  • time_stamps (Optional[List[int]]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for. Exactly one of time_series or time_stamps should be provided.

  • time_series_prev (Optional[TimeSeries]) – a time series immediately preceding time_series. If given, we use it to initialize the forecaster’s state. Otherwise, we assume that time_series immediately follows the training data.

  • exog_data (Optional[TimeSeries]) – A time series of exogenous variables. Exogenous variables are known a priori, and they are independent of the variable being forecasted. exog_data must include data for all of time_stamps; if time_series_prev is given, it must include data for all of time_series_prev.time_stamps as well. Optional. Only supported for models which inherit from ForecasterExogBase.

  • plot_anomaly – Whether to plot the model’s predicted anomaly scores.

  • filter_scores – whether to filter the anomaly scores by the post-rule before plotting them.

  • plot_forecast – Whether to plot the model’s forecasted values.

  • plot_forecast_uncertainty – whether to plot uncertainty estimates (the inter-quartile range) for forecast values. Not supported for all models.

  • plot_time_series_prev – whether to plot time_series_prev (and the model’s fit for it). Only used if time_series_prev is given.

Return type

Figure

Returns

a Figure of the model’s anomaly score predictions and/or forecast.

plot_anomaly(time_series, time_series_prev=None, exog_data=None, *, filter_scores=True, plot_forecast=False, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600), ax=None)

Plots the time series in matplotlib as a line graph, with points in the series overlaid as points color-coded to indicate their severity as anomalies. Optionally allows you to overlay the model’s forecast & the model’s uncertainty in its forecast (if applicable).

Parameters
  • time_series (TimeSeries) – the time series over whose timestamps we wish to make a forecast. Exactly one of time_series or time_stamps should be provided.

  • time_series_prev (Optional[TimeSeries]) – a time series immediately preceding time_series. If given, we use it to initialize the forecaster’s state. Otherwise, we assume that time_series immediately follows the training data.

  • exog_data (Optional[TimeSeries]) – A time series of exogenous variables. Exogenous variables are known a priori, and they are independent of the variable being forecasted. exog_data must include data for all of time_stamps; if time_series_prev is given, it must include data for all of time_series_prev.time_stamps as well. Optional. Only supported for models which inherit from ForecasterExogBase.

  • filter_scores – whether to filter the anomaly scores by the post-rule before plotting them.

  • plot_forecast – Whether to plot the model’s forecast, in addition to the anomaly scores.

  • plot_forecast_uncertainty – whether to plot uncertainty estimates (the inter-quartile range) for forecast values. Not supported for all models.

  • plot_time_series_prev – whether to plot time_series_prev (and the model’s fit for it). Only used if time_series_prev is given.

  • figsize – figure size in pixels

  • ax – matplotlib axis to add this plot to

Returns

matplotlib figure & axes

plot_anomaly_plotly(time_series, time_series_prev=None, exog_data=None, *, filter_scores=True, plot_forecast=False, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600))

Plots the time series in matplotlib as a line graph, with points in the series overlaid as points color-coded to indicate their severity as anomalies. Optionally allows you to overlay the model’s forecast & the model’s uncertainty in its forecast (if applicable).

Parameters
  • time_series (TimeSeries) – the time series over whose timestamps we wish to make a forecast. Exactly one of time_series or time_stamps should be provided.

  • time_series_prev (Optional[TimeSeries]) – a time series immediately preceding time_series. If given, we use it to initialize the forecaster’s state. Otherwise, we assume that time_series immediately follows the training data.

  • exog_data (Optional[TimeSeries]) – A time series of exogenous variables. Exogenous variables are known a priori, and they are independent of the variable being forecasted. exog_data must include data for all of time_stamps; if time_series_prev is given, it must include data for all of time_series_prev.time_stamps as well. Optional. Only supported for models which inherit from ForecasterExogBase.

  • filter_scores – whether to filter the anomaly scores by the post-rule before plotting them.

  • plot_forecast – Whether to plot the model’s forecast, in addition to the anomaly scores.

  • plot_forecast_uncertainty – whether to plot uncertainty estimates (the inter-quartile range) for forecast values. Not supported for all models.

  • plot_time_series_prev – whether to plot time_series_prev (and the model’s fit for it). Only used if time_series_prev is given.

  • figsize – figure size in pixels

Returns

plotly figure

plot_forecast(*, time_series=None, time_stamps=None, time_series_prev=None, exog_data=None, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600), ax=None)

Plots the forecast for the time series in matplotlib, optionally also plotting the uncertainty of the forecast, as well as the past values (both true and predicted) of the time series.

Parameters
  • time_series (Optional[TimeSeries]) – the time series over whose timestamps we wish to make a forecast. Exactly one of time_series or time_stamps should be provided.

  • time_stamps (Optional[List[int]]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for. Exactly one of time_series or time_stamps should be provided.

  • time_series_prev (Optional[TimeSeries]) – a time series immediately preceding time_series. If given, we use it to initialize the forecaster’s state. Otherwise, we assume that time_series immediately follows the training data.

  • exog_data (Optional[TimeSeries]) – A time series of exogenous variables. Exogenous variables are known a priori, and they are independent of the variable being forecasted. exog_data must include data for all of time_stamps; if time_series_prev is given, it must include data for all of time_series_prev.time_stamps as well. Optional. Only supported for models which inherit from ForecasterExogBase.

  • plot_forecast_uncertainty – whether to plot uncertainty estimates (the inter-quartile range) for forecast values. Not supported for all models.

  • plot_time_series_prev – whether to plot time_series_prev (and the model’s fit for it). Only used if time_series_prev is given.

  • figsize – figure size in pixels

  • ax – matplotlib axis to add this plot to

Returns

(fig, ax): matplotlib figure & axes the figure was plotted on

plot_forecast_plotly(*, time_series=None, time_stamps=None, time_series_prev=None, exog_data=None, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600))

Plots the forecast for the time series in plotly, optionally also plotting the uncertainty of the forecast, as well as the past values (both true and predicted) of the time series.

Parameters
  • time_series (Optional[TimeSeries]) – the time series over whose timestamps we wish to make a forecast. Exactly one of time_series or time_stamps should be provided.

  • time_stamps (Optional[List[int]]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for. Exactly one of time_series or time_stamps should be provided.

  • time_series_prev (Optional[TimeSeries]) – a time series immediately preceding time_series. If given, we use it to initialize the forecaster’s state. Otherwise, we assume that time_series immediately follows the training data.

  • exog_data (Optional[TimeSeries]) – A time series of exogenous variables. Exogenous variables are known a priori, and they are independent of the variable being forecasted. exog_data must include data for all of time_stamps; if time_series_prev is given, it must include data for all of time_series_prev.time_stamps as well. Optional. Only supported for models which inherit from ForecasterExogBase.

  • plot_forecast_uncertainty – whether to plot uncertainty estimates (the inter-quartile range) for forecast values. Not supported for all models.

  • plot_time_series_prev – whether to plot time_series_prev (and the model’s fit for it). Only used if time_series_prev is given.

  • figsize – figure size in pixels

merlion.models.anomaly.forecast_based.arima module

Classic ARIMA (AutoRegressive Integrated Moving Average) forecasting model, adapted for anomaly detection.

class merlion.models.anomaly.forecast_based.arima.ArimaDetectorConfig(order=(4, 1, 2), seasonal_order=(0, 0, 0, 0), exog_transform: TransformBase = None, exog_aggregation_policy: Union[AggregationPolicy, str] = 'Mean', exog_missing_value_policy: Union[MissingValuePolicy, str] = 'ZFill', max_forecast_steps: int = None, target_seq_index: int = None, invert_transform=None, transform: TransformBase = None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)

Bases: ArimaConfig, DetectorConfig

Configuration class for Arima. Just a Sarima model with seasonal order (0, 0, 0, 0).

Base class of the object used to configure an anomaly detection model.

Parameters
  • order – Order is (p, d, q) for an ARIMA(p, d, q) process. d must be an integer indicating the integration order of the process, while p and q must be integers indicating the AR and MA orders (so that all lags up to those orders are included).

  • seasonal_order – (0, 0, 0, 0) because ARIMA has no seasonal order.

  • exog_transform – The pre-processing transform for exogenous data. Note: resampling is handled separately.

  • exog_aggregation_policy – The policy to use for aggregating values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.

  • exog_missing_value_policy – The policy to use for imputing missing values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.

  • max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like MSES.

  • target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • invert_transform – Whether to automatically invert the transform before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.

  • transform – Transformation to pre-process input time series.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be z-scores (i.e. distributed as N(0, 1)).

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

class merlion.models.anomaly.forecast_based.arima.ArimaDetector(config)

Bases: ForecastingDetectorBase, Arima

config_class

alias of ArimaDetectorConfig

merlion.models.anomaly.forecast_based.sarima module

Seasonal ARIMA (SARIMA) forecasting model, adapted for anomaly detection.

class merlion.models.anomaly.forecast_based.sarima.SarimaDetectorConfig(order=(4, 1, 2), seasonal_order=(2, 0, 1, 24), exog_transform: TransformBase = None, exog_aggregation_policy: Union[AggregationPolicy, str] = 'Mean', exog_missing_value_policy: Union[MissingValuePolicy, str] = 'ZFill', max_forecast_steps: int = None, target_seq_index: int = None, invert_transform=None, transform: TransformBase = None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)

Bases: SarimaConfig, DetectorConfig

Config class for Sarima (Seasonal AutoRegressive Integrated Moving Average).

Base class of the object used to configure an anomaly detection model.

Parameters
  • order – Order is (p, d, q) for an ARIMA(p, d, q) process. d must be an integer indicating the integration order of the process, while p and q must be integers indicating the AR and MA orders (so that all lags up to those orders are included).

  • seasonal_order – Seasonal order is (P, D, Q, S) for seasonal ARIMA process, where s is the length of the seasonality cycle (e.g. s=24 for 24 hours on hourly granularity). P, D, Q are as for ARIMA.

  • exog_transform – The pre-processing transform for exogenous data. Note: resampling is handled separately.

  • exog_aggregation_policy – The policy to use for aggregating values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.

  • exog_missing_value_policy – The policy to use for imputing missing values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.

  • max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like MSES.

  • target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • invert_transform – Whether to automatically invert the transform before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.

  • transform – Transformation to pre-process input time series.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be z-scores (i.e. distributed as N(0, 1)).

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

class merlion.models.anomaly.forecast_based.sarima.SarimaDetector(config)

Bases: ForecastingDetectorBase, Sarima

config_class

alias of SarimaDetectorConfig

merlion.models.anomaly.forecast_based.ets module

ETS (error, trend, seasonal) forecasting model, adapted for anomaly detection.

class merlion.models.anomaly.forecast_based.ets.ETSDetectorConfig(max_forecast_steps=None, target_seq_index=None, error='add', trend='add', damped_trend=True, seasonal='add', seasonal_periods=None, pred_interval_strategy='exact', refit=True, invert_transform=None, transform: TransformBase = None, enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, **kwargs)

Bases: ETSConfig, NoCalibrationDetectorConfig

Configuration class for ETS model. ETS model is an underlying state space model consisting of an error term (E), a trend component (T), a seasonal component (S), and a level component. Each component is flexible with different traits with additive (‘add’) or multiplicative (‘mul’) formulation. Refer to https://otexts.com/fpp2/taxonomy.html for more information about ETS model.

Base class of the object used to configure an anomaly detection model.

Parameters
  • max_forecast_steps – Number of steps we would like to forecast for.

  • target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • error – The error term. “add” or “mul”.

  • trend – The trend component. “add”, “mul” or None.

  • damped_trend – Whether or not an included trend component is damped.

  • seasonal – The seasonal component. “add”, “mul” or None.

  • seasonal_periods – The length of the seasonality cycle. None by default.

  • pred_interval_strategy – Strategy to compute prediction intervals. “exact” or “simulated”.

  • refit – if True, refit the full ETS model when time_series_prev is given to the forecast method

  • invert_transform – Whether to automatically invert the transform before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.

  • transform – Transformation to pre-process input time series.

  • enable_calibratorFalse because this config assumes calibrated outputs from the model.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

Note that “simulated” setting supports more variants of ETS model.

(slower). If False, simply perform exponential smoothing (faster).

class merlion.models.anomaly.forecast_based.ets.ETSDetector(config)

Bases: ForecastingDetectorBase, ETS

config_class

alias of ETSDetectorConfig

merlion.models.anomaly.forecast_based.prophet module

Adaptation of Facebook’s Prophet forecasting model to anomaly detection.

class merlion.models.anomaly.forecast_based.prophet.ProphetDetectorConfig(max_forecast_steps=None, target_seq_index=None, yearly_seasonality='auto', weekly_seasonality='auto', daily_seasonality='auto', seasonality_mode='additive', holidays=None, uncertainty_samples=100, exog_transform=None, exog_aggregation_policy='Mean', exog_missing_value_policy='ZFill', invert_transform=None, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)

Bases: ProphetConfig, DetectorConfig

Configuration class for Facebook’s Prophet model, as described by Taylor & Letham, 2017.

Base class of the object used to configure an anomaly detection model.

Parameters
  • max_forecast_steps (Optional[int]) – Max # of steps we would like to forecast for.

  • target_seq_index (Optional[int]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • yearly_seasonality (Union[bool, int]) – If bool, whether to enable yearly seasonality. By default, it is activated if there are >= 2 years of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 10).

  • weekly_seasonality (Union[bool, int]) – If bool, whether to enable weekly seasonality. By default, it is activated if there are >= 2 weeks of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 3).

  • daily_seasonality (Union[bool, int]) – If bool, whether to enable daily seasonality. By default, it is activated if there are >= 2 days of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 4).

  • seasonality_mode – ‘additive’ (default) or ‘multiplicative’.

  • holidays – pd.DataFrame with columns holiday (string) and ds (date type) and optionally columns lower_window and upper_window which specify a range of days around the date to be included as holidays. lower_window=-2 will include 2 days prior to the date as holidays. Also optionally can have a column prior_scale specifying the prior scale for that holiday. Can also be a dict corresponding to the desired pd.DataFrame.

  • uncertainty_samples (int) – The number of posterior samples to draw in order to calibrate the anomaly scores.

  • exog_transform – The pre-processing transform for exogenous data. Note: resampling is handled separately.

  • exog_aggregation_policy – The policy to use for aggregating values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.

  • exog_missing_value_policy – The policy to use for imputing missing values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.

  • invert_transform – Whether to automatically invert the transform before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.

  • transform – Transformation to pre-process input time series.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be z-scores (i.e. distributed as N(0, 1)).

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

class merlion.models.anomaly.forecast_based.prophet.ProphetDetector(config)

Bases: ForecastingDetectorBase, Prophet

config_class

alias of ProphetDetectorConfig

merlion.models.anomaly.forecast_based.lstm module

Adaptation of a LSTM neural net forecaster, to the task of anomaly detection.

class merlion.models.anomaly.forecast_based.lstm.LSTMDetectorConfig(max_forecast_steps, nhid=1024, model_strides=(1,), target_seq_index=None, invert_transform=None, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)

Bases: LSTMConfig, DetectorConfig

Configuration class for LSTM.

Base class of the object used to configure an anomaly detection model.

Parameters
  • max_forecast_steps (int) – Max # of steps we would like to forecast for. Required for some models like MSES.

  • nhid – hidden dimension of LSTM

  • model_strides – tuple indicating the stride(s) at which we would like to subsample the input data before giving it to the model.

  • target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • invert_transform – Whether to automatically invert the transform before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.

  • transform – Transformation to pre-process input time series.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be z-scores (i.e. distributed as N(0, 1)).

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

class merlion.models.anomaly.forecast_based.lstm.LSTMDetector(config)

Bases: ForecastingDetectorBase, LSTM

Parameters

config (LSTMConfig) – model configuration

config_class

alias of LSTMDetectorConfig

merlion.models.anomaly.forecast_based.mses module

MSES (Multi-Scale Exponential Smoother) forecasting model adapted for anomaly detection.

class merlion.models.anomaly.forecast_based.mses.MSESDetectorConfig(max_forecast_steps, online_updates=True, max_backstep=None, recency_weight=0.5, accel_weight=1.0, optimize_acc=True, eta=0.0, rho=0.0, phi=2.0, inflation=1.0, target_seq_index=None, invert_transform=None, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)

Bases: MSESConfig, DetectorConfig

Configuration class for an MSES forecasting model adapted for anomaly detection.

Letting w be the recency weight, B the maximum backstep, x_t the last seen data point, and l_s,t the series of losses for scale s.

\[\begin{split}\begin{align*} \hat{x}_{t+h} & = \sum_{b=0}^B p_{b} \cdot (x_{t-b} + v_{b+h,t} + a_{b+h,t}) \\ \space \\ \text{where} \space\space & v_{b+h,t} = \text{EMA}_w(\Delta_{b+h} x_t) \\ & a_{b+h,t} = \text{EMA}_w(\Delta_{b+h}^2 x_t) \\ \text{and} \space\space & p_b = \sigma(z)_b \space\space \\ \text{if} & \space\space z_b = (b+h)^\phi \cdot \text{EMA}_w(l_{b+h,t}) \cdot \text{RWSE}_w(l_{b+h,t})\\ \end{align*}\end{split}\]
Parameters
  • max_forecast_steps (int) – Max # of steps we would like to forecast for. Required for some models like MSES.

  • max_backstep – Max backstep to use in forecasting. If we train with x(0),…,x(t), Then, the b-th model MSES uses will forecast x(t+h) by anchoring at x(t-b) and predicting xhat(t+h) = x(t-b) + delta_hat(b+h).

  • recency_weight – The recency weight parameter to use when estimating delta_hat.

  • accel_weight – The weight to scale the acceleration by when computing delta_hat. Specifically, delta_hat(b+h) = velocity(b+h) + accel_weight * acceleration(b+h).

  • optimize_acc – If True, the acceleration correction will only be used at scales ranging from 1,…(max_backstep+max_forecast_steps)/2.

  • eta – The parameter used to control the rate at which recency_weight gets tuned when online updates are made to the model and losses can be computed.

  • rho – The parameter that determines what fraction of the overall error is due to velcity error, while the rest is due to the complement. The error at any scale will be determined as rho * velocity_error + (1-rho) * loss_error.

  • phi – The parameter used to exponentially inflate the magnitude of loss error at different scales. Loss error for scale s will be increased by a factor of phi ** s.

  • inflation – The inflation exponent to use when computing the distribution p(b|h) over the models when forecasting at horizon h according to standard errors of the estimated velocities over the models; inflation=1 is equivalent to using the softmax function.

  • target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • invert_transform – Whether to automatically invert the transform before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.

  • transform – Transformation to pre-process input time series.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be z-scores (i.e. distributed as N(0, 1)).

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

class merlion.models.anomaly.forecast_based.mses.MSESDetector(config)

Bases: ForecastingDetectorBase, MSES

Parameters

config (MSESConfig) – model configuration

config_class

alias of MSESDetectorConfig

property online_updates
get_anomaly_score(time_series, time_series_prev=None, exog_data=None)

Returns the model’s predicted sequence of anomaly scores.

Parameters
  • time_series (TimeSeries) – the TimeSeries we wish to predict anomaly scores for.

  • time_series_prev (Optional[TimeSeries]) – a TimeSeries immediately preceding time_series. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume that time_series immediately follows the training data.

Return type

TimeSeries

Returns

a univariate TimeSeries of anomaly scores