merlion.models.forecast package
Contains all forecasting models.
For forecasting, we define an abstract base ForecasterBase
class which inherits from ModelBase
and supports the
following interface, in addition to model.save()
and ForecasterClass.load
defined for ModelBase
:
model = ForecasterClass(config)
initialization with a modelspecific config (which inherits from
ForecasterConfig
)configs contain:
a (potentially trainable) data preprocessing transform from
merlion.transform
; note thatmodel.transform
is a property which refers tomodel.config.transform
modelspecific hyperparameters
optionally, a maximum number of steps the model can forecast for
model.forecast(time_stamps, time_series_prev=None)
returns the forecast (
TimeSeries
) for future values at the time stamps specified bytime_stamps
, as well as the standard error of that forecast (TimeSeries
, may beNone
)if
time_series_prev
is specified, it is used as the most recent context. Otherwise, the training data is used
model.train(train_data, train_config=None)
trains the model on the
TimeSeries
train_data
train_config
(optional): extra configuration describing how the model should be trained (e.g. learning rate forLSTM
). Not used for all models. Classlevel default provided for models which do use it.returns the model’s prediction
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
Base class for forecasting models. 

The classic statistical forecasting model ARIMA (AutoRegressive Integrated Moving Average). 

A variant of ARIMA with a userspecified Seasonality. 

ETS (Error, Trend, Seasonal) forecasting model. 

Wrapper around Facebook's popular Prophet model for time series forecasting. 

MultiScale Exponential Smoother for univariate time series forecasting. 

Vector AutoRegressive model for multivariate time series forecasting. 

Bagging Treebased models for multivariate time series forecasting. 

Boosting Treebased models for multivariate time series forecasting. 

A forecaster based on a LSTM neural net. 
Submodules
merlion.models.forecast.base module
Base class for forecasting models.
 class merlion.models.forecast.base.ForecasterConfig(max_forecast_steps=None, target_seq_index=None, transform=None, **kwargs)
Bases:
Config
Config object used to define a forecaster model.
 Parameters
max_forecast_steps (
Optional
[int
]) – Max # of steps we would like to forecast for. Required for some models likeMSES
andLGBMForecaster
.target_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.transform – Transformation to preprocess input time series.
 max_forecast_steps: Optional[int] = None
 target_seq_index: Optional[int] = None
 class merlion.models.forecast.base.ForecasterBase(config)
Bases:
ModelBase
Base class for a forecaster model.
Note
If your model depends on an evenly spaced time series, make sure to
Call
ForecasterBase.train_pre_process
inForecasterBase.train
Call
ForecasterBase.resample_time_stamps
at the start ofForecasterBase.forecast
to get a set of resampled time stamps, and calltime_series.align(reference=time_stamps)
to align the forecast with the original time stamps.
 config_class
alias of
ForecasterConfig
 target_name = None
The name of the target univariate to forecast.
 property max_forecast_steps
 property target_seq_index: int
 Return type
int
 Returns
the index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
 resample_time_stamps(time_stamps, time_series_prev=None)
 train_pre_process(train_data, require_even_sampling, require_univariate)
Applies preprocessing steps common for training most models.
 Parameters
train_data (
TimeSeries
) – the original time series of training datarequire_even_sampling (
bool
) – whether the model assumes that training data is sampled at a fixed frequencyrequire_univariate (
bool
) – whether the model only works with univariate time series
 Return type
 Returns
the training data, after any necessary preprocessing has been applied
 abstract train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Return type
Tuple
[TimeSeries
,Optional
[TimeSeries
]] Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 abstract forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the interquartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Union
[Tuple
[TimeSeries
,Optional
[TimeSeries
]],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 batch_forecast(time_stamps_list, time_series_prev_list, return_iqr=False, return_prev=False)
Returns the model’s forecast on a batch of timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps_list (
List
[List
[int
]]) – a list of lists of timestamps we wish to forecast fortime_series_prev_list (
List
[TimeSeries
]) – a list of TimeSeries immediately preceding the time stamps in time_stamps_listreturn_iqr (
bool
) – whether to return the interquartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Tuple
[Union
[Tuple
[List
[TimeSeries
],List
[Optional
[TimeSeries
]]],Tuple
[List
[TimeSeries
],List
[TimeSeries
],List
[TimeSeries
]]]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 invert_transform(forecast, time_series_prev=None)
 get_figure(*, time_series=None, time_stamps=None, time_series_prev=None, plot_forecast_uncertainty=False, plot_time_series_prev=False)
 Parameters
time_series (
Optional
[TimeSeries
]) – the time series over whose timestamps we wish to make a forecast. Exactly one oftime_series
ortime_stamps
should be provided.time_stamps (
Optional
[List
[int
]]) – a list of timestamps we wish to forecast for. Exactly one oftime_series
ortime_stamps
should be provided.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.plot_forecast_uncertainty – whether to plot uncertainty estimates (the interquartile range) for forecast values. Not supported for all models.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.
 Return type
 Returns
a
Figure
of the model’s forecast.
 plot_forecast(*, time_series=None, time_stamps=None, time_series_prev=None, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600), ax=None)
Plots the forecast for the time series in matplotlib, optionally also plotting the uncertainty of the forecast, as well as the past values (both true and predicted) of the time series.
 Parameters
time_series (
Optional
[TimeSeries
]) – the time series over whose timestamps we wish to make a forecast. Exactly one oftime_series
ortime_stamps
should be provided.time_stamps (
Optional
[List
[int
]]) – a list of timestamps we wish to forecast for. Exactly one oftime_series
ortime_stamps
should be provided.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.plot_forecast_uncertainty – whether to plot uncertainty estimates (the interquartile range) for forecast values. Not supported for all models.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
ax – matplotlib axis to add this plot to
 Returns
(fig, ax): matplotlib figure & axes the figure was plotted on
 plot_forecast_plotly(*, time_series=None, time_stamps=None, time_series_prev=None, plot_forecast_uncertainty=False, plot_time_series_prev=False, figsize=(1000, 600))
Plots the forecast for the time series in plotly, optionally also plotting the uncertainty of the forecast, as well as the past values (both true and predicted) of the time series.
 Parameters
time_series (
Optional
[TimeSeries
]) – the time series over whose timestamps we wish to make a forecast. Exactly one oftime_series
ortime_stamps
should be provided.time_stamps (
Optional
[List
[int
]]) – a list of timestamps we wish to forecast for. Exactly one oftime_series
ortime_stamps
should be provided.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_stamps
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_stamps
immediately follows the training data.plot_forecast_uncertainty – whether to plot uncertainty estimates (the interquartile range) for forecast values. Not supported for all models.
plot_time_series_prev – whether to plot
time_series_prev
(and the model’s fit for it). Only used iftime_series_prev
is given.figsize – figure size in pixels
merlion.models.forecast.arima module
The classic statistical forecasting model ARIMA (AutoRegressive Integrated Moving Average).
 class merlion.models.forecast.arima.ArimaConfig(order=(4, 1, 2), seasonal_order=(0, 0, 0, 0), max_forecast_steps: int = None, target_seq_index: int = None, transform: TransformBase = None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
SarimaConfig
Configuration class for
Arima
. Just aSarima
model with seasonal order(0, 0, 0, 0)
.Base class of the object used to configure an anomaly detection model.
 Parameters
order – Order is (p, d, q) for an ARIMA(p, d, q) process. d must be an integer indicating the integration order of the process, while p and q must be integers indicating the AR and MA orders (so that all lags up to those orders are included).
seasonal_order – (0, 0, 0, 0) because ARIMA has no seasonal order.
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like
MSES
andLGBMForecaster
.target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 property seasonal_order: Tuple[int, int, int, int]
 Return type
Tuple
[int
,int
,int
,int
] Returns
(0, 0, 0, 0) because ARIMA has no seasonal order.
 class merlion.models.forecast.arima.Arima(config)
Bases:
Sarima
Implementation of the classic statistical model ARIMA (AutoRegressive Integrated Moving Average) for forecasting.
 config_class
alias of
ArimaConfig
merlion.models.forecast.sarima module
A variant of ARIMA with a userspecified Seasonality.
 class merlion.models.forecast.sarima.SarimaConfig(order=(4, 1, 2), seasonal_order=(2, 0, 1, 24), max_forecast_steps: int = None, target_seq_index: int = None, transform: TransformBase = None, max_score: float = 1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
ForecasterConfig
Config class for
Sarima
(Seasonal AutoRegressive Integrated Moving Average).Base class of the object used to configure an anomaly detection model.
 Parameters
order – Order is (p, d, q) for an ARIMA(p, d, q) process. d must be an integer indicating the integration order of the process, while p and q must be integers indicating the AR and MA orders (so that all lags up to those orders are included).
seasonal_order – Seasonal order is (P, D, Q, S) for seasonal ARIMA process, where s is the length of the seasonality cycle (e.g. s=24 for 24 hours on hourly granularity). P, D, Q are as for ARIMA.
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like
MSES
andLGBMForecaster
.target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.forecast.sarima.Sarima(config)
Bases:
ForecasterBase
,SeasonalityModel
Implementation of the classic statistical model SARIMA (Seasonal AutoRegressive Integrated Moving Average) for forecasting.
 config_class
alias of
SarimaConfig
 property order: Tuple[int, int, int]
 Return type
Tuple
[int
,int
,int
] Returns
the order (p, d, q) of the model, where p is the AR order, d is the integration order, and q is the MA order.
 property seasonal_order: Tuple[int, int, int, int]
 Return type
Tuple
[int
,int
,int
,int
] Returns
the seasonal order (P, D, Q, S) for the seasonal ARIMA process, where p is the AR order, D is the integration order, Q is the MA order, and S is the length of the seasonality cycle.
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Union
[Tuple
[TimeSeries
,TimeSeries
],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 set_seasonality(theta, train_data)
Implement this method to do any modelspecific adjustments on the seasonality that was provided by
SeasonalityLayer
. Parameters
theta – Seasonality processed by
SeasonalityLayer
.train_data (
UnivariateTimeSeries
) – Training data (or numpy array representing the target univariate) for any modelspecific adjustments you might want to make.
merlion.models.forecast.ets module
ETS (Error, Trend, Seasonal) forecasting model.
 class merlion.models.forecast.ets.ETSConfig(max_forecast_steps=None, target_seq_index=None, error='add', trend='add', damped_trend=True, seasonal='add', seasonal_periods=None, transform: TransformBase = None, enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, **kwargs)
Bases:
ForecasterConfig
Configuration class for
ETS
model. ETS model is an underlying state space model consisting of an error term (E), a trend component (T), a seasonal component (S), and a level component. Each component is flexible with different traits with additive (‘add’) or multiplicative (‘mul’) formulation. Refer to https://otexts.com/fpp2/taxonomy.html for more information about ETS model.Base class of the object used to configure an anomaly detection model.
 Parameters
max_forecast_steps – Number of steps we would like to forecast for.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
error – The error term. “add” or “mul”.
trend – The trend component. “add”, “mul” or None.
damped_trend – Whether or not an included trend component is damped.
seasonal – The seasonal component. “add”, “mul” or None.
seasonal_periods – The length of the seasonality cycle.
None
by default.transform – Transformation to preprocess input time series.
enable_calibrator –
False
because this config assumes calibrated outputs from the model.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.forecast.ets.ETS(config)
Bases:
SeasonalityModel
,ForecasterBase
Implementation of the classic local statistical model ETS (Error, Trend, Seasonal) for forecasting.
 property error
 property trend
 property damped_trend
 property seasonal
 property seasonal_periods
 set_seasonality(theta, train_data)
Implement this method to do any modelspecific adjustments on the seasonality that was provided by
SeasonalityLayer
. Parameters
theta – Seasonality processed by
SeasonalityLayer
.train_data (
UnivariateTimeSeries
) – Training data (or numpy array representing the target univariate) for any modelspecific adjustments you might want to make.
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False, refit=True)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Union
[Tuple
[TimeSeries
,TimeSeries
],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
merlion.models.forecast.prophet module
Wrapper around Facebook’s popular Prophet model for time series forecasting.
 class merlion.models.forecast.prophet.ProphetConfig(max_forecast_steps=None, target_seq_index=None, yearly_seasonality='auto', weekly_seasonality='auto', daily_seasonality='auto', seasonality_mode='additive', holidays=None, uncertainty_samples=100, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
ForecasterConfig
Configuration class for Facebook’s
Prophet
model, as described by Taylor & Letham, 2017.Base class of the object used to configure an anomaly detection model.
 Parameters
max_forecast_steps (
Optional
[int
]) – Max # of steps we would like to forecast for.target_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.yearly_seasonality (
Union
[bool
,int
]) – If bool, whether to enable yearly seasonality. By default, it is activated if there are >= 2 years of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 10).weekly_seasonality (
Union
[bool
,int
]) – If bool, whether to enable weekly seasonality. By default, it is activated if there are >= 2 weeks of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 3).daily_seasonality (
Union
[bool
,int
]) – If bool, whether to enable daily seasonality. By default, it is activated if there are >= 2 days of history, but deactivated otherwise. If int, this is the number of Fourier series components used to model the seasonality (default = 4).seasonality_mode – ‘additive’ (default) or ‘multiplicative’.
holidays – pd.DataFrame with columns holiday (string) and ds (date type) and optionally columns lower_window and upper_window which specify a range of days around the date to be included as holidays. lower_window=2 will include 2 days prior to the date as holidays. Also optionally can have a column prior_scale specifying the prior scale for that holiday.
uncertainty_samples (
int
) – The number of posterior samples to draw in order to calibrate the anomaly scores.transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.forecast.prophet.Prophet(config)
Bases:
SeasonalityModel
,ForecasterBase
Facebook’s model for time series forecasting. See docs for
ProphetConfig
and Taylor & Letham, 2017 for more details. config_class
alias of
ProphetConfig
 property yearly_seasonality
 property weekly_seasonality
 property daily_seasonality
 property add_seasonality
 property seasonality_mode
 property holidays
 property uncertainty_samples
 set_seasonality(theta, train_data)
Implement this method to do any modelspecific adjustments on the seasonality that was provided by
SeasonalityLayer
. Parameters
theta – Seasonality processed by
SeasonalityLayer
.train_data (
UnivariateTimeSeries
) – Training data (or numpy array representing the target univariate) for any modelspecific adjustments you might want to make.
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Union
[Tuple
[TimeSeries
,TimeSeries
],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
merlion.models.forecast.smoother module
MultiScale Exponential Smoother for univariate time series forecasting.
 class merlion.models.forecast.smoother.MSESConfig(max_forecast_steps, max_backstep=None, recency_weight=0.5, accel_weight=1.0, optimize_acc=True, eta=0.0, rho=0.0, phi=2.0, inflation=1.0, target_seq_index=None, transform=None, **kwargs)
Bases:
ForecasterConfig
Configuration class for an MSES forecasting model.
Letting
w
be the recency weight,B
the maximum backstep,x_t
the last seen data point, andl_s,t
the series of losses for scales
.\[\begin{split}\begin{align*} \hat{x}_{t+h} & = \sum_{b=0}^B p_{b} \cdot (x_{tb} + v_{b+h,t} + a_{b+h,t}) \\ \space \\ \text{where} \space\space & v_{b+h,t} = \text{EMA}_w(\Delta_{b+h} x_t) \\ & a_{b+h,t} = \text{EMA}_w(\Delta_{b+h}^2 x_t) \\ \text{and} \space\space & p_b = \sigma(z)_b \space\space \\ \text{if} & \space\space z_b = (b+h)^\phi \cdot \text{EMA}_w(l_{b+h,t}) \cdot \text{RWSE}_w(l_{b+h,t})\\ \end{align*}\end{split}\] Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for. Required for some models likeMSES
andLGBMForecaster
.max_backstep (
Optional
[int
]) – Max backstep to use in forecasting. If we train with x(0),…,x(t), Then, the bth model MSES uses will forecast x(t+h) by anchoring at x(tb) and predicting xhat(t+h) = x(tb) + delta_hat(b+h).recency_weight (
float
) – The recency weight parameter to use when estimating delta_hat.accel_weight (
float
) – The weight to scale the acceleration by when computing delta_hat. Specifically, delta_hat(b+h) = velocity(b+h) + accel_weight * acceleration(b+h).optimize_acc (
bool
) – If True, the acceleration correction will only be used at scales ranging from 1,…(max_backstep+max_forecast_steps)/2.eta (
float
) – The parameter used to control the rate at which recency_weight gets tuned when online updates are made to the model and losses can be computed.rho (
float
) – The parameter that determines what fraction of the overall error is due to velcity error, while the rest is due to the complement. The error at any scale will be determined asrho * velocity_error + (1rho) * loss_error
.phi (
float
) – The parameter used to exponentially inflate the magnitude of loss error at different scales. Loss error for scales
will be increased by a factor ofphi ** s
.inflation (
float
) – The inflation exponent to use when computing the distribution p(bh) over the models when forecasting at horizon h according to standard errors of the estimated velocities over the models; inflation=1 is equivalent to using the softmax function.target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
 property max_scale
 property backsteps
 class merlion.models.forecast.smoother.MSESTrainConfig(incremental=True, process_losses=True, tune_recency_weights=False, init_batch_sz=2, train_cadence=None)
Bases:
object
MSES training configuration.
 Parameters
incremental (
bool
) – If True, train the MSES model incrementally with the initial training data at the giventrain_cadence
. This allows MSES to return a forecast for the training data.init_batch_sz (
int
) – The size of the inital training batch for MSES. This is necessary because MSES cannot predict the past, but needs to start with some data. This should be very small. 2 is the minimum, and is recommended because 2 will result in the most representative train forecast.train_cadence (
Optional
[int
]) – The frequency at which the training forecasts will be generated during incremental training.
 Param
If True, track the losses encountered during incremental initial training.
 Tune_recency_weights
If True, tune recency weights during incremental initial training.
 class merlion.models.forecast.smoother.MSES(config)
Bases:
ForecasterBase
Multiscale Exponential Smoother (MSES) is a forecasting algorithm modeled heavily after classical mechanical concepts, namely, velocity and acceleration.
Having seen data points of a time series up to time t, MSES forecasts x(t+h) by anchoring at a value b steps back from the last known value, x(tb), and estimating the delta between x(tb) and x(t+h). The delta over these b+h timesteps, delta(b+h), also known as the delta at scale b+h, is predicted by estimating the velocity over these timesteps as well as the change in the velocity, acceleration. Specifically,
xhat(t+h) = x(tb) + velocity_hat(b+h) + acceleration_hat(b+h)
This estimation is done for each b, known as a backstep, from 0, which anchors at x(t), 1,… up to a maximum backstep configurable by the user. The algorithm then takes the seperate forecasts of x(t+h), indexed by which backstep was used, xhat_b(t+h), and determines a final forecast: p(bh) dot xhat_b, where p(bh) is a distribution over the xhat_b’s that is determined according to the lowest standard errors of the recencyweighted velocity estimates.
Letting
w
be the recency weight,B
the maximum backstep,x_t
the last seen data point, andl_s,t
the series of losses for scales
.\[\begin{split}\begin{align*} \hat{x}_{t+h} & = \sum_{b=0}^B p_{b} \cdot (x_{tb} + v_{b+h,t} + a_{b+h,t}) \\ \space \\ \text{where} \space\space & v_{b+h,t} = \text{EMA}_w(\Delta_{b+h} x_t) \\ & a_{b+h,t} = \text{EMA}_w(\Delta_{b+h}^2 x_t) \\ \text{and} \space\space & p_b = \sigma(z)_b \space\space \\ \text{if} & \space\space z_b = (b+h)^\phi \cdot \text{EMA}_w(l_{b+h,t}) \cdot \text{RWSE}_w(l_{b+h,t})\\ \end{align*}\end{split}\] config_class
alias of
MSESConfig
 property rho
 property backsteps
 property max_horizon
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config (
Optional
[MSESTrainConfig
]) – Additional training configs, if needed. Only required for some models.
 Return type
Tuple
[Optional
[TimeSeries
],None
] Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 update(new_data, tune_recency_weights=True, train_cadence=None)
Updates the MSES model with new data that has been acquired since the model’s initial training.
 Parameters
new_data (
TimeSeries
) – New data that has occured since the last training time.tune_recency_weights (
bool
) – If True, the model will first forecast the values at the new_data’s timestamps, calculate the associated losses, and use these losses to make updates to the recency weight.train_cadence – The frequency at which the training forecasts will be generated during incremental training.
 Return type
Tuple
[TimeSeries
,TimeSeries
]
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the interquartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Tuple
[TimeSeries
,None
] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 xhat_h(horizon)
Returns the forecasts for the input horizon at every backstep.
 Return type
List
[Optional
[float
]]
 marginalize_xhat_h(horizon, xhat_h)
Given a list of forecasted values produced by delta estimators at different backsteps, compute a weighted average of these values. The weights are assigned based on the standard errors of the velocities, where the b’th estimate will be given more weight if its velocity has a lower standard error relative to the other estimates.
 Parameters
horizon (
int
) – the horizon at which we want to predictxhat_h (
List
[Optional
[float
]]) – the forecasted values at this horizon, using each of the possible backsteps
 class merlion.models.forecast.smoother.DeltaStats(scale, recency_weight)
Bases:
object
A wrapper around the statistics used to estimate deltas at a given scale.
 Parameters
scale (
int
) – The scale associated with the statisticsrecency_weight (
float
) – The recency weight parameter that that the incremental velocity, acceleration and standard error statistics should use.
 property lag
 update_velocity(vels)
 update_acceleration(accs)
 update_loss(losses)
 tune(losses, eta)
Tunes the recency weight according to recent forecast losses.
 Parameters
losses (
List
[float
]) – List of recent losses.eta (
float
) – Constant by which to scale the update to the recency weight. A bigger eta means more aggressive updates to the recency_weight.
 class merlion.models.forecast.smoother.DeltaEstimator(max_scale, recency_weight, accel_weight, optimize_acc, eta, phi, data=None, stats=None)
Bases:
object
Class for estimating the delta for MSES.
 Parameters
max_scale (
int
) – Delta Estimator can estimate delta over multiple scales, or time steps, ranging from 1,2,…,max_scale.recency_weight (
float
) – The recency weight parameter to use when estimating delta_hat.accel_weight (
float
) – The weight to scale the acceleration by when computing delta_hat. Specifically, delta_hat(b+h) = velocity(b+h) + accel_weight * acceleration(b+h).optimize_acc (
bool
) – If True, the acceleration correction will only be used at scales ranging from 1,…,max_scale/2.eta (
float
) – The parameter used to control the rate at which recency_weight gets tuned when online updates are made to the model and losses can be computed.data (
Optional
[UnivariateTimeSeries
]) – The data to initialize the delta estimator with.stats (
Optional
[Dict
[int
,DeltaStats
]]) – Dictionary mapping scales to DeltaStats objects to be used for delta estimation.
 property acc_max_scale
 property max_scale
 property data
 property x
 train(new_data)
Updates the delta statistics: velocity, acceleration and velocity standard error at each scale using new data.
 Parameters
new_data (
UnivariateTimeSeries
) – new datapoints in the time series.
 process_losses(scale_losses, tune_recency_weights=False)
Uses recent forecast errors to improve the delta estimator. This is done by updating the recency_weight that is used by delta stats at particular scales.
 Parameters
scale_losses (
Dict
[int
,List
[float
]]) – A dictionary mapping a scale to a list of forecasting errors that associated with that scale.
 velocity(scale)
 Return type
float
 acceleration(scale)
 Return type
float
 vel_err(scale)
 Return type
float
 pos_err(scale)
 Return type
float
 neg_err(scale)
 Return type
float
 loss_err(scale)
 Return type
float
 delta_hat(scale)
 Return type
float
merlion.models.forecast.vector_ar module
Vector AutoRegressive model for multivariate time series forecasting.
 class merlion.models.forecast.vector_ar.VectorARConfig(max_forecast_steps, maxlags, target_seq_index=None, transform=None, **kwargs)
Bases:
ForecasterConfig
Config object for
VectorAR
forecaster. Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for.maxlags (
int
) – Max # of lags for ARtarget_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.transform – Transformation to preprocess input time series.
 class merlion.models.forecast.vector_ar.VectorAR(config)
Bases:
ForecasterBase
Vector AutoRegressive model for multivariate time series forecasting.
 config_class
alias of
VectorARConfig
 property maxlags: int
 Return type
int
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Return type
Tuple
[TimeSeries
,TimeSeries
] Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Union
[Tuple
[TimeSeries
,TimeSeries
],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 set_data_already_transformed()
 reset_data_already_transformed()
merlion.models.forecast.baggingtrees module
Bagging Treebased models for multivariate time series forecasting. Random Forest ExtraTreesRegressor
 class merlion.models.forecast.baggingtrees.BaggingTreeForecasterConfig(max_forecast_steps, maxlags, target_seq_index=None, sampling_mode='normal', prediction_stride=1, n_estimators=100, random_state=None, max_depth=None, min_samples_split=2, transform=None, **kwargs)
Bases:
ForecasterConfig
Configuration class for bagging Treebased forecaster model.
 Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for.maxlags (
int
) – Max # of lags for forecastingtarget_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.sampling_mode (
str
) – how to process time series data for the tree model. If “normal”, then concatenate all sequences over the window. If “stats”, then give statistics measures over the window. Note: “stats” mode is statistical summary for a multivariate dataset, mainly to reduce the computation cost for highdimensional time series. For univariate data, it is not necessary to use “stats” instead of the sequence itself as the input. Therefore, for univariate, the model will automatically adopt “normal” mode.prediction_stride (
int
) –the prediction step for training and forecasting
If univariate: the sequence target of the length of prediction_stride will be utilized, forecasting will be done by means of autoregression with the stride unit of prediction_stride
If multivariate:
if = 1: the autoregression with the stride unit of 1
if > 1: only support sequence mode, and the model will set prediction_stride = max_forecast_steps
n_estimators (
int
) – number of base estimators for the tree ensemblerandom_state – random seed for bagging
max_depth – max depth of base estimators
min_samples_split – min split for tree leaves
transform – Transformation to preprocess input time series.
 class merlion.models.forecast.baggingtrees.BaggingTreeForecaster(config)
Bases:
ForecasterBase
,MultiVariateAutoRegressionMixin
Tree model for multivariate time series forecasting.
 config_class
alias of
BaggingTreeForecasterConfig
 model = None
 property maxlags: int
 Return type
int
 property sampling_mode: str
 Return type
str
 property prediction_stride: int
 Return type
int
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
List
[int
]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 set_data_already_transformed()
 reset_data_already_transformed()
 class merlion.models.forecast.baggingtrees.RandomForestForecasterConfig(max_forecast_steps, maxlags, target_seq_index=None, sampling_mode='normal', prediction_stride=1, n_estimators=100, random_state=None, max_depth=None, min_samples_split=2, transform=None, **kwargs)
Bases:
BaggingTreeForecasterConfig
Config class for
RandomForestForecaster
. Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for.maxlags (
int
) – Max # of lags for forecastingtarget_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.sampling_mode (
str
) – how to process time series data for the tree model. If “normal”, then concatenate all sequences over the window. If “stats”, then give statistics measures over the window. Note: “stats” mode is statistical summary for a multivariate dataset, mainly to reduce the computation cost for highdimensional time series. For univariate data, it is not necessary to use “stats” instead of the sequence itself as the input. Therefore, for univariate, the model will automatically adopt “normal” mode.prediction_stride (
int
) –the prediction step for training and forecasting
If univariate: the sequence target of the length of prediction_stride will be utilized, forecasting will be done by means of autoregression with the stride unit of prediction_stride
If multivariate:
if = 1: the autoregression with the stride unit of 1
if > 1: only support sequence mode, and the model will set prediction_stride = max_forecast_steps
n_estimators (
int
) – number of base estimators for the tree ensemblerandom_state – random seed for bagging
max_depth – max depth of base estimators
min_samples_split – min split for tree leaves
transform – Transformation to preprocess input time series.
 class merlion.models.forecast.baggingtrees.RandomForestForecaster(config)
Bases:
BaggingTreeForecaster
Random Forest Regressor for time series forecasting
Random Forest is a meta estimator that fits a number of classifying decision trees on various subsamples of the dataset, and uses averaging to improve the predictive accuracy and control overfitting.
 config_class
alias of
RandomForestForecasterConfig
 class merlion.models.forecast.baggingtrees.ExtraTreesForecasterConfig(max_forecast_steps, maxlags, target_seq_index=None, sampling_mode='normal', prediction_stride=1, n_estimators=100, random_state=None, max_depth=None, min_samples_split=2, transform=None, **kwargs)
Bases:
BaggingTreeForecasterConfig
Config cass for
ExtraTreesForecaster
. Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for.maxlags (
int
) – Max # of lags for forecastingtarget_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.sampling_mode (
str
) – how to process time series data for the tree model. If “normal”, then concatenate all sequences over the window. If “stats”, then give statistics measures over the window. Note: “stats” mode is statistical summary for a multivariate dataset, mainly to reduce the computation cost for highdimensional time series. For univariate data, it is not necessary to use “stats” instead of the sequence itself as the input. Therefore, for univariate, the model will automatically adopt “normal” mode.prediction_stride (
int
) –the prediction step for training and forecasting
If univariate: the sequence target of the length of prediction_stride will be utilized, forecasting will be done by means of autoregression with the stride unit of prediction_stride
If multivariate:
if = 1: the autoregression with the stride unit of 1
if > 1: only support sequence mode, and the model will set prediction_stride = max_forecast_steps
n_estimators (
int
) – number of base estimators for the tree ensemblerandom_state – random seed for bagging
max_depth – max depth of base estimators
min_samples_split – min split for tree leaves
transform – Transformation to preprocess input time series.
 class merlion.models.forecast.baggingtrees.ExtraTreesForecaster(config)
Bases:
BaggingTreeForecaster
Extra Trees Regressor for time series forecasting
Extra Trees Regressor implements a meta estimator that fits a number of randomized decision trees (a.k.a. extratrees) on various subsamples of the dataset and uses averaging to improve the predictive accuracy and control overfitting.
 config_class
alias of
ExtraTreesForecasterConfig
merlion.models.forecast.boostingtrees module
Boosting Treebased models for multivariate time series forecasting. LightGBM
 class merlion.models.forecast.boostingtrees.BoostingTreeForecasterConfig(max_forecast_steps, maxlags, target_seq_index=None, sampling_mode='normal', prediction_stride=1, learning_rate=0.1, n_estimators=100, random_state=None, max_depth=None, n_jobs=1, transform=None, **kwargs)
Bases:
ForecasterConfig
Configuration class for boosting Treebased forecaster model.
 Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for.maxlags (
int
) – Max # of lags for forecastingtarget_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.sampling_mode (
str
) – how to process time series data for the tree model. If “normal”, then concatenate all sequences over the window. If “stats”, then give statistics measures over the window. Note: “stats” mode is statistical summary for a multivariate dataset, mainly to reduce the computation cost for highdimensional time series. For univariate data, it is not necessary to use “stats” instead of the sequence itself as the input. Therefore, for univariate, the model will automatically adopt “normal” mode.prediction_stride (
int
) –the prediction step for training and forecasting
If univariate: the sequence target of the length of prediction_stride will be utilized, forecasting will be done by means of autoregression with the stride unit of prediction_stride
If multivariate:
if = 1: the autoregression with the stride unit of 1
if > 1: only support sequence mode, and the model will set prediction_stride = max_forecast_steps
learning_rate – learning rate for boosting
n_estimators – number of base estimators for the tree ensemble
random_state – random seed for boosting
max_depth – max depth of base estimators
n_jobs – num of threading, 1 or 0 indicates device default, positive int indicates num of threads
transform – Transformation to preprocess input time series.
 class merlion.models.forecast.boostingtrees.BoostingTreeForecaster(config)
Bases:
ForecasterBase
,MultiVariateAutoRegressionMixin
Tree model for multivariate time series forecasting.
 config_class
alias of
BoostingTreeForecasterConfig
 model = None
 property maxlags: int
 Return type
int
 property sampling_mode: str
 Return type
str
 property prediction_stride: int
 Return type
int
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
 Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
List
[int
]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 set_data_already_transformed()
 reset_data_already_transformed()
 class merlion.models.forecast.boostingtrees.LGBMForecasterConfig(max_forecast_steps, maxlags, target_seq_index=None, sampling_mode='normal', prediction_stride=1, learning_rate=0.1, n_estimators=100, random_state=None, max_depth=None, n_jobs=1, transform=None, **kwargs)
Bases:
BoostingTreeForecasterConfig
Config class for
LGBMForecaster
. Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for.maxlags (
int
) – Max # of lags for forecastingtarget_seq_index (
Optional
[int
]) – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.sampling_mode (
str
) – how to process time series data for the tree model. If “normal”, then concatenate all sequences over the window. If “stats”, then give statistics measures over the window. Note: “stats” mode is statistical summary for a multivariate dataset, mainly to reduce the computation cost for highdimensional time series. For univariate data, it is not necessary to use “stats” instead of the sequence itself as the input. Therefore, for univariate, the model will automatically adopt “normal” mode.prediction_stride (
int
) –the prediction step for training and forecasting
If univariate: the sequence target of the length of prediction_stride will be utilized, forecasting will be done by means of autoregression with the stride unit of prediction_stride
If multivariate:
if = 1: the autoregression with the stride unit of 1
if > 1: only support sequence mode, and the model will set prediction_stride = max_forecast_steps
learning_rate – learning rate for boosting
n_estimators – number of base estimators for the tree ensemble
random_state – random seed for boosting
max_depth – max depth of base estimators
n_jobs – num of threading, 1 or 0 indicates device default, positive int indicates num of threads
transform – Transformation to preprocess input time series.
 class merlion.models.forecast.boostingtrees.LGBMForecaster(config)
Bases:
BoostingTreeForecaster
Light gradient boosting (LGBM) regressor for time series forecasting
LightGBM is a light weight and fast gradient boosting framework that uses tree based learning algorithms, for more details, please refer to the document https://lightgbm.readthedocs.io/en/latest/Features.html
 config_class
alias of
LGBMForecasterConfig
merlion.models.forecast.lstm module
A forecaster based on a LSTM neural net.
 class merlion.models.forecast.lstm.LSTMConfig(max_forecast_steps, nhid=1024, model_strides=(1,), target_seq_index=None, transform=None, max_score=1000, threshold=None, enable_calibrator=True, enable_threshold=True, **kwargs)
Bases:
ForecasterConfig
Configuration class for
LSTM
.Base class of the object used to configure an anomaly detection model.
 Parameters
max_forecast_steps (
int
) – Max # of steps we would like to forecast for. Required for some models likeMSES
andLGBMForecaster
.nhid – hidden dimension of LSTM
model_strides – tuple indicating the stride(s) at which we would like to subsample the input data before giving it to the model.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_calibrator – whether to enable a calibrator which automatically transforms all raw anomaly scores to be zscores (i.e. distributed as N(0, 1)).
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
 class merlion.models.forecast.lstm.LSTMTrainConfig(lr=1e05, batch_size=128, epochs=128, seq_len=256, data_stride=1, valid_split=0.2, checkpoint_file='checkpoint.pt')
Bases:
object
LSTM training configuration.
 class merlion.models.forecast.lstm.LSTM(config)
Bases:
ForecasterBase
LSTM forecaster: this assume the input time series has equal intervals across all its values so that we can use sequence modeling to make forecast.
 config_class
alias of
LSTMConfig
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config (
Optional
[LSTMTrainConfig
]) – Additional training configs, if needed. Only required for some models.
 Return type
Tuple
[TimeSeries
,None
] Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr – whether to return the interquartile range for the forecast. Note that not all models support this option.
return_prev – whether to return the forecast for
time_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Tuple
[TimeSeries
,None
] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp