merlion.models.ensemble package
Ensembles of models and automated model selection.
Base class for ensembles of models. 

Rules for combining the outputs of multiple time series models. 

Ensembles of anomaly detectors. 

Ensembles of forecasters. 

Mixture of Expert forecasters. 
Submodules
merlion.models.ensemble.base module
Base class for ensembles of models.
 class merlion.models.ensemble.base.EnsembleConfig(models=None, combiner=None, transform=None, **kwargs)
Bases:
Config
An ensemble config contains the each individual model in the ensemble, as well as the Combiner object to combine those models’ outputs. The rationale behind placing the model objects in the EnsembleConfig (rather than in the Ensemble itself) is discussed in more detail in the documentation for
LayeredModel
. Parameters
models (
Optional
[List
[Union
[ModelBase
,Dict
]]]) – A list of models or dicts representing them.combiner (
Optional
[CombinerBase
]) – TheCombinerBase
object to combine the outputs of the models in the ensemble.transform – Transformation to preprocess input time series.
kwargs – Any additional kwargs for
Config
 to_dict(_skipped_keys=None)
 Returns
dict with keyword arguments used to initialize the config class.
 class merlion.models.ensemble.base.EnsembleTrainConfig(valid_frac, per_model_train_configs=None)
Bases:
object
Config object describing how to train an ensemble.
 Parameters
valid_frac – fraction of training data to use for validation.
per_model_train_configs – list of train configs to use for individual models, one per model.
None
means that you use the default for all models. SpecifyingNone
for an individual model means that you use the default for that model.
 class merlion.models.ensemble.base.EnsembleBase(config=None, models=None)
Bases:
ModelBase
An abstract class representing an ensemble of multiple models.
 Parameters
config (
Optional
[EnsembleConfig
]) – The ensemble’s configmodels (
Optional
[List
[ModelBase
]]) – The models in the ensemble. Only provide this argument if you did not specifyconfig.models
.
 config_class
alias of
EnsembleConfig
 property models
 property combiner: CombinerBase
 Return type
 Returns
the object used to combine model outputs.
 reset()
Resets the model’s internal state.
 property models_used
 train_valid_split(transformed_train_data, train_config)
 Return type
Tuple
[TimeSeries
,TimeSeries
]
 get_max_common_horizon()
 truncate_valid_data(transformed_valid_data)
 train_combiner(all_model_outs, target)
 Return type
 save(dirname, save_only_used_models=False, **save_config)
Saves the ensemble of models.
 Parameters
dirname (
str
) – directory to save the ensemble tosave_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional save config arguments
 to_bytes(save_only_used_models=False, **save_config)
Converts the entire model state and configuration to a single byte object.
 Parameters
save_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional configurations (if needed)
merlion.models.ensemble.combine module
Rules for combining the outputs of multiple time series models.
 class merlion.models.ensemble.combine.CombinerBase(abs_score=False)
Bases:
object
Abstract base class for combining the outputs of multiple models. Subclasses should implement the abstract method
_combine_univariates
. All combiners are callable objects. __call__(all_model_outs, target, _check_dim=True)
Applies the model combination rule to combine multiple model outputs.
 Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
TimeSeries
) – a target time series (e.g. labels)
 Return type
 Returns
a single time series of combined model outputs on this training data.
 Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
 property requires_training
 to_dict(_skipped_keys=None)
 classmethod from_dict(state)
 property models_used: List[bool]
 Return type
List
[bool
] Returns
which models are actually used to make predictions.
 train(all_model_outs, target=None)
Trains the model combination rule.
 Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
Optional
[TimeSeries
]) – a target time series (e.g. labels)
 Return type
 Returns
a single time series of combined model outputs on this training data.
 class merlion.models.ensemble.combine.Mean(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their mean prediction.
 Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
 property weights: ndarray
 Return type
ndarray
 class merlion.models.ensemble.combine.Median(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their median prediction.
 Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
 class merlion.models.ensemble.combine.Max(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their max prediction.
 Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
 class merlion.models.ensemble.combine.ModelSelector(metric, abs_score=False)
Bases:
Mean
Takes the mean of the best models, where the models are ranked according to the value of an evaluation metric.
 Parameters
metric (
Union
[str
,TSADMetric
,ForecastMetric
]) – the evaluation metric to useabs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
 property invert
 property requires_training
 to_dict(_skipped_keys=None)
 classmethod from_dict(state)
 property models_used: List[bool]
 Return type
List
[bool
] Returns
which models are actually used to make predictions.
 train(all_model_outs, target=None, **kwargs)
Trains the model combination rule.
 Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
Optional
[TimeSeries
]) – a target time series (e.g. labels)
 Return type
 Returns
a single time series of combined model outputs on this training data.
 class merlion.models.ensemble.combine.MetricWeightedMean(metric, abs_score=False)
Bases:
ModelSelector
Computes a weighted average of model outputs with weights proportional to the metric values (or their inverses).
 Parameters
metric (
Union
[str
,TSADMetric
,ForecastMetric
]) – the evaluation metric to useabs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
 property models_used: List[bool]
 Return type
List
[bool
] Returns
which models are actually used to make predictions.
 property weights: ndarray
 Return type
ndarray
merlion.models.ensemble.anomaly module
Ensembles of anomaly detectors.
 class merlion.models.ensemble.anomaly.DetectorEnsembleConfig(enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, transform: TransformBase = None, normalize: Rescale = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)
Bases:
DetectorConfig
,EnsembleConfig
Config class for an ensemble of anomaly detectors.
Base class of the object used to configure an anomaly detection model.
 Parameters
enable_calibrator – Whether to enable calibration of the ensemble anomaly score.
False
by default.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_threshold – whether to enable the thresholding rule when postprocessing anomaly scores
transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
models – A list of models or dicts representing them.
combiner – The
CombinerBase
object to combine the outputs of the models in the ensemble.kwargs – Any additional kwargs for
EnsembleConfig
orDetectorConfig
 property per_model_threshold
 Returns
whether to apply the thresholding rules of each individual model, before combining their outputs. Only done if doing model selection.
 class merlion.models.ensemble.anomaly.DetectorEnsemble(config=None, models=None)
Bases:
EnsembleBase
,DetectorBase
Class representing an ensemble of multiple anomaly detection models.
 Parameters
config (
Optional
[DetectorEnsembleConfig
]) – model configuration
 config_class
alias of
DetectorEnsembleConfig
 property per_model_threshold
 Returns
whether to apply the threshold rule of each individual model before aggregating their anomaly scores.
 train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None, per_model_post_rule_train_configs=None)
Trains each anomaly detector in the ensemble unsupervised, and each of their postrules supervised (if labels are given).
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config (
Optional
[EnsembleTrainConfig
]) – config for ensemble training. Not recommended.post_rule_train_config – the postrule train config to use for the ensemblelevel postrule.
per_model_post_rule_train_configs – the postrule train configs to use for each of the individual models. Must be equal in length to the number of models, if given.
 Return type
 Returns
A
TimeSeries
of the ensemble’s anomaly scores on the training data.
 get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
 Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
 Return type
 Returns
a univariate
TimeSeries
of anomaly scores
merlion.models.ensemble.forecast module
Ensembles of forecasters.
 class merlion.models.ensemble.forecast.ForecasterEnsembleConfig(max_forecast_steps=None, verbose=False, target_seq_index: int = None, transform: TransformBase = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)
Bases:
ForecasterConfig
,EnsembleConfig
Config class for an ensemble of forecasters.
 Parameters
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like
MSES
andLGBMForecaster
.target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to preprocess input time series.
models – A list of models or dicts representing them.
combiner – The
CombinerBase
object to combine the outputs of the models in the ensemble.kwargs – Any additional kwargs for
Config
 class merlion.models.ensemble.forecast.ForecasterEnsemble(config=None, models=None)
Bases:
EnsembleBase
,ForecasterBase
Class representing an ensemble of multiple forecasting models.
 Parameters
config (
Optional
[ForecasterEnsembleConfig
]) – The ensemble’s configmodels (
Optional
[List
[ForecasterBase
]]) – The models in the ensemble. Only provide this argument if you did not specifyconfig.models
.
 config_class
alias of
ForecasterEnsembleConfig
 train_pre_process(train_data, require_even_sampling, require_univariate)
Applies preprocessing steps common for training most models.
 Parameters
train_data (
TimeSeries
) – the original time series of training datarequire_even_sampling (
bool
) – whether the model assumes that training data is sampled at a fixed frequencyrequire_univariate (
bool
) – whether the model only works with univariate time series
 Return type
 Returns
the training data, after any necessary preprocessing has been applied
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config (
Optional
[EnsembleTrainConfig
]) – Additional training configs, if needed. Only required for some models.
 Return type
Tuple
[Optional
[TimeSeries
],None
] Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the interquartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Return type
Union
[Tuple
[TimeSeries
,Optional
[TimeSeries
]],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]] Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
merlion.models.ensemble.MoE_forecast module
Mixture of Expert forecasters.
 class merlion.models.ensemble.MoE_forecast.myDataset(data, lookback, forecast=1, target_seq_index=0, include_ts=False)
Bases:
Dataset
Creates a pytorch dataset.
 Parameters
data – TimeSeries object
lookback – number of time steps to lookback in order to forecast
forecast – number of steps to forecast in the future
target_seq_index – dimension of the timeseries that will be forecasted
include_ts – Bool. If True, __getitem__ also returns a TimeSeries version of the data, excludes it otherwise
 class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsembleConfig(batch_size=128, lr=0.0001, warmup_steps=100, epoch_max=100, nfree_experts=0, lookback_len=10, max_forecast_steps=3, target_seq_index=0, use_gpu=True, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, transform: TransformBase = None, normalize: Rescale = None, **kwargs)
Bases:
EnsembleConfig
,ForecasterConfig
,NormalizingConfig
Config class for MoE (mixture of experts) forecaster.
 Parameters
batch_size – batch_size needed since MoE uses gradient descent based learning, training happens over multiple epochs.
lr – learning rate of the Adam optimizer used in MoE training
warmup_steps – number of iterations used to reach lr
epoch_max – number of epochs to train the MoE model
nfree_experts – number of free expert forecast values that are trained using gradient descent
lookback_len – number of past time steps to look at in order to make future forecasts
max_forecast_steps – number of future steps to forecast
target_seq_index – index of time series to forecast. Integer value.
use_gpu – Bool. Use True if GPU available for faster speed.
models – A list of models or dicts representing them.
combiner – The
CombinerBase
object to combine the outputs of the models in the ensemble.transform – Transformation to preprocess input time series.
normalize – Pretrained normalization transformation (optional).
kwargs – Any additional kwargs for
Config
 class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsemble(config=None, models=None, moe_model=None)
Bases:
EnsembleBase
,ForecasterBase
Modelbased mixture of experts for forecasting.
The main class functions useful for users are:
train: used for training the MoE model (includes training external expert and training MoE model parameters)
finetune: assuming the train() function has been called once, finetune can be called if the user wants to train the MoE model params again for some reason (E.g. different optimization hyperparameters).
forecast: given a time series, returns the forecast values and standard error as a tuple of
TimeSeries
objectsbatch_forecast: same as forecast, but can operate on a batch of input data and outputs list of
TimeSeries
objects_forecast: given a time series, returns the forecast values and confidence of all experts
_batch_forecast: same as _forecast, but can operate on a batch of input data
expert_prediction: this function operates on the output of _batch_forecast to compute a single forecast value per input by combining expert predictions using the user specified strategy (see expert_prediction function for details)
evaluate: mainly for development purpose. This function performs sMAPE evaluation for a given time series data
 Parameters
models (
Optional
[List
[ForecasterBase
]]) – list of external expert models (E.g. Sarima, Arima). Can be an empty list if nfree_experts>0 is specified.moe_model – pytorch model that takes torch.tensor input of size (B x lookback_len x input_dim) and outputs a tuple of 2 variables. The first variable is the logit (presoftmax) of size (B x nexperts x max_forecast_steps). The second variable is None if nfree_experts=0, else has size (nfree_experts x max_forecast_steps) which is the forecasted values by nfree_experts number of experts.
 config_class
alias of
MoE_ForecasterEnsembleConfig
 property moe_model
 property nexperts
 property batch_size: int
 Return type
int
 property lr: int
 Return type
int
 property warmup_steps: int
 Return type
int
 property epoch_max: int
 Return type
int
 property nfree_experts: int
 Return type
int
 property use_gpu: int
 Return type
int
 property lookback_len: int
 Return type
int
 train(train_data, train_config=None)
Trains the forecaster on the input time series.
 Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config (
Optional
[EnsembleTrainConfig
]) – Additional training configs, if needed. Only required for some models.
 Return type
Tuple
[Optional
[TimeSeries
],Optional
[TimeSeries
]] Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
 finetune(train_data, train_config=None)
This function expects the external experts to be already trained. This function extracts the predictions of external experts (if any) and stores them. It then uses them along with the training data to train the MoE model to perform expert selection and forecasting. This function is called internally by the train function.
 forecast(time_stamps, time_series_prev=None, apply_transform=True, return_iqr=False, return_prev=False, expert_idx=None, mode='max', use_gpu=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired. Parameters
time_stamps (
List
[int
]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the interquartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
 Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
 batch_forecast(time_stamps_list, time_series_prev_list, return_iqr=False, return_prev=False, apply_transform=True, expert_idx=None, mode='max', use_gpu=False)
Returns the ensemble’s forecast on a batch of timestamps given. Note invert transforms are applied to forecasts returned by this function
 Parameters
time_stamps_list (
List
[List
[int
]]) – a list of lists of timestamps we wish to forecast fortime_series_prev_list (
List
[TimeSeries
]) – a list of TimeSeries immediately preceeding the time stamps in time_stamps_listreturn_iqr (
bool
) – whether to return the interquartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.apply_transform – bool. Whether or not to apply transform to the inputs. Use False if transform has already been applied.
 Return type
Tuple
[List
[TimeSeries
],List
[Optional
[TimeSeries
]]] Returns
(List of TimeSeries of forecasts, List of TimeSeries of standard errors)
forecasts
(np array): the forecast for the timestamps given, of size (B x nexperts x max_forecast_steps)probs
(np array): the expert probabilities for each forecast made, of size (B x nexperts x max_forecast_steps), sum of probs is 1 along dim 1
 expert_prediction(expert_preds, probs, mode='max', use_gpu=False)
This function can take the outputs provided by batch_forecast or forecast of this class to get the final forecast value and allows the user to choose which strategy to use to combine different experts.
expert_preds: (B x nexperts x max_forecast_steps) np array probs: (B x nexperts x max_forecast_steps) np array mode: either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average use_gpu: set True if GPU available for faster speed
Returns: y_pred: B x max_forecast_steps std: B x max_forecast_steps
 evaluate(data, mode='mean', expert_idx=None, use_gpu=True, use_batch_forecast=True, bs=64, confidence_thres=0.1)
this function takes a timeseries data and performs an overall evaluation using sMAPE metric on it. This function uses many ifelse to satisfy the use_gpu and use_batch_forecast conditions specified by user.
 Parameters
data – TimeSeries object
mode – either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average.
expert_idx – if None, MoE uses all the experts provided and uses the ‘mode’ strategy specified below to forecast. If value is int (E.g. 0), MoE only uses the external expert at the corresponding index of the expert models provided to MoE to make forecasts.
use_gpu – set True if GPU available for faster speed
use_batch_forecast – set True for higher speed
bs – batch size for to go through data in chunks
confidence_thres – threshold used to determine if MoE output is considered confident or not on a sample. MoE confident is calculated as forecaststandarddeviation/abs(forecast value). forecaststandarddeviation is the standard deviation of the forecasts made by all the experts.
 save(dirname, **save_config)
 Parameters
dirname (
str
) – directory to save the modelsave_config – additional configurations (if needed)
 classmethod load(dirname, **kwargs)
Note: if a user specified model was used while saving the MoE ensemble, specify argument
moe_model
when calling the load function with the pytorch model that was used in the original MoE ensemble. Ifmoe_model
is not specified, it will be assumed that the default Pytorch network was used. Any discrepancy between the saved model state and model used here will raise an error. Parameters
dirname (
str
) – directory to load the model from