merlion.models.ensemble package

Ensembles of models and automated model selection.

`base`	Base class for ensembles of models.
`combine`	Rules for combining the outputs of multiple time series models.
`anomaly`	Ensembles of anomaly detectors.
`forecast`	Ensembles of forecasters.
`MoE_forecast`	Mixture of Expert forecasters.

Submodules

merlion.models.ensemble.base module

Base class for ensembles of models.

class merlion.models.ensemble.base.EnsembleConfig(models=None, combiner=None, transform=None, **kwargs)

Bases: Config

An ensemble config contains the each individual model in the ensemble, as well as the Combiner object to combine those models’ outputs. The rationale behind placing the model objects in the EnsembleConfig (rather than in the Ensemble itself) is discussed in more detail in the documentation for LayeredModel.

Parameters

models (Optional[List[Union[ModelBase, Dict]]]) – A list of models or dicts representing them.
combiner (Optional[CombinerBase]) – The CombinerBase object to combine the outputs of the models in the ensemble.
transform – Transformation to pre-process input time series.
kwargs – Any additional kwargs for Config

models: List[ModelBase]

to_dict(_skipped_keys=None)

Returns: dict with keyword arguments used to initialize the config class.

class merlion.models.ensemble.base.EnsembleTrainConfig(valid_frac, per_model_train_configs=None)

Bases: object

Config object describing how to train an ensemble.

Parameters

valid_frac – fraction of training data to use for validation.
per_model_train_configs – list of train configs to use for individual models, one per model. None means that you use the default for all models. Specifying None for an individual model means that you use the default for that model.

class merlion.models.ensemble.base.EnsembleBase(config=None, models=None)

Bases: ModelBase

An abstract class representing an ensemble of multiple models.

Parameters

config (Optional[EnsembleConfig]) – The ensemble’s config
models (Optional[List[ModelBase]]) – The models in the ensemble. Only provide this argument if you did not specify config.models.

config_class: alias of EnsembleConfig

property models

property combiner: CombinerBase

Return type: CombinerBase
Returns: the object used to combine model outputs.

reset(): Resets the model’s internal state.

property models_used

train_valid_split(transformed_train_data, train_config)

Return type: Tuple[TimeSeries, TimeSeries]

get_max_common_horizon()

truncate_valid_data(transformed_valid_data)

train_combiner(all_model_outs, target)

Return type: TimeSeries

save(dirname, save_only_used_models=False, **save_config)

Saves the ensemble of models.

Parameters

dirname (str) – directory to save the ensemble to
save_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional save config arguments

to_bytes(save_only_used_models=False, **save_config)

Converts the entire model state and configuration to a single byte object.

Parameters

save_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional configurations (if needed)

merlion.models.ensemble.combine module

Rules for combining the outputs of multiple time series models.

class merlion.models.ensemble.combine.CombinerBase(abs_score=False)

Bases: object

Abstract base class for combining the outputs of multiple models. Subclasses should implement the abstract method _combine_univariates. All combiners are callable objects.

__call__(all_model_outs, target, _check_dim=True)

Applies the model combination rule to combine multiple model outputs.

Parameters

all_model_outs (List[TimeSeries]) – a list of time series, with each time series representing the output of a single model.
target (TimeSeries) – a target time series (e.g. labels)

Return type

TimeSeries

Returns

a single time series of combined model outputs on this training data.

Parameters: abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property requires_training

to_dict(_skipped_keys=None)

classmethod from_dict(state)

property models_used: List[bool]

Return type: List[bool]
Returns: which models are actually used to make predictions.

train(all_model_outs, target=None)

Trains the model combination rule.

Parameters

all_model_outs (List[TimeSeries]) – a list of time series, with each time series representing the output of a single model.
target (Optional[TimeSeries]) – a target time series (e.g. labels)

Return type

TimeSeries

Returns

a single time series of combined model outputs on this training data.

class merlion.models.ensemble.combine.Mean(abs_score=False)

Bases: CombinerBase

Combines multiple models by taking their mean prediction.

Parameters: abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property weights: ndarray

Return type: ndarray

class merlion.models.ensemble.combine.Median(abs_score=False)

Bases: CombinerBase

Combines multiple models by taking their median prediction.

Parameters: abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

class merlion.models.ensemble.combine.Max(abs_score=False)

Bases: CombinerBase

Combines multiple models by taking their max prediction.

Parameters: abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

class merlion.models.ensemble.combine.ModelSelector(metric, abs_score=False)

Bases: Mean

Takes the mean of the best models, where the models are ranked according to the value of an evaluation metric.

Parameters

metric (Union[str, TSADMetric, ForecastMetric]) – the evaluation metric to use
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property invert

property requires_training

to_dict(_skipped_keys=None)

classmethod from_dict(state)

property models_used: List[bool]

Return type: List[bool]
Returns: which models are actually used to make predictions.

train(all_model_outs, target=None, **kwargs)

Trains the model combination rule.

Parameters

all_model_outs (List[TimeSeries]) – a list of time series, with each time series representing the output of a single model.
target (Optional[TimeSeries]) – a target time series (e.g. labels)

Return type

TimeSeries

Returns

a single time series of combined model outputs on this training data.

class merlion.models.ensemble.combine.MetricWeightedMean(metric, abs_score=False)

Bases: ModelSelector

Computes a weighted average of model outputs with weights proportional to the metric values (or their inverses).

Parameters

metric (Union[str, TSADMetric, ForecastMetric]) – the evaluation metric to use
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property models_used: List[bool]

Return type: List[bool]
Returns: which models are actually used to make predictions.

property weights: ndarray

Return type: ndarray

class merlion.models.ensemble.combine.CombinerFactory

Bases: object

Factory object for creating combiner objects.

classmethod create(name, **kwargs)

Return type: CombinerBase

merlion.models.ensemble.anomaly module

Ensembles of anomaly detectors.

class merlion.models.ensemble.anomaly.DetectorEnsembleConfig(enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, transform: TransformBase = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)

Bases: DetectorConfig, EnsembleConfig

Config class for an ensemble of anomaly detectors.

Base class of the object used to configure an anomaly detection model.

Parameters

enable_calibrator – Whether to enable calibration of the ensemble anomaly score. False by default.
max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores
transform – Transformation to pre-process input time series.
models – A list of models or dicts representing them.
combiner – The CombinerBase object to combine the outputs of the models in the ensemble.
kwargs – Any additional kwargs for EnsembleConfig or DetectorConfig

property per_model_threshold

Returns: whether to apply the thresholding rules of each individual model, before combining their outputs. Only done if doing model selection.

class merlion.models.ensemble.anomaly.DetectorEnsemble(config=None, models=None)

Bases: EnsembleBase, DetectorBase

Class representing an ensemble of multiple anomaly detection models.

Parameters: config (Optional[DetectorEnsembleConfig]) – model configuration

config_class: alias of DetectorEnsembleConfig

property per_model_threshold

Returns: whether to apply the threshold rule of each individual model before aggregating their anomaly scores.

train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None, per_model_post_rule_train_configs=None)

Trains each anomaly detector in the ensemble unsupervised, and each of their post-rules supervised (if labels are given).

Parameters

train_data (TimeSeries) – a TimeSeries of metric values to train the model.
anomaly_labels (Optional[TimeSeries]) – a TimeSeries indicating which timestamps are anomalous. Optional.
train_config (Optional[EnsembleTrainConfig]) – config for ensemble training. Not recommended.
post_rule_train_config – the post-rule train config to use for the ensemble-level post-rule.
per_model_post_rule_train_configs – the post-rule train configs to use for each of the individual models. Must be equal in length to the number of models, if given.

Return type

TimeSeries

Returns

A TimeSeries of the ensemble’s anomaly scores on the training data.

get_anomaly_score(time_series, time_series_prev=None)

Returns the model’s predicted sequence of anomaly scores.

Parameters

time_series (TimeSeries) – the TimeSeries we wish to predict anomaly scores for.
time_series_prev (Optional[TimeSeries]) – a TimeSeries immediately preceding time_series. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume that time_series immediately follows the training data.

Return type

TimeSeries

Returns

a univariate TimeSeries of anomaly scores

merlion.models.ensemble.forecast module

Ensembles of forecasters.

class merlion.models.ensemble.forecast.ForecasterEnsembleConfig(max_forecast_steps=None, verbose=False, target_seq_index: int = None, transform: TransformBase = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)

Bases: ForecasterConfig, EnsembleConfig

Config class for an ensemble of forecasters.

Parameters

max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like MSES and LGBMForecaster.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
transform – Transformation to pre-process input time series.
models – A list of models or dicts representing them.
combiner – The CombinerBase object to combine the outputs of the models in the ensemble.
kwargs – Any additional kwargs for Config

class merlion.models.ensemble.forecast.ForecasterEnsemble(config=None, models=None)

Bases: EnsembleBase, ForecasterBase

Class representing an ensemble of multiple forecasting models.

Parameters

config (Optional[ForecasterEnsembleConfig]) – The ensemble’s config
models (Optional[List[ForecasterBase]]) – The models in the ensemble. Only provide this argument if you did not specify config.models.

config_class: alias of ForecasterEnsembleConfig

train_pre_process(train_data, require_even_sampling, require_univariate)

Applies pre-processing steps common for training most models.

Parameters

train_data (TimeSeries) – the original time series of training data
require_even_sampling (bool) – whether the model assumes that training data is sampled at a fixed frequency
require_univariate (bool) – whether the model only works with univariate time series

Return type

TimeSeries

Returns

the training data, after any necessary pre-processing has been applied

train(train_data, train_config=None)

Trains the forecaster on the input time series.

Parameters

train_data (TimeSeries) – a TimeSeries of metric values to train the model.
train_config (Optional[EnsembleTrainConfig]) – Additional training configs, if needed. Only required for some models.

Return type

Tuple[Optional[TimeSeries], None]

Returns

the model’s prediction on train_data, in the same format as if you called ForecasterBase.forecast on the time stamps of train_data

forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)

Returns the model’s forecast on the timestamps given. Note that if self.transform is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired.

Parameters

time_stamps (Union[int, List[int]]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for.
time_series_prev (Optional[TimeSeries]) – a list of (timestamp, value) pairs immediately preceding time_series. If given, we use it to initialize the time series model. Otherwise, we assume that time_series immediately follows the training data.
return_iqr (bool) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.
return_prev (bool) – whether to return the forecast for time_series_prev (and its stderr or IQR if relevant), in addition to the forecast for time_stamps. Only used if time_series_prev is provided.

Return type

Union[Tuple[TimeSeries, Optional[TimeSeries]], Tuple[TimeSeries, TimeSeries, TimeSeries]]

Returns

(forecast, forecast_stderr) if return_iqr is false, (forecast, forecast_lb, forecast_ub) otherwise.

forecast: the forecast for the timestamps given
forecast_stderr: the standard error of each forecast value.
May be None.
forecast_lb: 25th percentile of forecast values for each timestamp
forecast_ub: 75th percentile of forecast values for each timestamp

merlion.models.ensemble.MoE_forecast module

Mixture of Expert forecasters.

class merlion.models.ensemble.MoE_forecast.myDataset(data, lookback, forecast=1, target_seq_index=0, include_ts=False)

Bases: Dataset

Creates a pytorch dataset.

Parameters

data – TimeSeries object
lookback – number of time steps to lookback in order to forecast
forecast – number of steps to forecast in the future
target_seq_index – dimension of the timeseries that will be forecasted
include_ts – Bool. If True, __getitem__ also returns a TimeSeries version of the data, excludes it otherwise

class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsembleConfig(batch_size=128, lr=0.0001, warmup_steps=100, epoch_max=100, nfree_experts=0, lookback_len=10, max_forecast_steps=3, target_seq_index=0, use_gpu=True, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, transform: TransformBase = None, normalize: Rescale = None, **kwargs)

Bases: EnsembleConfig, ForecasterConfig, NormalizingConfig

Config class for MoE (mixture of experts) forecaster.

Parameters

batch_size – batch_size needed since MoE uses gradient descent based learning, training happens over multiple epochs.
lr – learning rate of the Adam optimizer used in MoE training
warmup_steps – number of iterations used to reach lr
epoch_max – number of epochs to train the MoE model
nfree_experts – number of free expert forecast values that are trained using gradient descent
lookback_len – number of past time steps to look at in order to make future forecasts
max_forecast_steps – number of future steps to forecast
target_seq_index – index of time series to forecast. Integer value.
use_gpu – Bool. Use True if GPU available for faster speed.
models – A list of models or dicts representing them.
combiner – The CombinerBase object to combine the outputs of the models in the ensemble.
transform – Transformation to pre-process input time series.
normalize – Pre-trained normalization transformation (optional).
kwargs – Any additional kwargs for Config

models: List[ModelBase]

class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsemble(config=None, models=None, moe_model=None)

Bases: EnsembleBase, ForecasterBase

Model-based mixture of experts for forecasting.

The main class functions useful for users are:

train: used for training the MoE model (includes training external expert and training MoE model parameters)

finetune: assuming the train() function has been called once, finetune can be called if the user wants to train the MoE model params again for some reason (E.g. different optimization hyper-parameters).

forecast: given a time series, returns the forecast values and standard error as a tuple of TimeSeries objects

batch_forecast: same as forecast, but can operate on a batch of input data and outputs list of TimeSeries objects

_forecast: given a time series, returns the forecast values and confidence of all experts

_batch_forecast: same as _forecast, but can operate on a batch of input data

expert_prediction: this function operates on the output of _batch_forecast to compute a single forecast value per input by combining expert predictions using the user specified strategy (see expert_prediction function for details)

evaluate: mainly for development purpose. This function performs sMAPE evaluation for a given time series data

Parameters

models (Optional[List[ForecasterBase]]) – list of external expert models (E.g. Sarima, Arima). Can be an empty list if nfree_experts>0 is specified.
moe_model – pytorch model that takes torch.tensor input of size (B x lookback_len x input_dim) and outputs a tuple of 2 variables. The first variable is the logit (pre-softmax) of size (B x nexperts x max_forecast_steps). The second variable is None if nfree_experts=0, else has size (nfree_experts x max_forecast_steps) which is the forecasted values by nfree_experts number of experts.

config_class: alias of MoE_ForecasterEnsembleConfig

property moe_model

property nexperts

property batch_size: int

Return type: int

property lr: int

Return type: int

property warmup_steps: int

Return type: int

property epoch_max: int

Return type: int

property nfree_experts: int

Return type: int

property use_gpu: int

Return type: int

property lookback_len: int

Return type: int

train(train_data, train_config=None)

Trains the forecaster on the input time series.

Parameters

train_data (TimeSeries) – a TimeSeries of metric values to train the model.
train_config (Optional[EnsembleTrainConfig]) – Additional training configs, if needed. Only required for some models.

Return type

Tuple[Optional[TimeSeries], Optional[TimeSeries]]

Returns

the model’s prediction on train_data, in the same format as if you called ForecasterBase.forecast on the time stamps of train_data

finetune(train_data, train_config=None): This function expects the external experts to be already trained. This function extracts the predictions of external experts (if any) and stores them. It then uses them along with the training data to train the MoE model to perform expert selection and forecasting. This function is called internally by the train function.

forecast(time_stamps, time_series_prev=None, apply_transform=True, return_iqr=False, return_prev=False, expert_idx=None, mode='max', use_gpu=False)

Returns the model’s forecast on the timestamps given. Note that if self.transform is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired.

Parameters

time_stamps (List[int]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for.
time_series_prev (Optional[TimeSeries]) – a list of (timestamp, value) pairs immediately preceding time_series. If given, we use it to initialize the time series model. Otherwise, we assume that time_series immediately follows the training data.
return_iqr (bool) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.
return_prev (bool) – whether to return the forecast for time_series_prev (and its stderr or IQR if relevant), in addition to the forecast for time_stamps. Only used if time_series_prev is provided.

Returns

(forecast, forecast_stderr) if return_iqr is false, (forecast, forecast_lb, forecast_ub) otherwise.

forecast: the forecast for the timestamps given
forecast_stderr: the standard error of each forecast value.
May be None.
forecast_lb: 25th percentile of forecast values for each timestamp
forecast_ub: 75th percentile of forecast values for each timestamp

batch_forecast(time_stamps_list, time_series_prev_list, return_iqr=False, return_prev=False, apply_transform=True, expert_idx=None, mode='max', use_gpu=False)

Returns the ensemble’s forecast on a batch of timestamps given. Note invert transforms are applied to forecasts returned by this function

Parameters

time_stamps_list (List[List[int]]) – a list of lists of timestamps we wish to forecast for
time_series_prev_list (List[TimeSeries]) – a list of TimeSeries immediately preceeding the time stamps in time_stamps_list
return_iqr (bool) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.
return_prev (bool) – whether to return the forecast for time_series_prev (and its stderr or IQR if relevant), in addition to the forecast for time_stamps. Only used if time_series_prev is provided.
apply_transform – bool. Whether or not to apply transform to the inputs. Use False if transform has already been applied.

Return type

Tuple[List[TimeSeries], List[Optional[TimeSeries]]]

Returns

(List of TimeSeries of forecasts, List of TimeSeries of standard errors)

forecasts (np array): the forecast for the timestamps given, of size (B x nexperts x max_forecast_steps)
probs (np array): the expert probabilities for each forecast made, of size (B x nexperts x max_forecast_steps), sum of probs is 1 along dim 1

expert_prediction(expert_preds, probs, mode='max', use_gpu=False)

This function can take the outputs provided by batch_forecast or forecast of this class to get the final forecast value and allows the user to choose which strategy to use to combine different experts.

expert_preds: (B x nexperts x max_forecast_steps) np array probs: (B x nexperts x max_forecast_steps) np array mode: either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average use_gpu: set True if GPU available for faster speed

Returns: y_pred: B x max_forecast_steps std: B x max_forecast_steps

evaluate(data, mode='mean', expert_idx=None, use_gpu=True, use_batch_forecast=True, bs=64, confidence_thres=0.1)

this function takes a timeseries data and performs an overall evaluation using sMAPE metric on it. This function uses many if-else to satisfy the use_gpu and use_batch_forecast conditions specified by user.

Parameters

data – TimeSeries object
mode – either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average.
expert_idx – if None, MoE uses all the experts provided and uses the ‘mode’ strategy specified below to forecast. If value is int (E.g. 0), MoE only uses the external expert at the corresponding index of the expert models provided to MoE to make forecasts.
use_gpu – set True if GPU available for faster speed
use_batch_forecast – set True for higher speed
bs – batch size for to go through data in chunks
confidence_thres – threshold used to determine if MoE output is considered confident or not on a sample. MoE confident is calculated as forecast-standard-deviation/abs(forecast value). forecast-standard-deviation is the standard deviation of the forecasts made by all the experts.

save(dirname, **save_config)

Parameters

dirname (str) – directory to save the model
save_config – additional configurations (if needed)

classmethod load(dirname, **kwargs)

Note: if a user specified model was used while saving the MoE ensemble, specify argument moe_model when calling the load function with the pytorch model that was used in the original MoE ensemble. If moe_model is not specified, it will be assumed that the default Pytorch network was used. Any discrepancy between the saved model state and model used here will raise an error.

Parameters: dirname (str) – directory to load the model from