merlion.models.ensemble package

Ensembles of models and automated model selection.

base

Base class for ensembles of models.

combine

Rules for combining the outputs of multiple time series models.

anomaly

Ensembles of anomaly detectors.

forecast

Ensembles of forecasters.

MoE_forecast

Mixture of Expert forecasters.

Submodules

merlion.models.ensemble.base module

Base class for ensembles of models.

class merlion.models.ensemble.base.EnsembleConfig(models=None, combiner=None, transform=None, **kwargs)

Bases: Config

An ensemble config contains the each individual model in the ensemble, as well as the Combiner object to combine those models’ outputs. The rationale behind placing the model objects in the EnsembleConfig (rather than in the Ensemble itself) is discussed in more detail in the documentation for LayeredModel.

Parameters
  • models (Optional[List[Union[ModelBase, Dict]]]) – A list of models or dicts representing them.

  • combiner (Optional[CombinerBase]) – The CombinerBase object to combine the outputs of the models in the ensemble.

  • transform – Transformation to pre-process input time series.

  • kwargs – Any additional kwargs for Config

models: List[ModelBase]
to_dict(_skipped_keys=None)
Returns

dict with keyword arguments used to initialize the config class.

class merlion.models.ensemble.base.EnsembleTrainConfig(valid_frac, per_model_train_configs=None)

Bases: object

Config object describing how to train an ensemble.

Parameters
  • valid_frac – fraction of training data to use for validation.

  • per_model_train_configs – list of train configs to use for individual models, one per model. None means that you use the default for all models. Specifying None for an individual model means that you use the default for that model.

class merlion.models.ensemble.base.EnsembleBase(config=None, models=None)

Bases: ModelBase

An abstract class representing an ensemble of multiple models.

Parameters
  • config (Optional[EnsembleConfig]) – The ensemble’s config

  • models (Optional[List[ModelBase]]) – The models in the ensemble. Only provide this argument if you did not specify config.models.

config_class

alias of EnsembleConfig

property models
property combiner: CombinerBase
Return type

CombinerBase

Returns

the object used to combine model outputs.

reset()

Resets the model’s internal state.

property models_used
train_valid_split(transformed_train_data, train_config)
Return type

Tuple[TimeSeries, TimeSeries]

get_max_common_horizon()
truncate_valid_data(transformed_valid_data)
train_combiner(all_model_outs, target)
Return type

TimeSeries

save(dirname, save_only_used_models=False, **save_config)

Saves the ensemble of models.

Parameters
  • dirname (str) – directory to save the ensemble to

  • save_only_used_models – whether to save only the models that are actually used by the ensemble.

  • save_config – additional save config arguments

to_bytes(save_only_used_models=False, **save_config)

Converts the entire model state and configuration to a single byte object.

Parameters
  • save_only_used_models – whether to save only the models that are actually used by the ensemble.

  • save_config – additional configurations (if needed)

merlion.models.ensemble.combine module

Rules for combining the outputs of multiple time series models.

class merlion.models.ensemble.combine.CombinerBase(abs_score=False)

Bases: object

Abstract base class for combining the outputs of multiple models. Subclasses should implement the abstract method _combine_univariates. All combiners are callable objects.

__call__(all_model_outs, target, _check_dim=True)

Applies the model combination rule to combine multiple model outputs.

Parameters
  • all_model_outs (List[TimeSeries]) – a list of time series, with each time series representing the output of a single model.

  • target (TimeSeries) – a target time series (e.g. labels)

Return type

TimeSeries

Returns

a single time series of combined model outputs on this training data.

Parameters

abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property requires_training
to_dict(_skipped_keys=None)
classmethod from_dict(state)
property models_used: List[bool]
Return type

List[bool]

Returns

which models are actually used to make predictions.

train(all_model_outs, target=None)

Trains the model combination rule.

Parameters
  • all_model_outs (List[TimeSeries]) – a list of time series, with each time series representing the output of a single model.

  • target (Optional[TimeSeries]) – a target time series (e.g. labels)

Return type

TimeSeries

Returns

a single time series of combined model outputs on this training data.

class merlion.models.ensemble.combine.Mean(abs_score=False)

Bases: CombinerBase

Combines multiple models by taking their mean prediction.

Parameters

abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property weights: ndarray
Return type

ndarray

class merlion.models.ensemble.combine.Median(abs_score=False)

Bases: CombinerBase

Combines multiple models by taking their median prediction.

Parameters

abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

class merlion.models.ensemble.combine.Max(abs_score=False)

Bases: CombinerBase

Combines multiple models by taking their max prediction.

Parameters

abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

class merlion.models.ensemble.combine.ModelSelector(metric, abs_score=False)

Bases: Mean

Takes the mean of the best models, where the models are ranked according to the value of an evaluation metric.

Parameters
  • metric (Union[str, TSADMetric, ForecastMetric]) – the evaluation metric to use

  • abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property invert
property requires_training
to_dict(_skipped_keys=None)
classmethod from_dict(state)
property models_used: List[bool]
Return type

List[bool]

Returns

which models are actually used to make predictions.

train(all_model_outs, target=None, **kwargs)

Trains the model combination rule.

Parameters
  • all_model_outs (List[TimeSeries]) – a list of time series, with each time series representing the output of a single model.

  • target (Optional[TimeSeries]) – a target time series (e.g. labels)

Return type

TimeSeries

Returns

a single time series of combined model outputs on this training data.

class merlion.models.ensemble.combine.MetricWeightedMean(metric, abs_score=False)

Bases: ModelSelector

Computes a weighted average of model outputs with weights proportional to the metric values (or their inverses).

Parameters
  • metric (Union[str, TSADMetric, ForecastMetric]) – the evaluation metric to use

  • abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.

property models_used: List[bool]
Return type

List[bool]

Returns

which models are actually used to make predictions.

property weights: ndarray
Return type

ndarray

class merlion.models.ensemble.combine.CombinerFactory

Bases: object

Factory object for creating combiner objects.

classmethod create(name, **kwargs)
Return type

CombinerBase

merlion.models.ensemble.anomaly module

Ensembles of anomaly detectors.

class merlion.models.ensemble.anomaly.DetectorEnsembleConfig(enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, transform: TransformBase = None, normalize: Rescale = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)

Bases: DetectorConfig, EnsembleConfig

Config class for an ensemble of anomaly detectors.

Base class of the object used to configure an anomaly detection model.

Parameters
  • enable_calibrator – Whether to enable calibration of the ensemble anomaly score. False by default.

  • max_score – maximum possible uncalibrated anomaly score

  • threshold – the rule to use for thresholding anomaly scores

  • enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores

  • transform – Transformation to pre-process input time series.

  • normalize – Pre-trained normalization transformation (optional).

  • models – A list of models or dicts representing them.

  • combiner – The CombinerBase object to combine the outputs of the models in the ensemble.

  • kwargs – Any additional kwargs for EnsembleConfig or DetectorConfig

property per_model_threshold
Returns

whether to apply the thresholding rules of each individual model, before combining their outputs. Only done if doing model selection.

class merlion.models.ensemble.anomaly.DetectorEnsemble(config=None, models=None)

Bases: EnsembleBase, DetectorBase

Class representing an ensemble of multiple anomaly detection models.

Parameters

config (Optional[DetectorEnsembleConfig]) – model configuration

config_class

alias of DetectorEnsembleConfig

property per_model_threshold
Returns

whether to apply the threshold rule of each individual model before aggregating their anomaly scores.

train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None, per_model_post_rule_train_configs=None)

Trains each anomaly detector in the ensemble unsupervised, and each of their post-rules supervised (if labels are given).

Parameters
  • train_data (TimeSeries) – a TimeSeries of metric values to train the model.

  • anomaly_labels (Optional[TimeSeries]) – a TimeSeries indicating which timestamps are anomalous. Optional.

  • train_config (Optional[EnsembleTrainConfig]) – config for ensemble training. Not recommended.

  • post_rule_train_config – the post-rule train config to use for the ensemble-level post-rule.

  • per_model_post_rule_train_configs – the post-rule train configs to use for each of the individual models. Must be equal in length to the number of models, if given.

Return type

TimeSeries

Returns

A TimeSeries of the ensemble’s anomaly scores on the training data.

get_anomaly_score(time_series, time_series_prev=None)

Returns the model’s predicted sequence of anomaly scores.

Parameters
  • time_series (TimeSeries) – the TimeSeries we wish to predict anomaly scores for.

  • time_series_prev (Optional[TimeSeries]) – a TimeSeries immediately preceding time_series. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume that time_series immediately follows the training data.

Return type

TimeSeries

Returns

a univariate TimeSeries of anomaly scores

merlion.models.ensemble.forecast module

Ensembles of forecasters.

class merlion.models.ensemble.forecast.ForecasterEnsembleConfig(max_forecast_steps=None, verbose=False, target_seq_index: int = None, transform: TransformBase = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)

Bases: ForecasterConfig, EnsembleConfig

Config class for an ensemble of forecasters.

Parameters
  • max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like MSES and LGBMForecaster.

  • target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.

  • transform – Transformation to pre-process input time series.

  • models – A list of models or dicts representing them.

  • combiner – The CombinerBase object to combine the outputs of the models in the ensemble.

  • kwargs – Any additional kwargs for Config

class merlion.models.ensemble.forecast.ForecasterEnsemble(config=None, models=None)

Bases: EnsembleBase, ForecasterBase

Class representing an ensemble of multiple forecasting models.

Parameters
  • config (Optional[ForecasterEnsembleConfig]) – The ensemble’s config

  • models (Optional[List[ForecasterBase]]) – The models in the ensemble. Only provide this argument if you did not specify config.models.

config_class

alias of ForecasterEnsembleConfig

train_pre_process(train_data, require_even_sampling, require_univariate)

Applies pre-processing steps common for training most models.

Parameters
  • train_data (TimeSeries) – the original time series of training data

  • require_even_sampling (bool) – whether the model assumes that training data is sampled at a fixed frequency

  • require_univariate (bool) – whether the model only works with univariate time series

Return type

TimeSeries

Returns

the training data, after any necessary pre-processing has been applied

train(train_data, train_config=None)

Trains the forecaster on the input time series.

Parameters
  • train_data (TimeSeries) – a TimeSeries of metric values to train the model.

  • train_config (Optional[EnsembleTrainConfig]) – Additional training configs, if needed. Only required for some models.

Return type

Tuple[Optional[TimeSeries], None]

Returns

the model’s prediction on train_data, in the same format as if you called ForecasterBase.forecast on the time stamps of train_data

forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)

Returns the model’s forecast on the timestamps given. Note that if self.transform is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired.

Parameters
  • time_stamps (Union[int, List[int]]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for.

  • time_series_prev (Optional[TimeSeries]) – a list of (timestamp, value) pairs immediately preceding time_series. If given, we use it to initialize the time series model. Otherwise, we assume that time_series immediately follows the training data.

  • return_iqr (bool) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.

  • return_prev (bool) – whether to return the forecast for time_series_prev (and its stderr or IQR if relevant), in addition to the forecast for time_stamps. Only used if time_series_prev is provided.

Return type

Union[Tuple[TimeSeries, Optional[TimeSeries]], Tuple[TimeSeries, TimeSeries, TimeSeries]]

Returns

(forecast, forecast_stderr) if return_iqr is false, (forecast, forecast_lb, forecast_ub) otherwise.

  • forecast: the forecast for the timestamps given

  • forecast_stderr: the standard error of each forecast value.

    May be None.

  • forecast_lb: 25th percentile of forecast values for each timestamp

  • forecast_ub: 75th percentile of forecast values for each timestamp

merlion.models.ensemble.MoE_forecast module

Mixture of Expert forecasters.

class merlion.models.ensemble.MoE_forecast.myDataset(data, lookback, forecast=1, target_seq_index=0, include_ts=False)

Bases: Dataset

Creates a pytorch dataset.

Parameters
  • data – TimeSeries object

  • lookback – number of time steps to lookback in order to forecast

  • forecast – number of steps to forecast in the future

  • target_seq_index – dimension of the timeseries that will be forecasted

  • include_ts – Bool. If True, __getitem__ also returns a TimeSeries version of the data, excludes it otherwise

class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsembleConfig(batch_size=128, lr=0.0001, warmup_steps=100, epoch_max=100, nfree_experts=0, lookback_len=10, max_forecast_steps=3, target_seq_index=0, use_gpu=True, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, transform: TransformBase = None, normalize: Rescale = None, **kwargs)

Bases: EnsembleConfig, ForecasterConfig, NormalizingConfig

Config class for MoE (mixture of experts) forecaster.

Parameters
  • batch_size – batch_size needed since MoE uses gradient descent based learning, training happens over multiple epochs.

  • lr – learning rate of the Adam optimizer used in MoE training

  • warmup_steps – number of iterations used to reach lr

  • epoch_max – number of epochs to train the MoE model

  • nfree_experts – number of free expert forecast values that are trained using gradient descent

  • lookback_len – number of past time steps to look at in order to make future forecasts

  • max_forecast_steps – number of future steps to forecast

  • target_seq_index – index of time series to forecast. Integer value.

  • use_gpu – Bool. Use True if GPU available for faster speed.

  • models – A list of models or dicts representing them.

  • combiner – The CombinerBase object to combine the outputs of the models in the ensemble.

  • transform – Transformation to pre-process input time series.

  • normalize – Pre-trained normalization transformation (optional).

  • kwargs – Any additional kwargs for Config

models: List[ModelBase]
class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsemble(config=None, models=None, moe_model=None)

Bases: EnsembleBase, ForecasterBase

Model-based mixture of experts for forecasting.

The main class functions useful for users are:

  • train: used for training the MoE model (includes training external expert and training MoE model parameters)

  • finetune: assuming the train() function has been called once, finetune can be called if the user wants to train the MoE model params again for some reason (E.g. different optimization hyper-parameters).

  • forecast: given a time series, returns the forecast values and standard error as a tuple of TimeSeries objects

  • batch_forecast: same as forecast, but can operate on a batch of input data and outputs list of TimeSeries objects

  • _forecast: given a time series, returns the forecast values and confidence of all experts

  • _batch_forecast: same as _forecast, but can operate on a batch of input data

  • expert_prediction: this function operates on the output of _batch_forecast to compute a single forecast value per input by combining expert predictions using the user specified strategy (see expert_prediction function for details)

  • evaluate: mainly for development purpose. This function performs sMAPE evaluation for a given time series data

Parameters
  • models (Optional[List[ForecasterBase]]) – list of external expert models (E.g. Sarima, Arima). Can be an empty list if nfree_experts>0 is specified.

  • moe_model – pytorch model that takes torch.tensor input of size (B x lookback_len x input_dim) and outputs a tuple of 2 variables. The first variable is the logit (pre-softmax) of size (B x nexperts x max_forecast_steps). The second variable is None if nfree_experts=0, else has size (nfree_experts x max_forecast_steps) which is the forecasted values by nfree_experts number of experts.

config_class

alias of MoE_ForecasterEnsembleConfig

property moe_model
property nexperts
property batch_size: int
Return type

int

property lr: int
Return type

int

property warmup_steps: int
Return type

int

property epoch_max: int
Return type

int

property nfree_experts: int
Return type

int

property use_gpu: int
Return type

int

property lookback_len: int
Return type

int

train(train_data, train_config=None)

Trains the forecaster on the input time series.

Parameters
  • train_data (TimeSeries) – a TimeSeries of metric values to train the model.

  • train_config (Optional[EnsembleTrainConfig]) – Additional training configs, if needed. Only required for some models.

Return type

Tuple[Optional[TimeSeries], Optional[TimeSeries]]

Returns

the model’s prediction on train_data, in the same format as if you called ForecasterBase.forecast on the time stamps of train_data

finetune(train_data, train_config=None)

This function expects the external experts to be already trained. This function extracts the predictions of external experts (if any) and stores them. It then uses them along with the training data to train the MoE model to perform expert selection and forecasting. This function is called internally by the train function.

forecast(time_stamps, time_series_prev=None, apply_transform=True, return_iqr=False, return_prev=False, expert_idx=None, mode='max', use_gpu=False)

Returns the model’s forecast on the timestamps given. Note that if self.transform is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired.

Parameters
  • time_stamps (List[int]) – Either a list of timestamps we wish to forecast for, or the number of steps (int) we wish to forecast for.

  • time_series_prev (Optional[TimeSeries]) – a list of (timestamp, value) pairs immediately preceding time_series. If given, we use it to initialize the time series model. Otherwise, we assume that time_series immediately follows the training data.

  • return_iqr (bool) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.

  • return_prev (bool) – whether to return the forecast for time_series_prev (and its stderr or IQR if relevant), in addition to the forecast for time_stamps. Only used if time_series_prev is provided.

Returns

(forecast, forecast_stderr) if return_iqr is false, (forecast, forecast_lb, forecast_ub) otherwise.

  • forecast: the forecast for the timestamps given

  • forecast_stderr: the standard error of each forecast value.

    May be None.

  • forecast_lb: 25th percentile of forecast values for each timestamp

  • forecast_ub: 75th percentile of forecast values for each timestamp

batch_forecast(time_stamps_list, time_series_prev_list, return_iqr=False, return_prev=False, apply_transform=True, expert_idx=None, mode='max', use_gpu=False)

Returns the ensemble’s forecast on a batch of timestamps given. Note invert transforms are applied to forecasts returned by this function

Parameters
  • time_stamps_list (List[List[int]]) – a list of lists of timestamps we wish to forecast for

  • time_series_prev_list (List[TimeSeries]) – a list of TimeSeries immediately preceeding the time stamps in time_stamps_list

  • return_iqr (bool) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.

  • return_prev (bool) – whether to return the forecast for time_series_prev (and its stderr or IQR if relevant), in addition to the forecast for time_stamps. Only used if time_series_prev is provided.

  • apply_transform – bool. Whether or not to apply transform to the inputs. Use False if transform has already been applied.

Return type

Tuple[List[TimeSeries], List[Optional[TimeSeries]]]

Returns

(List of TimeSeries of forecasts, List of TimeSeries of standard errors)

  • forecasts (np array): the forecast for the timestamps given, of size (B x nexperts x max_forecast_steps)

  • probs (np array): the expert probabilities for each forecast made, of size (B x nexperts x max_forecast_steps), sum of probs is 1 along dim 1

expert_prediction(expert_preds, probs, mode='max', use_gpu=False)

This function can take the outputs provided by batch_forecast or forecast of this class to get the final forecast value and allows the user to choose which strategy to use to combine different experts.

expert_preds: (B x nexperts x max_forecast_steps) np array probs: (B x nexperts x max_forecast_steps) np array mode: either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average use_gpu: set True if GPU available for faster speed

Returns: y_pred: B x max_forecast_steps std: B x max_forecast_steps

evaluate(data, mode='mean', expert_idx=None, use_gpu=True, use_batch_forecast=True, bs=64, confidence_thres=0.1)

this function takes a timeseries data and performs an overall evaluation using sMAPE metric on it. This function uses many if-else to satisfy the use_gpu and use_batch_forecast conditions specified by user.

Parameters
  • data – TimeSeries object

  • mode – either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average.

  • expert_idx – if None, MoE uses all the experts provided and uses the ‘mode’ strategy specified below to forecast. If value is int (E.g. 0), MoE only uses the external expert at the corresponding index of the expert models provided to MoE to make forecasts.

  • use_gpu – set True if GPU available for faster speed

  • use_batch_forecast – set True for higher speed

  • bs – batch size for to go through data in chunks

  • confidence_thres – threshold used to determine if MoE output is considered confident or not on a sample. MoE confident is calculated as forecast-standard-deviation/abs(forecast value). forecast-standard-deviation is the standard deviation of the forecasts made by all the experts.

save(dirname, **save_config)
Parameters
  • dirname (str) – directory to save the model

  • save_config – additional configurations (if needed)

classmethod load(dirname, **kwargs)

Note: if a user specified model was used while saving the MoE ensemble, specify argument moe_model when calling the load function with the pytorch model that was used in the original MoE ensemble. If moe_model is not specified, it will be assumed that the default Pytorch network was used. Any discrepancy between the saved model state and model used here will raise an error.

Parameters

dirname (str) – directory to load the model from