merlion.models.ensemble package
Ensembles of models and automated model selection.
Base class for ensembles of models. |
|
Rules for combining the outputs of multiple time series models. |
|
Ensembles of anomaly detectors. |
|
Ensembles of forecasters. |
|
Mixture of Expert forecasters. |
Submodules
merlion.models.ensemble.base module
Base class for ensembles of models.
- class merlion.models.ensemble.base.EnsembleConfig(model_configs=None, combiner=None, **kwargs)
Bases:
Config
An ensemble config contains the configs of each individual model in the ensemble, as well as the combiner object to combine those models’ outputs.
- Parameters
model_configs (
Optional
[List
[Tuple
[str
,Union
[Config
,Dict
]]]]) – A list of(class_name, config)
tuples, whereclass_name
is the name of the model’s class (as you would provide to theModelFactory
), andconfig
is its config or a dict. Note thatmodel_configs
is not serialized byEnsembleConfig.to_dict
! The individual models are handled byEnsembleBase.save
. Ifmodel_configs
is not provided, you are expected to provide themodels
directly when initializing theEnsembleBase
.combiner (
Optional
[CombinerBase
]) – The combiner object to combine the outputs of the models in the ensemble.kwargs – Any additional kwargs for
Config
- to_dict(_skipped_keys=None)
- Returns
dict with keyword arguments used to initialize the config class.
- class merlion.models.ensemble.base.EnsembleTrainConfig(valid_frac, per_model_train_configs=None)
Bases:
object
Config object describing how to train an ensemble.
- Parameters
valid_frac – fraction of training data to use for validation.
per_model_train_configs – list of train configs to use for individual models, one per model.
None
means that you use the default for all models. SpecifyingNone
for an individual model means that you use the default for that model.
- class merlion.models.ensemble.base.EnsembleBase(config=None, models=None)
Bases:
ModelBase
,ABC
An abstract class representing an ensemble of multiple models.
Initializes the ensemble according to the specified config.
- Parameters
config (
Optional
[EnsembleConfig
]) – The ensemble’s configmodels (
Optional
[List
[ModelBase
]]) – The models in the ensemble. Only provide this argument if you did not specifyconfig.model_configs
, and you want to initialize an ensemble from models that have already been constructed.
- config_class
alias of
EnsembleConfig
- train_valid_split(transformed_train_data, train_config)
- Return type
Tuple
[TimeSeries
,TimeSeries
]
- get_max_common_horizon()
- truncate_valid_data(transformed_valid_data)
- train_combiner(all_model_outs, target)
- Return type
- property combiner: CombinerBase
- Return type
- Returns
the object used to combine model outputs.
- reset()
Resets the model’s internal state.
- property models_used
- save(dirname, save_only_used_models=False, **save_config)
Saves the ensemble of models.
- Parameters
dirname (
str
) – directory to save the ensemble tosave_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional save config arguments
- classmethod load(dirname, **kwargs)
- Parameters
dirname (
str
) – directory to load model (and config) fromkwargs – config params to override manually
- Returns
ModelBase
object loaded from file
- to_bytes(save_only_used_models=False, **save_config)
Converts the entire ensemble to a single byte object.
- Parameters
save_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional save config arguments
- Returns
bytes object representing the model.
- classmethod from_bytes(obj, **kwargs)
Creates a fully specified model from a byte object
- Parameters
obj – byte object to convert into a model
- Returns
EnsembleBase
object loaded fromobj
merlion.models.ensemble.combine module
Rules for combining the outputs of multiple time series models.
- class merlion.models.ensemble.combine.CombinerBase(abs_score=False)
Bases:
object
Abstract base class for combining the outputs of multiple models. Subclasses should implement the abstract method
_combine_univariates
. All combiners are callable objects.- __call__(all_model_outs, target, _check_dim=True)
Applies the model combination rule to combine multiple model outputs.
- Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
TimeSeries
) – a target time series (e.g. labels)
- Return type
- Returns
a single time series of combined model outputs on this training data.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property requires_training
- to_dict(_skipped_keys=None)
- classmethod from_dict(state)
- property models_used: List[bool]
- Return type
List
[bool
]- Returns
which models are actually used to make predictions.
- train(all_model_outs, target=None)
Trains the model combination rule.
- Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
Optional
[TimeSeries
]) – a target time series (e.g. labels)
- Return type
- Returns
a single time series of combined model outputs on this training data.
- class merlion.models.ensemble.combine.Mean(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their mean prediction.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property weights: ndarray
- Return type
ndarray
- class merlion.models.ensemble.combine.Median(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their median prediction.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- class merlion.models.ensemble.combine.Max(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their max prediction.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- class merlion.models.ensemble.combine.ModelSelector(metric, abs_score=False)
Bases:
Mean
Takes the mean of the best models, where the models are ranked according to the value of an evaluation metric.
- Parameters
metric (
Union
[str
,TSADMetric
,ForecastMetric
]) – the evaluation metric to useabs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property invert
- property requires_training
- to_dict(_skipped_keys=None)
- classmethod from_dict(state)
- property models_used: List[bool]
- Return type
List
[bool
]- Returns
which models are actually used to make predictions.
- train(all_model_outs, target=None, **kwargs)
Trains the model combination rule.
- Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
Optional
[TimeSeries
]) – a target time series (e.g. labels)
- Return type
- Returns
a single time series of combined model outputs on this training data.
- class merlion.models.ensemble.combine.MetricWeightedMean(metric, abs_score=False)
Bases:
ModelSelector
Computes a weighted average of model outputs with weights proportional to the metric values (or their inverses).
- Parameters
metric (
Union
[str
,TSADMetric
,ForecastMetric
]) – the evaluation metric to useabs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property models_used: List[bool]
- Return type
List
[bool
]- Returns
which models are actually used to make predictions.
- property weights: ndarray
- Return type
ndarray
merlion.models.ensemble.anomaly module
Ensembles of anomaly detectors.
- class merlion.models.ensemble.anomaly.DetectorEnsembleConfig(enable_calibrator=False, **kwargs)
Bases:
DetectorConfig
,EnsembleConfig
Config class for an ensemble of anomaly detectors.
- Parameters
enable_calibrator – Whether to enable calibration of the ensemble anomaly score.
False
by default.kwargs – Any additional kwargs for
EnsembleConfig
orDetectorConfig
- property per_model_threshold
- Returns
whether to apply the thresholding rules of each individual model, before combining their outputs. Only done if doing model selection.
- class merlion.models.ensemble.anomaly.DetectorEnsemble(config=None, models=None)
Bases:
EnsembleBase
,DetectorBase
Class representing an ensemble of multiple anomaly detection models.
- Parameters
config (
Optional
[DetectorEnsembleConfig
]) – model configuration
- models: List[DetectorBase]
- config_class
alias of
DetectorEnsembleConfig
- property per_model_threshold
- Returns
whether to apply the threshold rule of each individual model before aggregating their anomaly scores.
- train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None, per_model_post_rule_train_configs=None)
Trains each anomaly detector in the ensemble unsupervised, and each of their post-rules supervised (if labels are given).
- Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config (
Optional
[EnsembleTrainConfig
]) – config for ensemble training. Not recommended.post_rule_train_config – the post-rule train config to use for the ensemble-level post-rule.
per_model_post_rule_train_configs – the post-rule train configs to use for each of the individual models. Must be equal in length to the number of models, if given.
- Return type
- Returns
A
TimeSeries
of the ensemble’s anomaly scores on the training data.
- get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
- Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
- Return type
- Returns
a univariate
TimeSeries
of anomaly scores
merlion.models.ensemble.forecast module
Ensembles of forecasters.
- class merlion.models.ensemble.forecast.ForecasterEnsembleConfig(max_forecast_steps=None, **kwargs)
Bases:
ForecasterConfig
,EnsembleConfig
Config class for an ensemble of forecasters.
- Parameters
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models which pre-compute a forecast, like ARIMA, SARIMA, and LSTM.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
- class merlion.models.ensemble.forecast.ForecasterEnsemble(config=None, models=None)
Bases:
EnsembleBase
,ForecasterBase
Class representing an ensemble of multiple forecasting models.
Initializes the ensemble according to the specified config.
- Parameters
config (
Optional
[ForecasterEnsembleConfig
]) – The ensemble’s configmodels (
Optional
[List
[ForecasterBase
]]) – The models in the ensemble. Only provide this argument if you did not specifyconfig.model_configs
, and you want to initialize an ensemble from models that have already been constructed.
- models: List[ForecasterBase]
- config_class
alias of
ForecasterEnsembleConfig
- train_pre_process(train_data, require_even_sampling, require_univariate)
Applies pre-processing steps common for training most models.
- Parameters
train_data (
TimeSeries
) – the original time series of training datarequire_even_sampling (
bool
) – whether the model assumes that training data is sampled at a fixed frequencyrequire_univariate (
bool
) – whether the model only works with univariate time series
- Return type
- Returns
the training data, after any necessary pre-processing has been applied
- train(train_data, train_config=None)
Trains the forecaster on the input time series.
- Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config (
Optional
[EnsembleTrainConfig
]) – Additional training configs, if needed. Only required for some models.
- Return type
Tuple
[Optional
[TimeSeries
],None
]- Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
- forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired.- Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
- Return type
Union
[Tuple
[TimeSeries
,Optional
[TimeSeries
]],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]]- Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
merlion.models.ensemble.MoE_forecast module
Mixture of Expert forecasters.
- class merlion.models.ensemble.MoE_forecast.myDataset(data, lookback, forecast=1, target_seq_index=0, include_ts=False)
Bases:
Dataset
Creates a pytorch dataset.
- Parameters
data – TimeSeries object
lookback – number of time steps to lookback in order to forecast
forecast – number of steps to forecast in the future
target_seq_index – dimension of the timeseries that will be forecasted
include_ts – Bool. If True, __getitem__ also returns a TimeSeries version of the data, excludes it otherwise
- class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsembleConfig(batch_size=128, lr=0.0001, warmup_steps=100, epoch_max=100, nfree_experts=0, lookback_len=10, max_forecast_steps=3, target_seq_index=0, use_gpu=True, **kwargs)
Bases:
ForecasterConfig
,EnsembleConfig
,NormalizingConfig
Config class for MoE (mixture of experts) forecaster.
- Parameters
batch_size – batch_size needed since MoE uses gradient descent based learning, training happens over multiple epochs.
lr – learning rate of the Adam optimizer used in MoE training
warmup_steps – number of iterations used to reach lr
epoch_max – number of epochs to train the MoE model
nfree_experts – number of free expert forecast values that are trained using gradient descent
lookback_len – number of past time steps to look at in order to make future forecasts
max_forecast_steps – number of future steps to forecast
target_seq_index – index of time series to forecast. Integer value.
use_gpu – Bool. Use True if GPU available for faster speed.
- class merlion.models.ensemble.MoE_forecast.MoE_ForecasterEnsemble(config=None, models=None, moe_model=None)
Bases:
EnsembleBase
,ForecasterBase
Model-based mixture of experts for forecasting.
The main class functions useful for users are:
train: used for training the MoE model (includes training external expert and training MoE model parameters)
finetune: assuming the train() function has been called once, finetune can be called if the user wants to train the MoE model params again for some reason (E.g. different optimization hyper-parameters).
forecast: given a time series, returns the forecast values and standard error as a tuple of
TimeSeries
objectsbatch_forecast: same as forecast, but can operate on a batch of input data and outputs list of
TimeSeries
objects_forecast: given a time series, returns the forecast values and confidence of all experts
_batch_forecast: same as _forecast, but can operate on a batch of input data
expert_prediction: this function operates on the output of _batch_forecast to compute a single forecast value per input by combining expert predictions using the user specified strategy (see expert_prediction function for details)
evaluate: mainly for development purpose. This function performs sMAPE evaluation for a given time series data
- Parameters
models (
Optional
[List
[ForecasterBase
]]) – list of external expert models (E.g. Sarima, Arima). Can be an empty list if nfree_experts>0 is specified.moe_model – pytorch model that takes torch.tensor input of size (B x lookback_len x input_dim) and outputs a tuple of 2 variables. The first variable is the logit (pre-softmax) of size (B x nexperts x max_forecast_steps). The second variable is None if nfree_experts=0, else has size (nfree_experts x max_forecast_steps) which is the forecasted values by nfree_experts number of experts.
- models: List[ForecasterBase]
- config_class
alias of
MoE_ForecasterEnsembleConfig
- property batch_size: int
- Return type
int
- property lr: int
- Return type
int
- property warmup_steps: int
- Return type
int
- property epoch_max: int
- Return type
int
- property nfree_experts: int
- Return type
int
- property use_gpu: int
- Return type
int
- property lookback_len: int
- Return type
int
- train(train_data, train_config=None)
Trains the forecaster on the input time series.
- Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config (
Optional
[EnsembleTrainConfig
]) – Additional training configs, if needed. Only required for some models.
- Return type
Tuple
[Optional
[TimeSeries
],Optional
[TimeSeries
]]- Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
- finetune(train_data, train_config=None)
This function expects the external experts to be already trained. This function extracts the predictions of external experts (if any) and stores them. It then uses them along with the training data to train the MoE model to perform expert selection and forecasting. This function is called internally by the train function.
- forecast(time_stamps, time_series_prev=None, apply_transform=True, return_iqr=False, return_prev=False, expert_idx=None, mode='max', use_gpu=False)
Returns the model’s forecast on the timestamps given. Note that if
self.transform
is specified in the config, the forecast is a forecast of transformed values! It is up to you to manually invert the transform if desired.- Parameters
time_stamps (
List
[int
]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
- Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp
- batch_forecast(time_stamps_list, time_series_prev_list, return_iqr=False, return_prev=False, apply_transform=True, expert_idx=None, mode='max', use_gpu=False)
Returns the ensemble’s forecast on a batch of timestamps given. Note invert transforms are applied to forecasts returned by this function
- Parameters
time_stamps_list (
List
[List
[int
]]) – a list of lists of timestamps we wish to forecast fortime_series_prev_list (
List
[TimeSeries
]) – a list of TimeSeries immediately preceeding the time stamps in time_stamps_listreturn_iqr (
bool
) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.apply_transform – bool. Whether or not to apply transform to the inputs. Use False if transform has already been applied.
- Return type
Tuple
[List
[TimeSeries
],List
[Optional
[TimeSeries
]]]- Returns
(List of TimeSeries of forecasts, List of TimeSeries of standard errors)
forecasts
(np array): the forecast for the timestamps given, of size (B x nexperts x max_forecast_steps)probs
(np array): the expert probabilities for each forecast made, of size (B x nexperts x max_forecast_steps), sum of probs is 1 along dim 1
- expert_prediction(expert_preds, probs, mode='max', use_gpu=False)
This function can take the outputs provided by batch_forecast or forecast of this class to get the final forecast value and allows the user to choose which strategy to use to combine different experts.
expert_preds: (B x nexperts x max_forecast_steps) np array probs: (B x nexperts x max_forecast_steps) np array mode: either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average use_gpu: set True if GPU available for faster speed
Returns: y_pred: B x max_forecast_steps std: B x max_forecast_steps
- evaluate(data, mode='mean', expert_idx=None, use_gpu=True, use_batch_forecast=True, bs=64, confidence_thres=0.1)
this function takes a timeseries data and performs an overall evaluation using sMAPE metric on it. This function uses many if-else to satisfy the use_gpu and use_batch_forecast conditions specified by user.
- Parameters
data – TimeSeries object
mode – either mean or max. Max picks the expert with the highest confidence; mean computes the weighted average.
expert_idx – if None, MoE uses all the experts provided and uses the ‘mode’ strategy specified below to forecast. If value is int (E.g. 0), MoE only uses the external expert at the corresponding index of the expert models provided to MoE to make forecasts.
use_gpu – set True if GPU available for faster speed
use_batch_forecast – set True for higher speed
bs – batch size for to go through data in chunks
confidence_thres – threshold used to determine if MoE output is considered confident or not on a sample. MoE confident is calculated as forecast-standard-deviation/abs(forecast value). forecast-standard-deviation is the standard deviation of the forecasts made by all the experts.
- save(dirname, **save_config)
- Parameters
dirname (
str
) – directory to save the modelsave_config – additional configurations (if needed)
- classmethod load(dirname, **kwargs)
Note: if a user specified model was used while saving the MoE ensemble, specify argument
moe_model
when calling the load function with the pytorch model that was used in the original MoE ensemble. Ifmoe_model
is not specified, it will be assumed that the default Pytorch network was used. Any discrepency between the saved model state and model used here will raise an error.- Parameters
dirname (
str
) – directory to save the model