ensemble
Ensembles of models and automated model selection.
Base class for ensembles of models. |
|
Rules for combining the outputs of multiple time series models. |
|
Ensembles of anomaly detectors. |
|
Ensembles of forecasters. |
ensemble.base
Base class for ensembles of models.
- class merlion.models.ensemble.base.EnsembleConfig(models=None, combiner=None, transform=None, **kwargs)
Bases:
Config
An ensemble config contains the each individual model in the ensemble, as well as the Combiner object to combine those models’ outputs. The rationale behind placing the model objects in the EnsembleConfig (rather than in the Ensemble itself) is discussed in more detail in the documentation for LayeredModel.
- Parameters
models (
Optional
[List
[Union
[ModelBase
,Dict
]]]) – A list of models or dicts representing them.combiner (
Optional
[CombinerBase
]) – The CombinerBase object to combine the outputs of the models in the ensemble.transform – Transformation to pre-process input time series.
kwargs – Any additional kwargs for Config
- to_dict(_skipped_keys=None)
- Returns
dict with keyword arguments used to initialize the config class.
- class merlion.models.ensemble.base.EnsembleTrainConfig(valid_frac, per_model_train_configs=None)
Bases:
object
Config object describing how to train an ensemble.
- Parameters
valid_frac – fraction of training data to use for validation.
per_model_train_configs – list of train configs to use for individual models, one per model.
None
means that you use the default for all models. SpecifyingNone
for an individual model means that you use the default for that model.
- class merlion.models.ensemble.base.EnsembleBase(config=None, models=None)
Bases:
ModelBase
An abstract class representing an ensemble of multiple models.
- Parameters
config (
Optional
[EnsembleConfig
]) – The ensemble’s configmodels (
Optional
[List
[ModelBase
]]) – The models in the ensemble. Only provide this argument if you did not specifyconfig.models
.
- config_class
alias of
EnsembleConfig
- property models
- property combiner: CombinerBase
- Returns
the object used to combine model outputs.
- reset()
Resets the model’s internal state.
- property models_used
- train_valid_split(transformed_train_data, train_config)
- Return type
Tuple
[TimeSeries
,Optional
[TimeSeries
]]
- get_max_common_horizon(train_data=None)
- train_combiner(all_model_outs, target, **kwargs)
- Return type
- save(dirname, save_only_used_models=False, **save_config)
Saves the ensemble of models.
- Parameters
dirname (
str
) – directory to save the ensemble tosave_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional save config arguments
- to_bytes(save_only_used_models=False, **save_config)
Converts the entire model state and configuration to a single byte object.
- Parameters
save_only_used_models – whether to save only the models that are actually used by the ensemble.
save_config – additional configurations (if needed)
ensemble.combine
Rules for combining the outputs of multiple time series models.
- class merlion.models.ensemble.combine.CombinerBase(abs_score=False)
Bases:
object
Abstract base class for combining the outputs of multiple models. Subclasses should implement the abstract method
_combine_univariates
. All combiners are callable objects.- __call__(all_model_outs, target, _check_dim=True)
Applies the model combination rule to combine multiple model outputs.
- Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
TimeSeries
) – a target time series (e.g. labels)
- Return type
- Returns
a single time series of combined model outputs on this training data.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- reset()
- property requires_training
- to_dict(_skipped_keys=None)
- classmethod from_dict(state)
- set_model_used(i, used)
- get_model_used(i)
- property models_used: List[bool]
- Returns
which models are actually used to make predictions.
- train(all_model_outs, target=None, **kwargs)
Trains the model combination rule.
- Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
Optional
[TimeSeries
]) – a target time series (e.g. labels)
- Return type
- Returns
a single time series of combined model outputs on this training data.
- class merlion.models.ensemble.combine.Mean(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their mean prediction.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property weights: ndarray
- class merlion.models.ensemble.combine.Median(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their median prediction.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- class merlion.models.ensemble.combine.Max(abs_score=False)
Bases:
CombinerBase
Combines multiple models by taking their max prediction.
- Parameters
abs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- class merlion.models.ensemble.combine.ModelSelector(metric, abs_score=False)
Bases:
Mean
Takes the mean of the best models, where the models are ranked according to the value of an evaluation metric.
- Parameters
metric (
Union
[str
,TSADMetric
,ForecastMetric
]) – the evaluation metric to useabs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property invert
- property requires_training
- to_dict(_skipped_keys=None)
- classmethod from_dict(state)
- train(all_model_outs, target=None, **kwargs)
Trains the model combination rule.
- Parameters
all_model_outs (
List
[TimeSeries
]) – a list of time series, with each time series representing the output of a single model.target (
Optional
[TimeSeries
]) – a target time series (e.g. labels)
- Return type
- Returns
a single time series of combined model outputs on this training data.
- class merlion.models.ensemble.combine.MetricWeightedMean(metric, abs_score=False)
Bases:
ModelSelector
Computes a weighted average of model outputs with weights proportional to the metric values (or their inverses).
- Parameters
metric (
Union
[str
,TSADMetric
,ForecastMetric
]) – the evaluation metric to useabs_score – whether to take the absolute value of the model outputs. Useful for anomaly detection.
- property weights: ndarray
ensemble.anomaly
Ensembles of anomaly detectors.
- class merlion.models.ensemble.anomaly.DetectorEnsembleConfig(enable_calibrator=False, max_score: float = 1000, threshold=None, enable_threshold=True, transform: TransformBase = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)
Bases:
DetectorConfig
,EnsembleConfig
Config class for an ensemble of anomaly detectors.
Base class of the object used to configure an anomaly detection model.
- Parameters
enable_calibrator – Whether to enable calibration of the ensemble anomaly score.
False
by default.max_score – maximum possible uncalibrated anomaly score
threshold – the rule to use for thresholding anomaly scores
enable_threshold – whether to enable the thresholding rule when post-processing anomaly scores
transform – Transformation to pre-process input time series.
models – A list of models or dicts representing them.
combiner – The CombinerBase object to combine the outputs of the models in the ensemble.
kwargs – Any additional kwargs for EnsembleConfig or DetectorConfig
- property per_model_threshold
- Returns
whether to apply the thresholding rules of each individual model, before combining their outputs. Only done if doing model selection.
- class merlion.models.ensemble.anomaly.DetectorEnsembleTrainConfig(valid_frac=0.0, per_model_train_configs=None, per_model_post_rule_train_configs=None)
Bases:
EnsembleTrainConfig
Config object describing how to train an ensemble of anomaly detectors.
- Parameters
valid_frac – fraction of training data to use for validation.
per_model_train_configs – list of train configs to use for individual models, one per model.
None
means that you use the default for all models. SpecifyingNone
for an individual model means that you use the default for that model.per_model_post_rule_train_configs – list of post-rule train configs to use for individual models, one per model.
None
means that you use the default for all models. SpecifyingNone
for an individual model means that you use the default for that model.
- class merlion.models.ensemble.anomaly.DetectorEnsemble(config=None, models=None)
Bases:
EnsembleBase
,DetectorBase
Class representing an ensemble of multiple anomaly detection models.
- Parameters
config (
Optional
[DetectorEnsembleConfig
]) – model configuration
- config_class
alias of
DetectorEnsembleConfig
- property require_even_sampling: bool
Whether the model assumes that training data is sampled at a fixed frequency
- property require_univariate: bool
Whether the model only works with univariate time series.
- property per_model_threshold
- Returns
whether to apply the threshold rule of each individual model before aggregating their anomaly scores.
ensemble.forecast
Ensembles of forecasters.
- class merlion.models.ensemble.forecast.ForecasterEnsembleConfig(max_forecast_steps=None, target_seq_index=None, verbose=False, exog_transform: TransformBase = None, exog_aggregation_policy: Union[AggregationPolicy, str] = 'Mean', exog_missing_value_policy: Union[MissingValuePolicy, str] = 'ZFill', invert_transform=None, transform: TransformBase = None, models: List[Union[ModelBase, Dict]] = None, combiner: CombinerBase = None, **kwargs)
Bases:
ForecasterExogConfig
,EnsembleConfig
Config class for an ensemble of forecasters.
- Parameters
max_forecast_steps – Max # of steps we would like to forecast for. Required for some models like MSES.
target_seq_index – The index of the univariate (amongst all univariates in a general multivariate time series) whose value we would like to forecast.
exog_transform – The pre-processing transform for exogenous data. Note: resampling is handled separately.
exog_aggregation_policy – The policy to use for aggregating values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.
exog_missing_value_policy – The policy to use for imputing missing values in exogenous data, to ensure it is sampled at the same timestamps as the endogenous data.
invert_transform – Whether to automatically invert the
transform
before returning a forecast. By default, we will invert the transform for all base forecasters if it supports a proper inversion, but we will not invert it for forecaster-based anomaly detectors or transforms without proper inversions.transform – Transformation to pre-process input time series.
models – A list of models or dicts representing them.
combiner – The CombinerBase object to combine the outputs of the models in the ensemble.
kwargs – Any additional kwargs for Config
- property target_seq_index
- class merlion.models.ensemble.forecast.ForecasterEnsemble(config=None, models=None)
Bases:
EnsembleBase
,ForecasterExogBase
Class representing an ensemble of multiple forecasting models.
- Parameters
config (
Optional
[ForecasterEnsembleConfig
]) – The ensemble’s configmodels (
Optional
[List
[ForecasterBase
]]) – The models in the ensemble. Only provide this argument if you did not specifyconfig.models
.
- config_class
alias of
ForecasterEnsembleConfig
- property require_even_sampling: bool
Whether the model assumes that training data is sampled at a fixed frequency
- train_pre_process(train_data, exog_data=None, return_exog=None)
Applies pre-processing steps common for training most models.
- Parameters
train_data (
TimeSeries
) – the original time series of training data- Return type
Union
[TimeSeries
,Tuple
[TimeSeries
,Optional
[TimeSeries
]]]- Returns
the training data, after any necessary pre-processing has been applied
- resample_time_stamps(time_stamps, time_series_prev=None)
- train_combiner(all_model_outs, target, **kwargs)
- Return type