merlion.models package
Broadly, Merlion contains two types of models: anomaly detection (merlion.models.anomaly
)
and forecasting (merlion.models.forecast
). Note that there is a distinct subset of anomaly
detection models that use forecasting models at their core (merlion.models.anomaly.forecast_based
).
We implement an abstract ModelBase
class which provides the following functionality for all models:
model = ModelClass(config)
initialization with a model-specific config (which inherits from
Config
)configs contain:
a (potentially trainable) data pre-processing transform from
merlion.transform
; note thatmodel.transform
is a property which refers tomodel.config.transform
model-specific hyperparameters
model.save(dirname, save_config=None)
saves the model to the specified directory. The model’s configuration is saved to
<dirname>/config.json
, while the model’s binary data is (by default) saved in binary form to<dirname>/model.pkl
. Note that if you edit the saved<dirname>/config.json
on disk, the changes will be loaded when you callModelClass.load(dirname)
!this method heavily exploits the fact that many objects in Merlion are JSON-serializable
ModelClass.load(dirname, **kwargs)
this class method initializes an instance of
ModelClass
from the config file saved in<dirname>/config.json
, (overriding any parameters of the config withkwargs
where relevant), loads the remaining binary data into the model object, and returns the fully initialized model.
For users who aren’t familiar with the specific details of various models, we provide default models for anomaly
detection and forecasting in merlion.models.defaults
.
We also provide a ModelFactory
which can be used to conveniently instantiate models from their name and a set of
keyword arguments, or to load them directly from disk. For example, we may have the following workflow:
from merlion.models.factory import ModelFactory
from merlion.models.anomaly.windstats import WindStats, WindStatsConfig
# creates the same kind of model in 2 equivalent ways
model1a = WindStats(WindStatsConfig(wind_sz=60))
model1b = ModelFactory.create("WindStats", wind_sz=60)
# save the model & load it in 2 equivalent ways
model1a.save("tmp")
model2a = WindStats.load("tmp")
model2b = ModelFactory.load("tmp")
Finally, we support ensembles of models in merlion.models.ensemble
.
Contains the base classes for all models. |
|
Contains the |
|
Default models for anomaly detection & forecasting that balance speed and performance. |
|
Contains all anomaly detection models. |
|
Contains all forecaster-based anomaly detectors. |
|
Contains all forecasting models. |
|
Ensembles of models and automated model selection. |
|
Contains all AutoML layers. |
Subpackages
- merlion.models.anomaly package
- Subpackages
- Submodules
- merlion.models.anomaly.base module
- merlion.models.anomaly.dbl module
- merlion.models.anomaly.windstats module
- merlion.models.anomaly.isolation_forest module
- merlion.models.anomaly.random_cut_forest module
- merlion.models.anomaly.spectral_residual module
- merlion.models.anomaly.stat_threshold module
- merlion.models.anomaly.zms module
- merlion.models.anomaly.autoencoder module
- merlion.models.anomaly.vae module
- merlion.models.anomaly.dagmm module
- merlion.models.anomaly.lstm_ed module
- merlion.models.anomaly.deep_point_anomaly_detector module
- merlion.models.anomaly.forecast_based package
- Submodules
- merlion.models.anomaly.forecast_based.base module
- merlion.models.anomaly.forecast_based.arima module
- merlion.models.anomaly.forecast_based.sarima module
- merlion.models.anomaly.forecast_based.ets module
- merlion.models.anomaly.forecast_based.prophet module
- merlion.models.anomaly.forecast_based.lstm module
- merlion.models.anomaly.forecast_based.mses module
- merlion.models.forecast package
- Submodules
- merlion.models.forecast.base module
- merlion.models.forecast.arima module
- merlion.models.forecast.sarima module
- merlion.models.forecast.prophet module
- merlion.models.forecast.smoother module
- merlion.models.forecast.vector_ar module
- merlion.models.forecast.baggingtrees module
- merlion.models.forecast.bo0stingtrees module
- merlion.models.forecast.lstm module
- merlion.models.ensemble package
- merlion.models.automl package
Submodules
merlion.models.base module
Contains the base classes for all models.
- class merlion.models.base.Config(transform=None, **kwargs)
Bases:
object
Abstract class which defines a model config.
- Parameters
transform (
Optional
[TransformBase
]) – Transformation to pre-process input time series.
- filename = 'config.json'
- to_dict(_skipped_keys=None)
- Returns
dict with keyword arguments used to initialize the config class.
- classmethod from_dict(config_dict, return_unused_kwargs=False, **kwargs)
Constructs a
Config
from a Python dictionary of parameters.- Parameters
config_dict (
Dict
[str
,Any
]) – dict that will be used to instantiate this object.return_unused_kwargs – whether to return any unused keyword args.
kwargs – any additional parameters to set (overriding config_dict).
- Returns
Config
object initialized from the dict.
- class merlion.models.base.NormalizingConfig(normalize=None, **kwargs)
Bases:
Config
Model config where the transform must return normalized values. Applies additional normalization after the initial data pre-processing transform.
- Parameters
normalize (
Optional
[Rescale
]) – Pre-trained normalization transformation (optional).
- property full_transform
Returns the full transform, including the pre-processing step, lags, and final mean/variance normalization.
- property transform
- class merlion.models.base.ModelBase(config)
Bases:
object
Abstract base class for models.
- filename = 'model.pkl'
- reset()
Resets the model’s internal state.
- property transform
- Returns
The data pre-processing transform to apply on any time series, before giving it to the model.
- train_pre_process(train_data, require_even_sampling, require_univariate)
Applies pre-processing steps common for training most models.
- Parameters
train_data (
TimeSeries
) – the original time series of training datarequire_even_sampling (
bool
) – whether the model assumes that training data is sampled at a fixed frequencyrequire_univariate (
bool
) – whether the model only works with univariate time series
- Return type
- Returns
the training data, after any necessary pre-processing has been applied
- transform_time_series(time_series, time_series_prev=None)
Applies the model’s pre-processing transform to
time_series
andtime_series_prev
.- Parameters
time_series (
TimeSeries
) – The time seriestime_series_prev (
Optional
[TimeSeries
]) – A time series of context, immediately precedingtime_series
. Optional.
- Return type
Tuple
[TimeSeries
,Optional
[TimeSeries
]]- Returns
The transformed
time_series
.
- abstract train(train_data, train_config=None)
Trains the model on the specified time series, optionally with some additional implementation-specific config options
train_config
.- Parameters
train_data (
TimeSeries
) – aTimeSeries
to use as a training settrain_config – additional configurations (if needed)
- save(dirname, **save_config)
- Parameters
dirname (
str
) – directory to save the model & its configsave_config – additional configurations (if needed)
- classmethod load(dirname, **kwargs)
- Parameters
dirname (
str
) – directory to load model (and config) fromkwargs – config params to override manually
- Returns
ModelBase
object loaded from file
- to_bytes(**save_config)
Converts the entire model state and configuration to a single byte object.
- Returns
bytes object representing the model.
- classmethod from_bytes(obj, **kwargs)
Creates a fully specified model from a byte object
- Parameters
obj – byte object to convert into a model
- Returns
ModelBase object loaded from
obj
- class merlion.models.base.ModelWrapper(config, model=None)
Bases:
ModelBase
Abstract class implementing a model that wraps around another internal model.
- filename = 'model'
- save(dirname, **save_config)
- Parameters
dirname (
str
) – directory to save the model & its configsave_config – additional configurations (if needed)
- classmethod load(dirname, **kwargs)
- Parameters
dirname (
str
) – directory to load model (and config) fromkwargs – config params to override manually
- Returns
ModelBase
object loaded from file
- to_bytes(**save_config)
Converts the entire model state and configuration to a single byte object.
- Returns
bytes object representing the model.
- classmethod from_bytes(obj, **kwargs)
Creates a fully specified model from a byte object
- Parameters
obj – byte object to convert into a model
- Returns
ModelBase object loaded from
obj
merlion.models.factory module
Contains the ModelFactory
.
- class merlion.models.factory.ModelFactory
Bases:
object
merlion.models.defaults module
Default models for anomaly detection & forecasting that balance speed and performance.
- class merlion.models.defaults.DefaultModelConfig(granularity=None, **kwargs)
Bases:
Config
- Parameters
transform – Transformation to pre-process input time series.
- to_dict(_skipped_keys=None)
- Returns
dict with keyword arguments used to initialize the config class.
- class merlion.models.defaults.DefaultDetectorConfig(granularity=None, threshold=None, n_threads=1, **kwargs)
Bases:
DetectorConfig
,DefaultModelConfig
Config object for default anomaly detection model.
- Parameters
granularity – the granularity at which the input time series should be sampled, e.g. “5min”, “1h”, “1d”, etc.
threshold –
Threshold
object setting a default anomaly detection threshold in units of z-score.n_threads (
int
) – the number of parallel threads to use for relevant models
- class merlion.models.defaults.DefaultDetector(config, model=None)
Bases:
ModelWrapper
,DetectorBase
Default anomaly detection model that balances efficiency with performance.
- Parameters
config (
Config
) – model configuration
- config_class
alias of
DefaultDetectorConfig
- property granularity
- train(train_data, anomaly_labels=None, train_config=None, post_rule_train_config=None)
Trains the anomaly detector (unsupervised) and its post-rule (supervised, if labels are given) on the input time series.
- Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.anomaly_labels (
Optional
[TimeSeries
]) – aTimeSeries
indicating which timestamps are anomalous. Optional.train_config – Additional training configs, if needed. Only required for some models.
post_rule_train_config – The config to use for training the model’s post-rule. The model’s default post-rule train config is used if none is supplied here.
- Return type
- Returns
A
TimeSeries
of the model’s anomaly scores on the training data.
- get_anomaly_score(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores.
- Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
- Return type
- Returns
a univariate
TimeSeries
of anomaly scores
- get_anomaly_label(time_series, time_series_prev=None)
Returns the model’s predicted sequence of anomaly scores, processed by any relevant post-rules (calibration and/or thresholding).
- Parameters
time_series (
TimeSeries
) – theTimeSeries
we wish to predict anomaly scores for.time_series_prev (
Optional
[TimeSeries
]) – aTimeSeries
immediately precedingtime_series
. If given, we use it to initialize the time series anomaly detection model. Otherwise, we assume thattime_series
immediately follows the training data.
- Return type
- Returns
a univariate
TimeSeries
of anomaly scores, filtered by the model’s post-rule
- class merlion.models.defaults.DefaultForecasterConfig(granularity=None, max_forecast_steps=100, target_seq_index=None, **kwargs)
Bases:
ForecasterConfig
,DefaultModelConfig
Config object for default forecasting model.
- Parameters
granularity – the granularity at which the input time series should be sampled, e.g. “5min”, “1h”, “1d”, etc.
max_forecast_steps – Max # of steps we would like to forecast for.
target_seq_index – If doing multivariate forecasting, the index of univariate whose value you wish to forecast.
- class merlion.models.defaults.DefaultForecaster(config, model=None)
Bases:
ModelWrapper
,ForecasterBase
Default forecasting model that balances efficiency with performance.
- config_class
alias of
DefaultForecasterConfig
- property granularity
- train(train_data, train_config=None)
Trains the forecaster on the input time series.
- Parameters
train_data (
TimeSeries
) – aTimeSeries
of metric values to train the model.train_config – Additional training configs, if needed. Only required for some models.
- Return type
Tuple
[TimeSeries
,Optional
[TimeSeries
]]- Returns
the model’s prediction on
train_data
, in the same format as if you calledForecasterBase.forecast
on the time stamps oftrain_data
- forecast(time_stamps, time_series_prev=None, return_iqr=False, return_prev=False)
Returns the model’s forecast on the timestamps given.
- Parameters
time_stamps (
Union
[int
,List
[int
]]) – Either alist
of timestamps we wish to forecast for, or the number of steps (int
) we wish to forecast for.time_series_prev (
Optional
[TimeSeries
]) – a list of (timestamp, value) pairs immediately precedingtime_series
. If given, we use it to initialize the time series model. Otherwise, we assume thattime_series
immediately follows the training data.return_iqr (
bool
) – whether to return the inter-quartile range for the forecast. Note that not all models support this option.return_prev (
bool
) – whether to return the forecast fortime_series_prev
(and its stderr or IQR if relevant), in addition to the forecast fortime_stamps
. Only used iftime_series_prev
is provided.
- Return type
Union
[Tuple
[TimeSeries
,Optional
[TimeSeries
]],Tuple
[TimeSeries
,TimeSeries
,TimeSeries
]]- Returns
(forecast, forecast_stderr)
ifreturn_iqr
is false,(forecast, forecast_lb, forecast_ub)
otherwise.forecast
: the forecast for the timestamps givenforecast_stderr
: the standard error of each forecast value.May be
None
.
forecast_lb
: 25th percentile of forecast values for each timestampforecast_ub
: 75th percentile of forecast values for each timestamp