automl
Contains all AutoML model variants & some utilities.
Base classes:
| Base class/mixin for AutoML hyperparameter search. | 
Models:
| Automatic hyperparamter selection for ETS. | |
| Automatic hyperparameter selection for Facebook's Prophet. | |
| Automatic hyperparameter selection for SARIMA. | 
Utilities:
| Automatic seasonality detection. | |
| Abstractions for hyperparameter search. | 
Base classes
automl.base
Base class/mixin for AutoML hyperparameter search.
- class merlion.models.automl.base.AutoMLMixIn(config=None, model=None, **kwargs)
- Bases: - LayeredModel- Abstract base class which converts LayeredModel into an AutoML model. - abstract generate_theta(train_data)
- Parameters
- train_data ( - TimeSeries) – Pre-processed training data to use for generation of hyperparameters \(\theta\)
- Return type
- Iterator
 - Returns an iterator of hyperparameter candidates for consideration with th underlying model. 
 - abstract evaluate_theta(thetas, train_data, train_config=None, exog_data=None)
- Parameters
- thetas ( - Iterator) – Iterator of the hyperparameter candidates
- train_data ( - TimeSeries) – Pre-processed training data
- train_config – Training configuration 
 
- Return type
- Tuple[- Any,- Optional[- ModelBase],- Optional[- Tuple[- TimeSeries,- Optional[- TimeSeries]]]]
 - Return the optimal hyperparameter, as well as optionally a model and result of the training procedure. 
 - abstract set_theta(model, theta, train_data=None)
- Parameters
- model – Underlying base model to which the new theta is applied 
- theta – Hyperparameter to apply 
- train_data ( - Optional[- TimeSeries]) – Pre-processed training data (Optional)
 
 - Sets the hyperparameter to the provided - model. This is used to apply the \(\theta\) to the model, since this behavior is custom to every model. Oftentimes in internal implementations,- modelis the optimal model.
 
- class merlion.models.automl.base.InformationCriterion(value)
- Bases: - Enum- An enumeration. - AIC = 1
- Akaike information criterion. Computed as \[\mathrm{AIC} = 2k - 2\mathrm{ln}(L)\]- where k is the number of parameters, and L is the model’s likelihood. 
 - BIC = 2
- Bayesian information criterion. Computed as \[k \mathrm{ln}(n) - 2 \mathrm{ln}(L)\]- where n is the sample size, k is the number of parameters, and L is the model’s likelihood. 
 - AICc = 3
- Akaike information criterion with correction for small sample size. Computed as \[\mathrm{AICc} = \mathrm{AIC} + \frac{2k^2 + 2k}{n - k - 1}\]- where n is the sample size, and k is the number of paramters. 
 
- class merlion.models.automl.base.ICConfig(information_criterion=InformationCriterion.AIC, transform=None, **kwargs)
- Bases: - Config- Mix-in to add an information criterion parameter to a model config. - Parameters
- information_criterion ( - InformationCriterion) – information criterion to select the best model.
- transform – Transformation to pre-process input time series. 
 
 - property information_criterion
 
- class merlion.models.automl.base.ICAutoMLForecaster(config=None, model=None, **kwargs)
- Bases: - AutoMLMixIn,- ForecasterBase- AutoML model which uses an information criterion to determine which model paramters are best. - property information_criterion
 - abstract get_ic(model, train_data, train_result)
- Returns the information criterion of the model based on the given training data & the model’s train result. - Parameters
- model – One of the models being tried. Must be trained. 
- train_data ( - DataFrame) – The target sequence of the training data as a- pandas.DataFrame.
- train_result ( - Tuple[- DataFrame,- Optional[- DataFrame]]) – The result of calling- model._train().
 
- Return type
- float
- Returns
- The information criterion evaluating the model’s goodness of fit. 
 
 - evaluate_theta(thetas, train_data, train_config=None, exog_data=None)
- Parameters
- thetas ( - Iterator) – Iterator of the hyperparameter candidates
- train_data ( - TimeSeries) – Pre-processed training data
- train_config – Training configuration 
 
- Return type
- Tuple[- Any,- ModelBase,- Tuple[- TimeSeries,- Optional[- TimeSeries]]]
 - Return the optimal hyperparameter, as well as optionally a model and result of the training procedure. 
 
Models
automl.autoets
Automatic hyperparamter selection for ETS.
- class merlion.models.automl.autoets.AutoETSConfig(model=None, auto_seasonality=True, auto_error=True, auto_trend=True, auto_seasonal=True, auto_damped=True, periodicity_strategy=PeriodicityStrategy.ACF, information_criterion=InformationCriterion.AIC, additive_only=False, allow_multiplicative_trend=False, restrict=True, pval=0.05, max_lag=None, model_kwargs=None, transform=None, **kwargs)
- Bases: - SeasonalityConfig,- ICConfig- Configuration class for AutoETS. Act as a wrapper around a ETS model, which automatically detects the hyperparameters - seasonal_periods,- error,- trend,- damped_trendand- seasonal.- Parameters
- model ( - Union[- ETS,- dict,- None]) – The model being wrapped, or a dict representing it.
- auto_seasonality ( - bool) – Whether to automatically detect the seasonality.
- auto_error ( - bool) – Whether to automatically detect the error components.
- auto_trend ( - bool) – Whether to automatically detect the trend components.
- auto_seasonal ( - bool) – Whether to automatically detect the seasonal components.
- auto_damped ( - bool) – Whether to automatically detect the damped trend components.
- periodicity_strategy ( - PeriodicityStrategy) – Strategy to choose the seasonality if multiple candidates are detected.
- information_criterion ( - InformationCriterion) – information criterion to select the best model.
- additive_only ( - bool) – If True, the search space will only consider additive models.
- allow_multiplicative_trend ( - bool) – If True, models with multiplicative trend are allowed in the search space.
- restrict ( - bool) – If True, the models with infinite variance will not be allowed in the search space.
- pval – p-value for deciding whether a detected seasonality is statistically significant. 
- max_lag – max lag considered for seasonality detection. 
- model_kwargs – Keyword arguments used specifically to initialize the underlying model. Only used if - modelis a dict. Will override keys in the- modeldict if specified.
- transform – Transformation to pre-process input time series. 
- kwargs – Any other keyword arguments (e.g. for initializing a base class). If - modelis a dict, we will also try to pass these arguments when creating the actual underlying model. However, they will not override arguments in either the- modeldict or- model_kwargsdict.
 
 
- class merlion.models.automl.autoets.AutoETS(config)
- Bases: - ICAutoMLForecaster,- SeasonalityLayer- Wrapper around a ETS model, which automatically detects the hyperparameters - seasonal_periods,- error,- trend,- damped_trendand- seasonal.- config_class
- alias of - AutoETSConfig
 - generate_theta(train_data)
- generate [theta]. theta is a list of parameter combination [error, trend, damped_trend, seasonal] - Return type
- Iterator
 
 - set_theta(model, theta, train_data=None)
- Parameters
- model – Underlying base model to which the new theta is applied 
- theta – Hyperparameter to apply 
- train_data ( - Optional[- TimeSeries]) – Pre-processed training data (Optional)
 
 - Sets the hyperparameter to the provided - model. This is used to apply the \(\theta\) to the model, since this behavior is custom to every model. Oftentimes in internal implementations,- modelis the optimal model.
 - get_ic(model, train_data, train_result)
- Returns the information criterion of the model based on the given training data & the model’s train result. - Parameters
- model – One of the models being tried. Must be trained. 
- train_data ( - DataFrame) – The target sequence of the training data as a- pandas.DataFrame.
- train_result ( - Tuple[- DataFrame,- DataFrame]) – The result of calling- model._train().
 
- Return type
- float
- Returns
- The information criterion evaluating the model’s goodness of fit. 
 
 
automl.autoprophet
Automatic hyperparameter selection for Facebook’s Prophet.
- class merlion.models.automl.autoprophet.AutoProphetConfig(model=None, periodicity_strategy=PeriodicityStrategy.All, information_criterion=InformationCriterion.AIC, pval=0.05, max_lag=None, model_kwargs=None, transform=None, **kwargs)
- Bases: - SeasonalityConfig,- ICConfig- Config class for Prophet with automatic seasonality detection & other hyperparameter selection. - Parameters
- model ( - Union[- Prophet,- dict,- None]) – The model being wrapped, or a dict representing it.
- periodicity_strategy ( - Union[- PeriodicityStrategy,- str]) – Strategy to choose the seasonality if multiple candidates are detected.
- information_criterion ( - InformationCriterion) – information criterion to select the best model.
- pval – p-value for deciding whether a detected seasonality is statistically significant. 
- max_lag – max lag considered for seasonality detection. 
- model_kwargs – Keyword arguments used specifically to initialize the underlying model. Only used if - modelis a dict. Will override keys in the- modeldict if specified.
- transform – Transformation to pre-process input time series. 
- kwargs – Any other keyword arguments (e.g. for initializing a base class). If - modelis a dict, we will also try to pass these arguments when creating the actual underlying model. However, they will not override arguments in either the- modeldict or- model_kwargsdict.
 
 - property multi_seasonality
- Returns
- Truebecause Prophet supports multiple seasonality.
 
 
- class merlion.models.automl.autoprophet.AutoProphet(config=None, model=None, **kwargs)
- Bases: - ICAutoMLForecaster,- SeasonalityLayer- Prophet with automatic seasonality detection. Automatically detects and adds additional seasonalities that the existing Prophet may not detect (e.g. hourly). Also automatically chooses other hyperparameters. - config_class
- alias of - AutoProphetConfig
 - property supports_exog
- Whether the model supports exogenous regressors. 
 - generate_theta(train_data)
- Parameters
- train_data ( - TimeSeries) – Pre-processed training data to use for generation of hyperparameters \(\theta\)
- Return type
- Iterator
 - Returns an iterator of hyperparameter candidates for consideration with th underlying model. 
 - set_theta(model, theta, train_data=None)
- Parameters
- model – Underlying base model to which the new theta is applied 
- theta – Hyperparameter to apply 
- train_data ( - Optional[- TimeSeries]) – Pre-processed training data (Optional)
 
 - Sets the hyperparameter to the provided - model. This is used to apply the \(\theta\) to the model, since this behavior is custom to every model. Oftentimes in internal implementations,- modelis the optimal model.
 - get_ic(model, train_data, train_result)
- Returns the information criterion of the model based on the given training data & the model’s train result. - Parameters
- model – One of the models being tried. Must be trained. 
- train_data ( - DataFrame) – The target sequence of the training data as a- pandas.DataFrame.
- train_result ( - Tuple[- DataFrame,- DataFrame]) – The result of calling- model._train().
 
- Return type
- float
- Returns
- The information criterion evaluating the model’s goodness of fit. 
 
 
automl.autosarima
Automatic hyperparameter selection for SARIMA.
- class merlion.models.automl.autosarima.AutoSarimaConfig(model=None, auto_seasonality=True, periodicity_strategy=PeriodicityStrategy.ACF, auto_pqPQ=True, auto_d=True, auto_D=True, maxiter=None, max_k=100, max_dur=3600, approximation=None, approx_iter=None, pval=0.05, max_lag=None, model_kwargs=None, transform=None, **kwargs)
- Bases: - SeasonalityConfig- Configuration class for AutoSarima. Acts as a wrapper around a Sarima model, which automatically detects the seasonality, (seasonal) differencing order, and (seasonal) AR/MA orders. If a non-numeric value is specified for any of the relevant parameters in the order or seasonal order, we assume that the user wishes to detect that parameter automatically. - Note - The automatic selection of AR, MA, seasonal AR, and seasonal MA parameters is implemented in a coupled way. The user must specify all of these parameters explicitly to avoid automatic selection. - Parameters
- model ( - Union[- Sarima,- dict,- None]) – The model being wrapped, or a dict representing it.
- auto_seasonality ( - bool) – Whether to automatically detect the seasonality.
- periodicity_strategy ( - PeriodicityStrategy) – Periodicity Detection Strategy.
- auto_pqPQ ( - bool) – Whether to automatically choose AR/MA orders- p, qand seasonal AR/MA orders- P, Q.
- auto_d ( - bool) – Whether to automatically choose the difference order- d.
- auto_D ( - bool) – Whether to automatically choose the seasonal difference order- D.
- maxiter ( - Optional[- int]) – The maximum number of iterations to perform
- max_k ( - int) – Maximum number of models considered in the stepwise search
- max_dur ( - float) – Maximum training time considered in the stepwise search
- approximation ( - Optional[- bool]) – Whether to use- approx_iteriterations (instead of- maxiter) to speed up computation. If- None, we use approximation mode when the training data is too long (>150), or when the length off the period is too high (- periodicity > 12).
- approx_iter ( - Optional[- int]) – The number of iterations to perform in approximation mode
- pval – p-value for deciding whether a detected seasonality is statistically significant. 
- max_lag – max lag considered for seasonality detection. 
- model_kwargs – Keyword arguments used specifically to initialize the underlying model. Only used if - modelis a dict. Will override keys in the- modeldict if specified.
- transform – Transformation to pre-process input time series. 
- kwargs – Any other keyword arguments (e.g. for initializing a base class). If - modelis a dict, we will also try to pass these arguments when creating the actual underlying model. However, they will not override arguments in either the- modeldict or- model_kwargsdict.
 
 - property order
 - property seasonal_order
 
- class merlion.models.automl.autosarima.AutoSarima(config=None, model=None, **kwargs)
- Bases: - SeasonalityLayer- config_class
- alias of - AutoSarimaConfig
 - property supports_exog
- Whether the model supports exogenous regressors. 
 - generate_theta(train_data)
- generate [action, theta]. action is an indicator for stepwise seach (stepwsie) of p, q, P, Q, trend parameters or use a predefined parameter combination (pqPQ) theta is a list of parameter combination [order, seasonal_order, trend] - Return type
- Iterator
 
 - evaluate_theta(thetas, train_data, train_config=None, exog_data=None)
- Parameters
- thetas ( - Iterator) – Iterator of the hyperparameter candidates
- train_data ( - TimeSeries) – Pre-processed training data
- train_config – Training configuration 
 
- Return type
- Tuple[- Any,- Optional[- Sarima],- Optional[- Tuple[- TimeSeries,- Optional[- TimeSeries]]]]
 - Return the optimal hyperparameter, as well as optionally a model and result of the training procedure. 
 - set_theta(model, theta, train_data=None)
- Parameters
- model – Underlying base model to which the new theta is applied 
- theta – Hyperparameter to apply 
- train_data ( - Optional[- TimeSeries]) – Pre-processed training data (Optional)
 
 - Sets the hyperparameter to the provided - model. This is used to apply the \(\theta\) to the model, since this behavior is custom to every model. Oftentimes in internal implementations,- modelis the optimal model.
 
Utilities
automl.seasonality
Automatic seasonality detection.
Note that the static method merlion.models.automl.seasonality.SeasonalityLayer.detect_seasonality()
can be used to find the seasonality of an arbitrary numpy.array, without needing to initialize a model.
- class merlion.models.automl.seasonality.PeriodicityStrategy(value)
- Bases: - Enum- Strategy to choose the seasonality if multiple candidates are detected. - ACF = 1
- Select the seasonality value with the highest autocorrelation. 
 - Min = 2
- Select the minimum seasonality. 
 - Max = 3
- Select the maximum seasonality. 
 - All = 4
- Use all seasonalities. Only valid for models which support multiple seasonalities. 
 
- class merlion.models.automl.seasonality.SeasonalityModel
- Bases: - object- Class provides simple implementation to set the seasonality in a model. Extend this class to implement custom behavior for seasonality processing. - abstract set_seasonality(theta, train_data)
- Implement this method to do any model-specific adjustments on the seasonality that was provided by SeasonalityLayer. - Parameters
- theta – Seasonality processed by SeasonalityLayer. 
- train_data ( - UnivariateTimeSeries) – Training data (or numpy array representing the target univariate) for any model-specific adjustments you might want to make.
 
 
 
- class merlion.models.automl.seasonality.SeasonalityConfig(model, periodicity_strategy=PeriodicityStrategy.ACF, pval=0.05, max_lag=None, model_kwargs=None, transform=None, **kwargs)
- Bases: - LayeredModelConfig- Config object for an automatic seasonality detection layer. - Parameters
- model – The model being wrapped, or a dict representing it. 
- periodicity_strategy – Strategy to choose the seasonality if multiple candidates are detected. 
- pval ( - float) – p-value for deciding whether a detected seasonality is statistically significant.
- max_lag ( - Optional[- int]) – max lag considered for seasonality detection.
- model_kwargs – Keyword arguments used specifically to initialize the underlying model. Only used if - modelis a dict. Will override keys in the- modeldict if specified.
- transform – Transformation to pre-process input time series. 
- kwargs – Any other keyword arguments (e.g. for initializing a base class). If - modelis a dict, we will also try to pass these arguments when creating the actual underlying model. However, they will not override arguments in either the- modeldict or- model_kwargsdict.
 
 - property multi_seasonality
- Returns
- Whether the model supports multiple seasonalities. - Falseunless explicitly overridden.
 
 - property periodicity_strategy: PeriodicityStrategy
- Returns
- Strategy to choose the seasonality if multiple candidates are detected. 
 
 
- class merlion.models.automl.seasonality.SeasonalityLayer(config=None, model=None, **kwargs)
- Bases: - AutoMLMixIn- Seasonality Layer that uses automatically determines the seasonality of your data. Can be used directly on any model that implements SeasonalityModel class. The algorithmic idea is from the theta method. We find a set of multiple candidate seasonalites, and we return the best one(s) based on the PeriodicityStrategy. - config_class
- alias of - SeasonalityConfig
 - property require_even_sampling: bool
- Whether the model assumes that training data is sampled at a fixed frequency 
 - property require_univariate
- Whether the model only works with univariate time series. 
 - property multi_seasonality
- Returns
- Whether the model supports multiple seasonalities. 
 
 - property periodicity_strategy
- Returns
- Strategy to choose the seasonality if multiple candidates are detected. 
 
 - property pval
- Returns
- p-value for deciding whether a detected seasonality is statistically significant. 
 
 - property max_lag
- Returns
- max_lag for seasonality detection 
 
 - static detect_seasonality(x, max_lag=None, pval=0.05, periodicity_strategy=PeriodicityStrategy.ACF)
- Helper method to detect the seasonality of a time series. - Parameters
- x ( - array) – The numpy array of values whose seasonality we want to detect. Must be univariate & flattened.
- periodicity_strategy ( - PeriodicityStrategy) – Strategy to choose the seasonality if multiple candidates are detected.
- pval ( - float) – p-value for deciding whether a detected seasonality is statistically significant.
- max_lag ( - Optional[- int]) – max lag considered for seasonality detection.
 
- Return type
- List[- int]
 
 - set_theta(model, theta, train_data=None)
- Parameters
- model – Underlying base model to which the new theta is applied 
- theta – Hyperparameter to apply 
- train_data ( - Optional[- TimeSeries]) – Pre-processed training data (Optional)
 
 - Sets the hyperparameter to the provided - model. This is used to apply the \(\theta\) to the model, since this behavior is custom to every model. Oftentimes in internal implementations,- modelis the optimal model.
 - evaluate_theta(thetas, train_data, train_config=None, exog_data=None)
- Parameters
- thetas ( - Iterator) – Iterator of the hyperparameter candidates
- train_data ( - TimeSeries) – Pre-processed training data
- train_config – Training configuration 
 
- Return type
- Tuple[- Any,- Optional[- ModelBase],- Optional[- Tuple[- TimeSeries,- Optional[- TimeSeries]]]]
 - Return the optimal hyperparameter, as well as optionally a model and result of the training procedure. 
 - generate_theta(train_data)
- Parameters
- train_data ( - TimeSeries) – Pre-processed training data to use for generation of hyperparameters \(\theta\)
- Return type
- Iterator
 - Returns an iterator of hyperparameter candidates for consideration with th underlying model. 
 
automl.search
Abstractions for hyperparameter search.
- class merlion.models.automl.search.GridSearch(param_values, restrictions=None)
- Bases: - object- Iterator over a grid of parameter values, skipping any restricted combinations of values. - Parameters
- param_values ( - Dict[- str,- List]) – a dict mapping a set of parameter names to lists of values they can take on.
- restrictions ( - Optional[- List[- Dict[- str,- Any]]]) – a list of dicts indicating inadmissible combinations of parameter values. For example, an ETS model has parameters error (add/mul), trend (add/mul/none), seasonal (add/mul), and damped_trend (True/False). If we are only considering additive models, we would impose the restrictions- [{"error": "mul"}, {"trend": "mul"}, {"seasonal": "mul"}]. Since a damped trend is only possible if the model has a trend, we would add the restriction- {"trend": None, "damped_trend": True}.