Welcome to Merlion’s documentation!
Merlion is a Python library for time series intelligence. It features a unified interface for many commonly used models and datasets for anomaly detection and forecasting on both univariate and multivariate time series, along with standard pre-processing and post-processing layers. It has several modules to improve ease-of-use, including visualization, anomaly score calibration to improve interpetability, AutoML for hyperparameter tuning and model selection, and model ensembling. Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production. This library aims to provide engineers and researchers a one-stop solution to rapidly develop models for their specific time series needs, and benchmark them across multiple time series datasets.
Installation
Merlion consists of two sub-packages: merlion implements the library’s core time series intelligence features,
and ts_datasets provides standardized data loaders for multiple time series datasets. These loaders load
time series as pandas.DataFrame
s with accompanying metadata.
You can install merlion
from PyPI by calling pip install salesforce-merlion
. You may install from source by
cloning the Merlion repo, navigating to the root directory, and calling
pip install .
, or pip install -e .
to install in editable mode. You may install additional dependencies
for plotting & visualization via pip install salesforce-merlion[plot]
, or by calling pip install ".[plot]"
from the
root directory of the repo if installing from source.
To install the data loading package ts_datasets
, simply clone the Merlion
repo and call pip install -e ts_datasets/
from its root directory. This package must be installed in editable mode (i.e. with the -e
flag)
if you don’t want to manually specify the root directory of every dataset when initializing its data loader.
Note the following external dependencies:
Some of our forecasting models depend on OpenMP. Some of our forecasting models depend on OpenMP. If using
conda
, pleaseconda install -c conda-forge lightgbm
before installing our package. This will ensure that OpenMP is configured to work with thelightgbm
package (one of our dependencies) in yourconda
environment. If using Mac, please install Homebrew and callbrew install libomp
so that the OpenMP libary is available for the model. This is relevant for theLGBMForecaster
, which is also used as a part of theDefaultForecaster
.Some of our anomaly detection models depend on having the Java Development Kit (JDK) installed. For Ubuntu, call
sudo apt-get install openjdk-11-jdk
. For Mac OS, install Homebrew and callbrew tap adoptopenjdk/openjdk && brew install --cask adoptopenjdk11
. This is relevant for theRandomCutForest
which is also used as a part of theDefaultDetector
.
Getting Started
To get started, we recommend the linked tutorials on anomaly detection and forecasting. After that, you should read in more detail about Merlion’s main data structure for representing time series here.