Welcome to Merlion’s documentation!
Merlion is a Python library for time series intelligence. It features a unified interface for many commonly used models and datasets for forecasting, anomaly detection, and change point detection on both univariate and multivariate time series, along with standard pre-processing and post-processing layers. It has several modules to improve ease-of-use, including visualization, anomaly score calibration to improve interpetability, AutoML for hyperparameter tuning and model selection, and model ensembling. Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production. This library aims to provide engineers and researchers a one-stop solution to rapidly develop models for their specific time series needs, and benchmark them across multiple time series datasets.
Merlion consists of two sub-packages: merlion implements the library’s core time series intelligence features,
and ts_datasets provides standardized data loaders for multiple time series datasets. These loaders load
time series as
pandas.DataFrame s with accompanying metadata.
You can install
merlion from PyPI by calling
pip install salesforce-merlion. You may install from source by
cloning the Merlion repo and calling
pip install Merlion/, or
pip install -e Merlion/ to install in editable mode. You may install additional optional dependencies via
pip install salesforce-merlion[all], or by calling
pip install "Merlion/[all]" if installing from source.
Individually, the optional dependencies include
dashboard for a GUI dashboard,
spark for a distributed computation backend with PySpark, and
deep-learning for all deep learning models.
To install the data loading package
ts_datasets, clone the Merlion
repo and call
pip install -e Merlion/ts_datasets/. This package must be
installed in editable mode (i.e. with the
-e flag) if you don’t want to manually specify the root directory of
every dataset when initializing its data loader.
Note the following external dependencies:
Some of our forecasting models depend on OpenMP. Some of our forecasting models depend on OpenMP. If using
conda install -c conda-forge lightgbmbefore installing our package. This will ensure that OpenMP is configured to work with the
lightgbmpackage (one of our dependencies) in your
condaenvironment. If using Mac, please install Homebrew and call
brew install libompso that the OpenMP libary is available for the model. This is relevant for the
LGBMForecaster, which is also used as a part of the
Some of our anomaly detection models depend on having the Java Development Kit (JDK) installed. For Ubuntu, call
sudo apt-get install openjdk-11-jdk. For Mac OS, install Homebrew and call
brew tap adoptopenjdk/openjdk && brew install --cask adoptopenjdk11. Also ensure that
javacan be found on your
PATH, and that the
JAVA_HOMEenvironment variable is set. This is relevant for the
RandomCutForestwhich is also used as a part of the
The easiest way to get started is to use the GUI web-based dashboard.
This dashboard provides a great way to quickly experiment with many models on your own custom datasets.
To use it, install Merlion with the optional
dashboard dependency (i.e.
pip install salesforce-merlion[dashboard]), and call
python -m merlion.dashboard from the command line.
You can view the dashboard at http://localhost:8050.
For code resources, we recommend the linked tutorials on anomaly detection and forecasting. After that, you should read in more detail about Merlion’s main data structure for representing time series here.
Finally, developers should look at the architecture document to better understand how Merlion’s key components interact with each other.
- merlion: Time Series Intelligence
- ts_datasets: Easy Data Loading
- Tutorials & Example Code
- Merlion Architecture