Multivariate Time Series Anomaly Detection

Multivariate time series anomaly detection works in largely the same way as univariate time series anomaly detection (covered here and here). To begin, we will load the multivariate MSL dataset for time series anomaly detection.

from merlion.utils import TimeSeries
from ts_datasets.anomaly import MSL

time_series, metadata = MSL()[0]
train_data = TimeSeries.from_pd(time_series[metadata.trainval])
test_data = TimeSeries.from_pd(time_series[~metadata.trainval])
test_labels = TimeSeries.from_pd(metadata.anomaly[~metadata.trainval])

print(f"Time series is {train_data.dim}-dimensional")
Time series is 55-dimensional

Model Initialization and Training

For the purposes of this tutorial, we will be using 3 models:

  1. DefaultDetector (which automatically detects whether the input time series is univariate or multivariate);

  2. IsolationForest (a classic algorithm); and

  3. A DetectorEnsemble which takes the maximum anomaly score returned by either model.

Note that while all multivariate anomaly detection models can be used on univariate time series, some Merlion models (e.g. WindStats, ZMS, StatThreshold) are specific to univariate time series. However, the API is identical to that of univariate anomaly detection models.

# We initialize models using the model factory in this tutorial
# We manually set the detection threshold to 2 (in standard deviation units) for all models
from merlion.models.factory import ModelFactory
from merlion.post_process.threshold import AggregateAlarms

model1 = ModelFactory.create("DefaultDetector",

model2 = ModelFactory.create("IsolationForest",

# Here, we create a _max ensemble_ that takes the maximal anomaly score
# returned by any individual model (rather than the mean).
model3 = ModelFactory.create("DetectorEnsemble", models=[model1, model2],
                             combiner={"name": "Max"})

for model in [model1, model2, model3]:
    print(f"Training {type(model).__name__}...")
    train_scores = model.train(train_data)
Training DefaultDetector...
 |████████████████████████████████████████| 100.0% Complete, Loss 0.5998
Training IsolationForest...
Training DetectorEnsemble...
 |████████████████████████████████████████| 100.0% Complete, Loss 0.6001

Model Inference and Quantitative Evaluation

Like univariate models, we may call get_anomaly_label() to get a sequence of post-processed (calibrated and thresholded) training scores. We can then use these to evaluate the model’s performance.

from merlion.evaluate.anomaly import TSADMetric

for model in [model1, model2, model3]:
    labels = model.get_anomaly_label(test_data)
    precision = TSADMetric.PointAdjustedPrecision.value(ground_truth=test_labels, predict=labels)
    recall = TSADMetric.PointAdjustedRecall.value(ground_truth=test_labels, predict=labels)
    f1 = TSADMetric.PointAdjustedF1.value(ground_truth=test_labels, predict=labels)
    mttd = TSADMetric.MeanTimeToDetect.value(ground_truth=test_labels, predict=labels)
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1:        {f1:.4f}")
    print(f"MTTD:      {mttd}")
Precision: 0.9659
Recall:    0.8312
F1:        0.8935
MTTD:      0 days 01:21:15

Precision: 0.9638
Recall:    0.8192
F1:        0.8856
MTTD:      0 days 01:40:57

Precision: 0.9620
Recall:    0.8184
F1:        0.8844
MTTD:      0 days 01:34:51

We can also use a TSADEvaluator to evaluate a model in a manner that simulates live deployment. Here, we train an initial model on the training data, and we obtain its predictions on the training data using a sliding window of 1 week (cadence="1w"). However, we only retrain the model every 4 weeks (retrain_freq="4w").

from merlion.evaluate.anomaly import TSADEvaluator, TSADEvaluatorConfig
for model in [model1, model2, model3]:
    print(f"{type(model).__name__} Sliding Window Evaluation")
    evaluator = TSADEvaluator(model=model, config=TSADEvaluatorConfig(
        cadence="1w", retrain_freq="4w"))
    train_result, test_pred = evaluator.get_predict(train_vals=train_data, test_vals=test_data)
    precision = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
    recall = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
    f1 = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
    mttd = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1:        {f1:.4f}")
    print(f"MTTD:      {mttd}")
DefaultDetector Sliding Window Evaluation
 |████████████████████████████████████████| 100.0% Complete, Loss 0.5998
TSADEvaluator:  55%|█████▍    | 2419200/4423680 [00:23<00:19, 101454.83it/s]
 |████████████████████████████████████████| 100.0% Complete, Loss 0.6667
TSADEvaluator: 100%|██████████| 4423680/4423680 [02:09<00:00, 34157.67it/s]
Precision: 0.9522
Recall:    0.8027
F1:        0.8711
MTTD:      0 days 01:22:46

IsolationForest Sliding Window Evaluation
TSADEvaluator: 100%|██████████| 4423680/4423680 [00:27<00:00, 160149.84it/s]
Precision: 0.9666
Recall:    0.8321
F1:        0.8943
MTTD:      0 days 01:40:42

DetectorEnsemble Sliding Window Evaluation
 |████████████████████████████████████████| 100.0% Complete, Loss 0.6002
TSADEvaluator:  55%|█████▍    | 2419200/4423680 [00:28<00:24, 83276.94it/s]
 |████████████████████████████████████████| 100.0% Complete, Loss 0.6532
TSADEvaluator: 100%|██████████| 4423680/4423680 [02:37<00:00, 28002.66it/s]
Precision: 0.9453
Recall:    0.8209
F1:        0.8787
MTTD:      0 days 01:32:18