Quickstart

In the following steps we will show a short guide about how to run one of the Orion Pipelines on one of the signals from the Demo Dataset.

1. Load the data

In the first step we will load the S-1 signal from the Demo Dataset.

We will do so in two parts, we will use the first part to fit the pipeline and the second one to detect anomalies.

To do so, we need to import the orion.data.load_signal function and call it passing the ‘S-1-train’ as signal name.

In [1]: from orion.data import load_signal

In [2]: train_data = load_signal('S-1-train')

In [3]: train_data.head()
Out[3]: 
    timestamp     value
0  1222819200 -0.366359
1  1222840800 -0.394108
2  1222862400  0.403625
3  1222884000 -0.362759
4  1222905600 -0.370746

The output will be a table that contains two columns timestamp and value.

2. Detect anomalies using Orion

Once we have the data, let us try to use an Orion pipeline to analyze it and search for anomalies.

In order to do so, we will have to create an instance of the orion.Orion class.

In [4]: from orion import Orion

In [5]: orion = Orion()

Optionally, we might want to select a pipeline other than the default one or alter the hyperparameters by the underlying MLBlocks pipeline.

For example, let’s select the lstm_dynamic_threshold pipeline and set some hyperparameters (in this case training epochs as 5).

In [6]: hyperparameters = {
   ...:     'keras.Sequential.LSTMTimeSeriesRegressor#1': {
   ...:         'epochs': 5
   ...:     }
   ...: }
   ...: 

Then, we simply pass the hyperparameters alongside the pipeline.

In [7]: orion = Orion(
   ...:     pipeline='lstm_dynamic_threshold',
   ...:     hyperparameters=hyperparameters
   ...: )
   ...: 

Once we the pipeline is ready, we can proceed to fit it to our data:

In [8]: orion.fit(train_data)

Once it is fitted, we are ready to use it to detect anomalies in incoming data:

In [9]: new_data = load_signal('S-1-new')

In [10]: anomalies = orion.detect(new_data)

Warning

Depending on your system and the exact versions that you might have installed some WARNINGS may be printed. These can be safely ignored as they do not interfere with the proper behavior of the pipeline.

The output of the previous command will be a pandas.DataFrame containing a table in the detected anomalies Output format described above:

In [11]: anomalies
Out[11]: 
        start         end  severity
0  1398427200  1400198400  0.624933

3. Evaluate the performance of your pipeline

In this next step we will load some already known anomalous intervals and evaluate how good our anomaly detection was by comparing those with our detected intervals.

For this, we will first load the known anomalies for the signal that we are using:

In [12]: from orion.data import load_anomalies

In [13]: ground_truth = load_anomalies('S-1')

In [14]: ground_truth
Out[14]: 
        start         end
0  1398168000  1407823200

The output will be a table in the same format as the anomalies one.

Afterwards, we can call the orion.evaluate method, passing both the data to detect anomalies and the ground truth:

In [15]: scores = orion.evaluate(new_data, ground_truth)

In [16]: scores
Out[16]: 
accuracy     0.950205
f1           0.310019
recall       0.183445
precision    1.000000
dtype: float64

The output will be a pandas.Series containing a collection of scores indicating how the predictions were.