Orion
In the following steps we will show a short guide about how to run one of the Orion Pipelines on one of the signals from the Demo Dataset.
In the first step we will load the S-1 signal from the Demo Dataset.
We will do so in two parts, we will use the first part to fit the pipeline and the second one to detect anomalies.
To do so, we need to import the orion.data.load_signal function and call it passing the ‘S-1-train’ as signal name.
In [1]: from orion.data import load_signal In [2]: train_data = load_signal('S-1-train') In [3]: train_data.head() Out[3]: timestamp value 0 1222819200 -0.366359 1 1222840800 -0.394108 2 1222862400 0.403625 3 1222884000 -0.362759 4 1222905600 -0.370746
The output will be a table that contains two columns timestamp and value.
Once we have the data, let us try to use an Orion pipeline to analyze it and search for anomalies.
In order to do so, we will have to create an instance of the orion.Orion class.
In [4]: from orion import Orion In [5]: orion = Orion()
Optionally, we might want to select a pipeline other than the default one or alter the hyperparameters by the underlying MLBlocks pipeline.
For example, let’s select the lstm_dynamic_threshold pipeline and set some hyperparameters (in this case training epochs as 5).
lstm_dynamic_threshold
In [6]: hyperparameters = { ...: 'keras.Sequential.LSTMTimeSeriesRegressor#1': { ...: 'epochs': 5 ...: } ...: } ...:
Then, we simply pass the hyperparameters alongside the pipeline.
hyperparameters
In [7]: orion = Orion( ...: pipeline='lstm_dynamic_threshold', ...: hyperparameters=hyperparameters ...: ) ...:
Once we the pipeline is ready, we can proceed to fit it to our data:
In [8]: orion.fit(train_data)
Once it is fitted, we are ready to use it to detect anomalies in incoming data:
In [9]: new_data = load_signal('S-1-new') In [10]: anomalies = orion.detect(new_data)
Warning
Depending on your system and the exact versions that you might have installed some WARNINGS may be printed. These can be safely ignored as they do not interfere with the proper behavior of the pipeline.
The output of the previous command will be a pandas.DataFrame containing a table in the detected anomalies Output format described above:
pandas.DataFrame
In [11]: anomalies Out[11]: start end severity 0 1398427200 1400198400 0.624933
In this next step we will load some already known anomalous intervals and evaluate how good our anomaly detection was by comparing those with our detected intervals.
For this, we will first load the known anomalies for the signal that we are using:
In [12]: from orion.data import load_anomalies In [13]: ground_truth = load_anomalies('S-1') In [14]: ground_truth Out[14]: start end 0 1398168000 1407823200
The output will be a table in the same format as the anomalies one.
Afterwards, we can call the orion.evaluate method, passing both the data to detect anomalies and the ground truth:
orion.evaluate
In [15]: scores = orion.evaluate(new_data, ground_truth) In [16]: scores Out[16]: accuracy 0.950205 f1 0.310019 recall 0.183445 precision 1.000000 dtype: float64
The output will be a pandas.Series containing a collection of scores indicating how the predictions were.
pandas.Series