This function detects data drift (deviation) by comparing two rolling averages, short and long interval, of the signal. The
deviation between the short and long term average is considered significant if it is above a given threshold
multiplied by the rolling standard deviation of the long term average.
Parameters:
data – Time series.
long_interval – Long length.
Length of long term time interval.
short_interval – Short length.
Length of short term time interval.
std_threshold – Threshold.
Parameter that determines if the signal has changed significantly enough to be considered drift. The threshold
is multiplied by the long term rolling standard deviation to take into account the recent condition of the
signal.
detect – Type.
Parameter to determine if the model should detect significant decreases, increases or both. Options are:
“decrease”, “increase”, or “both”. Defaults to “both”.
This function identifies if a signal contains one or more oscillatory components. It is based on the paper by Sharma et al. [1].
The method uses Linear Predictive Coding (LPC) and is implemented as a 3 step process:
Estimate the LPC coefficients from the prediction polynomial. These are used to estimate a fit to the
data.
Estimate the roots of the LPC coefficients.
Estimate the distance of each root to the unit circle in the complex plane.
If the distance of any root is close to the unit circle (less than 0.2) the signal is considered to have an
oscillatory component
Parameters:
data – Time series
order – Polynomial order.
Order of the prediction polynomial. Defaults to 4.
threshold – Threshold.
Maximum distance of a root to the unit circle for which the signal is considered to have an oscillatory
component. Defaults to 0.2
Returns:
Oscillation region.
Regions where oscillations were detected. Ocillations detected =1, no detection =0.
Return type:
pd.Series
Warning
Large variations in sampling time may affect the proficiency of the algorithm. The algorithm works best on time series with
uniform sampling frequency. If non-uniformly sampled, you can use a resampling method to fill the missing
data.
Raises:
RuntimeError – Length of interpolated data does not match predicted data.
This function detects change points in a time series. The time series is split into “statistically homogeneous” segments using the
ED Pelt change point detection algorithm while observing the minimum distance argument.
Parameters:
data – Time series
min_distance – Minimum distance.
Specifies the minimum point wise distance for each segment that will be considered in the Change
Point Detection algorithm.
This technique calculates the cumulative sum of positive and negative changes (g+t and g−t) in the data (x) and compares them to a threshold.
When this threshold is exceeded, a change is detected (ttalarm), and the cumulative sum restarts from zero.
To avoid the detection of a change in absence of an actual change or a slow drift, this algorithm also depends on a parameter drift for drift correction.
Remove extreme standalone outliers before using this technique to get a better result.
Typical uses of this function:
Set the type of series to return to “mean_data” to visualize the smoothed data. Leave the rest of the parameters to their default values.
Adjust the alpha parameter to get the desired smoothing for the data.
Set the type of series to return to “positive_cumulative_sum” or “negative_cumulative_sum” to visualize the cumulative sum of the positive or negative changes.
Adjust the threshold and drift accordingly to get the desired number of change points.
Set the type of series to return to “cusum_binary_result” to visualize the detected changes.
Parameters:
data – Time series.
threshold – Amplitude threshold.
Cumulative changes are compared to this threshold. Defaults to None.
When this is exceeded a change is detected and the cumulative sum restarts from 0.
If the threshold is not provided, it is assigned to 5 * standard_deviation of the data.
drift – Drift term.
Prevents any change in the absence of change. Defaults to None.
If fewer false alarms are wanted, try to increase drift.
If the threshold is not provided, it is assigned to (2 * data_standard_deviation - data_mean) / 2.
detect – Type of changes to detect.
Options are:
* “both” for detecting both increasing and decreasing changes in the data (default)
* “increase” for detecting increasing changes in the data
* “decrease” for detecting decreasing changes in the data
predict_ending – Predict end point.
Prolongs the change until the predicted end point. Defaults to True.
If false, single change points are detected.
alpha – Smoothing factor.
Value between 0 < alpha <= 1. Defaults to 0.05.
return_series_type –
Type of series to return.
Defaults to “cusum_binary_result”.
This option allows the user to visualize the intermediate steps of the algorithm.
Options are:
”cusum_binary_result” returns the cusum results as a binary time series. Change detected = 1, No change detected = 0.
”mean_data” returns the smoothed data.
”positive_cumulative_sum” returns the positive cumulative sum.
”negative_cumulative_sum” returns the negative cumulative sum.
Returns:
Time series.
Specified in the return_series_type parameter.
Return type:
pd.Series
Raises:
UserTypeError – If a time series with the wrong index is provided.
UserValueError – If an empty time series is passed into the function.
Detect steady state periods in a time series based on a change point detection algorithm. The time series is split
into “statistically homogeneous” segments using the ED Pelt change point detection algorithm. Then each segment is tested with regard
to a normalized standard deviation and the slope of the line of best fit to determine if the segment can be
considered a steady or transient region.
Parameters:
data – Time series.
min_distance – Minimum distance.
Specifies the minimum point-wise distance for each segment that will be considered in the Change
Point Detection algorithm.
var_threshold – Variance threshold.
Specifies the variance threshold. If the normalized variance calculated for a given segment is greater than
the threshold, the segment will be labeled as transient (value = 0).
slope_threshold – Slope threshold.
Specifies the slope threshold. If the slope of a line fitted to the data of a given segment is greater than
10 to the power of the threshold value, the segment will be labeled as transient (value = 0).
Returns:
Binary time series.
Steady state = 1, Transient = 0.
The steady state detector is based on the ration of two variances estimated from the same signal [2] . The algorithm first
filters the data using the factor “Alpha 1” and calculates two variances (long and short term) based on the
parameters “Alpa 2” and “Alpha 3”. The first variance is an exponentially weighted moving variance based on the
difference between the data and the average. The second is also an exponentially weighted moving “variance” but
based on sequential data differences. Larger Alpha values imply that fewer data are involved in the analysis,
which has the benefit of reducing the time for the identifier to detect a process change (average run length, ARL)
but has an undesired impact of increasing the variability on the results, broadening the distribution and
confounding interpretation. Lower λ values undesirably increase the average run length to detection but increase
precision (minimizing Type-I and Type-II statistical errors) by reducing the variability of the distributions
and increasing the signal-to-noise ratio of a TS to SS situation.
Parameters:
data – Time series.
ratio_lim – Threshold.
Specifies the variance ratio threshold if it is in steady state or not. A variance ratio greater than the
threshold labels the state as transient.
alpha1 – Alpha 1.
Filter factor for the mean. Value should be between 0 and 1. Recommended value is 0.2.
Defaults to 0.2.
alpha2 – Alpha 2.
Filter factor for variance 1. Value should be between 0 and 1. Recommended value is 0.1.
Defaults to 0.1.
alpha3 – Alpha 3.
Filter factor for variance 2. Value should be between 0 and 1. Recommended value is 0.1.
Defaults to 0.1.
Returns:
Binary time series.
Steady state = 0, transient = 1.
This moving average is designed to become flat (constant value) when the data
within the lookup window does not vary significantly. It can also be state detector. The calculation is based on
the variability of the signal in a lookup window.
Parameters:
series – Time series.
window_length – Lookup window.
Window length in data points used to estimate the variability of the signal.
Returns:
Moving average.
If the result has the same value as the previous moving average result, the signal can be considered to
be on steady state.