Filter

Status Filter

indsl.filter.status_flag_filter(data: Series, filter_by: Series, int_to_keep: int = 0, align_timesteps: bool = False)

Status flag filter

Function to filter any given data series by a series with integers denoting different states. A typical example of such a series is typically a series of 0 and 1 where 1 would indicate the presence of an anomaly. The status flag filter retrieves all relevant indices and matches these to the data series.

Parameters
  • data – Time series.

  • filter_by – Status flag time series. Time series values are expected to be integers. If not, values are cast to integer automatically.

  • int_to_keep – Value. Value to filter by in the boolean filter

  • align_timesteps – Auto-align Automatically align time stamp of input time series. Default is False.

Returns

Filtered time series

Return type

pandas.Series

Raises

UserRuntimeError – Timeseries returns no data. This could be due to the absence of sufficient data in either data or filter_by or filter_by series contains no values of int_to_keep.

Wavelet Filter

indsl.filter.wavelet_filter(data: Series, level: int = 2, wavelet: WaveletType = WaveletType.DAUBECHIES_8) Series

Wavelet de-noising

Wavelets approach to filtering industrial data can be very powerful as it uses a dual frequency-time representation of the original signal, which allows separating noise frequencies from valuable signal frequencies. For more on wavelet filter or other application see https://en.wikipedia.org/wiki/Wavelet

Parameters
  • data – Time series. The data to be filtered. The series must have a pandas.DatetimeIndex.

  • level – Level. The number of wavelet decomposition levels (typically 1 through 6) to use.

  • wavelet – Type. The default is a Daubechies wavelet of order 8 (db8). For other types of wavelets see consult the pywavelets pacakge. The thresholding methods assume an orthogonal wavelet transform and may not choose the threshold appropriately for biorthogonal wavelets. Orthogonal wavelets are desirable because white noise in the input remains white noise in the sub-bands. Therefore one should choose one of the db[1-20], sym[2-20] or coif[1-5] type wavelet filters.

Returns

Filtered time series

Return type

pandas.Series

Trend filter

indsl.filter.trend.trend_extraction_hilbert_transform(series: Series, sift_thresh: float = 1e-08, max_num_imfs: Optional[int] = None, error_tolerance: float = 0.05, return_trend: bool = True) Series

Trend / De-trend signal

Robust method to determine the trend of any non-linear and non-stationary time series based on the `Hilbert-Huang Transform and empirical mode decomposition (EMD). This is a mathematically complex method and not easy to document in detail. If you are interested in knowing more about it see following the list of resources:

The EMD method decomposes a time series into a finite number of oscillatory components, each with a well defined frequency and amplitude. These are called intrinsic mode functions (IMFs). The process to identify IMFs is called sifting (i.e. filtering). The sift works by iteratively extracting oscillatory components from a signal. Starting from the fastest and through to the very slowest until the average envelope of the components is less than the sifting threshold.

The number of components selected for building the main trend are selected using the cross energy ration between IMFs using the Hilbert-Huang transform to estimate the spectra. If the ratio is below a given energy tolerance threshold, the process stops and the selected IMFs are added together. That is the resulting main trend.

As an output, it is possible to select either the trend of the main signal or the de-trended signal.

Parameters
  • series – Time series

  • sift_thresh – Sifting threshold Threshold to stop sifting process. This threshold is based on the Cauchy convergence test and represent the residue between two consecutive oscillatory components (IMFs). A small threshold (close to zero) will result in more components extracted. Typically, a few IMFs are enough to build the main trend. Choosing a high threshold might not affect the outcome. Defaults to 1e-8.

  • max_num_imfs – Maximum number of components Maximum number of oscillatory components (or IMFs) to estimate the main trend. If no value (None) is defined the process continues until sifting threshold is reached. Defaults to None.

  • error_tolerance – Energy tolerance Threshold for cross energy ratio validation used for choosing oscillatory components or IMFs. Defaults to 0.05.

  • return_trend – Output trend Output the trend if true. Remove the trend from the time series if False. Defaults to True.

Returns

Time series

Return type

pd.Series