Smooth

Moving Averages

Arnaud Legoux Moving Average

indsl.smooth.alma(data: Series, window: int = 10, sigma: float = 6, offset_factor: float = 0.75)

Arnaud Legoux moving average

Moving average typically used in the financial industry which aims to strike a good balance between smoothness and responsivness (i.e. capture a general smoothed trend without allowing for significant lag). It can be interpreted as a Gaussian weighted moving average with an offset, where the offset, spread and window size are user defined.

Parameters
  • data – Time series.

  • window – Window size. Defaults to 10 data points or time steps for uniformly sample time series.

  • sigma – Sigma. Parameter that controls the width of the Gaussian filter. Defaults to 6.

  • offset_factor – Offset factor. Parameter that controls the magnitude of the weights for each past observation within the window. Defaults to 0.75.

Returns

Smoothed data.

Return type

pandas.Series

Autoregressive moving average

indsl.smooth.arma(data: Series, ar_order: int = 2, ma_order: int = 2)

Autoregressive moving average

The autoregressive moving average (ARMA) is a popular model used in forecasting. It uses an autoregression (AR) analysis characterize the effect of past values on current values and a moving average to quantify the effect of the previous day error (variation).

Parameters
  • data – Time series.

  • ar_order – AR order. Number of past dat points to include in the AR model. Defaults to 2.

  • ma_order – MA order. Number of terms in the MA model. Defaults to 2.

Returns

Smoothed data.

Return type

pandas.Series

Exponential weighted moving average

indsl.smooth.ewma(data: Series, time_window: str = '60min', min_periods: int = 1, adjust: bool = True, max_pt: int = 200, resample: bool = False, resample_window: str = '60min') Series

Exp. weighted moving average

The exponential moving average gives more weight to the more recent observations. The weights fall exponentially as the data point gets older. It reacts more than the simple moving average with regards to recent movements. The moving average value is calculated following the definition yt=(1−α)yt−1+αxt if adjust = False or yt=(xt+(1−α)*xt−1+(1−α)^2*xt−2+…+(1−α)^t*x0) / (1+(1−α)+(1−α)^2+…+(1−α)^t) if adjust = True.

Parameters
  • data – Time series. Data with a pd.DatetimeIndex.

  • time_window – Time window. Defines how important the current observation is in the calculation of the EWMA. The longer the period, the slowly it adjusts to reflect current trends. Defaults to ‘60min’. If the user gives a number without unit (such as ‘60’), it will be considered as the number of minutes. Accepted string format: ‘3w’, ‘10d’, ‘5h’, ‘30min’, ’10s’. The time window is converted to the number of points for each of the windows. Each time window may have different number of points if the timeseries is not regular. The number of points specify the decay of the exponential weights in terms of span α=2/(span+1), for span≥1.

  • min_periods – Minimum number of data points. Minimum number of data points inside a time window required to have a value (otherwise result is NA). Defaults to 1. If min_periods > 1 and adjust is False, the SMA is computed for the first observation.

  • adjust – Adjust. If true, the exponential function is calculated using weights w_i=(1−α)^i. If false, the exponential function is calculated recursively using yt=(1−α)yt−1+αxt. Defaults to True.

  • max_pt – Maximum number of data points. Sets the maximum number of points to consider in a window if adjust = True. A high number of points will require more time to execute. Defaults to 200.

  • resample – Resample. If true, resamples the calculated exponential moving average series. Defaults to False.

  • resample_window – Resampling window Time window used to resample the calculated exponential moving average series. Defaults to ‘60min’.

Returns

Smoothed time series.

Return type

pandas.Series

Linear weighted moving average

indsl.smooth.sma(data: Series, time_window: str = '60min', min_periods: int = 1) Series

Simple moving average

Plain simple average that computes the sum of the values of the observations in a time_window divided by the number of observations in the time_window. SMA time series are much less noisy than the original time series. However, SMA time series lag the original time series, which means that changes in the trend are only seen with a delay (lag) of time_window/2.

Parameters
  • data – Time series.

  • time_window – Window Length of the time period to compute the average. Defaults to ‘60min’. Accepted string format: ‘3w’, ‘10d’, ‘5h’, ‘30min’, ’10s’. If the user gives a number without unit (such as ‘60’), it will be considered as the number of minutes.

  • min_periods – Minimum samples. Minimum number of observations in window required to have a value (otherwise result is NA). Defaults to 1.

Returns

Smoothed time series

Return type

pandas.Series

Simple moving average

indsl.smooth.lwma(data: Series, time_window: str = '60min', min_periods: int = 1, resample: bool = False, resample_window: str = '60min') Series

Linear weighted moving average

The linear weighted moving average gives more weight to the more recent observations and gradually less to the older ones.

Parameters
  • data – Time series.

  • time_window – Time window. Length of the time period to compute the rolling mean. Defaults to ‘60min’. If the user gives a number without unit (such as ‘60’), it will be considered as the number of minutes. Accepted string format: ‘3w’, ‘10d’, ‘5h’, ‘30min’, ’10s’.

  • min_periods – Minimum samples. Minimum number of observations in the time window required to estimate a value (otherwise result is NA). Defaults to 1.

  • resample – Resample. Resamples the calculated linear weighted moving average series. Defaults to False

  • resample_window – Resampling window. Time window used to resample the calculated linear weighted moving average series. Defaults to ‘60min’.

Returns

Smoothed time series.

Return type

pandas.Series

Frequency Based (low-pass filters)

Butterworth

indsl.smooth.butterworth(data: Series, N: int = 50, Wn: float = 0.1, btype: Literal['lowpass', 'highpass'] = 'lowpass')

Butterworth

This is signal processing filter designed to have a frequency response as flat as possible in the passband and roll-offs towards zero in the stopband. In other words, this filter is designed not to modify much the signal at the in the passband and attenuate as much as possible the signal at the stopband. At the moment onlylow and high pass filtering is supported.

Parameters
  • data – Time series.

  • N – Order. Defaults to 50.

  • Wn – Critical frequency. Number between 0 and 1, with 1 representing one-half of the sampling rate (Nyquist frequency). Defaults to 0.1.

  • btype – Filter type. The options are: “lowpass” and “highpass” Defaults to “lowpass”.

Returns

Filtered signal.

Return type

pandas.Series

Chebyshev

indsl.smooth.chebyshev(data: Series, filter_type: int = 1, N: int = 10, rp: float = 0.1, Wn: float = 0.1, btype: str = 'lowpass')

Chebyshev (I, II)

Chebyshev filters are analog or digital filters having a steeper roll-off than Butterworth filters, and have passband ripple (type I) or stopband ripple (type II). Chebyshev filters have the property that they minimize the error between the idealized and the actual filter characteristic over the range of the filter but with ripples in the passband (Wikipedia).

Parameters
  • data – Time series.

  • filter_type – Filter type Options are 1 or 2. Defaults to 1.

  • N – Order Defaults to 10.

  • rp – Maximum ripple. Maximum ripple allowed below unity gain in the passband. Defaults to 0.1.

  • Wn – Critical frequency. Number between 0 and 1, with 1 representing one-half of the sampling rate (Nyquist frequency). Defaults to 0.1.

  • btype – Filter type The options are: “lowpass” and “highpass” Defaults to “lowpass”.

Returns

Filtered signal

Return type

pandas.Series

Savitzky-Golay

indsl.smooth.sg(data: Series, window_length: int = None, polyorder: int = 1) Series

Saviztky-Golay

Use this filter for smoothing data, without distorting the data tendency. The method is independent of the sampling frequency, hence simple and robust to apply on data with non-uniform sampling. If working with high-frequency data (e.g. sampling frequency ~> 1 Hz) we recommend the user to provide the filter window length and polynomial order parameters to suit the requirements. Otherwise, if no parameters are provided, the function will estimate and set the parameters based on the characteristics of the input time series (e.g. sampling frequency).

Parameters
  • data – Time series.

  • window_length – Window. Point-wise length of the filter window (i.e. number of data points). A large window results in a stronger smoothing effect and vice-versa. If the filter window_length is not defined by the user, a length of about 1/5 of the length of time series is set.

  • polyorder – Polynomial order. Order of the polynomial used to fit the samples. Must be less than filter window length. Defaults to 1. Hint: A small polyorder (e.g. polyorder = 1) results in a stronger data smoothing effect, representing the dominating trend and attenuating data fluctuations.

Returns

Smoothed time series.

Return type

pandas.Series

Raises
  • UserValueError – The window length must be a positive odd integer

  • UserValueError – The window length must be less than or equal to the number of data points in your time series

  • UserValueError – The polynomial order must be less than the window length