Smooth

Moving Averages

Arnaud Legoux Moving Average

indsl.smooth.alma(data: Series, window: int = 10, sigma: float = 6, offset_factor: float = 0.75) Series

Arnaud Legoux moving average.

Moving average typically used in the financial industry, which aims to strike a good balance between smoothness and responsiveness (i.e., capture a general smoothed trend without allowing for significant lag). It can be interpreted as a Gaussian weighted moving average with an offset, where the offset, spread, and window size are user-defined.

Parameters:
  • data – Time series.

  • window – Window size. Defaults to 10 data points or time steps for uniformly sample time series.

  • sigma – Sigma. Parameter that controls the width of the Gaussian filter. Defaults to 6.

  • offset_factor – Offset factor. Parameter that controls the magnitude of the weights for each past observation within the window. Defaults to 0.75.

Raises:
Returns:

Smoothed data.

Return type:

pandas.Series

Autoregressive moving average

indsl.smooth.arma(data: Series, ar_order: int = 2, ma_order: int = 2) Series

Autoregressive moving average.

The autoregressive moving average (ARMA) is a popular model used in forecasting. It uses an autoregression (AR) analysis to characterize the effect of past values on current values and a moving average to quantify the effect of the previous day’s error (variation).

Parameters:
  • data – Time series.

  • ar_order – AR order. Number of past data points to include in the AR model. Defaults to 2.

  • ma_order – MA order. Number of terms in the MA model. Defaults to 2.

Returns:

Smoothed data.

Return type:

pandas.Series

Exponential weighted moving average

indsl.smooth.ewma(data: Series, time_window: Timedelta = Timedelta('0 days 01:00:00'), min_periods: int = 1, adjust: bool = True, max_pt: int = 200, resample: bool = False, resample_window: Timedelta = Timedelta('0 days 01:00:00')) Series

Exp. weighted moving average.

The exponential moving average gives more weight to the more recent observations. The weights fall exponentially as the data point gets older. It reacts more than the simple moving average with regards to recent movements. The moving average value is calculated following the definition yt=(1−α)yt−1+αxt if adjust = False or yt=(xt+(1−α)*xt−1+(1−α)^2*xt−2+…+(1−α)^t*x0) / (1+(1−α)+(1−α)^2+…+(1−α)^t) if adjust = True.

Parameters:
  • data – Time series. Data with a pd.DatetimeIndex.

  • time_window – Time window. Defines how important the current observation is in the calculation of the EWMA. The longer the period, the slowly it adjusts to reflect current trends. Defaults to ‘60min’. Accepted string format: ‘3w’, ‘10d’, ‘5h’, ‘30min’, ’10s’. The time window is converted to the number of points for each of the windows. Each time window may have different number of points if the timeseries is not regular. The number of points specify the decay of the exponential weights in terms of span α=2/(span+1), for span≥1.

  • min_periods – Minimum number of data points. Minimum number of data points inside a time window required to have a value (otherwise result is NA). Defaults to 1. If min_periods > 1 and adjust is False, the SMA is computed for the first observation.

  • adjust – Adjust. If true, the exponential function is calculated using weights w_i=(1−α)^i. If false, the exponential function is calculated recursively using yt=(1−α)yt−1+αxt. Defaults to True.

  • max_pt – Maximum number of data points. Sets the maximum number of points to consider in a window if adjust = True. A high number of points will require more time to execute. Defaults to 200.

  • resample – Resample. If true, resamples the calculated exponential moving average series. Defaults to False.

  • resample_window – Resampling window Time window used to resample the calculated exponential moving average series. Defaults to ‘60min’.

Returns:

Smoothed time series.

Return type:

pandas.Series

Simple moving average

indsl.smooth.sma(data: Series, time_window: Timedelta = Timedelta('0 days 01:00:00'), min_periods: int = 1) Series

Simple moving average (SMA).

Plain simple average that computes the sum of the values of the observations in a time_window divided by the number of observations in the time_window. SMA time series are much less noisy than the original time series. However, SMA time series lag the original time series, which means that changes in the trend are only seen with a delay (lag) of time_window/2.

Parameters:
  • data – Time series.

  • time_window – Window. Length of the time period to compute the average. Defaults to ‘60min’. Accepted string format: ‘3w’, ‘10d’, ‘5h’, ‘30min’, ’10s’.

  • min_periods – Minimum samples. Minimum number of observations in window required to have a value (otherwise result is NA). Defaults to 1.

Returns:

Smoothed time series

Return type:

pandas.Series

Linear weighted moving average

indsl.smooth.lwma(data: Series, time_window: Timedelta = Timedelta('0 days 01:00:00'), min_periods: int = 1, resample: bool = False, resample_window: Timedelta = Timedelta('0 days 01:00:00')) Series

Linear weighted moving average.

The linear weighted moving average gives more weight to the more recent observations and gradually less to the older ones.

Parameters:
  • data – Time series.

  • time_window – Time window. Length of the time period to compute the rolling mean. Defaults to ‘60min’. If the user gives a number without unit (such as ‘60’), it will be considered as the number of minutes. Accepted string format: ‘3w’, ‘10d’, ‘5h’, ‘30min’, ’10s’.

  • min_periods – Minimum samples. Minimum number of observations in the time window required to estimate a value (otherwise result is NA). Defaults to 1.

  • resample – Resample. Resamples the calculated linear weighted moving average series. Defaults to False

  • resample_window – Resampling window. Time window used to resample the calculated linear weighted moving average series. Defaults to ‘60min’.

Returns:

Smoothed time series.

Return type:

pandas.Series

Frequency Based (low-pass filters)

Butterworth

indsl.smooth.butterworth(data: Series, N: int = 50, Wn: float = 0.1, btype: Literal['lowpass', 'highpass'] = 'lowpass') Series

Butterworth.

This signal processing filter is designed to have a frequency response as flat as possible in the passband and roll-offs towards zero in the stopband. In other words, this filter is designed not to modify much the signal at the in the passband and attenuate as much as possible the signal at the stopband. At the moment, only low and high pass filtering are supported.

Parameters:
  • data – Time series.

  • N – Order. Defaults to 50.

  • Wn – Critical frequency. Number between 0 and 1, with 1 representing one-half of the sampling rate (Nyquist frequency). Defaults to 0.1.

  • btype – Filter type. The options are: “lowpass” and “highpass” Defaults to “lowpass”.

Returns:

Filtered signal.

Return type:

pandas.Series

Chebyshev

indsl.smooth.chebyshev(data: Series, filter_type: int = 1, N: int = 10, rp: float = 0.1, Wn: float = 0.1, btype: str = 'lowpass') Series

Chebyshev (I, II).

Chebyshev filters are analog or digital filters having a steeper roll-off than Butterworth filters, and have passband ripple (type I) or stopband ripple (type II). Chebyshev filters have the property that they minimize the error between the idealized and the actual filter characteristic over the range of the filter but with ripples in the passband (Wikipedia).

Parameters:
  • data – Time series.

  • filter_type – Filter type Options are 1 or 2. Defaults to 1.

  • N – Order Defaults to 10.

  • rp – Maximum ripple. Maximum ripple allowed below unity gain in the passband. Defaults to 0.1.

  • Wn – Critical frequency. Number between 0 and 1, with 1 representing one-half of the sampling rate (Nyquist frequency). Defaults to 0.1.

  • btype – Filter type. The options are: “lowpass” and “highpass” Defaults to “lowpass”.

Returns:

Filtered signal

Return type:

pandas.Series

Savitzky-Golay

indsl.smooth.sg(data: Series, window_length: int | None = None, polyorder: int = 1) Series

Saviztky-Golay.

Use this filter for smoothing data without distorting the data tendency. The method is independent of the sampling frequency. Hence, it is simple and robust to apply to data with non-uniform sampling. If you work with high-frequency data (e.g., sampling frequency ~> 1 Hz), we recommend that you provide the filter window length and polynomial order parameters to suit the requirements. Otherwise, if no parameters are provided, the function will estimate and set the parameters based on the characteristics of the input time series (e.g., sampling frequency).

Parameters:
  • data – Time series.

  • window_length – Window. Point-wise length of the filter window (i.e., number of data points). A large window results in a stronger smoothing effect and vice-versa. If the filter window_length is not defined by the user, a length of about 1/5 of the length of time series is set.

  • polyorder – Polynomial order. Order of the polynomial used to fit the samples. Must be less than filter window length. Defaults to 1. Hint: A small polyorder (e.g., polyorder = 1) results in a stronger data smoothing effect, representing the dominating trend and attenuating data fluctuations.

Returns:

Smoothed time series.

Return type:

pandas.Series

Raises:
  • UserValueError – The window length must be a positive odd integer

  • UserValueError – The window length must be less than or equal to the number of data points in your time series

  • UserValueError – The polynomial order must be less than the window length