Resample

Interpolate and Resample

indsl.resample.interpolate(data: Series, method: Literal['linear', 'ffill', 'stepwise', 'zero', 'slinear', 'quadratic', 'cubic'] = 'linear', kind: Literal['pointwise', 'average'] = 'pointwise', granularity: Timedelta = Timedelta('0 days 00:00:01'), bounded: bool = True) DataFrame | Series

Interpolation.

This function interpolates and resamples data with a uniform sampling frequency.

Parameters:
  • data – Time series.

  • method

    Method. Specifies the interpolation method. Defaults to “linear”. Possible inputs are :

    • ’linear’: linear interpolation.

    • ’ffill’: forward filling.

    • ’stepwise’: yields same result as ffill.

    • ’zero’, ‘slinear’, ‘quadratic’, ‘cubic’: spline interpolation of zeroth, first, second or third order.

  • kind

    Kind. Specifies the kind of returned data points. Defaults to “pointwise”. Possible inputs are:

    • ’pointwise’: returns the pointwise value of the interpolated function for each timestamp.

    • ’average’: returns the average of the interpolated function within each time period.

  • granularity – Frequency. Sampling frequency or granularity of the output (e.g. ‘1s’ or ‘2h’). Defaults to “1s”.

  • bounded

    Bounded. Specifies behaviour for requested points outside of the data range. Defaults to True.

    • True: Extrapolate for requested points outside of the data range.

    • False: Ignore points outside of the data range.

Returns:

Interpolated time series.

Return type:

pandas.Series

Raises:
  • UserTypeError – data is not a time series

  • Warning – Empty data time series

  • Warning – All data in timeseries is nan

Resampling: Fourier, Polynomial, Linear, min, max, sum, count

indsl.resample.resample(data: Series, method: Literal['fourier', 'polyphase', 'interpolate', 'min', 'max', 'sum', 'count', 'mean'] = 'fourier', granularity_current: Timedelta | None = None, granularity_next: Timedelta | None = Timedelta('0 days 00:00:01'), num: int | None = None, downsampling_factor: int | None = None, interpolate_resolution: Timedelta | None = None, ffill_resolution: Timedelta | None = None) Series

Resample.

This method offers a robust filling of missing data points and data resampling a given sampling frequency. Multiple data resampling options are available:

  • Fourier

  • Polynomial phase filtering

  • Linear interpolation (for up-sampling)

  • Min, max, sum, count, mean (for down-sampling)

Parameters:
  • data – Time series.

  • method

    Method. Resampling method. Valid options are:

    • ”fourier” for Fourier method (default)

    • ”polyphase” for polyphase filtering

    • ”interpolate” for linear interpolation when upsampling

    • ”min”, “max”, “sum”, “count”, “mean” when downsampling

  • granularity_current – Current temporal resolution. Temporal resolution of uniform time series, before resampling. Defaults to None. If not specified, the frequency will be implied, which only works if no data is missing. Follows Pandas DateTime convention.

  • granularity_next – Final temporal resolution. Temporal resolution of uniform time series, after resampling. Defaults to “1s”. Either “Number of Samples” or “Final temporal resolution” must be provided.

  • num – Number of Samples. The number of samples in the resampled signal. If this is set, the time deltas will be inferred. Defaults to None. Either “Number of Samples” or “Final temporal resolution” must be provided.

  • downsampling_factor – Down-sampling factor. The down-sampling factor is required for the polyphase filtering. Defaults to None.

  • interpolate_resolution – Interpolation threshold. Gaps smaller than threshold will be interpolated, larger than this will be filled by noise. Defaults to None.

  • ffill_resolution – Forward fill threshold. Gaps smaller than this threshold will be forward filled. Defaults to None.

Returns:

Interpolated time series Uniform, resampled time series with specified number of data points.

Return type:

pandas.Series

Raises:
  • UserTypeError – data is not a time series

  • UserTypeError – Either num or granularity_next has to be set

  • UserTypeError – If specified, outside_fill must be either ‘None’ or ‘extrapolate’.

  • UserTypeError – Method has to be in ‘fourier’, ‘polyphase’, ‘interpolate’, ‘min’, ‘max’, ‘sum’, ‘count’

  • UserTypeError – Empty data time series

  • UserTypeError – All values in the time series are NaN

  • Warning – Can’t infer time series resolution with missing data. Please provide resolution

Resampling to granularity (default)

indsl.resample.resample_to_granularity(series: Series, granularity: Timedelta = Timedelta('0 days 01:00:00'), aggregate: Literal['mean', 'interpolation', 'stepInterpolation', 'max', 'min', 'count', 'sum'] = 'mean') Series

Resample to granularity.

Resample time series to a given fixed granularity (time delta) and aggregation type (read more about aggregation)

Parameters:
  • series – Time series.

  • granularity – Granularity. Granularity defines the time range that each aggregate is calculated from. It consists of a time unit and a size. Valid time units are day or d, hour h, minute or m and second or s. For example, 2h means that each time range should be 2 hours wide, 3m means 3 minutes, and so on.

  • aggregate

    Aggregate. Type of aggregation to use when resampling. Valid options are:

    • ”interpolation” for linear interpolation when upsampling

    • ”stepInterpolation” for stepwise interpolation when upsampling

    • ”min”, “max”, “sum”, “count”, “mean” when downsampling

Returns:

Resampled time series.

Return type:

pandas.Series

Group by region

indsl.resample.group_by_region(data: Series, filter_by: Series, int_to_keep: int = 1, aggregate: Literal['Mean', 'Median', 'Standard deviation', 'Count', 'Min', 'Max'] = 'Mean', timestamp: Literal['Region center', 'Region start', 'Region end', 'Entire region'] = 'Region center') Series

Group by region.

This function groups any given data series by a series with integers denoting different states. A typical example of such a series is a series of 0 and 1 where 1 would indicate the presence of steady process conditions.

Parameters:
  • data (pd.Series) – Time series.

  • filter_by (pd.Series) – Region flag time series. Time series values are expected to be integers. If not, values are cast to integer automatically.

  • int_to_keep (int, optional) – Value. Value that identifies the region of interest.

  • aggregate (str, optional) – Aggregate. Indicates the aggregation to be performed for each identified region.

  • timestamp (str, optional) – Timestamp. Indicates the location of the timestamps for the aggregated data.

Returns:

Grouped time series

Return type:

pd.Series

Raises:
  • UserRuntimeError – Time series returns no data. This could be due to insufficient data in either data or filter_by, or filter_by series contains no values of int_to_keep.

  • ValueError – The provided aggregate or timestamp inputs are not valid options.

Reindex

indsl.resample.reindex(data1: Series, data2: Series, method: Literal['zero', 'next', 'slinear', 'quadratic', 'cubic'] = 'slinear', kind: Literal['pointwise', 'average'] = 'pointwise', bounded: bool = False) Series

Reindex.

This method offers data reindexing onto a common index and fills missing data points. If bounded is false, the common index is the union of the the input time-series indices. If bounded is true, the common index is restricted to the period where the time-series overlap. All not-a-number (NaN) values are removed in the output time series.

Parameters:
  • data1 – First time series.

  • data2 – Second time series.

  • method

    Method. Specifies the interpolation method. Defaults to “linear”. Possible inputs are :

    • ’zero’: zero order spline interpolation with forward filling mode, i.e., the previous known value of any point is used.

    • ’next’: zero order spline interpolation with backward filling mode, i.e., the next known value of any point is used.

    • ’slinear’: linear order spline interpolation.

    • ’quadratic’: quadratic order spline interpolation.

    • ’cubic’: cubic order spline interpolation.

  • kind

    Kind. Specifies the kind of returned data points. Defaults to “pointwise”. Possible inputs are:

    • ’pointwise’: returns the pointwise value of the interpolated function for each timestamp.

    • ’average’: returns the average of the interpolated function within each time period.

  • bounded

    Bounded. Specifies if the output should be bounded to avoid extrapolation. Defaults to False. Possible inputs are:

    • True: Return the intersection of the time periods of the input time series.

    • False: Return the union of the time periods of the input time series. Extrapolate points outside of the data range.

Returns:

First reindexed time series

Reindexed time series with common indices.

Return type:

pandas.Series

Raises:

UserValueError – All time series must have at least two values

Reindex scatterplot

indsl.resample.reindex_scatter(signal_x: Series, signal_y: Series, align_timesteps: bool = False) Series

Reindex scatterplot.

It returns the values from signal_y with the timestamps as the values from signal_x, where the timestamps has been scaled to the range of timestamps from signal_x. The timestamps are sorted in ascending order, and the values are sorted with the same sort-index

This is a way of creating a scatterplot inside a chart

Parameters:
  • signal_x – x-value. The time series where the values are used as the x-value

  • signal_y – y-value. The time series where the values are used as the y-value

  • align_timesteps (bool) – Auto-align Automatically align time stamp of input time series. Default is False.

Returns:

Scatter plot

Return type:

pandas.Series

Reindex scatterplot x-values

indsl.resample.reindex_scatter_x(signal_x: Series) Series

Reindex scatterplot x-values.

It returns the values from signal_x with the timestamps as the values from signal_x, where the timestamps has been scaled to the range of timestamps from signal_x. The timestamps are sorted in ascending order, and the values are sorted with the same sort-index In effect this is a straight line going from min value to the max value of signal_x over the time range

Parameters:

signal_x – x-value. The time series where the values are used as the x-value

Returns:

Scatter plot

Return type:

pandas.Series