preprocessing
boxcox(method='mle')
Applies the BoxCox transformation to numeric columns in a panel DataFrame.
Parameters:
Name  Type  Description  Default 

method 
str

The method used to determine the lambda parameter of the BoxCox transformation. Supported methods:

'mle'

coerce_dtypes(schema)
Coerces the column datatypes of a DataFrame using the provided schema.
Parameters:
Name  Type  Description  Default 

schema 
Mapping[str, DataType]

A dictionarylike object mapping column names to the desired data types. 
required 
deseasonalize_fourier(sp, K, robust=False)
Removes seasonality via residualized regression with Fourier terms.
Parameters:
Name  Type  Description  Default 

sp 
int

Seasonal period. 
required 
K 
int

Maximum order(s) of Fourier terms.
Must be less than 
required 
Note 

required 
detrend(freq, method='linear')
Removes mean or linear trend from numeric columns in a panel DataFrame.
Parameters:
Name  Type  Description  Default 

freq 
str

Offset alias supported by Polars. 
required 
method 
str

If 
'linear'

diff(order, sp=1, fill_strategy=None)
Difference timeseries in panel data given order and seasonal period.
Parameters:
Name  Type  Description  Default 

order 
int

The order to difference. 
required 
sp 
int

Seasonal periodicity. 
1

fill_strategy 
Optional[str]

Strategy to fill nulls by. Nulls are not filled if None. Supported strategies include: ["backward", "forward", "mean", "zero"]. 
None

fractional_diff(d, min_weight=None, window_size=None)
Compute the fractional differential of a time series.
This particular functionality is referenced in Advances in Financial Machine Learning by Marcos Lopez de Prado (2018).
For feature creation purposes, it is suggested that the minimum value of d is used that removes stationarity from the time series. This can be achieved by running the augmented dickeyfuller test on the time series for different values of d and selecting the minimum value that makes the time series stationary.
Parameters:
Name  Type  Description  Default 

d 
float

The fractional order of the differencing operator. 
required 
min_weight 
float

The minimum weight to use for calculations. If specified, the window size is computed from this value and not needed. 
None

window_size 
int

The window size of the fractional differencing operator. If specified, the minimum weight is not needed. 
None

impute(method)
Performs missing value imputation on numeric columns of a DataFrame grouped by entity.
Parameters:
Name  Type  Description  Default 

method 
Union[str, int, float]

The imputation method to use. Supported methods are:

required 
lag(lags, is_sorted=False)
Applies lag transformation to a LazyFrame. The time series is assumed to have no null values.
Parameters:
Name  Type  Description  Default 

lags 
List[int]

A list of lag values to apply. 
required 
is_sorted 
bool

If already sorted by entity and time columns already, this won't sort again and can save some time. 
False

one_hot_encode(drop_first=False)
Encode categorical features as a onehot numeric array.
Parameters:
Name  Type  Description  Default 

drop_first 
bool

Drop the first one hot feature. 
False

Raises:
Type  Description 

ValueError

if X passed into 
reindex(drop_duplicates=False)
Reindexes the entity and time columns to have every possible combination of (entity, time).
Parameters:
Name  Type  Description  Default 

drop_duplicates 
bool

Defaults to False. If True, duplicates are dropped before reindexing. 
False

resample(freq, agg_method, impute_method)
Resamples and transforms a DataFrame using the specified frequency, aggregation method, and imputation method.
Parameters:
Name  Type  Description  Default 

freq 
str

Offset alias supported by Polars. 
required 
agg_method 
str

The aggregation method to use for resampling. Supported values are 'sum', 'mean', and 'median'. 
required 
impute_method 
Union[str, int, float]

The method used for imputing missing values. If a string, supported values are 'ffill' (forward fill) and 'bfill' (backward fill). If an int or float, missing values will be filled with the provided value. 
required 
roll(window_sizes, stats, freq, fill_strategy=None)
Performs rolling window calculations on specified columns of a DataFrame.
Parameters:
Name  Type  Description  Default 

window_sizes 
List[int]

A list of integers representing the window sizes for the rolling calculations. 
required 
stats 
List[Literal['mean', 'min', 'max', 'mlm', 'sum', 'std', 'cv']]

A list of statistical measures to calculate for each rolling window. Supported values are:

required 
freq 
str

Offset alias supported by Polars. 
required 
fill_strategy 
Optional[str]

Strategy to fill nulls by. Nulls are not filled if None. Supported strategies include: ["backward", "forward", "mean", "zero"]. 
None

scale(use_mean=True, use_std=True, rescale_bool=False)
Performs scaling and rescaling operations on the numeric columns of a DataFrame.
Parameters:
Name  Type  Description  Default 

use_mean 
bool

Whether to subtract the mean from the numeric columns. Defaults to True. 
True

use_std 
bool

Whether to divide the numeric columns by the standard deviation. Defaults to True. 
True

rescale_bool 
bool

Whether to rescale boolean columns to the range [1, 1]. Defaults to False. 
False

time_to_arange(eager=False)
Coerces time column into arange per entity.
Assumes evenspaced timeseries and homogeneous start dates.
trim(direction='both')
Trims timeseries in panel to have the same start or end dates as the shortest timeseries.
Parameters:
Name  Type  Description  Default 

direction 
Literal['both', 'left', 'right']

Defaults to "both". If "left" trims from start date of the shortest time series); if "right" trims up to the end date of the shortest timeseries; or otherwise "both" trims between start and end dates of the shortest timeseries 
'both'

yeojohnson(brack=(2, 2))
Applies the YeoJohnson transformation to numeric columns in a panel DataFrame.
Parameters:
Name  Type  Description  Default 

brack 
2  tuple

The starting interval for a downhill bracket search with optimize.brent. Note that this is in most cases not critical; the final result is allowed to be outside this bracket. 
(2, 2)
