API Documentation¶
- class experiment.dataset_processor.AsFreqPreprocessor(freq)¶
- fit(data)¶
Learn any parameters needed for transformation from the data.
- transform(data)¶
Apply transformations to the data.
- class experiment.dataset_processor.BasePreprocessor¶
- abstract fit(data)¶
Learn any parameters needed for transformation from the data.
- fit_transform(data)¶
Fit to data, then transform it.
- abstract transform(data)¶
Apply transformations to the data.
- class experiment.dataset_processor.DatePreprocessor¶
- fit(data)¶
Learn any parameters needed for transformation from the data.
- transform(data)¶
Apply transformations to the data.
- class experiment.dataset_processor.FeatureEngineeringPreprocessor(transformations)¶
- fit(data)¶
Learn any parameters needed for transformation from the data.
- transform(data)¶
Apply transformations to the data.
- class experiment.dataset_processor.LagProcessor(num_lags)¶
- fit(data)¶
Learn any parameters needed for transformation from the data.
- transform(data)¶
Apply transformations to the data.
- class experiment.dataset_processor.MissingValuePreprocessor(method='ffill')¶
- fit(data)¶
Learn any parameters needed for transformation from the data.
- transform(data)¶
Apply transformations to the data.
- class experiment.dataset_processor.ScalerPreprocessor(method='standard', test_size=0)¶
- fit(data)¶
Fits the scaler on the first train_size fraction of the provided data.
- transform(data)¶
Transforms the provided data using the fitted scaler.
- experiment.dataset_processor.df_train_test_split(data, target=None, test_size=0.25, shuffle=True)¶
Split pandas DataFrames into random train and test subsets, optionally shuffling the data before splitting.
Parameters: data (DataFrame): The DataFrame containing features to be split. target (DataFrame or Series, optional): The DataFrame or
Series containing target variable(s). If provided, it will be split in sync with the data.
- test_size (float or int): If float, should be between 0.0 and 1.0 and
represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples.
- shuffle (bool, optional): Whether to shuffle the data before splitting.
Default is True.
Returns: X_train, X_test, y_train, y_test (DataFrame or Series): Split datasets.
y_train and y_test are returned only if target is provided.
- experiment.evaluation.calculate_crps(actual, prediction, sigma=1.0)¶
Calculate the Continuous Ranked Probability Score (CRPS) for Gaussian forecasts. This is a simplified version assuming a normal distribution of predictions. Parameters: - sigma: Standard deviation of the forecast distribution. Default is 1.0. Returns: - CRPS: float
- class experiment.layers.__init__.AttentionLayer(attention, d_model, n_heads, d_keys=None, d_values=None)¶
- forward(queries, keys, values, attn_mask, tau=None, delta=None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class experiment.layers.__init__.DataEmbeddingInverted(c_in, d_model, dropout=0.1)¶
- forward(x, x_mark)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class experiment.layers.__init__.Encoder(attn_layers, conv_layers=None, norm_layer=None)¶
- forward(x, attn_mask=None, tau=None, delta=None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class experiment.layers.__init__.EncoderLayer(attention, d_model, d_ff=None, dropout=0.1, activation='relu')¶
- forward(x, attn_mask=None, tau=None, delta=None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class experiment.layers.__init__.FullAttention(mask_flag=True, factor=5, scale=None, attention_dropout=0.1, output_attention=False)¶
- forward(queries, keys, values, attn_mask, tau=None, delta=None)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- pipeline_script.tools.utils.seed_everything(seed=1234, is_cuda=False)¶
set the seed whever there is randomness involved
paraterms: seed: an intiger number is_cuda: whether cuda is used for experiments
return:
- class pipeline_script.dataset.dataset.ForecastingDataset(target: ndarray, input_length: int, ouput_length: int, shift: int = 1, covariates: ndarray | None = None)¶
class of forecasting dataset
- to_numpy() ndarray¶
convert the input and output data to numpy ndarray
return: input_data: the part of dataset which can be inputted to the model output_data: the part of the dataset which can be forecasted
- class pipeline_script.models.KaplanMeier.KaplanMeier(options, results)¶
class for survival analysis model of Kaplan-Meier inlcuding methods such as preprocess_data, fit_model and visualize
- fit_model()¶
fit the processed data to the model
return: kmf: fitted model
- preprocess_data()¶
preproces the datasets required for survival analysis using the failure data, final failures, failure times, and failure indices
return: cox_data: processed dataframe with ‘Value’, ‘Load’, ‘Age’, and ‘Status’ columns
- visualize()¶
visualize the final survival curve and save it in the output directory (stored in options[“output_dir”])
- class load_forecasting.dataset.ForecastingDataset(target: ndarray, input_length: int, ouput_length: int, shift: int = 1, covariates: ndarray | None = None, resolution: str | None = None, datetime_index: DatetimeIndex | None = None, future_covariates: ndarray | None = None)¶
Dataset for load forecasting.
- Parameters:
target (np.ndarray) – The target variable to forecast.
input_length (int) – The length of the input sequence.
ouput_length (int) – The length of the output sequence.
shift (int, optional) – The number of steps to shift the output sequence. (default: 1)
covariates (np.ndarray, optional) – The covariates to use in the forecasting model. (default: None)
resolution (str, optional) – The resolution of the datetime index. (default: None)
datetime_index (pd.DatetimeIndex, optional) – The datetime index of the target variable. If resolution is not None, this argument must be provided. (default: None)
future_covariates (np.ndarray, optional) – The covariates to use in the forecasting model for the future. (default: None)
- Raises:
AssertionError – If the shift + input_length <= ouput_length.
AssertionError – If the length of the target and the covariates are not the same.
AssertionError – If the resolution is not None and the datetime index is None.
AssertionError – If the length of the datetime index and the target are not the same.
- Returns:
The input data. np.ndarray: The future target.
- Return type:
np.ndarray
- to_numpy()¶
Returns the dataset as a numpy array.
- __len__()¶
Returns the length of the dataset.
- __getitem__()¶
Returns the input data and the future target for a given index.
- class load_forecasting.neural_basis.loader.LoadDatset(mode, in_size, out_size, data_path)¶
- class load_forecasting.neural_basis.models.time_operator.TimeOp(num_basis_in, num_basis_out, emb_dim=16)¶
- forward(coeffs_in, dayofweek, dayofmonth, month)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.