PCA

Bases: BaseReduction

Principal Component Analysis (PCA) class.

Attributes:
  • n_components (int or float) –

    The number of components or the explained variance ratio.

  • is_incremental (bool) –

    Indicates whether Incremental PCA is used.

  • _pca (PCA or IncrementalPCA) –

    The PCA or Incremental PCA model.

  • is_fitted (bool) –

    Indicates whether the PCA model has been fitted.

  • _data (Dataset) –

    The original dataset.

  • _stacked_data_matrix (ndarray) –

    The stacked data matrix.

  • _standarized_stacked_data_matrix (ndarray) –

    The standardized stacked data matrix.

  • scaler (StandardScaler) –

    The scaler used for standardizing the data.

  • vars_to_stack (list of str) –

    The list of variables to stack.

  • coords_to_stack (list of str) –

    The list of coordinates to stack.

  • pca_dim_for_rows (str) –

    The dimension for rows in PCA.

  • window_in_pca_dim_for_rows (list of int) –

    The window in PCA dimension for rows.

  • value_to_replace_nans (float) –

    The value to replace NaNs in the dataset.

  • num_cols_for_vars (int) –

    The number of columns for variables.

Methods:

Name Description
fit

data: xr.Dataset, vars_to_stack: List[str], coords_to_stack: List[str], pca_dim_for_rows: str, window_in_pca_dim_for_rows: List[int] = [0], value_to_replace_nans: float = None,

) -> None
transform
fit_transform

data: xr.Dataset, vars_to_stack: List[str], coords_to_stack: List[str], pca_dim_for_rows: str, window_in_pca_dim_for_rows: List[int] = [0], value_to_replace_nans: float = None,

) -> xr.Dataset
inverse_transform

__init__(n_components=0.98, is_incremental=False)

Initialize the PCA class.

Parameters:
  • n_components (int or float, default: 0.98 ) –

    Number of components to keep. If 0 < n_components < 1, it represents the proportion of variance to be explained by the selected components. If n_components >= 1, it represents the number of components to keep. Default is 0.98.

  • is_incremental (bool, default: False ) –

    If True, use Incremental PCA which is useful for large datasets. Default is False.

Raises:
  • ValueError

    If n_components is less than or equal to 0.

  • TypeError

    If n_components is not an integer when it is greater than or equal to 1.

fit(data, vars_to_stack, coords_to_stack, pca_dim_for_rows, window_in_pca_dim_for_rows=[0], value_to_replace_nans=None)

Fit PCA model to data.

Parameters:
  • data (Dataset) –

    The data to fit the PCA model.

  • vars_to_stack (list of str) –

    The variables to stack.

  • coords_to_stack (list of str) –

    The coordinates to stack.

  • pca_dim_for_rows (str) –

    The PCA dimension to maintain in rows (usually the time).

  • window_in_pca_dim_for_rows (list of int, default: [0] ) –

    The window steps to roll the pca_dim_for_rows. Default is [0].

  • value_to_replace_nans (float, default: None ) –

    The value to replace NaNs. Default is None.

fit_transform(data, vars_to_stack, coords_to_stack, pca_dim_for_rows, window_in_pca_dim_for_rows=[0], value_to_replace_nans=None)

Fit and transform data using PCA model.

Parameters:
  • data (Dataset) –

    The data to fit the PCA model.

  • vars_to_stack (list of str) –

    The variables to stack.

  • coords_to_stack (list of str) –

    The coordinates to stack.

  • pca_dim_for_rows (str) –

    The PCA dimension to maintain in rows (usually the time).

  • window_in_pca_dim_for_rows (list of int, default: [0] ) –

    The window steps to roll the pca_dim_for_rows. Default is [0].

  • value_to_replace_nans (float, default: None ) –

    The value to replace NaNs. Default is None.

Returns:
  • Dataset

    The transformed data.

inverse_transform(PCs)

Inverse transform data using the fitted PCA model.

Parameters:
  • X (ndarray or Dataset) –

    The data to inverse transform.

Returns:
  • data_transformed( Dataset ) –

    The inverse transformed data.

transform(data)

Transform data using the fitted PCA model.

Parameters:
  • data (Dataset) –

    The data to transform.

Returns:
  • Dataset

    The transformed data.

PCAError

Bases: Exception

Custom exception for PCA class.