RBF

Bases: BaseInterpolation

Radial Basis Function (RBF) interpolation model.

Attributes:
  • sigma_min (float) –

    The minimum value for the sigma parameter. This value might change in the optimization process.

  • sigma_max (float) –

    The maximum value for the sigma parameter. This value might change in the optimization process.

  • sigma_diff (float) –

    The difference between the optimal bounded sigma and the minimum and maximum sigma values. If the difference is less than this value, the optimization process continues.

  • kernel (str) –

    The kernel to use for the RBF model. The available kernels are:

    - gaussian              : ``exp(-1/2 * (r / const)**2)``
    - multiquadratic        : ``sqrt(1 + (r / const)**2)``
    - inverse               : ``1 / sqrt(1 + (r / const)**2)``
    - cubic                 : ``r**3``
    - thin_plate            : ``r**2 * log(r / const)``
    
  • kernel_func (function) –

    The kernel function to use for the RBF model.

  • smooth (float) –

    The smoothness parameter.

  • subset_data (DataFrame) –

    The subset data used to fit the model.

  • normalized_subset_data (DataFrame) –

    The normalized subset data used to fit the model.

  • target_data (DataFrame) –

    The target data used to fit the model.

  • normalized_target_data (DataFrame) –

    The normalized target data used to fit the model. This attribute is only set if normalize_target_data is True in the fit method.

  • subset_directional_variables (List[str]) –

    The subset directional variables.

  • target_directional_variables (List[str]) –

    The target directional variables.

  • subset_processed_variables (List[str]) –

    The subset processed variables.

  • target_processed_variables (List[str]) –

    The target processed variables.

  • subset_custom_scale_factor (dict) –

    The custom scale factor for the subset data.

  • target_custom_scale_factor (dict) –

    The custom scale factor for the target data.

  • subset_scale_factor (dict) –

    The scale factor for the subset data.

  • target_scale_factor (dict) –

    The scale factor for the target data.

  • rbf_coeffs (DataFrame) –

    The RBF coefficients for the target variables.

  • opt_sigmas (dict) –

    The optimal sigmas for the target variables.

Methods:

Name Description
fit

Fits the model to the data.

predict

Predicts the data for the provided dataset.

fit_predict

Fits the model to the subset and predicts the interpolated dataset.

Notes

TODO: For the moment, this class only supports optimization for one parameter kernels. For this reason, we only have sigma as the parameter to optimize. This sigma refers to the sigma parameter in the Gaussian kernel (but is used for all kernels).

Main reference for sigma optimization: https://link.springer.com/article/10.1023/A:1018975909870

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> from bluemath_tk.interpolation.rbf import RBF
>>> dataset = pd.DataFrame(
...     {
...         "Hs": np.random.rand(1000) * 7,
...         "Tp": np.random.rand(1000) * 20,
...         "Dir": np.random.rand(1000) * 360,
...     }
... )
>>> subset = dataset.sample(frac=0.25)
>>> target = pd.DataFrame(
...     {
...         "HsPred": self.subset["Hs"] * 2 + self.subset["Tp"] * 3,
...         "DirPred": - self.subset["Dir"],
...     }
... )
>>> rbf = RBF()
>>> predictions = rbf.fit_predict(
...     subset_data=subset,
...     subset_directional_variables=["Dir"],
...     target_data=target,
...     target_directional_variables=["DirPred"],
...     normalize_target_data=True,
...     dataset=dataset,
...     num_threads=4,
...     iteratively_update_sigma=True,
... )

__init__(sigma_min=0.001, sigma_max=0.1, sigma_diff=0.0001, sigma_opt=None, kernel='gaussian', smooth=1e-05)

Raises:
  • ValueError

    If the sigma_min is not a positive float. If the sigma_max is not a positive float greater than sigma_min. If the sigma_diff is not a positive float. If the kernel is not a string and one of the available kernels. If the smooth is not a positive float.

fit(subset_data, target_data, subset_directional_variables=[], target_directional_variables=[], subset_custom_scale_factor={}, normalize_target_data=True, target_custom_scale_factor={}, num_threads=None, iteratively_update_sigma=False)

Fits the model to the data.

Parameters:
  • subset_data (DataFrame) –

    The subset data used to fit the model.

  • target_data (DataFrame) –

    The target data used to fit the model.

  • subset_directional_variables (List[str], default: [] ) –

    The subset directional variables. Default is [].

  • target_directional_variables (List[str], default: [] ) –

    The target directional variables. Default is [].

  • subset_custom_scale_factor (dict, default: {} ) –

    The custom scale factor for the subset data. Default is {}.

  • normalize_target_data (bool, default: True ) –

    Whether to normalize the target data. Default is True.

  • target_custom_scale_factor (dict, default: {} ) –

    The custom scale factor for the target data. Default is {}.

  • num_threads (int, default: None ) –

    The number of threads to use for the optimization. Default is None.

  • iteratively_update_sigma (bool, default: False ) –

    Whether to iteratively update the sigma parameter. Default is False.

Notes
  • This function fits the RBF model to the data by:
    1. Preprocessing the subset and target data.
    2. Calculating the optimal sigma for the target variables.
    3. Storing the RBF coefficients and optimal sigmas.
  • The number of threads to use for the optimization can be specified.

fit_predict(subset_data, target_data, dataset, subset_directional_variables=[], target_directional_variables=[], subset_custom_scale_factor={}, normalize_target_data=True, target_custom_scale_factor={}, num_threads=None, iteratively_update_sigma=False)

Fits the model to the subset and predicts the interpolated dataset.

Parameters:
  • subset_data (DataFrame) –

    The subset data used to fit the model.

  • target_data (DataFrame) –

    The target data used to fit the model.

  • dataset (DataFrame) –

    The dataset to predict (must have same variables than subset).

  • subset_directional_variables (List[str], default: [] ) –

    The subset directional variables. Default is [].

  • target_directional_variables (List[str], default: [] ) –

    The target directional variables. Default is [].

  • subset_custom_scale_factor (dict, default: {} ) –

    The custom scale factor for the subset data. Default is {}.

  • normalize_target_data (bool, default: True ) –

    Whether to normalize the target data. Default is True.

  • target_custom_scale_factor (dict, default: {} ) –

    The custom scale factor for the target data. Default is {}.

  • num_threads (int, default: None ) –

    The number of threads to use for the optimization. Default is None.

  • iteratively_update_sigma (bool, default: False ) –

    Whether to iteratively update the sigma parameter. Default is False.

Returns:
  • DataFrame

    The interpolated dataset.

Notes
  • This function fits the model to the subset and predicts the interpolated dataset.

predict(dataset)

Predicts the data for the provided dataset.

Parameters:
  • dataset (DataFrame) –

    The dataset to predict (must have same variables than subset).

Returns:
  • DataFrame

    The interpolated dataset.

Raises:
  • ValueError

    If the model is not fitted.

Notes
  • This function predicts the data by:
    1. Reconstructing the data using the fitted coefficients.
    2. Denormalizing the target data if normalize_target_data is True.
    3. Calculating the degrees for the target directional variables.

RBFError

Bases: Exception

Custom exception for RBF class.

gaussian_kernel(r, const)

This function calculates the Gaussian kernel for the given distance and constant.

Parameters:
  • r (float) –

    The distance.

  • const (float) –

    The constant (default name is usually sigma for gaussian kernel).

Returns:
  • float

    The Gaussian kernel value.

Notes
  • The Gaussian kernel is defined as: K(r) = exp(r^2 / 2 * const^2) (https://en.wikipedia.org/wiki/Gaussian_function)
  • Here, we are assuming the mean is 0.