bluemath_tk.distributions package

Submodules

bluemath_tk.distributions.copula module

bluemath_tk.distributions.extreme_correction module

class bluemath_tk.distributions.extreme_correction.ExtremeCorrection(corr_config: dict, pot_config: dict, method: str = 'pot', conf_level: float = 0.95, debug: bool = False)[source]

Bases: BlueMathModel

Extreme Correction class

correlations() dict[source]

Rank based correlations between sampled and corrected sampled data

Returns:

Dictionary with Spearman, Kendall and Pearson correlation coefficients. Keys : - “Spearman” : Spearman correlation coefficient - “Kendall” : Kendall correlation coefficient - “Pearson” : Pearson correlation coefficient

Return type:

dict

fit(data_hist: Dataset, plot_diagnostic: bool = False) None[source]

Fit the historical data into GEV or GPD

Parameters:
  • data_hist (xr.Dataset) – Dataset with historical data

  • bmus (list[bool, str], default=[False, ""]) – Whether to apply the correction by BMUS, if given the name of bmus variable should be given

  • plot_diagnostic (bool, default=False) – Whether to plot the diagnostics plot of the fitted distribution

fit_transform(data_hist: Dataset, data_sim: Dataset, bmus: list[bool, str] = [False, ''], prob: str = 'unif', plot_diagnostic: bool = False, random_state: int = 0) Dataset[source]

Fit and apply the correction procedure

See fit and transform for more information

Parameters:
  • data_hist (xr.Dataset) – Dataset with historical data

  • data_sim (xr.Dataset) – Dataset with synthetic data

  • bmus (list[bool, str], default=[False, ""]) – Whether to apply the correction by BMUS, if given the name of bmus variable should be given

  • prob (str, default="unif") – Type of probabilities consider to random correct the AM If “unif”, a sorted random uniform is considered If “ecdf”, the ECDF is considered

  • plot_diagnostic (bool, default=False) – Whether to plot the diagnostics plot of the fitted distribution

  • random_state (int, default=0) – Random state to generate the probabilities

Returns:

sim_pit_data_corrected – Point-in-time corrected data

Return type:

xr.Dataset

hist_retper_plot() tuple[Figure, Axes][source]

Historical Return Period plot

Returns:

  • fig – plt.Figure

  • ax – plt.Axes

plot() tuple[list[Figure], list[Axes]][source]

Plot return periods

sim_retper_plot() tuple[Figure, Axes][source]

Corrected Sampled and Sampled Return Period plot

Returns:

  • fig – plt.Figure

  • ax – plt.Axes

test() dict[source]

Cramer Von-Mises test to check the GOF of fitted distribution

Test to check the Goodness-of-Fit of the historical fitted distribution with the synthetic data. Null Hypothesis: sampled AM comes from the fitted extreme distribution.

Returns:

Statistic and p-value of the Cramer Von-Mises test

Return type:

dict

Notes

The test is applied in the AM since the correction procedure is applied in the AM

transform(data_sim: Dataset, prob: str = 'unif', random_state: int = 0, siglevel: float = 0.05) Dataset[source]

Apply the correction in the synthetic dataset

Parameters:
  • data_sim (xr.Dataset) – Dataset with synthetic data

  • prob (str, default="unif") – Type of probabilities consider to random correct the AM If “unif”, a sorted random uniform is considered If “ecdf”, the ECDF is considered

  • random_state (int, default=0) – Random state to generate the probabilities

  • siglevel (float, default=0.05)

Returns:

sim_pit_data_corrected – Point-in-time corrected data

Return type:

xr.Dataset

bluemath_tk.distributions.gev module

class bluemath_tk.distributions.gev.GEV[source]

Bases: BaseDistribution

Generalized Extreme Value (GEV) distribution class.

This class contains all the methods assocaited to the GEV distribution.

name[source]

The complete name of the distribution (GEV).

Type:

str

nparams[source]

Number of GEV parameters.

Type:

int

param_names[source]

Names of GEV parameters (location, scale, shape).

Type:

List[str]

pdf(x, loc, scale, shape)[source]

Probability density function.

cdf(x, loc, scale, shape)[source]

Cumulative distribution function

qf(p, loc, scale, shape)[source]

Quantile function

sf(x, loc, scale, shape)[source]

Survival function

nll(data, loc, scale, shape)[source]

Negative Log-Likelihood function

fit(data)[source]

Fit distribution to data (NOT IMPLEMENTED).

random(size, loc, scale, shape)[source]

Generates random values from GEV distribution.

mean(loc, scale, shape)[source]

Mean of GEV distribution.

median(loc, scale, shape)[source]

Median of GEV distribution.

variance(loc, scale, shape)[source]

Variance of GEV distribution.

std(loc, scale, shape)[source]

Standard deviation of GEV distribution.

stats(loc, scale, shape)[source]

Summary statistics of GEV distribution.

Notes

  • This class is designed to obtain all the properties associated to the GEV distribution.

Examples

>>> from bluemath_tk.distributions.gev import GEV
>>> gev_pdf = GEV.pdf(x, loc=0, scale=1, shape=0.1)
>>> gev_cdf = GEV.cdf(x, loc=0, scale=1, shape=0.1)
>>> gev_qf = GEV.qf(p, loc=0, scale=1, shape=0.1)
static cdf(x: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Cumulative distribution function

Parameters:
  • x (np.ndarray) – Values to compute their probability

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

p – Probability

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static fit(data: ndarray, **kwargs) FitResult[source]

Fit GEV distribution

Parameters:
  • data (np.ndarray) – Data to fit the GEV distribution

  • **kwargs (dict, optional) – Additional keyword arguments for the fitting function. These can include options like method, bounds, etc. See fit_dist for more details. If not provided, default fitting options will be used.

Returns:

Result of the fit containing the parameters loc, scale, shape, success status, and negative log-likelihood value.

Return type:

FitResult

static mean(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Mean

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

mean – Mean value of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static median(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Median

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

median – Median value of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static name() str[source]
static nll(data: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Negative Log-Likelihood function

Parameters:
  • data (np.ndarray) – Data to compute the Negative Log-Likelihood value

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

nll – Negative Log-Likelihood value

Return type:

float

static nparams() int[source]

Number of parameters of GEV

static param_names() List[str][source]

Name of parameters of GEV

static pdf(x: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Probability density function

Parameters:
  • x (np.ndarray) – Values to compute the probability density value

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

pdf – Probability density function values

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static qf(p: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Quantile function (Inverse of Cumulative Distribution Function)

Parameters:
  • p (np.ndarray) – Probabilities to compute their quantile

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

q – Quantile value

Return type:

np.ndarray

Raises:
  • ValueError – If probabilities are not in the range (0, 1).

  • ValueError – If scale is not greater than 0.

static random(size: int, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0, random_state: int = None) ndarray[source]

Generates random values from GEV distribution

Parameters:
  • size (int) – Number of random values to generate

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

  • random_state (np.random.RandomState, optional) – Random state for reproducibility. If None, do not use random stat.

Returns:

x – Random values from GEV distribution

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static sf(x: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Survival function (1-Cumulative Distribution Function)

Parameters:
  • x (np.ndarray) – Values to compute their survival function value

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

sp – Survival function value

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static stats(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) Dict[str, float][source]

Summary statistics

Return summary statistics including mean, std, variance, etc.

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

stats – Summary statistics of GEV distribution with the given parameters

Return type:

dict

Raises:

ValueError – If scale is not greater than 0.

static std(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Standard deviation

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

std – Standard Deviation of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static variance(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Variance

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

var – Variance of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

bluemath_tk.distributions.gpd module

class bluemath_tk.distributions.gpd.GPD[source]

Bases: BaseDistribution

Generalized Pareto Distribution (GPD) class.

This class contains all the methods assocaited to the GPD distribution.

name[source]

The complete name of the distribution (GPD).

Type:

str

nparams[source]

Number of GPD parameters.

Type:

int

param_names[source]

Names of the GPD parameters (threshold, scale, shape).

Type:

List[str]

pdf(x, threshold, scale, shape)[source]

Probability density function.

cdf(x, threshold, scale, shape)[source]

Cumulative distribution function

qf(p, threshold, scale, shape)[source]

Quantile function

sf(x, threshold, scale, shape)[source]

Survival function

nll(data, threshold, scale, shape)[source]

Negative Log-Likelihood function

fit(data)[source]

Fit distribution to data (NOT IMPLEMENTED).

random(size, threshold, scale, shape)[source]

Generates random values from GPD distribution.

mean(threshold, scale, shape)[source]

Mean of GPD distribution.

median(threshold, scale, shape)[source]

Median of GPD distribution.

variance(threshold, scale, shape)[source]

Variance of GPD distribution.

std(threshold, scale, shape)[source]

Standard deviation of GPD distribution.

stats(threshold, scale, shape)[source]

Summary statistics of GPD distribution.

Notes

  • This class is designed to obtain all the properties associated to the GPD distribution.

Examples

>>> from bluemath_tk.distributions.gpd import GPD
>>> gpd_pdf = GPD.pdf(x, threshold=0, scale=1, shape=0.1)
>>> gpd_cdf = GPD.cdf(x, threshold=0, scale=1, shape=0.1)
>>> gpd_qf = GPD.qf(p, threshold=0, scale=1, shape=0.1)
static cdf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Cumulative distribution function

Parameters:
  • x (np.ndarray) – Values to compute their probability

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

p – Probability

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static fit(data: ndarray, threshold: float, **kwargs) FitResult[source]

Fit GEV distribution

Parameters:
  • data (np.ndarray) – Data to fit the GEV distribution

  • threshold (float, default=0.0) – Threshold parameter

  • **kwargs (dict, optional) – Additional keyword arguments for the fitting function. These can include options like method, bounds, etc. See fit_dist for more details. If not provided, default fitting options will be used.

Returns:

Result of the fit containing the parameters loc, scale, shape, success status, and negative log-likelihood value.

Return type:

FitResult

static mean(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Mean

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

mean – Mean value of GEV with the given parameters

Return type:

np.ndarray

Raises:
  • ValueError – If scale is not greater than 0.

  • Warning – If shape is greater than or equal to 1, mean is not defined. In this case, it returns infinity.

static median(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Median

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

median – Median value of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static name() str[source]
static nll(data: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Negative Log-Likelihood function

Parameters:
  • data (np.ndarray) – Data to compute the Negative Log-Likelihood value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

nll – Negative Log-Likelihood value

Return type:

float

static nparams() int[source]

Number of parameters of GPD

static param_names() List[str][source]

Name of parameters of GPD

static pdf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Probability density function

Parameters:
  • x (np.ndarray) – Values to compute the probability density value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

pdf – Probability density function values

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static qf(p: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Quantile function (Inverse of Cumulative Distribution Function)

Parameters:
  • p (np.ndarray) – Probabilities to compute their quantile

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

q – Quantile value

Return type:

np.ndarray

Raises:
  • ValueError – If probabilities are not in the range (0, 1).

  • ValueError – If scale is not greater than 0.

static random(size: int, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0, random_state: int = None) ndarray[source]

Generates random values from GPD distribution

Parameters:
  • size (int) – Number of random values to generate

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

  • random_state (np.random.RandomState, optional) – Random state for reproducibility. If None, do not use random stat.

Returns:

x – Random values from GEV distribution

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static sf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Survival function (1-Cumulative Distribution Function)

Parameters:
  • x (np.ndarray) – Values to compute their survival function value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

sp – Survival function value

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static stats(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) Dict[str, float][source]

Summary statistics

Return summary statistics including mean, std, variance, etc.

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

stats – Summary statistics of GEV distribution with the given parameters

Return type:

dict

Raises:

ValueError – If scale is not greater than 0.

static std(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Standard deviation

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

std – Standard Deviation of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static variance(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Variance

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

var – Variance of GEV with the given parameters

Return type:

np.ndarray

Raises:
  • ValueError – If scale is not greater than 0.

  • Warning – If shape is greater than or equal to 172, mean is not defined. In this case, it returns infinity.

bluemath_tk.distributions.nonstat_gev module

class bluemath_tk.distributions.nonstat_gev.NonStatGEV(xt: ndarray, t: ndarray, covariates: ndarray | DataFrame | None = None, harms: bool = True, trends: bool = False, kt: ndarray | None = None, quanval: float = 0.95, var_name: str = 'x')[source]

Bases: BlueMathModel

Non-Stationary Generalized Extreme Value Distribution

Class to implement the Non-Stationary GEV including trends and/or covariates in the location, scale and shape parameters. This methodology selects the covariates and trends based on which of them minimize the Akaike Information Criteria (AIC).

This class is based in the work of R. Mínguez et al. 2010. “Pseudooptimal parameter selection of non-stationary generalized extreme value models for environmental variables”. Environ. Model. Softw. 25, 1592-1607.

Parameters:
  • xt (np.ndarray) – Data to fit Non Stationary GEV.

  • t (np.ndarray, default=None.) – Time associated to the data.

  • covariates (np.ndarray | pd.DataFrame, default=None.) – Covariates to include for location, scale and shape parameters.

  • kt (np.ndarray, default=None.) – Frequency of block maxima, if None, it is assumed to be 1.

  • trends (bool, defaul=False.) – Whether trends should be included, if so, t must be passed.

  • quanval (float, default=0.95.) – Confidence interval value

  • var_name (str, default="x") – Name of the variable to be used in the model. Used for plotting purposes.

fit:

Fit the Non-Stationary GEV with desired Trends and Covariates.

auto_adjust:

Automatically selects the best covariates and trends based on AIC.

Examples

>>> from bluemath_tk.distributions.nonstat_gev import NonStatGEV
>>> nonstat_gev = NonStatGEV(x, t, covariates, trends=True)
>>> fit_result = nonstat_gev.auto_adjust()
>>> fit_result = nonstat_gev.fit(nmu=2,npsi=2,ngamma=2,ntrend_loc=1,list_loc="all",ntrend_sc=1,list_sc="all",ntrend_sh=1,list_sh="all")
PPplot(save=False)[source]

PP plot

Parameters:

save (bool, default=False) – If True, save the plot in “Figures/”

QQplot(save: bool = False)[source]

QQ plot

Parameters:

save (bool, default=False) – If True, save the plot in “Figures/”

auto_adjust(max_iter: int = 1000, plot: bool = False, stationary_shape: bool = False) dict[source]

This method automatically select and calculate the parameters which minimize the AIC related to Non-Stationary GEV distribution using the Maximum Likelihood method within an iterative scheme, including one parameter at a time based on a perturbation criteria. The process is repeated until no further improvement in the objective function is achieved.

Parameters:
  • max_iter (int, default=1000) – Number of iteration in optimization process.

  • plot (bool, default=False) – If plot the adjusted distribution

  • stationary_shape (bool, default=False) – If True, the shape parameter remain stationary

Returns:

fit_result – Dictionary with the optimal parameters and values related to the Non-Stationary GEV distribution. The keys of the dictionary are: - x: Optimal solution - beta0, beta, betaT, beta_cov: Location parameters (intercept, harmonic, trend, covariates) - alpha0, alpha, alphaT, alpha_cov: Scale parameters (intercept, harmonic, trend, covariates) - gamma0, gamma, gammaT, gamma_cov: Shape parameters (intercept, harmonic, trend, covariates) - negloglikelihood: Negative log-likelihood value at the optimal solution - loglikelihood: Log-likelihood value at the optimal solution - grad: Gradient of the log-likelihood function at the optimal solution - hessian: Hessian matrix of the log-likelihood function at the optimal solution - AIC: Akaike Information Criterion value at the optimal solution - invI0: Fisher information matrix at the optimal solution - std_param: Standard deviation of parameters at the optimal solution

Return type:

dict

fit(nmu: int = 0, npsi: int = 0, ngamma: int = 0, ntrend_loc: int = 1, list_loc: list | str | None = 'all', ntrend_sc: int = 1, list_sc: list | str | None = 'all', ntrend_sh: int = 1, list_sh: list | str | None = 'all', options: dict = None, plot: bool = False) dict[source]

Function to determine the optimal parameters of Non-Stationary GEV for given covariates, trends and harmonics.

By default the method fits a Non-Stationary GEV including trends in all the parameters, all possible covariates and no harmonics

Parameters:
  • nmu (int) – Number of harmonics to be included in the location parameter

  • npsi (int) – Number of harmonics to be included in the scale parameter

  • ngamma (int) – Number of harmonics to be included in the shape parameter

  • ntrend_loc (int, default=1) – If trends in location are included.

  • list_loc (list or str, default="all") – List of indices of covariates to be included in the location parameter. If None,no covariates are included in the location parameter.

  • ntrend_sc (int, default=1) – If trends in scale are included.

  • list_sc (list or str, default="all") – List of indices of covariates to be included in the scale parameter. If None,no covariates are included in the scale parameter.

  • ntrend_sh (int, default=1) – If trends in shape are included.

  • list_sh (list or str, default="all") – List of indices of covariates to be included in the shape parameter. If None,no covariates are included in the shape parameter.

  • plot (bool, default=False) – If True, plot the diagnostic plots

Returns:

fit_result – Dictionary with the optimal parameters and other information about the fit. The keys of the dictionary are: - beta0, beta, betaT, beta_cov: Location parameters (intercept, harmonic, trend, covariates) - alpha0, alpha, alphaT, alpha_cov: Scale parameters (intercept, harmonic, trend, covariates) - gamma0, gamma, gammaT, gamma_cov: Shape parameters (intercept, harmonic, trend, covariates) - negloglikelihood: Negative log-likelihood value at the optimal solution - hessian: Hessian matrix of the log-likelihood function at the optimal solution - invI0: Inverse of Fisher information matrix

Return type:

dict

static parametro(beta0: float | None = None, beta: ndarray | None = None, betaT: float | None = None, beta_cov: ndarray | None = None, covariates: ndarray | None = None, indicesint: ndarray | None = None, times: ndarray | None = None, x: ndarray | None = None, ntend: int | None = None) ndarray[source]

This function computes the location, scale and shape parameters for given parameters. Expressions by (2)-(3) in the paper

Parameters:
  • beta0 (float, optional) – Value of the intercept

  • beta (np.ndarray, optional) – Value of the harmonics terms

  • betaT (float, optional) – Trend parameter

  • beta_cov (np.ndarray, optional) – Covariate parameters

  • covariates (np.ndarray, optional) – Covariate matrix, where each column corresponds to a covariate and each row to a time point

  • indicesint (np.ndarray, optional) – Covariate mean values in the integral interval

  • times (np.ndarray, optional) – Times when covariates are known, used to find the nearest value

  • t (np.ndarray, optional) – Specific time point to evaluate the parameters at, if None, uses the times given

Returns:

y – Values of the parameter

Return type:

np.ndarray

paramplot(save: bool = False)[source]

Create a heatmap of parameter sensitivities for location, scale and shape.

Parameters:

save (bool, False)

plot(return_plot: bool = False, save: bool = False, init_year: int = 0)[source]

Plot the location, scale and shape parameters, also the PP plot and QQ plot

Return period plot is plotted if and only if no covariates and trends are included

Parameters:
  • return_plot (bool, default=True) – If True, return period plot is plotted

  • save (bool, default=False) – If True, save all the figures in a “Figures/”

  • init_year (int, default=0) – Initial year for plotting purposes

returnperiod_plot(annualplot: bool = True, conf_int: bool = False, monthly_plot: bool = False, save: bool = False)[source]

Funtion to plot the Aggregated Return period plot for each month and the annual Return period

Parameters:
  • annualplot (bool, default=True) – Whether to plot the annual return period plot

  • conf_int (bool, default=False) – Whether to plot the confidence bands for annual return periods Heavy computational time

  • monthly_plot (bool, default=False) – Wheter to plot the return periods grouped by months

  • save (bool, default=False) – Whether to save the plot

summary()[source]

Print a summary of the fitted model, including parameter estimates, standard errors and fit statistics.

bluemath_tk.distributions.nonstat_gev.bsimp(fun, a, b, n=None, epsilon=1e-08, trace=0)[source]

BSIMP Numerically evaluate integral, low order method. I = BSIMP(‘F’,A,B) approximates the integral of F(X) from A to B within a relative error of 1e-3 using an iterative Simpson’s rule. ‘F’ is a string containing the name of the function. Function F must return a vector of output values if given a vector of input values.% I = BSIMP(‘F’,A,B,EPSILON) integrates to a total error of EPSILON. % I = BSIMP(‘F’,A,B,N,EPSILON,TRACE,TRACETOL) integrates to a relative error of EPSILON, beginning with n subdivisions of the interval [A,B],for non-zero TRACE traces the function evaluations with a point plot. [I,cnt] = BSIMP(F,a,b,epsilon) also returns a function evaluation count.% Roberto Minguez Solana% Copyright (c) 2001 by Universidad de Cantabria

bluemath_tk.distributions.pareto_poisson module

class bluemath_tk.distributions.pareto_poisson.GPDPoiss[source]

Bases: BaseDistribution

cdf(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0, poisson: float = 1.0) ndarray[source]

Cumulative distribution function

Parameters:
  • x (np.ndarray) – Values to compute their probability

  • **kwargs – Distribution specific parameters as keyword arguments. Common parameters include: - loc: Location parameter - scale: Scale parameter - shape: Shape parameter (for some distributions)

Returns:

p – Probability

Return type:

np.ndarray

property name: str
property nparams: int

Number of parameters of GPD-Poisson model

property param_names: list[str]

Name of parameters of GPD-Poisson

static pdf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0, poisson: float = 1.0) ndarray[source]

Probability density function

Parameters:
  • x (np.ndarray) – Values to compute the probability density value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

  • poisson (float, default = 1.0)

Returns:

pdf – Probability density function values

Return type:

np.ndarray

Raises:
  • ValueError – If scale is not greater than 0.

  • ValueError – If poisson is not greater than 0.

static qf(p: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0, poisson: float = 1.0) ndarray[source]

Quantile function (Inverse of Cumulative Distribution Function)

Parameters:
  • p (np.ndarray) – Probabilities to compute their quantile

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

q – Quantile value

Return type:

np.ndarray

Raises:
  • ValueError – If probabilities are not in the range (0, 1).

  • ValueError – If scale is not greater than 0.

bluemath_tk.distributions.poisson module

bluemath_tk.distributions.pot module

class bluemath_tk.distributions.pot.OptimalThreshold(data, threshold: float = 0.0, n0: int = 10, min_peak_distance: int = 2, sig_level: float = 0.05, method: str = 'studentized', plot: bool = False, folder: str = None, display_flag: bool = False)[source]

Bases: BlueMathModel

Class to compute the optimal threshold using different algorithms.

studentidez_residuals :

Function to compute the threshold using the studentidez resiudals

Notes

The list of methods implemented to select the optimal threshold are: - Studentidez residuals method Mínguez (2025) [1].

[1] Mínguez, R. (2025). Automatic Threshold Selection for Generalized Pareto and Pareto–Poisson Distributions in Rainfall Analysis: A Case Study Using the NOAA NCDC Daily Rainfall Database. Atmosphere, 16(1), 61. https://doi.org/10.3390/atmos16010061

fit()[source]

Obtain the optimal threshold and POTs given the selected method

Returns:

  • threshold (float) – Optimal threshold

  • pks (np.ndarray) – POT

  • pks_idx (np.ndarray) – Indices of POT

potplot(time: ndarray = None, ax: Axes = None, figsize: tuple = (8, 5))[source]

Auxiliar function which call generic potplot to plot the POT usign the optimal threshold obtained

Parameters:
  • time (np.ndarray, default=None) – Time of data

  • ax (plt.Axes, default=None) – Axes

  • figsize (tuple, default=(8,5)) – Figure size, by default (8, 5)

Returns:

  • fig (plt.Figure) – Figure

  • ax (plt.Axes) – Axes

studentized_residuals(pks_unicos_valid: ndarray, exceedances_mean_valid: ndarray, exceedances_weight_valid: ndarray, sig_level: float = 0.05, plot: bool = False, folder: str = None, display_flag: bool = False)[source]

Function to compute the optimal threshold based on Chi-Squared and studentized residuals. Optionally plot the results if plot_flag is true and displays messages if display_flag is true.

Parameters:
  • pks_unicos_valid (np.ndarray(n,)) – Vector of unique peaks (potential thresholds)

  • exceedances_mean_valid (np.ndarray(n,)) – Vector of exceedance means

  • exceedances_weight_valid (np.ndarray(n,)) – Vector of exceedance weights

  • sig_level (bool, default=0.05) – Significance level for Chi-squared test

  • plot_flag (bool, default=False) – Boolean flag, true to plot the graphs, false otherwise

  • folder (str, default=None) – Path and name for making graphs

  • display_flag (bool, default=False) – Boolean flag, true to display messages, false otherwise

Returns:

  • threshold – Optimal threshold found

  • beta (np.ndarray) – Optimal regression coefficients

  • fobj – Optimal objective function (weighted leats squares)

  • r (np.ndarray) – Optimal residuals

bluemath_tk.distributions.pot.block_maxima(x: ndarray, block_size: int | float = 365.25, min_sep: int = 2) tuple[ndarray, ndarray][source]

Function to obtain the Block Maxima of given size taking into account minimum independence hypothesis

Parameters:
  • x (np.ndarray) – Data used to compute the Block Maxima

  • block_size (int, default=5) – Size of BM in index units (if daily data, nº of days), by default 5

  • min_sep (int, optional) – Minimum separation between maximas in index units, by default 2

Returns:

  • idx (np.ndarray) – Indices of BM

  • bmaxs (np.ndarray) – BM values

Raises:

ValueError – Minimum separation must be smaller than (block_size+1)/2

Example

>>> # 1-year of daily values
>>> x = np.random.lognormal(1, 1.2, size=365)
>>> # 5-day Block Maxima with 72h of independency
>>> idx, bmaxs = block_maxima(
>>>     x,
>>>     block_size=5,
>>>     min_sep=3
>>> )
bluemath_tk.distributions.pot.mrlp(data: ndarray, threshold: float = None, conf_level: float = 0.95, ax: Axes = None, figsize: tuple = (8, 5)) Axes[source]

Plot mean residual life for given threshold value.

The mean residual life plot should be approximately linear above a threshold for which the Generalized Pareto Distribution model is valid. The strategy is to select the smallest threshold value immediately above which the plot is approximately linear.

Parameters:
  • data (np.ndarray) – Time series of the signal.

  • threshold (float, optional) – An array of thresholds for which the mean residual life plot is plotted. If None (default), starting in the 90th quantile

  • conf_level (float, default=0.95) – Confidence interval width in the range (0, 1), by default it is 0.95. If None, then confidence interval is not shown.

  • ax (matplotlib.axes._axes.Axes, optional) – If provided, then the plot is drawn on this axes. If None (default), new figure and axes are created

  • figsize (tuple, optional) – Figure size in inches in format (width, height). By default it is (8, 5).

Returns:

Axes object.

Return type:

matplotlib.axes._axes.Axes

bluemath_tk.distributions.pot.pot(data: ndarray, threshold: float = 0.0, n0: int = 10, min_peak_distance: int = 2, sig_level: float = 0.05)[source]

Function to identiy POT This function identifies peaks in a dataset that exceed a specified threshold and computes statistics such as mean exceedances, variances, and weights for valid unique peaks. Peaks are considered independent if they are separated by a minimum distance.

Parameters:
  • data (np.ndarray(n,)) – Input time series or data array

  • threshold (float, default=0) – Threshold above which peaks are extracted

  • n0 (int, default=10) – Minimum number of exceedances required for valid computation

  • min_peak_distance (int, default = 2) – Minimum distance between two peaks (in data points)

  • sig_level (float, default=0.05) – Significance level for Chi-squared test

Returns:

  • pks_unicos_valid (np.ndarray) – Valid unique peaks after removing NaN values

  • excedencias_mean_valid (np.ndarray) – Mean exceedances for valid peaks

  • excedencias_weight_valid (np.ndarray) – Weights based on exceedance variance for valid peaks

  • pks (np.ndarray) – All detected peaks

  • locs (np.ndarray) – Indices of the detected peaks in the data

  • autocorrelations (np.ndarray(n, 3)) – Lags, correlations and pvalues to check the independence assumption

bluemath_tk.distributions.pot.potplot(data: ndarray, threshold: float = 0.0, time: ndarray = None, n0: int = 10, min_peak_distance: int = 2, sig_level: float = 0.05, ax: Axes = None, figsize: tuple = (8, 5))[source]

Plot the POT for data given a threshold.

Parameters:
  • data (np.ndarray) – Data

  • threshold (float, default=0.0) – Threshold used to plot the peaks

  • time (np.ndarray, default=None) – Time of data

  • n0 (int, default=10) – Minimum number of data to compute the POT given a threshold

  • min_peak_distance (int, default=2) – Minimum peak distance between POT (in index size)

  • sig_level (float, default=0.05) – Significance level for Chi-squared test

  • ax (plt.Axes, default=None) – Axes

  • figsize (tuple, default=(8,5)) – Figure figsize

Returns:

  • fig (plt.Figure) – Figure

  • ax (plt.Axes) – Axes

Module contents

Project: BlueMath_tk Sub-Module: distributions Author: GeoOcean Research Group, Universidad de Cantabria Repository: https://github.com/GeoOcean/BlueMath_tk.git Status: Under development (Working)

class bluemath_tk.distributions.GEV[source]

Bases: BaseDistribution

Generalized Extreme Value (GEV) distribution class.

This class contains all the methods assocaited to the GEV distribution.

name[source]

The complete name of the distribution (GEV).

Type:

str

nparams[source]

Number of GEV parameters.

Type:

int

param_names[source]

Names of GEV parameters (location, scale, shape).

Type:

List[str]

pdf(x, loc, scale, shape)[source]

Probability density function.

cdf(x, loc, scale, shape)[source]

Cumulative distribution function

qf(p, loc, scale, shape)[source]

Quantile function

sf(x, loc, scale, shape)[source]

Survival function

nll(data, loc, scale, shape)[source]

Negative Log-Likelihood function

fit(data)[source]

Fit distribution to data (NOT IMPLEMENTED).

random(size, loc, scale, shape)[source]

Generates random values from GEV distribution.

mean(loc, scale, shape)[source]

Mean of GEV distribution.

median(loc, scale, shape)[source]

Median of GEV distribution.

variance(loc, scale, shape)[source]

Variance of GEV distribution.

std(loc, scale, shape)[source]

Standard deviation of GEV distribution.

stats(loc, scale, shape)[source]

Summary statistics of GEV distribution.

Notes

  • This class is designed to obtain all the properties associated to the GEV distribution.

Examples

>>> from bluemath_tk.distributions.gev import GEV
>>> gev_pdf = GEV.pdf(x, loc=0, scale=1, shape=0.1)
>>> gev_cdf = GEV.cdf(x, loc=0, scale=1, shape=0.1)
>>> gev_qf = GEV.qf(p, loc=0, scale=1, shape=0.1)
static cdf(x: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Cumulative distribution function

Parameters:
  • x (np.ndarray) – Values to compute their probability

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

p – Probability

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static fit(data: ndarray, **kwargs) FitResult[source]

Fit GEV distribution

Parameters:
  • data (np.ndarray) – Data to fit the GEV distribution

  • **kwargs (dict, optional) – Additional keyword arguments for the fitting function. These can include options like method, bounds, etc. See fit_dist for more details. If not provided, default fitting options will be used.

Returns:

Result of the fit containing the parameters loc, scale, shape, success status, and negative log-likelihood value.

Return type:

FitResult

static mean(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Mean

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

mean – Mean value of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static median(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Median

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

median – Median value of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static name() str[source]
static nll(data: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Negative Log-Likelihood function

Parameters:
  • data (np.ndarray) – Data to compute the Negative Log-Likelihood value

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

nll – Negative Log-Likelihood value

Return type:

float

static nparams() int[source]

Number of parameters of GEV

static param_names() List[str][source]

Name of parameters of GEV

static pdf(x: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Probability density function

Parameters:
  • x (np.ndarray) – Values to compute the probability density value

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

pdf – Probability density function values

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static qf(p: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Quantile function (Inverse of Cumulative Distribution Function)

Parameters:
  • p (np.ndarray) – Probabilities to compute their quantile

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

q – Quantile value

Return type:

np.ndarray

Raises:
  • ValueError – If probabilities are not in the range (0, 1).

  • ValueError – If scale is not greater than 0.

static random(size: int, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0, random_state: int = None) ndarray[source]

Generates random values from GEV distribution

Parameters:
  • size (int) – Number of random values to generate

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

  • random_state (np.random.RandomState, optional) – Random state for reproducibility. If None, do not use random stat.

Returns:

x – Random values from GEV distribution

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static sf(x: ndarray, loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Survival function (1-Cumulative Distribution Function)

Parameters:
  • x (np.ndarray) – Values to compute their survival function value

  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

sp – Survival function value

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static stats(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) Dict[str, float][source]

Summary statistics

Return summary statistics including mean, std, variance, etc.

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

stats – Summary statistics of GEV distribution with the given parameters

Return type:

dict

Raises:

ValueError – If scale is not greater than 0.

static std(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Standard deviation

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

std – Standard Deviation of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static variance(loc: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Variance

Parameters:
  • loc (float, default=0.0) – Location parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

var – Variance of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

class bluemath_tk.distributions.GPD[source]

Bases: BaseDistribution

Generalized Pareto Distribution (GPD) class.

This class contains all the methods assocaited to the GPD distribution.

name[source]

The complete name of the distribution (GPD).

Type:

str

nparams[source]

Number of GPD parameters.

Type:

int

param_names[source]

Names of the GPD parameters (threshold, scale, shape).

Type:

List[str]

pdf(x, threshold, scale, shape)[source]

Probability density function.

cdf(x, threshold, scale, shape)[source]

Cumulative distribution function

qf(p, threshold, scale, shape)[source]

Quantile function

sf(x, threshold, scale, shape)[source]

Survival function

nll(data, threshold, scale, shape)[source]

Negative Log-Likelihood function

fit(data)[source]

Fit distribution to data (NOT IMPLEMENTED).

random(size, threshold, scale, shape)[source]

Generates random values from GPD distribution.

mean(threshold, scale, shape)[source]

Mean of GPD distribution.

median(threshold, scale, shape)[source]

Median of GPD distribution.

variance(threshold, scale, shape)[source]

Variance of GPD distribution.

std(threshold, scale, shape)[source]

Standard deviation of GPD distribution.

stats(threshold, scale, shape)[source]

Summary statistics of GPD distribution.

Notes

  • This class is designed to obtain all the properties associated to the GPD distribution.

Examples

>>> from bluemath_tk.distributions.gpd import GPD
>>> gpd_pdf = GPD.pdf(x, threshold=0, scale=1, shape=0.1)
>>> gpd_cdf = GPD.cdf(x, threshold=0, scale=1, shape=0.1)
>>> gpd_qf = GPD.qf(p, threshold=0, scale=1, shape=0.1)
static cdf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Cumulative distribution function

Parameters:
  • x (np.ndarray) – Values to compute their probability

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

p – Probability

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static fit(data: ndarray, threshold: float, **kwargs) FitResult[source]

Fit GEV distribution

Parameters:
  • data (np.ndarray) – Data to fit the GEV distribution

  • threshold (float, default=0.0) – Threshold parameter

  • **kwargs (dict, optional) – Additional keyword arguments for the fitting function. These can include options like method, bounds, etc. See fit_dist for more details. If not provided, default fitting options will be used.

Returns:

Result of the fit containing the parameters loc, scale, shape, success status, and negative log-likelihood value.

Return type:

FitResult

static mean(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Mean

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

mean – Mean value of GEV with the given parameters

Return type:

np.ndarray

Raises:
  • ValueError – If scale is not greater than 0.

  • Warning – If shape is greater than or equal to 1, mean is not defined. In this case, it returns infinity.

static median(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Median

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

median – Median value of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static name() str[source]
static nll(data: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Negative Log-Likelihood function

Parameters:
  • data (np.ndarray) – Data to compute the Negative Log-Likelihood value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

nll – Negative Log-Likelihood value

Return type:

float

static nparams() int[source]

Number of parameters of GPD

static param_names() List[str][source]

Name of parameters of GPD

static pdf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Probability density function

Parameters:
  • x (np.ndarray) – Values to compute the probability density value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

pdf – Probability density function values

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static qf(p: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Quantile function (Inverse of Cumulative Distribution Function)

Parameters:
  • p (np.ndarray) – Probabilities to compute their quantile

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

q – Quantile value

Return type:

np.ndarray

Raises:
  • ValueError – If probabilities are not in the range (0, 1).

  • ValueError – If scale is not greater than 0.

static random(size: int, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0, random_state: int = None) ndarray[source]

Generates random values from GPD distribution

Parameters:
  • size (int) – Number of random values to generate

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

  • random_state (np.random.RandomState, optional) – Random state for reproducibility. If None, do not use random stat.

Returns:

x – Random values from GEV distribution

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static sf(x: ndarray, threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) ndarray[source]

Survival function (1-Cumulative Distribution Function)

Parameters:
  • x (np.ndarray) – Values to compute their survival function value

  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

sp – Survival function value

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static stats(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) Dict[str, float][source]

Summary statistics

Return summary statistics including mean, std, variance, etc.

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

stats – Summary statistics of GEV distribution with the given parameters

Return type:

dict

Raises:

ValueError – If scale is not greater than 0.

static std(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Standard deviation

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

std – Standard Deviation of GEV with the given parameters

Return type:

np.ndarray

Raises:

ValueError – If scale is not greater than 0.

static variance(threshold: float = 0.0, scale: float = 1.0, shape: float = 0.0) float[source]

Variance

Parameters:
  • threshold (float, default=0.0) – Threshold parameter

  • scale (float, default = 1.0) – Scale parameter. Must be greater than 0.

  • shape (float, default = 0.0) – Shape parameter.

Returns:

var – Variance of GEV with the given parameters

Return type:

np.ndarray

Raises:
  • ValueError – If scale is not greater than 0.

  • Warning – If shape is greater than or equal to 172, mean is not defined. In this case, it returns infinity.

class bluemath_tk.distributions.NonStatGEV(xt: ndarray, t: ndarray, covariates: ndarray | DataFrame | None = None, harms: bool = True, trends: bool = False, kt: ndarray | None = None, quanval: float = 0.95, var_name: str = 'x')[source]

Bases: BlueMathModel

Non-Stationary Generalized Extreme Value Distribution

Class to implement the Non-Stationary GEV including trends and/or covariates in the location, scale and shape parameters. This methodology selects the covariates and trends based on which of them minimize the Akaike Information Criteria (AIC).

This class is based in the work of R. Mínguez et al. 2010. “Pseudooptimal parameter selection of non-stationary generalized extreme value models for environmental variables”. Environ. Model. Softw. 25, 1592-1607.

Parameters:
  • xt (np.ndarray) – Data to fit Non Stationary GEV.

  • t (np.ndarray, default=None.) – Time associated to the data.

  • covariates (np.ndarray | pd.DataFrame, default=None.) – Covariates to include for location, scale and shape parameters.

  • kt (np.ndarray, default=None.) – Frequency of block maxima, if None, it is assumed to be 1.

  • trends (bool, defaul=False.) – Whether trends should be included, if so, t must be passed.

  • quanval (float, default=0.95.) – Confidence interval value

  • var_name (str, default="x") – Name of the variable to be used in the model. Used for plotting purposes.

fit:

Fit the Non-Stationary GEV with desired Trends and Covariates.

auto_adjust:

Automatically selects the best covariates and trends based on AIC.

Examples

>>> from bluemath_tk.distributions.nonstat_gev import NonStatGEV
>>> nonstat_gev = NonStatGEV(x, t, covariates, trends=True)
>>> fit_result = nonstat_gev.auto_adjust()
>>> fit_result = nonstat_gev.fit(nmu=2,npsi=2,ngamma=2,ntrend_loc=1,list_loc="all",ntrend_sc=1,list_sc="all",ntrend_sh=1,list_sh="all")
PPplot(save=False)[source]

PP plot

Parameters:

save (bool, default=False) – If True, save the plot in “Figures/”

QQplot(save: bool = False)[source]

QQ plot

Parameters:

save (bool, default=False) – If True, save the plot in “Figures/”

auto_adjust(max_iter: int = 1000, plot: bool = False, stationary_shape: bool = False) dict[source]

This method automatically select and calculate the parameters which minimize the AIC related to Non-Stationary GEV distribution using the Maximum Likelihood method within an iterative scheme, including one parameter at a time based on a perturbation criteria. The process is repeated until no further improvement in the objective function is achieved.

Parameters:
  • max_iter (int, default=1000) – Number of iteration in optimization process.

  • plot (bool, default=False) – If plot the adjusted distribution

  • stationary_shape (bool, default=False) – If True, the shape parameter remain stationary

Returns:

fit_result – Dictionary with the optimal parameters and values related to the Non-Stationary GEV distribution. The keys of the dictionary are: - x: Optimal solution - beta0, beta, betaT, beta_cov: Location parameters (intercept, harmonic, trend, covariates) - alpha0, alpha, alphaT, alpha_cov: Scale parameters (intercept, harmonic, trend, covariates) - gamma0, gamma, gammaT, gamma_cov: Shape parameters (intercept, harmonic, trend, covariates) - negloglikelihood: Negative log-likelihood value at the optimal solution - loglikelihood: Log-likelihood value at the optimal solution - grad: Gradient of the log-likelihood function at the optimal solution - hessian: Hessian matrix of the log-likelihood function at the optimal solution - AIC: Akaike Information Criterion value at the optimal solution - invI0: Fisher information matrix at the optimal solution - std_param: Standard deviation of parameters at the optimal solution

Return type:

dict

fit(nmu: int = 0, npsi: int = 0, ngamma: int = 0, ntrend_loc: int = 1, list_loc: list | str | None = 'all', ntrend_sc: int = 1, list_sc: list | str | None = 'all', ntrend_sh: int = 1, list_sh: list | str | None = 'all', options: dict = None, plot: bool = False) dict[source]

Function to determine the optimal parameters of Non-Stationary GEV for given covariates, trends and harmonics.

By default the method fits a Non-Stationary GEV including trends in all the parameters, all possible covariates and no harmonics

Parameters:
  • nmu (int) – Number of harmonics to be included in the location parameter

  • npsi (int) – Number of harmonics to be included in the scale parameter

  • ngamma (int) – Number of harmonics to be included in the shape parameter

  • ntrend_loc (int, default=1) – If trends in location are included.

  • list_loc (list or str, default="all") – List of indices of covariates to be included in the location parameter. If None,no covariates are included in the location parameter.

  • ntrend_sc (int, default=1) – If trends in scale are included.

  • list_sc (list or str, default="all") – List of indices of covariates to be included in the scale parameter. If None,no covariates are included in the scale parameter.

  • ntrend_sh (int, default=1) – If trends in shape are included.

  • list_sh (list or str, default="all") – List of indices of covariates to be included in the shape parameter. If None,no covariates are included in the shape parameter.

  • plot (bool, default=False) – If True, plot the diagnostic plots

Returns:

fit_result – Dictionary with the optimal parameters and other information about the fit. The keys of the dictionary are: - beta0, beta, betaT, beta_cov: Location parameters (intercept, harmonic, trend, covariates) - alpha0, alpha, alphaT, alpha_cov: Scale parameters (intercept, harmonic, trend, covariates) - gamma0, gamma, gammaT, gamma_cov: Shape parameters (intercept, harmonic, trend, covariates) - negloglikelihood: Negative log-likelihood value at the optimal solution - hessian: Hessian matrix of the log-likelihood function at the optimal solution - invI0: Inverse of Fisher information matrix

Return type:

dict

static parametro(beta0: float | None = None, beta: ndarray | None = None, betaT: float | None = None, beta_cov: ndarray | None = None, covariates: ndarray | None = None, indicesint: ndarray | None = None, times: ndarray | None = None, x: ndarray | None = None, ntend: int | None = None) ndarray[source]

This function computes the location, scale and shape parameters for given parameters. Expressions by (2)-(3) in the paper

Parameters:
  • beta0 (float, optional) – Value of the intercept

  • beta (np.ndarray, optional) – Value of the harmonics terms

  • betaT (float, optional) – Trend parameter

  • beta_cov (np.ndarray, optional) – Covariate parameters

  • covariates (np.ndarray, optional) – Covariate matrix, where each column corresponds to a covariate and each row to a time point

  • indicesint (np.ndarray, optional) – Covariate mean values in the integral interval

  • times (np.ndarray, optional) – Times when covariates are known, used to find the nearest value

  • t (np.ndarray, optional) – Specific time point to evaluate the parameters at, if None, uses the times given

Returns:

y – Values of the parameter

Return type:

np.ndarray

paramplot(save: bool = False)[source]

Create a heatmap of parameter sensitivities for location, scale and shape.

Parameters:

save (bool, False)

plot(return_plot: bool = False, save: bool = False, init_year: int = 0)[source]

Plot the location, scale and shape parameters, also the PP plot and QQ plot

Return period plot is plotted if and only if no covariates and trends are included

Parameters:
  • return_plot (bool, default=True) – If True, return period plot is plotted

  • save (bool, default=False) – If True, save all the figures in a “Figures/”

  • init_year (int, default=0) – Initial year for plotting purposes

returnperiod_plot(annualplot: bool = True, conf_int: bool = False, monthly_plot: bool = False, save: bool = False)[source]

Funtion to plot the Aggregated Return period plot for each month and the annual Return period

Parameters:
  • annualplot (bool, default=True) – Whether to plot the annual return period plot

  • conf_int (bool, default=False) – Whether to plot the confidence bands for annual return periods Heavy computational time

  • monthly_plot (bool, default=False) – Wheter to plot the return periods grouped by months

  • save (bool, default=False) – Whether to save the plot

summary()[source]

Print a summary of the fitted model, including parameter estimates, standard errors and fit statistics.