bluemath_tk.downloaders package
Subpackages
- bluemath_tk.downloaders.copernicus package
- Submodules
- bluemath_tk.downloaders.copernicus.copernicus_downloader module
CopernicusDownloader
CopernicusDownloader.product
CopernicusDownloader.product_config
CopernicusDownloader.client
CopernicusDownloader.client
CopernicusDownloader.download_data()
CopernicusDownloader.download_data_era5()
CopernicusDownloader.list_datasets()
CopernicusDownloader.list_variables()
CopernicusDownloader.num_workers
CopernicusDownloader.product
CopernicusDownloader.product_config
CopernicusDownloader.products_configs
CopernicusDownloader.show_markdown_table()
- bluemath_tk.downloaders.copernicus.copernicus_marine_downloader module
- Module contents
- bluemath_tk.downloaders.ecmwf package
- bluemath_tk.downloaders.noaa package
- Submodules
- bluemath_tk.downloaders.noaa.noaa_downloader module
NOAADownloader
NOAADownloader.config
NOAADownloader.base_path_to_download
NOAADownloader.debug
NOAADownloader.config
NOAADownloader.data_types
NOAADownloader.datasets
NOAADownloader.download_data()
NOAADownloader.list_data_types()
NOAADownloader.list_datasets()
NOAADownloader.num_workers
NOAADownloader.read_bulk_parameters()
NOAADownloader.read_directional_spectra()
NOAADownloader.read_wave_spectra()
NOAADownloader.show_markdown_table()
- Module contents
Module contents
Project: BlueMath_tk Sub-Module: downloaders Author: GeoOcean Research Group, Universidad de Cantabria Repository: https://github.com/GeoOcean/BlueMath_tk.git Status: Under development (Working)
- class bluemath_tk.downloaders.CopernicusDownloader(product: str, base_path_to_download: str, token: str = None, debug: bool = True, check: bool = True)[source]
Bases:
BaseDownloader
This is the main class to download data from the Copernicus Climate Data Store.
- product
The product to download data from. Currently only ERA5 is supported.
- Type:
str
- product_config
The configuration for the product to download data from.
- Type:
dict
- client
The client to interact with the Copernicus Climate Data Store API.
- Type:
cdsapi.Client
Examples
from bluemath_tk.downloaders.copernicus.copernicus_downloader import CopernicusDownloader copernicus_downloader = CopernicusDownloader( product="ERA5", base_path_to_download="/path/to/Copernicus/", # Will be created if not available token=None, check=True, ) result = copernicus_downloader.download_data_era5( variables=["swh"], years=["2020"], months=["01", "03"], ) print(result)
Fully downloaded files: Not fully downloaded files: /path/to/Copernicus/ERA5/reanalysis-era5-single-levels/ocean/reanalysis/significant_height_of_combined_wind_waves_and_swell/swh_2020_01_03.nc Error files:
- property client: Client
- download_data(*args, **kwargs) str [source]
Downloads the data for the product.
- Parameters:
*args – The arguments to pass to the download function.
**kwargs – The keyword arguments to pass to the download function.
- Returns:
The message with the fully downloaded files and the not fully downloaded files.
- Return type:
str
- Raises:
ValueError – If the product is not supported.
- download_data_era5(variables: List[str], years: List[str], months: List[str], days: List[str] = None, times: List[str] = None, area: List[float] = None, product_type: str = 'reanalysis', data_format: str = 'netcdf', download_format: str = 'unarchived', force: bool = False) str [source]
Downloads the data for the ERA5 product.
- Parameters:
variables (List[str]) – The variables to download. If not provided, all variables in self.product_config will be downloaded.
years (List[str]) – The years to download. Years are downloaded one by one.
months (List[str]) – The months to download. Months are downloaded together.
days (List[str], optional) – The days to download. If None, all days in the month will be downloaded. Default is None.
times (List[str], optional) – The times to download. If None, all times in the day will be downloaded. Default is None.
area (List[float], optional) – The area to download. If None, the whole globe will be downloaded. Default is None.
product_type (str, optional) – The product type to download. Default is “reanalysis”.
data_format (str, optional) – The data format to download. Default is “netcdf”.
download_format (str, optional) – The download format to use. Default is “unarchived”.
force (bool, optional) – Whether to force the download. Default is False.
- Returns:
The message with the fully downloaded files and the not fully downloaded files. Error files are also included.
- Return type:
str
- list_datasets() List[str] [source]
Lists the datasets available for the product.
- Returns:
The list of datasets available for the product.
- Return type:
List[str]
- list_variables(type: str = None) List[str] [source]
Lists the variables available for the product. Filtering by type if provided.
- Parameters:
type (str, optional) – The type of variables to list. Default is None.
- Returns:
The list of variables available for the product.
- Return type:
List[str]
- property product: str
- property product_config: dict
- products_configs = {'ERA5': {'datasets': {'reanalysis-era5-complete': {'description': 'ERA5 Complete Dataset', 'mandatory_fields': ['class', 'date', 'direction', 'domain', 'expver', 'frequency', 'param', 'stream', 'time', 'type', 'area', 'format', 'grid'], 'optional_fields': [], 'template': {'class': 'ea', 'direction': '1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24', 'domain': 'g', 'expver': '1', 'format': 'netcdf', 'frequency': '1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30', 'grid': '0.5/0.5', 'param': '251.140', 'stream': 'wave', 'time': '00:00:00/01:00:00/02:00:00/03:00:00/04:00:00/05:00:00/06:00:00/07:00:00/08:00:00/09:00:00/10:00:00/11:00:00/12:00:00/13:00:00/14:00:00/15:00:00/16:00:00/17:00:00/18:00:00/19:00:00/20:00:00/21:00:00/22:00:00/23:00:00', 'type': 'an'}, 'types': ['ocean'], 'url': 'https://cds.climate.copernicus.eu/datasets/reanalysis-era5-complete?tab=overview'}, 'reanalysis-era5-pressure-levels': {'description': 'ERA5 Pressure Levels', 'mandatory_fields': ['product_type', 'variable', 'year', 'month', 'pressure_level', 'data_format', 'download_format'], 'optional_fields': ['day', 'time', 'area'], 'template': {'data_format': 'netcdf', 'day': ['01'], 'download_format': 'unarchived', 'month': ['01'], 'pressure_level': ['500'], 'product_type': ['reanalysis'], 'time': ['00:00'], 'variable': ['geopotential'], 'year': ['2019']}, 'types': ['pressure'], 'url': 'https://cds.climate.copernicus.eu/datasets/reanalysis-era5-pressure-levels?tab=overview'}, 'reanalysis-era5-single-levels': {'description': 'ERA5 Single Levels', 'mandatory_fields': ['product_type', 'variable', 'year', 'month', 'data_format', 'download_format'], 'optional_fields': ['day', 'time', 'area'], 'template': {'data_format': 'netcdf', 'day': ['01'], 'download_format': 'unarchived', 'month': ['01'], 'product_type': ['reanalysis'], 'time': ['00:00'], 'variable': ['significant_wave_height_of_first_swell_partition'], 'year': ['2019']}, 'types': ['atmosphere', 'ocean'], 'url': 'https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form'}}, 'variables': {'dwww': {'cds_name': 'wave_spectral_directional_width_for_wind_waves', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Wave spectral directional width for wind waves', 'nc_name': 'dwww', 'type': 'ocean', 'units': 'degrees'}, 'mdww': {'cds_name': 'mean_direction_of_wind_waves', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean direction of wind waves', 'nc_name': 'mdww', 'type': 'ocean', 'units': 'degrees'}, 'mpww': {'cds_name': 'mean_period_of_wind_waves', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean period of wind waves', 'nc_name': 'mpww', 'type': 'ocean', 'units': 's'}, 'mwd': {'cds_name': 'mean_wave_direction', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave direction', 'nc_name': 'mwd', 'type': 'ocean', 'units': 'degrees'}, 'mwp': {'cds_name': 'mean_wave_period', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave period', 'nc_name': 'mwp', 'type': 'ocean', 'units': 's'}, 'p140121': {'cds_name': 'significant_wave_height_of_first_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Significant wave height of first swell partition', 'nc_name': 'p140121', 'type': 'ocean', 'units': 'm'}, 'p140122': {'cds_name': 'mean_wave_direction_of_first_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave direction of first swell partition', 'nc_name': 'p140122', 'type': 'ocean', 'units': 'degrees'}, 'p140123': {'cds_name': 'mean_wave_period_of_first_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave period of first swell partition', 'nc_name': 'p140123', 'type': 'ocean', 'units': 's'}, 'p140124': {'cds_name': 'significant_wave_height_of_second_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Significant wave height of second swell partition', 'nc_name': 'p140124', 'type': 'ocean', 'units': 'm'}, 'p140125': {'cds_name': 'mean_wave_direction_of_second_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave direction of second swell partition', 'nc_name': 'p140125', 'type': 'ocean', 'units': 'degrees'}, 'p140126': {'cds_name': 'mean_wave_period_of_second_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave period of second swell partition', 'nc_name': 'p140126', 'type': 'ocean', 'units': 's'}, 'p140127': {'cds_name': 'significant_wave_height_of_third_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Significant wave height of third swell partition', 'nc_name': 'p140127', 'type': 'ocean', 'units': 'm'}, 'p140128': {'cds_name': 'mean_wave_direction_of_third_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave direction of third swell partition', 'nc_name': 'p140128', 'type': 'ocean', 'units': 'degrees'}, 'p140129': {'cds_name': 'mean_wave_period_of_third_swell_partition', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Mean wave period of third swell partition', 'nc_name': 'p140129', 'type': 'ocean', 'units': 's'}, 'pp1d': {'cds_name': 'peak_wave_period', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Peak wave period', 'nc_name': 'pp1d', 'type': 'ocean', 'units': 's'}, 'shww': {'cds_name': 'significant_height_of_wind_waves', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Significant height of wind waves', 'nc_name': 'shww', 'type': 'ocean', 'units': 'm'}, 'spectra': {'cds_name': 'full_wave_spectra', 'dataset': 'reanalysis-era5-complete', 'long_name': 'Full wave spectra', 'nc_name': 'spectra', 'type': 'ocean', 'units': 'm^2/Hz·radian'}, 'swh': {'cds_name': 'significant_height_of_combined_wind_waves_and_swell', 'dataset': 'reanalysis-era5-single-levels', 'long_name': 'Significant height of combined wind waves and swell', 'nc_name': 'swh', 'type': 'ocean', 'units': 'm'}}}}
- class bluemath_tk.downloaders.NOAADownloader(base_path_to_download: str, debug: bool = True, check: bool = False)[source]
Bases:
BaseDownloader
This is the main class to download and read data from NOAA.
- config
The configuration for NOAA data sources loaded from JSON file.
- Type:
dict
- base_path_to_download
Base path where the data is stored.
- Type:
Path
- debug
Whether to run in debug mode.
- Type:
bool
Examples
from bluemath_tk.downloaders.noaa.noaa_downloader import NOAADownloader noaa_downloader = NOAADownloader( base_path_to_download="/path/to/NOAA/", # Will be created if not available debug=True, check=True, ) # Download buoy bulk parameters and load DataFrame result = noaa_downloader.download_data( data_type="bulk_parameters", buoy_id="41001", years=[2020, 2021, 2022], load_df=True ) print(result)
None
- config = {'data_types': {'bulk_parameters': {'columns': ['YYYY', 'MM', 'DD', 'hh', 'mm', 'WD', 'WSPD', 'GST', 'WVHT', 'DPD', 'APD', 'MWD', 'BAR', 'ATMP', 'WTMP', 'DEWP', 'VIS', 'TIDE'], 'dataset': 'buoy_data', 'description': 'Wind, wave, temperature, and pressure measurements', 'fallback_urls': ['view_text_file.php?filename={buoy_id}h{year}.txt.gz&dir=data/historical/stdmet/', 'stdmet/{year}/{buoy_id}h{year}.txt.gz'], 'file_format': 'txt.gz', 'long_name': 'Standard Meteorological Data', 'name': 'bulk_parameters', 'url_pattern': 'historical/stdmet/{buoy_id}h{year}.txt.gz'}, 'directional_spectra': {'coefficients': {'d': {'name': 'alpha1', 'url_pattern': 'historical/swdir/{buoy_id}d{year}.txt.gz'}, 'i': {'name': 'alpha2', 'url_pattern': 'historical/swdir2/{buoy_id}i{year}.txt.gz'}, 'j': {'name': 'r1', 'url_pattern': 'historical/swr1/{buoy_id}j{year}.txt.gz'}, 'k': {'name': 'r2', 'url_pattern': 'historical/swr2/{buoy_id}k{year}.txt.gz'}, 'w': {'name': 'c11', 'url_pattern': 'historical/swden/{buoy_id}w{year}.txt.gz'}}, 'dataset': 'buoy_data', 'description': 'Fourier coefficients for directional wave spectra', 'file_format': 'txt.gz', 'long_name': 'Directional Wave Spectra Coefficients', 'name': 'directional_spectra'}, 'wave_spectra': {'dataset': 'buoy_data', 'description': 'Wave energy density spectra', 'file_format': 'txt.gz', 'long_name': 'Wave Spectral Density', 'name': 'wave_spectra', 'url_pattern': 'historical/swden/{buoy_id}w{year}.txt.gz'}, 'wind_forecast': {'dataset': 'forecast_data', 'description': 'Wind speed and direction forecast from GFS model', 'file_format': 'netcdf', 'long_name': 'GFS Wind Forecast', 'name': 'wind_forecast', 'output_variables': {'u10': 'ugrd10m', 'v10': 'vgrd10m'}, 'variables': ['ugrd10m', 'vgrd10m']}}, 'datasets': {'buoy_data': {'base_url': 'https://www.ndbc.noaa.gov/data', 'description': 'Historical buoy measurements from NDBC', 'mandatory_fields': ['buoy_id', 'year'], 'name': 'NOAA Buoy Data', 'template': {'buoy_id': None, 'data_type': 'bulk_parameters', 'year': None}}, 'forecast_data': {'base_url': 'https://nomads.ncep.noaa.gov/dods/gfs_0p25_1hr', 'description': 'GFS 0.25 degree forecast data', 'mandatory_fields': ['date'], 'name': 'NOAA GFS Forecast Data', 'template': {'date': None}}}}
- property data_types: dict
- property datasets: dict
- download_data(data_type: str, load_df: bool = False, **kwargs) DataFrame | Dataset | str [source]
Downloads the data for the specified data type.
- Parameters:
data_type (str) – The data type to download. - ‘bulk_parameters’ - ‘wave_spectra’ - ‘directional_spectra’ - ‘wind_forecast’
load_df (bool, optional) – Whether to load and return the DataFrame after downloading. Default is False. If True and multiple years are specified, all years will be combined into a single DataFrame.
**kwargs – Additional keyword arguments specific to each data type.
- Returns:
Downloaded data or status message.
- Return type:
Union[pd.DataFrame, xr.Dataset, str]
- Raises:
ValueError – If the data type is not supported.
- list_data_types() List[str] [source]
Lists the available data types.
- Returns:
The list of available data types.
- Return type:
List[str]
- list_datasets() List[str] [source]
Lists the available datasets.
- Returns:
The list of available datasets.
- Return type:
List[str]
- read_bulk_parameters(buoy_id: str, years: int | List[int]) DataFrame | None [source]
Read bulk parameters for a specific buoy and year(s).
- Parameters:
buoy_id (str) – The buoy ID.
years (Union[int, List[int]]) – The year(s) to read data for. Can be a single year or a list of years.
- Returns:
DataFrame containing the bulk parameters, or None if data not found.
- Return type:
Optional[pd.DataFrame]
- read_directional_spectra(buoy_id: str, years: int | List[int]) Tuple[DataFrame | None, ...] [source]
Read directional spectra data for a specific buoy and year(s).
- Parameters:
buoy_id (str) – The buoy ID
years (Union[int, List[int]]) – The year(s) to read data for. Can be a single year or a list of years.
- Returns:
Tuple containing DataFrames for alpha1, alpha2, r1, r2, and c11, or None for each if data not found
- Return type:
Tuple[Optional[pd.DataFrame], …]
- read_wave_spectra(buoy_id: str, years: int | List[int]) DataFrame | None [source]
Read wave spectra data for a specific buoy and year(s).
- Parameters:
buoy_id (str) – The buoy ID.
years (Union[int, List[int]]) – The year(s) to read data for. Can be a single year or a list of years.
- Returns:
DataFrame containing the wave spectra, or None if data not found
- Return type:
Optional[pd.DataFrame]