Processing (s2stools.process)
Data processing.
Functions
- s2stools.process.add_model_cycle_ecmwf(ds)
Add a coordinate
cycleto a dataset that denotes the ecmwf model cycle.- Parameters:
ds (xr.Dataset) – ecmwf s2s forecast data
- Returns:
ds – dataset with new coordinate
- Return type:
xr.Dataset
- s2stools.process.add_validtime(da)
Given a DataArray/ Dataset with dimensions (‘reftime’, ‘hc_year’, ‘leadtime’), add a coordinate validtime that indicates the target date of the forecast. Example: reftime=”2000-01-01”, hc_year=-1, leadtime=+3D corresponds to validtime “1999-01-03”.
- Parameters:
da (xr.DataArray or xr.Dataset) – Input data, requires dimensions (‘reftime’, ‘hc_year’, ‘leadtime’).
- Returns:
Same dataset as input, but with coordinate validtime.
- Return type:
xr.DataArray or xr.Dataset
Notes
Validtime is of type np.datetime64 and it will not be a dimension.
Warning
Only dimension leadtime is supported, not days_since_init.
Warning
Only makes sense for ECMWF data.
- s2stools.process.combine_s2s_and_reanalysis(s2s, reanalysis, ensfc=True)
Project reanalysis time series on S2S forecast data. Resulting object will have dimensions of s2s dataset.
- Parameters:
s2s (xr.Dataset | xr.DataArray)
reanalysis (xr.Dataset | xr.DataArray)
ensfc (bool) – If True, stack resulting forecasts to ensemble forecasts ([reftime, hc_year] -> fc)
- Returns:
combined_data
- Return type:
xr.Dataset
Examples
>>> ds_s2s <xarray.Dataset> Dimensions: (leadtime: 47, longitude: 2, latitude: 1, number: 51, reftime: 2, hc_year: 21) Coordinates: * leadtime (leadtime) timedelta64[ns] 0 days 1 days ... 45 days 46 days * longitude (longitude) float32 -180.0 -177.5 * latitude (latitude) float32 60.0 * number (number) int64 0 1 2 3 4 5 6 7 8 9 ... 42 43 44 45 46 47 48 49 50 * reftime (reftime) datetime64[ns] 2017-11-16 2017-11-20 * hc_year (hc_year) int64 -20 -19 -18 -17 -16 -15 -14 ... -5 -4 -3 -2 -1 0 validtime (reftime, leadtime, hc_year) datetime64[ns] 1997-11-16 ... 201... Data variables: u (reftime, latitude, longitude, leadtime, hc_year, number) float32 dask.array<chunksize=(1, 1, 2, 47, 20, 1), meta=np.ndarray> >>> ds_reanalysis <xarray.Dataset> Dimensions: (time: 30, latitude: 1, longitude: 2) Coordinates: * time (time) datetime64[ns] 2017-11-01 2017-11-02 ... 2017-11-30 * longitude (longitude) float32 -180.0 -177.5 * latitude (latitude) float32 60.0 Data variables: u (time, latitude, longitude) float32 dask.array<chunksize=(30, 1, 2), meta=np.ndarray> >>> import s2stools.process >>> s2stools.process.combine_s2s_and_reanalysis(s2s, reanalysis) <xarray.Dataset> Dimensions: (leadtime: 47, longitude: 2, latitude: 1, number: 51, reftime: 2, hc_year: 21) Coordinates: * leadtime (leadtime) timedelta64[ns] 0 days 1 days ... 45 days 46 days * longitude (longitude) float32 -180.0 -177.5 * latitude (latitude) float32 60.0 * number (number) int64 0 1 2 3 4 5 6 7 8 9 ... 42 43 44 45 46 47 48 49 50 * reftime (reftime) datetime64[ns] 2017-11-16 2017-11-20 * hc_year (hc_year) int64 -20 -19 -18 -17 -16 -15 -14 ... -5 -4 -3 -2 -1 0 validtime (reftime, leadtime, hc_year) datetime64[ns] 1997-11-16 ... 201... Data variables: u (reftime, latitude, longitude, leadtime, hc_year, number) float32 dask.array<chunksize=(1, 1, 2, 47, 20, 1), meta=np.ndarray> u_verif (reftime, leadtime, hc_year, latitude, longitude) float32 dask.array<chunksize=(2, 47, 21, 1, 2), meta=np.ndarray>
See also
- s2stools.process.concat_era5_before_s2s(s2s: DataArray, era5: DataArray, max_neg_leadtime_days: int = 46) DataArray
Append ERA5 prior to start of forecasts, ERA5 is indicated as negative leadtimes.
- Parameters:
s2s (xr.DataArray)
era5 (xr.DataArray) – requires dimension
timemax_neg_leadtime_days (int) – maximum negative leadtime (i.e. number of ERA5 days to append)
- Returns:
da – dataset with s2s and era5 combined
- Return type:
xr.DataArray
- s2stools.process.reft_hc_year_to_fc_init_date(s2s_data)
Go from dimensions (
reftime,hc_year) to dimensionfc_init_date. :param d: :type d: xr.DataArray | xr.Dataset- Returns:
data
- Return type:
xr.DataArray | xr.Dataset
- s2stools.process.s2sparser(ds)
Will create dimensions reftime, hc_year, leadtime. Coordinate validtime is automatically added. Files need to have the forecast realtime date somewhere in the filename, e.g., s2s_something_2017-11-16.nc.
- Parameters:
ds (xr.Dataset) – dataset
- Return type:
xr.Dataset
Warning
Realtime and hindcast forecasts are combined in a single dataset. If they have different ensemble sizes, then the resulting dataset is larger than necessary as coordinates span full dimension space, e.g., ensemble members 12-51 are padded with NaN. For a more efficient solution consider using xarray-datatree.
Examples
>>> # Use in the following form: >>> ds = xr.open_mfdataset("/some/path/filename_2017*.nc", preprocess=s2stools.process.s2sparser) >>> ds <xarray.Dataset> Dimensions: (leadtime: 47, longitude: 2, latitude: 1, number: 51, reftime: 2, hc_year: 21) Coordinates: * leadtime (leadtime) timedelta64[ns] 0 days 1 days ... 45 days 46 days * longitude (longitude) float32 -180.0 -177.5 * latitude (latitude) float32 60.0 * number (number) int64 0 1 2 3 4 5 6 7 8 9 ... 42 43 44 45 46 47 48 49 50 * reftime (reftime) datetime64[ns] 2017-11-16 2017-11-20 * hc_year (hc_year) int64 -20 -19 -18 -17 -16 -15 -14 ... -5 -4 -3 -2 -1 0 validtime (reftime, leadtime, hc_year) datetime64[ns] 1997-11-16 ... 201... Data variables: u (reftime, latitude, longitude, leadtime, hc_year, number) float32 dask.array<chunksize=(1, 1, 2, 47, 20, 1), meta=np.ndarray>
- s2stools.process.save_one_file_per_reftime(data: Dataset, path: str, create_subdirectory=None)
Save S2S Dataset with one file per reftime.
- Parameters:
data (xr.Dataset) – xr.Dataset Data
path (str) – str target path including filename. _REFTIME.nc will be added. E.g.: /home/foo/s2s_somefilename
data – Dataset to save.
path – target path including filename. _REFTIME.nc will be added. E.g.: /home/foo/s2s_somefilename
create_subdirectory (str or None) – Check if the subdirectory exists. If yes, raise error. If no, create subdirectory and save files into this subdirectory. Defaults to None, where no subdirectory is created and the files are just saved to ‘path’.
- s2stools.process.sel_fc_around_dates(s2s, dates, tolerance_days)
Select forecasts around specific dates.
- Parameters:
s2s (xr.Dataset) – Dataset with dimensions (‘reftime’, ‘hc_year’, ‘leadtime’)
dates (list) – List of dates to select around.
tolerance_days (int) – Tolerance in days to select around the date. E.g., if tolerance_days=3, then include forecasts from 3 days before and after the date.
- Return type:
xr.Dataset
- s2stools.process.stack_ensfc(d, reset_index=True)
Go from dimensions (
reftime,hc_year) to dimensionfc.- Parameters:
d (xr.DataArray | xr.Dataset)
reset_index (bool) – If True, drop multiindex and flatten around new index
fc.
- Returns:
data
- Return type:
xr.DataArray | xr.Dataset
- s2stools.process.stack_fc(d, reset_index=True)
Go from dimensions (
reftime,hc_year,number) to dimensionfc.- Parameters:
d (xr.DataArray | xr.Dataset)
reset_index (bool) – If True, drop multiindex and flatten around new index
fc.
- Returns:
data
- Return type:
xr.DataArray | xr.Dataset