7  Meteorology Inputs

We are downloading, subsetting, and processing forecasted meteorology drivers for each NEON site. Currently, we have NOAA’s Global Ensemble Forecasting System (GEFS) V12 output available at the native time resolution and a 1 hr time resolution for each NEON site.

The following are important considerations when using the NOAA GEFS forecasts

The following meteorological variables are included:

The weather forecasts are available through an s3 bucket (see NOAA Global Ensemble Forecasting System below) and we provide an R functions code in the neon4cast package for downloading all the ensemble members for particular location and NEON site

remotes::install_github("eco4cast/neon4cast")

7.1 NOAA Global Ensemble Forecasting System

7.1.1 Stage 1

At each site, 31 ensemble member forecasts are provided at 3 hr intervals for the first 10 days, and 6 hr intervals for up to 35 days (840 hr horizon).

Forecasts include the following variables:

  • TMP: temperature (C)
  • RH: Relative humidity (%)
  • PRES: Atmospheric pressure (Pa)
  • UGRD: U-component of wind speed (m/s)
  • VGRD: V-component of wind speed (m/s)
  • APCP: Total precipitation in interval (kg/m^2)
  • DSWRF: Downward shortwave radiation flux in interval (W/m^2)
  • DLWRF: Downward longwave radiation flux in interval (W/m^2)

All variables are given at height 2m above ground, surface, or 10 m above ground as indicated in height column. See https://www.nco.ncep.noaa.gov/pmb/products/gens/ for more details on GEFS variables and intervals.

Common ways to filter the data before running collect() include reference_datetime, site_id, variable, and horizon.

weather <- neon4cast::noaa_stage1()
# 5.7M rows of data:
weather |> 
dplyr::filter(start_date == "2022-04-01") |>
dplyr::collect()

Stage 1 has the following columns:

site_id: string : NEON site ID
prediction: double : forecasted value
variable: string : weather variable
horizon: double : number of hours in the future
family: string: class of uncertainty (ensemble)
ensemble: int32 : ensemble member number
reference_datetime: timestamp[us, tz=UTC]: datetime of horizon 0
forecast_valid: string: period of time (in hours) that the predicted value applies
datetime: timestamp[us, tz=UTC] : datetime of forecast
cycle: string: hour of day that forecast was started

7.1.2 Stage 2

Stage 2 is a processed version of Stage 1 and involves the following transforms of the data that may be useful for some modeling approaches:

  • Fluxes are standardized to per second rates
  • Fluxes and states are interpolated to 1 hour intervals

Variables are renamed to match CF conventions:

  • TMP -> air_temperature (K)
  • PRES -> air_pressure (Pa)
  • RH -> relative_humidity (proportion)
  • DLWRF -> surface_downwelling_longwave_flux_in_air (W/m^2)
  • DSWRF -> surface_downwelling_shortwave_flux_in_air (W/m^2) 
  • APCP -> precipitation_flux (kg/(m^2 s))
  • VGRD -> eastward_wind (m/s)
  • UGRD -> northward_wind (m/s)
weather_1hr <- neon4cast::noaa_stage2()
weather_1hr |> 
dplyr::filter(start_date == "2022-04-01" & site_id == "BART") |>
dplyr::collect()

Stage 2 has the following columns:

site_id: string : NEON site ID
prediction: double : forecasted value
variable: string : weather variable
horizon: double : number of hours in the future
family: string: class of uncertainty (ensemble)
parameter: int32 : ensemble member number
reference_datetime: timestamp[us, tz=UTC]: datetime of horizon 0
datetime: timestamp[us, tz=UTC] : datetime of forecast

7.1.3 Stage 3

Stage 3 can be viewed as the “historical” weather for site as simulated by NOAA GEFS. Stage 3 is useful for model training because it ensures that the magnitude and variability of the weather data used to train your model is similar to that in the NOAA GEFS weather forecast you may use as inputs to your forecast.

Stage 3 uses CF variable names and 1 hr interval

  • air_temperature (K)
  • air_pressure (Pa)
  • relative_humidity (proportion)
  • surface_downwelling_longwave_flux_in_air (W/m^2)
  • surface_downwelling_shortwave_flux_in_air (W/m^2) 
  • precipitation_flux (kg/(m^2 s))
  • eastward_wind (m/s)
  • northward_wind (m/s)
weather_stage3 <- neon4cast::noaa_stage3()
weather_stage3 |> 
dplyr::filter(site_id == "BART") |>
dplyr::collect()

Stage 3 has the following columns

site_id: string : NEON site ID
prediction: double : forecasted value
variable: string : weather variable
family: string: class of uncertainty (ensemble)
parameter: int32 : ensemble member number
reference_datetime: timestamp[us, tz=UTC]: always NA in Stage3 datetime: timestamp[us, tz=UTC] : datetime of forecast

7.2 Downloading of raw NOAA GEFS Forecasts

The documentation above describes how to access NOAA GEFS forecasts that we have subsetted for NEON sites. We recommend using this approach.

The code used to download and process the raw NOAA GEFS forecast from Amazon Web Services Registry of Open Data is available in the gefs4cast package found at github.com/eco4cast/gefs4cast