Recent improvements in the E-OBS gridded data set for daily mean wind speed over Europe in the period 1980–2021

. In this work, we present the most recent updates in the E-OBS gridded data set for daily mean wind speed over Europe. The data set is provided as an ensemble of 20 equally likely realisations. The main improvements of this data set are the use of forward selection linear regression for the monthly background ﬁeld, as well as a method to ensure the reliability of the ensemble dispersion. In addition, we make a preliminary study into possible causes of the observed terrestrial wind stilling effect, such as local changes in surface roughness length.


Introduction
Providing gridded climate data sets is part of climate data services. Gridded climate data sets are popular products in climate data services because of their ease in handling and their ability to homogeneously cover the complete domain. Such gridded data sets can be exploited in studies into climate change mitigation and adaptation. For Europe, E-OBS provides a gridded data set for a family of variables (van den Besselaar et al., 2011;Cornes et al., 2018); a recent addition to the E-OBS family is the present gridded data set for daily mean wind speed (de Baar et al., 2023). The E-OBS insitu based gridded data set is part of the Copernicus Climate Change Services and is based on a large number of European stations, mostly from National Meteorological and Hydrological Services. Currently, the variables mean temperature, minimum temperature, maximum temperature, precipitation amount, sea level pressure, surface shortwave downwelling radiation, relative humidity and wind speed can be downloaded from https://cds.climate.copernicus.eu/ (last access: 14 August 2023) or from https://surfobs.climate.copernicus. eu/ (last access: 14 August 2023).

Data
For this gridded data set, validated in situ station data for h = 10 m wind speed is used which is supplied by the European National Meteorological Services (NMHSs) and other data holding institutes to the European Climate Assessment & Dataset (ECA&D) (Klein Tank et al., 2002;Klok and Klein Tank, 2009;ECA&D Team, 2012). For the areas in Europe where these data are missing, we use the in situ daily mean wind speed compiled by the Global Summary Of the Day (GSOD) (Smith et al., 2011).
Inspired by Brinckmann et al. (2016), we use a set of covariates as predictors for the daily mean wind speed: (1) latitude and longitude coordinates (from the ECA&D geographical extent); (2) multi-year ERA5 monthly average 800 hPa wind speed (Hersbach et al., 2020); (3) terrain altitude, slope and positioning index (from USGS GTOPO30); (4) distance to coast; and (5) terrain roughness length (obtained from Copernicus Climate Data Store). Because wind speed is a spatially high-resolution variable, we have maintained the spatial representation of the covariates and only used piecewise linear downsampling to match the 0.1 • × 0.1 • E-OBS grid resolution.

Methodology
A detailed description of the methodology is provided in de Baar et al. (2023), here we provide a short description. We consider the log-transformed daily mean wind speed. Our procedure is to first compute the monthly mean station values, and then exploit the covariates to learn a background field for each year and month. For the background field we allow for higher-order terms and interaction terms of the covariates -as an improvement over more standard methods. Because such a model contains many potential basis functions and coefficients and is therefore susceptible to overfitting, we use forward selection linear regression (James et al., 2013) to select the most important basis function. Instead of using the root mean squared (rms) residual as a selection criterion, we use the 10-fold cross-validation rms error (James et al., 2013), which is a further safe-guard against over-fitting of the monthly background field.
After finding the background fields for each year and month, we use gaussian process regression (Wikle and Berliner, 2007) to regress the daily station anomalies. We use an exponentially decaying covariance function, with a relative measurement uncertainty level ("nugget") of 1 %. We use a maximum likelihood estimate to tune the correlation length of the covariance function, and use a search range (i.e. mask) of 150 km to the nearest station. The gridded anomaly is then added to the backround grid.
We provide the gridded data set as an ensemble of 20 equally likely realisations as a means to quantify the uncertainty. The initial ensemble is created through random bootstrapping (James et al., 2013) of the monthly background grid. This implies that each ensemble member of the background grid can have different basis functions and coefficients. Then, we make another improvement by applying the newly developed Ensemble Dispersion Improvement auto-Tune (EDIT) to adjust the ensemble spread during postprocessing to reflect realistic uncertainty levels in the data set (de Baar et al., 2022(de Baar et al., , 2023. Figure 1 illustrates the E-OBS gridded data set for daily mean wind speed, for storm "Kyrill" (2007) passing over Europe. In Fig. 1, the top row shows the ensemble mean, while the bottom row shows the ensemble standard deviation. It can be seen that the uncertainty is larger in areas with high mean wind speeds, areas with low station density (like Eastern Europe) and areas with complicated topography (like the coast of Norway). It should be noted that ERA5-Land is not assimilated for wind speed, although wind speed does enter in an indirect way because ERA5-Land is forced by the atmospheric component of ERA5 which was assimilated by information from boundary layer, e.g., ground-based wind profile or air balloon (Muñoz-Sabater et al., 2021). Although the general features are quite similar, we note that E-OBS shows more detail (i.e. less smoothness) than ERA5-Land. Also, we see some differences along the coast of Norway and of Portugal. A more detailed comparison between E-OBS and ERA5-Land is provided in de Baar et al. (2023), which confirms that E-OBS preserve the extremes in wind speed a bit better and generally shows more structure in the wind field.

Towards understanding terrestrial wind stilling
The new E-OBS daily mean wind speed gridded dataset over Europe might be helpful to better understand an interesting and elusive phenomenon: terrestrial wind stilling. According to different studies, the yearly mean wind speed over land has seen a downward trend, in particular in the period 1980-2010 (e.g. Vautard et al., 2010). We extracted the daily mean wind speed during the selected period to better explore this hypothesis. Figure 3a illustrates the reduction in yearly mean wind speed spatially averaged over the area covered by E-OBS, which would be in line with what previous research has identified. However, when this one-dimensional trend is unraveled to the geographic space, we can find some spatial patterns that might not conform with previous findings. Figure 3b shows that the trend depends very much on the location with the strongest decreases in wind speed over Sweden. Recently, this decrease in wind speed over Sweden was also discussed in Minola et al. (2023), and change of surface roughness was mentioned as one of the possible causes.
In order to better understand the cause of this wind stilling effect, we use forward selection linear regression (James et al., 2013) to learn the local wind stilling trend shown in Fig. 3b as a function of local covariates. The resulting model is illustrated in Fig. 3c, where the basis functions (vertical axis) have been ordered from high to low by magnitude of their effect. In the top panel of Fig. 3c we show the basis functions, where the color indicates the power of each basis function during multiplication: open circle indicates power zero, orange circle indicates power one and purple indicates power two. For example, basis function "2" is linear in topographic position index (TPI), while basis function "3" is a constant. A basis function with multiple circles indicates an interaction term. In the bottom panel, we show the corresponding coefficients. We observe an important effect from the local trend in the logarithm of the surface roughness length suggesting land-use changes to drive wind stilling. However, the indicated r 2 value is low, indicating that the picture of terrestrial wind stilling is probably more complex and involves many significant terms. Some of these terms might be even higher-order terms or interactions or covariates that were not included in this study. Further investigation is taken up, for example, by Luu et al. (2023).

Conclusions
We have presented a new E-OBS gridded data set for daily mean wind speed. The main improvements are the use of forward selection linear regression to fit a covariate-based background field, as well as EDIT to ensure reliable ensemble dispersion. The study of the terrestrial wind stilling effect  provides some initial results, however, more research is required.
Competing interests. The contact author has declared that none of the authors has any competing interests.

Disclaimer.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Special issue statement.
This article is part of the special issue "EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2022". It is a result of the EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2022, Bonn, Germany, 4-9 September 2022. The corresponding presentation was part of session OSA3.2: Spatial climatology.