Introduction
A fully integrated European energy market is one of the
priority policy areas of the European Commission e.g..
Transmission system operators use estimates of the energy production from
variable renewable sources within their transmission zones already today.
Besides technical aspects, such as the reinforcement of the transmission grid
e.g., also the upscaling algorithms behind
these renewable power estimates need to be revised when trading zones are
extended – in particular for increasing shares of renewables. In fact, the
large-scale integration of variable renewable energy sources (VRES) – such
as wind power – introduces additional factors of uncertainty. This
uncertainty poses new challenges to the power system operator since it is
necessary to keep the balance between production and consumption at every
moment, in order to ensure the stability of the power system
. In this sense, it is
crucial to know the actual and future generation from the VRES within the
system. While the future generation is subject of forecasting technologies,
this work focuses on the introduction of an upscaling methodology to estimate
the Europe-wide actual wind
power generation based on spatio-temporal clustering
e.g..
With the application of upscaling methodologies on the European scale
additional potential benefits are expected: Aggregating wind parks with a
wide geographical dispersion, for instance, is an effective way to reduce the
short term variability and forecast errors by taking advantage of the statistical smoothing effect .
In the current literature, several upscaling approaches can be found: In
a typical upscaling function using a bi-exponential
function to estimate the cross-correlation is proposed.
performed a benchmarking of different approaches based on dynamic fuzzy
neural networks. In the upscaling technique is based on
smoothing techniques to construct the predictions of the aggregated wind
generation from historical wind speed predictions and the associated wind
generation measurements. Recently, proposed a probabilistic
approach showing that this type of methodology can provide competitive
interval forecasts when compared to conventional statistical approaches.
However, all of the upscaling
methodologies described above are usually applied to a set of wind parks, and not to the European scale.
As wind is a meteorological quantity, weather conditions may have a strong
impact on the wind power variability as well as on the uncertainty of its
forecasts . for instance show
that the presence of cyclonic systems with strong dynamics – such as cold
fronts – can be related to larger errors in the forecast when compared with
prevailing weather conditions associated with stationary systems such as
anticyclonic systems. A similar methodology was also applied to several wind
parks in Portugal demonstrating the weather dependency of the wind power
forecast errors . shows that strong
wind variability can be associated with certain weather patterns and
show a strong impact of weather regimes on wind power ramps
in Portugal. Consequently, taking into account the underlying role of the
synoptic weather patterns could be an important step towards
reliable upscaling algorithms.
The objective of this work is to introduce a new upscaling approach for
Europe-wide wind power generation based on spatio-temporal clustering
(Sect. ). The upscaling model will be trained and
evaluated for different circulation weather types (CWTs, Sect. )
using a set of Europe-wide wind power generation data
(Sect. ). The training for specific CWTs will be compared
to the training over all time steps in the training period in order to
investigate its weather dependency and the potential benefit from the weather
dependent training (Sect. ). Conclusions will be drawn in
Sect. .
Methodology and data
Reference site selection: spatio-temporal clustering
Schematic dendrogram for illustration of steps 4 and 5 of the spatio-temporal clustering
approach.
Focus of this work is the presentation of a reference site selection scheme
based on spatio-temporal clustering. In order to derive a finite set of
reference sites to upscale the generation of wind power across Europe at a
certain point of time the following procedure is applied:
Cluster the locations of wind farms (latitude/longitude coordinates) into N
(geographical) clusters via the kmeans algorithm .
For each of the N geographical clusters, select the site with the highest wind
power capacity. Obtain the set Ωgeo with size |Ωgeo|=N.
Compute pairwise (temporal) correlations
ϱ(ri,rj)=ϱp(ri,t),p(rj,t)∀ri,rj∈Ωgeo of
the historical generation time series p(ri,t) at the N sites ri,i=1,…N
selected in the previous step.
Use the correlation information to apply a hierarchical clustering
e.g. with the distance between sites ri and rj being defined
as d(ri,rj)=1-|ϱij|.
Cut the dendrogram obtained from the hierarchical (temporal) clustering at height
h=τ. Yield k=k(τ)≤N clusters. Here, τ is the distance between two
clusters. For each cluster, again, select the site with the highest wind power capacity
as cluster centres to obtain the final set of k reference sites Ω0. This step is illustrated
in Fig. .
Note that if the average group linkage method is used to agglomerate
clusters, τ can be interpreted as 1 minus the average intra-cluster
correlation. In other words, the final set of reference sites can be
determined by choosing the average intra-cluster correlation:
D(A,B):=1(|A|+|B|)(|A|+|B|-1)∑x,y∈A∪Bd(x,y)=1(|A|+|B|)(|A|+|B|-1)∑x,y∈A∪B1-ϱ(x,y)=1-ϱ‾C
For two clusters (sets) A and B and C=A∪B, i.e. the cluster
resulting from the union of set A and set B. Choosing the average
intra-cluster correlation as key-parameter to determine the reference sites
allows to further investigate the behavior of the clustering approach from a
physical-meteorological perspective. This is the major advantage of the
proposed methodology compared to, for instance, st-DBSCAN ,
which does not allow for using different distance measures than the euclidean
distance.
Upscaling and evaluation
Modelled spatial distribution of rated wind power capacity across Europe.
The upscaling estimate itself for time t=t′ is computed as a weighted sum
of the generation measured at the reference sites:
E(t=t′)=∑ri∈Ω0w(ri)p(ri,t=t′)
Where the weights wi are computed from a multiple linear regression of the
generation at the k reference sites rk∈Ω0 on the total
Europe-wide generation performed over a pre-chosen training period. Note,
that
in general the wi may vary in dependency of τ and N.
For this study, the upscaling estimate derived from Eq. ()
will be evaluated based on the Pearson correlation and the
root mean square error (RMSE) between the upscaling estimate E(t) and the
reference time series for a testing period. Here, the sum of all grid cells
of the wind power generation data (Sect. ) is used as
reference. RMSE values
have been normalized to the average hourly wind power production.
In order to investigate the dependency from the prevailing weather situation
and the eventual benefit from training the model for specific weather
situations, both training and testing will be performed for the nine most
common circulation weather
types in Europe (see Sect. ).
We use five years (2008–2012) for training and one year (2013) for testing.
Wind power generation data
Location of the cluster centres and the weights assigned to them
by the linear regression (size scale) for ϱ‾C=0.8
(=^τ=0.2) and training over all
time steps (a) and over the time steps with prevailing CWT SW (b).
The upscaling methodology introduced above is tested for a data set of
modeled hourly onshore wind power generation across Europe. This data bases
on two data sets: COSMO-EU analysis data provided by the German Weather
Service used for the statistical downscaling of MERRA
reanalysis data provided by the National Aeronautics and Space Administration
of the United States . MERRA was used to capture a longer period of time.
The spatial distribution of rated wind power across Europe is modeled as a
function of the average (computed over the period considered) wind speed for
each location (grid cell) in Europe. The relation between wind speed and
rated power is estimated based on the available data of deployed wind power
capacity in Germany. Since this relation is not very distinct, artificial
noise has additionally been added:
y(r)=aw‾(r)+b+ε
Here, y(r) is the rated wind power at location r, w‾(r) is
the average wind speed at the same location, a and b
are coefficient and intercept fitted from the available data and ε is artificial gaussian noise with zero mean.
Locations of the 16 points used for the circulation weather type identification.
The spatial distribution is shown in Fig. . Note, that it does
not – and is not meant to – represent the real
spatial distribution. Furthermore, offshore locations are not included.
Wind speed is converted to wind power by applying the regional power curve
model for the largest German transmission zone developed by
. The procedure described here is similar to the one used
by . For this study, the years 2008–2013 are considered.
Circulation weather types
Classification of atmospheric circulation into distinct states is a widely
used tool for describing and examining weather patterns and their impact on
meteorological phenomena, e.g., rainfall . In the
literature, several methodologies of weather circulation classification are
available . In this
study, an automatic version of the Lamb weather type classification is
applied to MERRA sea level pressure fields in order to obtain a time series
of prevailing circulation weather types. This method was initially proposed
by
and thereafter applied by several authors e.g.,.
The algorithm bases on the sea level pressure at the 16 points depicted in
Fig. . Assuming geostrophic conditions, westerly and
southerly winds can be computed from the meridional and zonal pressure
gradient respectively. Doing so, six circulation indices (southerly flow SF,
westerly flow WF, resultant flow FT, southerly shear vorticity ZS,
westerly shear vorticity ZW and total shear vorticity ZT) can be
computed from the sea level pressure data via:
SF=A⋅14⋅p5+2p9+p13-p4-2p8-p12WF=12⋅p12+p13-p4-p5FT=SF2+WF2ZS=B⋅14⋅(p6+2p10+p14-p5-2p9-p13-p4-2p8-p12+p3+2p7+p11)ZW=C⋅14⋅(p15+p16-p8-p9)-D⋅14⋅(p8+p9-p1-p2)ZT=ZS+ZW
Southerly and westerly shear vorticity are estimated from the wind
shear in the center of the domain. Subscribed numbers indicate the location.
The four coefficients A, B, C and D are determined by the central
latitude of the chosen raster φ0 (here: φ0=45∘):
A=1cos(φ0)B=12cos2(φ0)C=sin(φ0)sin(φ0-5∘)D=sin(φ0)sin(φ0+5∘)
From the six circulation indices 26 circulation weather types (CWTs) can be
deduced as follows:
If |ZT|<FT the mean flow dominates over the vorticity (local curvature of the wind field). These CWTs are
called directional and named after the eight directions North (N), Northeast (NE), East (E), Southeast (SE), South
(S), Southwest (SW), West (W) and Northwest (NW). The flow direction is given by
tan-1WFSF if WF ≤ 0 and tan-1WFSF+180∘ if WF>0, respectively.
If |ZT|>2FT the vorticity exceeds the mean flow. The circulation is either cyclonic (L) if ZT>0 or
anticyclonic (H) if ZT<0
If FT<|ZT|<2FT both, vorticity and mean flow, are equally strong. These CWTs are called hybrid and named
after the prevailing circulation, i.e. either cyclonic or anticyclonic, plus one of the eight flow directions.
For this study, the nine most common CWTs in Europe are chosen for
evaluation. These are the directional types except for Southeast, the
cyclonic type and the anticyclonic type.
Results
Cluster centres and reference site weights
As mentioned above, the number of reference sites varies in dependency of the
chosen average intra-cluster correlation. Figure shows
the locations of the reference sites obtained from the spatio-temporal
clustering exemplary for the training over all time steps (a) and the time
steps with prevailing Southwestern circulation type (b). The average
intra-cluster correlation was exemplary set to ϱ‾C=0.8.
The size of the dots additionally indicates the weights given to the
reference sites by the linear regression. Points with |w(r)|<0.5×σ are considered as neutral. Here, σ
denotes the standard deviation computed from all weights.
Obviously, the number of reference sites for the CWT SW ( right)
is lower (88 to 97). Hence, the correlations of
wind power generation at the geographical clusters is higher than average
during time steps of Southwesterly flow – especially on the Iberian
Peninsula where the reduction of reference sites is most apparent. Here, wind
power production exhibits a relatively coherent spatial structure. This can
be related to the passage of large-scale atmospheric phenomena associated
with southwesterly circulation, such as cold fronts, able to cover the whole
region . However, not all of the nine CWTS
considered exhibit this higher-than-average correlation. In contrary to
southwesterly circulation, some CWTS are usually associated with relatively
weak and diffused synoptic scale phenomena. These may cause a less coherent
spatial structure of the wind field. Therefore, the number of reference sites
for ϱ‾C=0.8 ranges between 88 for SW and 105 for the
Easterly
flow type (not shown).
From Fig. it can also be seen, that the weights given
to the selected reference sites vary as well. The reference sites on the
Iberian Peninsula get relatively higher weights for the Southwesterly
circulation type than for all time steps.
Correlation versus the average intra-cluster correlation ϱ‾C for CWT SW obtained from the
specific training for this CWT (black) and from the training over all time steps (green) respectively.
As Fig. but for the RMSE normalised to the average generation.
Time series of the upscaling estimate [GWh] versus the reference time series [GWh] for all time steps
(green) and time steps with prevailing Southwesterly circulation (black).
Upscaling evaluation
The skill of the methodology introduced in Sect. measured
by correlation and RMSE is exemplary shown in Figs. and
for the Southwesterly circulation type. It can be seen,
that very high (>0.95) values for the correlation can be achieved for
average intra-cluster correlations above 0.1. For the Southwesterly CWT this
corresponds to a number of reference sites k=17 for whole Europe. For higher
ϱ‾C the correlation asymptotically approaches 1.
A similar behaviour is found for the RMSE. For ϱ‾C>0.1
the RMSE drops below 10 % of the average wind power generation in Europe. For
average intra-cluster correlations above 0.45 – corresponding to k=41 –
RMSE values below 5 % of
the average generation can be achieved.
The good agreement between the upscaling estimate and the reference time
series can additionally be seen from the scatter plot (Fig. ,
again for ϱ‾C=0.8). A systematic
error only appears for extreme high (above 75 GWh) wind power generation
values. Here, the upscaling model systematically underestimates the
generation. Furthermore, all these extreme values occur during Southwesterly
circulations. This reduces the skill of the upscaling model for this CWT disproportionately strong.
ϱ‾C=0 does not involve any hierarchical clustering. The
corresponding data point is considered as
non-representative and therefore neglected from the further analysis.
Benefit from training for weather types
Range of correlation values achieved by training the upscaling for
the specific CWTs (black) and from training over all time steps (green).
In general, the Southwesterly CWT is the one, for which the introduced
upscaling methodology works best with respect to the correlation (Fig. ,
black bars). Other CWTs exhibit lower correlations. With
respect to the RMSE, the SW type only skills average
(Fig. , black bars). Here, especially the Easterly type benefits from the specific training.
Figures and show the range of the
correlation and the RMSE for all ϱ‾C∈]0,1]
obtained from (i) the training specifically for the particular CWTs in black
and (ii) training over all time steps in green. Evidently, the upscaling
skill benefits from the specific training. The range of both, correlation and
RMSE, can be reduced significantly. It can furthermore be observed that the
cyclonic CWT and the Southerly CWT perform worst – with respect to both
correlation and RMSE – while the Easterly, Southwesterly and Cyclonic type
perform best. The benefit from the CWT specific training is strongest for the
Northeasterly and Northwesterly type with respect to correlation and RMSE,
respectively.
As Fig. but for the RMSE normalised to the average generation.