Classical approaches to the calculation of the photovoltaic (PV) power
generated in a region from meteorological data require the knowledge of the
detailed characteristics of the plants, which are most often not publicly
available. An approach is proposed with the objective to obtain the best
possible assessment of power generated in any region without having to
collect detailed information on PV plants. The proposed approach is based on
a model of PV plant coupled with a statistical distribution of the prominent
characteristics of the configuration of the plant and is tested over Europe.
The generated PV power is first calculated for each of the plant
configurations frequently found in a given region and then aggregated taking
into account the probability of occurrence of each configuration. A
statistical distribution has been constructed from detailed information
obtained for several thousands of PV plants representing approximately
2 % of the total number of PV plants in Germany and was then adapted to
other European countries by taking into account changes in the optimal PV
tilt angle as a function of the latitude and meteorological conditions. The
model has been run with bias-adjusted ERA-interim data as meteorological
inputs. The results have been compared to estimates of the total PV power
generated in two countries: France and Germany, as provided by the
corresponding transmission system operators. Relative RMSE of 4.2 and
3.8 % and relative biases of

Time series of photovoltaic (PV) power generated within a region are needed for prospective studies on the transformation of the electricity supply system. Under classical approaches an accurate calculation of the power generated by PV plants in a region requires the knowledge of the detailed characteristics of the plants. A few cases can be found where enough information is available, and can be used for the model development and/or validation (see e.g. Jamaly et al., 2013; Lingfors and Widén, 2016; Shaker et al., 2015, 2016). Plant production data are however most often unavailable to the public. They may be collected from e.g. the operators of PV plants but this represents a huge amount of work. Hence, an accurate calculation of the PV power generated in any European region with a classical method represents an exhaustive and time-consuming task and is intractable. Alternatively, statistical approaches may be used to train models from historical time series of the aggregated PV generated power, such as those published by the transmission system operators. In this case, the collection, understanding and quality check of the training data is primordial to ensure accurate model output. Though more practical than the classical approaches, these statistical approaches are also time consuming because of the handling of the data and they cannot be applied in regions where no training data is available. Another common practice consists in estimating the total power generated in a region by upscaling the power generated by a subset of reference plants (Schierenbeck et al., 2010; Lorenz and Heinemann, 2012; Shaker et al., 2015, 2016; Saint-Drenan et al., 2016; Bright et al., 2017; Pierro et al., 2017; Killinger et al., 2017). The major obstacles to this practice is on the one hand the establishment of criteria on the selection of the plants that are statistically representative of the region and their number, and on the other hand, the access to measurements of the selected plants. In addition, those methods only allow estimating the PV power generation for historical periods where measurements are available, and their extrapolation to other periods is not straightforward. This can be a major drawback for e.g. generation adequacy study or prospective analysis where scenarios with long time series are needed. A further option consists in selecting a simple PV model, with a very limited number of unknowns (Jerez et al., 2015), whose implementation for any region is easy to do at the expense of the model accuracy. Finally, several authors propose to consider a mix of different key parameters, where the distributions are chosen by the authors (Marinelli et al., 2015; Schubert, 2012). This requires a deep expertise of the domain. This approach is limited to a few regions and cannot be used to model all EU countries.

This brief survey of the proposed approaches demonstrates the need for a new
approach that offers a better trade-off between implementation constraints
and model output accuracy. This paper describes a method addressing this
need. It has been developed in the framework of the EU-funded Copernicus
Climate Change Service ECEM

The innovation of our method is the extension of the regional PV model proposed by Saint-Drenan (2015) and Saint-Drenan et al. (2017) to any region, without the need for a priori knowledge of the characteristics of the installed PV plants. To this end, the plant-related parameters which are needed as input to the PV model are expressed as a function of known solar resource characteristics, making thus the model generalizable to any region, namely beyond Germany where the approach was originally tested. This is achieved in two steps: firstly, by reducing the number of inputs to the PV model by the means of an analytical function for the statistical distribution of the module orientation and, secondly, by expressing the parameters of the chosen analytical functions as a function of known geographically-dependent information (optimal tilt angle).

The paper is organized as follows. After a short summary of the regional PV model proposed by Saint-Drenan (2015) and Saint-Drenan et al. (2017) in Sect. 2.1, the reduction of the number of parameters achieved by the use of an analytical function is detailed in Sect. 2.2. The approach chosen to relate the parameters of the analytical function to known geographically-dependent quantities is then explained in Sect. 2.3. Implementation details are provided in Sect. 2.4. The results of a validation of the model are described in Sect. 3, where the model output has been compared to estimates of the total PV power generation of France and Germany provided by transmission system operators (TSOs). Finally, the results and potential improvements of the approach presented are discussed in Sect. 4.

Our approach for modelling the PV power generation in any country use a generic PV model which needs only the distribution of the two module orientation angles as inputs. This model is introduced in Sect. 2.1 and the methodology for estimating the distribution of the module orientation angles in any location is described in the Sect. 2.2 and 2.3. Finally, some implementation details are given in Sect. 2.4.

The proposed method is built upon previous works by Saint-Drenan
(2015) and Saint-Drenan et al. (2017), where a model for the aggregated PV power produced by a fleet
of PV plants installed in a region is described. The authors have showed
that an accurate estimate of the German PV power generation can be obtained
by using the statistical distribution of the orientation angle of PV panels
as the sole plant-relevant input to the model (Fig. 1). The model is based
on the simple idea that the aggregated PV power generated in a region is the
sum of the normalized outputs of all plants with characteristics

A first advantage of the chosen regional PV model is that each important
configuration is considered only once and the number of configurations

Flow chart of the single PV plant model.

The function

Saint-Drenan (2015) and Saint-Drenan et al. (2017) have created a dataset of
peak power and module orientation angles for 35 000 PV plants located in
Germany, which is used here. This amount of plants represents approximately
2 % of the number of plants installed in Germany. It is assumed that this
dataset is representative of all plants in Germany. A realistic example of
the relationship between

Share of the installed capacity per module orientation evaluated from the 35 000 PV plants installed in Germany (coloured squares). Black squares denote the set of 19 reference orientations used for the implementation of the regional model.

The use of Eq. (1) requires that the space spanned by

The use of the analytic form in Eq. (2) for the weights

Comparison of the experimental histograms of two module orientation
angles (blue bars) with the fitted normal distribution function (red lines)
for the module azimuth angle

The three plant-related parameters necessary to calculate the total PV power produced in a given region from meteorological data has been determined above in the specific case of Germany. How can these three parameters be extended to other countries/regions? At this stage, one possibility may consist in using ones own expertise on the characteristics of PV plants installed in the considered regions as in Marinelli et al. (2015); Schubert (2012); another is a detailed statistical analysis of a dataset of plant information installed in the studied areas. Both ways hamper the easy use of the regional model aimed at in this work. To address this issue a parameterization of the three parameters is proposed in this section, which makes the model implementable in any region without any prior knowledge on the installed PV plants.

The statistical distribution of the plant capacity as a function of the
module orientation of a region is the result of individual choices on the
configurations of each single plant. It is affected by many factors of
different nature such as the characteristics of the solar resource, the
shading profile, architectural characteristics, different installation
practices, etc. All these factors cannot be taken into consideration and we
make the assumption that the most important one is the characteristics of
the solar resource. We propose to take this into consideration through the
use of an optimal tilt angle. The optimal tilt angle corresponds to the
value of the tilt angle of a southwards oriented module yielding the largest
annual output. In this work, we use the raster file of optimal tilt angles
available on the PV-GIS website (

It can be observed in Fig. 4 that the optimal tilt angle

The weights corresponding to the different orientation angles are finally
estimated using Eq. (2), where the mean tilt angle is taken equal to the
optimal tilt angle time the factor

Some implementation details have been intentionally omitted in the previous sections for the sake of clarity and conciseness. This section provides some important details for the implementation of our method.

For the implementation of Eq. (1), the identification of a limited number of
vectors

Our parameterization of the distribution of the module orientation is a function of the optimal tilt angle found on the PV-GIS website. This dataset is displayed in the left map of Fig. 4, where it can be observed that greater than average values are present in mountains (see e.g. regions of the Alps or the Pyrenees). These high values are presumably stemming from the high irradiation values present at high elevations. Since little PV plants are installed in these regions and to avoid overestimation of the tilt angle in the region neighbouring the mountains, these values have been filtered out. The resulting data are displayed in the right map of Fig. 4.

As already mentioned in Sect. 3, the expression given in Eq. (2) cannot be
directly used to estimate the weights

Spatial distribution of the installed PV capacity in France and
Germany for the year 2014. The installed capacity is aggregated on the pixel
used for the calaculation which have a resolution of 0.5

Comparison of the model output (blue lines) with TSO estimates (red
lines) of the PV power generated in France

Scatter plots of the TSO data against model outputs for France

The model has been assessed by comparing its outputs to the PV power generated within a country. Given the approach is influenced by uncertainties in the input meteorological parameters, this comparison allows only an indirect evaluation of our model and not a quantification of the modelling accuracy. However, this approach offers a good balance between accuracy and versatility. The goal of this evaluation is thus to verify the plausibility of the model output for a particular model set up. Not only is there a lack of certainty in the input meteorological data but also there are various sources of uncertainty impacting the TSO data as well as the installed capacity used by the model, both making the conclusion of the validation difficult. To address these issues, we conduct the validation in two steps. In the first step, the validation is conducted for two countries: France and Germany, where we have long experience with both the installed capacity and the TSO data. In this first step, the impact of the uncertainty on the installed capacities and TSO estimates is under control but its spatial extension is limited. We therefore conduct a second step, where TSO data from 16 countries are considered. Given the lack of available information on the installed PV capacity in these countries, it is assumed spatially and temporally constant. The actual installed capacity being unknown, the validation is made by evaluating the correlation coefficient between TSO data and model output.

The assessment is first performed for Germany and France for the year 2014. The choice of these two countries has been strongly motivated by the comparatively high level of knowledge of their electricity supply structure and the availability of the data to conduct the validation. The PV power data was provided by the TSOs themselves with a time resolution ranging from 15 min to 1 h. A visual analysis of the time series was performed to control the data. The data was aggregated into 3 h means to conform to the temporal resolution of the meteorological data. Instants with no production by PV (night time) were excluded from the comparison.

The German case is used to validate the assumption made that the statistical quantities evaluated with 35 000 plants can be generalized to the ca. 1 500 000 plants installed in Germany at that time. France has a different level of PV development compared to Germany and is located at slightly different latitudes. This second case will test the validity of our approach to generalize the statistical quantities evaluated in Germany to another country with somewhat different meteorological conditions.

Gridded values of the normalized PV power were computed with the model using
the bias-adjusted ERA-interim data proposed by the ECEM project (Jones et
al., 2017) as meteorological inputs. A bias-adjusted dataset was preferred
to the original ERA-Interim re-analysis dataset in order to limit the effect
of error in the input meteorological data on the assessment of model
performance. The bias-adjusted ERA-Interim covers the period from 1 January 1979
to 31 December 2016 and is covering Europe with a spatial resolution of
0.5

Histograms of the ratio of actual plant tilt angles with the corresponding optimal value for different classes of nominal capacity (coloured lines). In the upper plot, the German case is calculated with the IWES database. The French case is displayed in the lower plot where data from BDPV are used.

Spatial distribution of the tilt angle for plants smaller than
25 kW

By using gridded maps of the installed PV capacity in each country (Fig. 5),
the generated PV power was computed at each grid cell and then spatially
summed to yield the production for each country. The data on the installed
PV plants used for this purpose have been retrieved from the websites of the
four German TSOs (PV-DE 2014) and from a data portal of the French
government (PV-FR 2014). Finally, all time series have been normalized by
the total installed PV capacity, which is equal to 6.17 and 36.87 GW

Some efforts were made to understand the reasons for the greater bias value observed for France. During this investigation we obtained access to the content of the bdpv.fr online portal (BDPV, 2018), which contains the main information for more than 20 000 PV plants installed in France. We used this new data source to compare the characteristics of the German and French PV plants and to verify the validity of our assumption for France.

The strongest assumption made in this work is to consider that the mean tilt
angle is equal to the product of the optimal tilt angle and a constant

It would be interesting to exploit the trend observed in Fig. 8 in our model. However, information on the size of installed PV plants is missing in most European countries so that this is unfortunately impossible. Based on these new results, one can wonder whether the choice of a value of 0.7 for the ratio between actual and optimal tilt is still relevant? Given that larger plants have more weight for the calculation of the regional PV power generation than smaller plants, we consider that our estimate is not unfounded and we decide to keep this value.

In Fig. 8, it can also be observed that for PV plants with an installed
capacity smaller than 10 kW

Scatter plots of three-hourly ENTSO-E solar generation data against the corresponding model output for 16 European countries for the year 2016. The modelled PV generation has been calculated with ERA-interim data assuming a spatially constant installed capacity.

Spatial distribution of the correlation between ENTSO-E data and model output for a three-hourly time resolution and for the year 2016.

As reported in Saint-Drenan (2015), the spatial variations of the tilt angle of small plants are resulting from regional architectural practices. It would therefore be tempting to integrate this information into our model. However, because this information is not commonly available (i.e. not even for France), it could not be accounted for in a robust way.

Though the results of this first validation can be considered as satisfactory, it is important to also demonstrate that results for Germany and France can be extrapolated to other (European) countries, also with different climates, engineering practices, etc. We therefore decided to conduct an additional validation step, in which we compared the output of our model to additional TSO data. To this end, we collected time series of solar power generation on the ENTSO-E Transparency Portal for 16 countries for the year 2015 and built 3-hourly averages to make the data comparable with the model output. The model setup is the same than in the previous validation except for the installed capacity, which is not known and thus assumed spatially and temporally constant (even in France and Germany). Indeed, the information available on the installed capacity is only updated yearly and we experiment several situations where the time series of the production were not matching with the given installed capacity (e.g. situation with production values greater than the installed capacity).

The comparison of the model output with the ENTSO-E data has been conducted for 16 countries. The scatter plot of the model output against ENTSO-E data is given in Fig. 10 for each country. As mentioned before, since the installed capacity is not known, the model output has not been scaled to the actual capacity. As a result, one should not consider the absolute error values in these plots but solely the correlation between the two time series. Accordingly, only the correlation coefficient is given in Fig. 10 and discussed in the remaining of this section. To facilitate the visualisation of the results, the correlation coefficients evaluated for the different countries are displayed as a map in Fig. 11.

With values greater than 0.97, the correlations are particularly high in
Italy, France and Germany. These results confirm those obtained for France
and Germany in the first validation. That the best correlation (0.982) is
found for Italy is a very good surprise since no information on the PV
plants installed in this country was considered in the model development. As
we can see in Fig. 12, the high installed capacity in Italy (ca. 19 GW

scatter plot of the correlation coefficients between model output and ENTSO-E data against installed PV capacity for the 16 different countries.

This paper describes an innovative approach that offers a trade-off between implementation constraints and model output accuracy convenient for the goals of the C3S ECEM service and that may be used in other contexts. The validation of the model against country-aggregated production of electricity by PV plants for France and Germany shows that the model is accurate enough with a RMSE of 3–4 % of the installed capacity. In addition, the model has been further validated against solar power generation time series from 16 countries, which give correlation coefficient above 0.94 except for 4 countries (Austria, Lithuania, Netherlands, and Switzerland). The reasons for the under-average scores for these countries could unfortunately not be identified, which represents a first possible continuation of the present work. This validation revealed that the greater the installed capacity the better the performance of our model is. This finding together with the satisfying results of our performance analysis, confirm that the proposed model is well suited for our targeted applications. Indeed, the goal of the present work was not to make a perfect model for a single country but to propose a generalized approach that can be implemented in any (European) region without having to collect any specific information on the fleet of plants installed in that country. We believe an under-optimal performance is thus acceptable with respect to the gain in flexibility offered by the proposed approach.

Additional validation work would bring a better insight into the strengths and weaknesses of the proposed methodology and identify possible improvements. In addition, data on PV production is available from TSOs in many European countries and the validation may be performed for these countries thus confirming or not the performances of the model presented here. The model may be refined with respect to its parameters using more data from various countries. A possible approach to this end may consist in estimating the probability function of the regional PV model using inversion techniques, using the optimal tilt angle dependent distribution described in this paper as a first guess.

Time series of PV power generation have been calculated in the framework of
the C3S ECEM service with the proposed approach using the ECEM bias-adjusted
ERA interim data and future climate projections for 33 countries in a 3 h
time resolution. These model output data are freely available on the
demonstrator of this project:

The set of adjusted reanalysis data is available on ESSD
(Jones et al., 2017) and has the following DOI

The authors declare that they have no conflict of interest.

This article is part of the special issue “17th EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2017”. It is a result of the EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2017, Dublin, Ireland, 4–8 September 2017.

The authors would like to acknowledge funding for the European Climatic Energy Mixes (ECEM) service by the Copernicus Climate Change Service, a programme being implemented by the European Centre for Medium-Range Weather Forecasts (ECMWF) on behalf of the European Commission. The specific grant number is 2015/C3S_441_Lot2_UEA. Edited by: Sven-Erik Gryning Reviewed by: Sven Killinger, Hans Georg Beyer, and one anonymous referee