Towards a definitive historical high-resolution climate dataset for Ireland – promoting climate research in Ireland

There is strong and constant demand from various sectors (research, industry and government) for long-term, high-resolution (both temporal and spatial), gridded climate datasets. To address this demand, the Irish Centre for High-End Computing (ICHEC) has recently performed two high-resolution simulations of the Irish climate, utilising the Regional Climate Models (RCMs) COSMO-CLM5 and WRF v3.7.1. The datasets produced contain hourly outputs for an array of sub-surface, surface and atmospheric fields for the entire 36year period 1981–2016. In this work, we list the climate variables that have been archived at ICHEC. We present preliminary uncertainty estimates (error, standard deviation, mean absolute error) based on Met Éireann station observations, for several of the more commonly used variables: 2 m temperature, 10 m wind speeds and mean sea level pressure at the hourly time scale; and precipitation at hourly and daily time scales. Additionally, analyses of 10 cm soil temperatures, CAPE 3 km, Showalter index and surface lifted index are presented.


Our Responsibilities Licensing
We regulate the following activities so that they do not endanger human health or harm the environment: • waste facilities (e.g. landfills, incinerators, waste transfer stations); • large scale industrial activities (e.g. pharmaceutical, cement manufacturing, power plants); • intensive agriculture (e.g. pigs, poultry); • the contained use and controlled release of Genetically Modified Organisms (GMOs); • sources of ionising radiation (e.g. x-ray and radiotherapy equipment, industrial sources); • large petrol storage facilities; • waste water discharges; • dumping at sea activities.

National Environmental Enforcement
• Conducting an annual programme of audits and inspections of EPA licensed facilities. • Overseeing local authorities' environmental protection responsibilities. • Supervising the supply of drinking water by public water suppliers. • Working with local authorities and other agencies to tackle environmental crime by co-ordinating a national enforcement network, targeting offenders and overseeing remediation. • Enforcing Regulations such as Waste Electrical and Electronic Equipment (WEEE), Restriction of Hazardous Substances (RoHS) and substances that deplete the ozone layer. • Prosecuting those who flout environmental law and damage the environment.

Water Management
• Monitoring and reporting on the quality of rivers, lakes, transitional and coastal waters of Ireland and groundwaters; measuring water levels and river flows. • National coordination and oversight of the Water Framework Directive. • Monitoring and reporting on Bathing Water Quality.

Monitoring, Analysing and Reporting on the Environment
• Monitoring air quality and implementing the EU Clean Air for Europe (CAFÉ) Directive. • Independent reporting to inform decision making by national and local government (e.g. periodic reporting on the State of Ireland's Environment and Indicator Reports).

Regulating Ireland's Greenhouse Gas Emissions
• Preparing Ireland's greenhouse gas inventories and projections.
• Implementing the Emissions Trading Directive, for over 100 of the largest producers of carbon dioxide in Ireland.

Environmental Research and Development
• Funding environmental research to identify pressures, inform policy and provide solutions in the areas of climate, water and sustainability.

Strategic Environmental Assessment
• Assessing the impact of proposed plans and programmes on the Irish environment (e.g. major development plans).

Radiological Protection
• Monitoring radiation levels, assessing exposure of people in Ireland to ionising radiation. • Assisting in developing national plans for emergencies arising from nuclear accidents. • Monitoring developments abroad relating to nuclear installations and radiological safety. • Providing, or overseeing the provision of, specialist radiation protection services.

Guidance, Accessible Information and Education
• Providing advice and guidance to industry and the public on environmental and radiological protection topics. • Providing timely and easily accessible environmental information to encourage public participation in environmental decision-making (e.g. My Local Environment, Radon Maps). • Advising Government on matters relating to radiological safety and emergency response. • Developing a National Hazardous Waste Management Plan to prevent and manage hazardous waste.

Awareness Raising and Behavioural Change
• Generating greater environmental awareness and influencing positive behavioural change by supporting businesses, communities and householders to become more resource efficient. • Promoting radon testing in homes and workplaces and encouraging remediation where necessary. The EPA is assisted by an Advisory Committee of twelve members who meet regularly to discuss issues of concern and provide advice to the Board.

Management and structure of the EPA
vi    Tables   Table ES.1. The 36 climate change indices with descriptions produced from temperature and precipitation fields xii    It should be expected that different models will exhibit different errors as per the climate field in question. A priori knowledge of data errors helps end-users tailor their needs accordingly. For instance, an application may require fine resolution without too much regard to minor errors. Or an application may be highly dependent on the accuracy of the underlying dataset.

List of Figures
In this work, a full list of the climate variables that have been produced by these simulations is provided. Uncertainty estimates and skill scores for several of the more in-demand variables (essential climate variables; ECVs), as determined from potential end-users, have been produced utilising a variety of observations (station, satellite, turbine and radiosonde) and are presented here. Gridded datasets that are of interest to the renewable energy sector (winds at turbine heights and solar fields) have been validated and are described.
A total of 36 climate change indices ( A total of 23 climate indices that are of interest to the agricultural sector have also been produced for the period 1981-2000 and are presented. The list includes average daily temperature and temperature range; average January/July temperature values; average monthly rainfall; mean number of wet (1 mm precipitation) days; mean number of 5, 10, 15, 20, 25 and 30 mm precipitation days; mean January, July and annual relative humidity; average annual wind speed; air frost indices; 5 -year return period rainfall amounts (at several timescales); and a driving rain index. Some of the more commonly used climate variables and all the climate indices have been made available online and their location is provided.
Several recommendations based on the research undertaken during the lifetime of this project are made. These include the following: • The use of fifth generation European Centre for Medium-Range Weather Forecasts atmospheric reanalysis of the global climate (ERA5) data to extend the reanalysis/downscalings described here would result in even greater accuracy and temporal coverage. • The validated gridded climate variables that are now available can provide the data resources for numerous new studies that will be of benefit to Ireland -for example, they can motivate an improved understanding of climate change at local levels and extreme weather events. • MÉRA shows the least errors for most of the climate variables examined (hourly/daily precipitation, hourly/daily 2 m temperature, 10 m wind speed and direction, relative humidity, sea level pressure, global irradiance, upper-air wind speed and direction) and should be considered the primary source. However, the finer resolution COSMO-CLM and/or WRF data may be preferred for variables for which there is little error difference (e.g. WRF for 10 m and upper-air winds) or for which the corresponding MÉRA variable is unavailable (e.g. COSMO-CLM5 for convective available potential energy 3 km, Showalter index and surface lifted index). • Some of the rainfall and temperature indices examined suggest changes in rainfall and temperature patterns. A more robust statistical approach (e.g. non-parametric trend analysis, further regional and/or monthly/seasonal analysis) would shed further light on the perceived changes. • The solar fields examined suggest a strong northsouth gradient with an along-coastline resource that could be exploited for future installations. Monthly maximum 1-day and consecutive 5-day precipitation amounts

R95pTOT and R99pTOT
Percentage of annual precipitation from wet days that exceeds the 95th and 99th percentile of (wet day) precipitation in the period  1 Introduction There is a constant demand from industry, research and governmental sectors for high-quality, long-term gridded climate datasets with high spatial and temporal resolution for conducting climate research. Such datasets have the potential to be utilised in a wide range of applications, including agricultural (Collins et al., 1996) In Ireland, station observations have traditionally been used to describe the Irish climate and produce gridded datasets. For instance, daily and monthly gridded datasets (at 1 km resolution) of precipitation have been created for Ireland (Walsh, 2012(Walsh, , 2016 and are based on station data from Met Éireann's rainfall network. The identification of changes in Irish precipitation patterns, whether they are driven by natural variability or man-made climate change, is particularly important to the country, with recent projections pointing to an increased likelihood of summer droughts and winter flooding (Nolan et al., 2013a,b). Unfortunately, gridded datasets based on station observations come with numerous caveats, as detailed by Prein and Gobiet (2017): they may not be particularly representative in regions with few stations and station densities may change over time; and station data are prone to error and/or missing values, precipitation undercatch and excessive smoothing. Furthermore, some observational datasets may have inhomogeneities, caused by changes in instrumentation, location or methods of measurement over time.
Gridded observational climate datasets for a wide range of parameters are not readily available for climate research applications in Ireland.
Climatologically important variables, such as wind speed and direction, humidity and radiation, are measured at a limited number of weather stations.
The reanalysis outputs from numerical weather models represent an alternative to observations for the production of gridded datasets. These outputs provide a consistent physical analysis over time. The European Centre for Medium-Range Weather Forecasts (ECMWF) has initiated several global reanalysis datasets beginning with ERA-15 (1979ERA-15 ( -1993190 km resolution;Gibson et al., 1995). Model improvements over the intervening years have led to higher resolution datasets: ERA-40 (1957ERA-40 ( -2002125 km;Uppala et al., 2005); ERA-Interim (1979-present;80 km;Dee et al., 2011) and, more recently, ERA5 (31 km;1950-present, with data for 1979-present currently available from the Copernicus Climate Data Store, https://cds.climate. copernicus.eu, and the remainder to be released throughout 2020).
Today, even with the most up-to-date global climate models (GCMs), long climate simulations are computationally feasible only with horizontal grid spacing of ~30 km or coarser. Such resolutions are inadequate to simulate the detail and pattern of Ireland's climate that are required at regional and local levels; for instance, climate fields such as precipitation, wind speed and direction are strongly influenced by the local topography. Fortunately, the computational obstacles can be overcome through the application of numerical weather prediction (NWP) or regional climate models (RCMs) to achieve higher resolutions than extant global reanalysis datasets.
Regional reanalysis and dynamical downscaling are two methods often used to achieve a higher resolution (and overcome the associated computational cost). Both methods include forcing (with global reanalysis data) at the boundaries, but differ in other ways. Reanalyses simulate past weather/climate utilising a consistent NWP and data assimilation scheme. Assimilation ensures that historical observations are included so as to produce the best representation of climate at any given time. Dynamical downscaling makes use of an RCM, typically with nested domains and without data assimilation. The computational cost of running the RCM for a given (high) resolution is considerably less than that of a global model as the simulated RCM domain is considerably smaller. High-resolution RCMs demonstrate an improved ability to simulate precipitation (Kendon et al., 2012;Lucas-Picher et al., 2012) and improve the simulation of topography-influenced phenomena and extremes with a relatively small spatial or short temporal character (Flato et al., 2013). An additional advantage is that the physically based RCMs explicitly resolve mesoscale atmospheric features and may provide a better representation of convective (Rauscher et al., 2010) (2015). Although there are numerous high-resolution regional reanalysis datasets available, until recently (2017) none has covered Ireland at spatial resolutions higher than 6 km.
Prior to the current research, two high-resolution Irish climate datasets that cover the period 1981-2016 were produced by researchers at the Irish Centre for High-End Computing (ICHEC). The datasets were created by downscaling ERA-Interim data using the Weather Research and Forecasting (WRF) model v3.7.1 (Skamarock et al., 2008) and COSMO-CLM5 (Rockel et al., 2008). These RCMs were run at 2 km and 1.5 km spatial resolution, respectively, with two additional 6 km and 18 km simulations run for both models. The data output from each model was archived at 1-hour intervals.
In addition, in 2017 the Irish meteorological service, Met Éireann, completed a 36-year reanalysis (MÉRA) at 2.5 km resolution for the period 1981-2016 (Gleeson et al., 2017). Although the MÉRA resolution is coarser than those of the two ICHEC simulations, it has the advantage of data assimilation that utilised time series of surface observations (Whelan et al., 2016). As such, the MÉRA dataset is expected to show better skill than the ICHEC datasets for assimilated surface variables (e.g. temperature, pressure, 10 m wind speeds). The MÉRA datasets, which are stored as a series of 3-hour and 33-hour forecasts, are archived by Met Éireann at 1-hour intervals.
The overall aim of this research project was the production and dissemination of validated, long-term, high-resolution (spatial and temporal) gridded datasets of climate variables and derived products that are of use to researchers, planners and policymakers from various diverse sectors (e.g. climate science, renewable energy and agriculture).
Descriptions of the ICHEC model set-ups are provided in Chapter 2 (the MÉRA set-up has been described elsewhere, e.g. Gleeson et al., 2017), as well as a description of the model outputs that are available. In Chapter 3, descriptions of the data validations conducted for several commonly used climate parameters (e.g. precipitation, 2 m temperature, sea level pressure, relative humidity, 10 m winds) and other more "exotic" climate parameters [e.g. convective available potential energy (CAPE) 3 km] are provided. This is followed in Chapter 4 by a description of validated wind and solar radiation fields, for use in renewable energy applications. In Chapter 5, we give a detailed description of 36 (19 temperature and 17 precipitation) gridded climate change indices that were produced for each of the models (temperature: COSMO-CLM and MÉRA only). In Chapter 6, we describe a selection of products (24 in total) that are of interest to the agricultural sector and that are based on the outputs of MÉRA. In Chapter 7, we give details of the data and their accessibility. Finally, recommendations for future work are given in Chapter 8.

Methods and Models
Both the ICHEC WRF and COSMO-CLM RCM simulations were performed utilising nested domains with 18 km, 6 km and 2 km (WRF) or 18 km, 6 km and 1.5 km (COSMO-CLM) resolutions. Figure 2.1 illustrates the spatial coverage and topography of the three COSMO-CLM and WRF domains. The 18 km simulations were driven at the boundaries by ERA-Interim reanalysis data, produced by ECMWF at 80 km resolution, with all outputs from each individual nested domain (for both COSMO-CLM and WRF) archived at hourly intervals.
The WRF model used (v3.7.1) provides topography data at four resolutions (approximately 11, 6, 2 and 0.6 km at Irish latitudes) that can be used to construct terrain data for the model grid. Given that some climate variables (e.g. winds) are affected by nearby

Data Validations
The three new datasets discussed here contain many climate variables for which observations have previously been unavailable either at such temporal/ spatial coverage and/or resolution or, indeed, at all. Given the datasets' novelty and potential, it is important that some indicator of quality be attached.
To this end, uncertainty estimates [bias, absolute error, standard deviation (STD) and root mean square error (RMSE)] and several skill scores (where appropriate) were calculated for average annual and daily precipitation and 2 m temperature utilising gridded datasets of observations made available by Met Éireann and the UK Met Office. The results of these analyses are presented in section 3.1. Uncertainty estimates for hourly precipitation, 2 m temperature, 10 m winds, relative humidity and mean sea level pressure were calculated utilising station observations and are presented in section 3.2.

Annual and Daily Precipitation and 2 m Temperature
Gridded datasets of observed daily temperature and accumulated precipitation at 1 km resolution covering Ireland (Walsh, 2012(Walsh, , 2016(Walsh, , 2017 for the period 1981-2015 were obtained from Met Éireann. UK Met Office accumulated precipitation datasets with equal resolutions and time ranges were acquired for Northern Ireland from the Centre for Ecology and Hydrology (Tanguy et al., 2016). Since they are based on observations, the 1 km daily precipitation fields are the most authoritative source of information at daily (and longer) timescales and provide a strong benchmark against which the model data can be measured. However, they do not provide sub-daily information, which can come only from a limited number of stations or from models. The gridded datasets are available in monthly comma-separated values (CSV) files and require a level of processing before they can be used in any later analyses: the precipitation files contain some erroneous negative values that must be masked; easting and northing co-ordinates must be transformed to longitude-latitude pairs; and the gridded datasets are at 1 km resolution, whereas the COSMO, WRF and MÉRA datasets are at To prevent the possibility of double-counting, any observed value that is used in a calculation is then marked as "missing". This routine has been applied to the gridded observations for each of the three model grids.
Daily and (subsequently) annual records of 2 m temperature and precipitation were built from each of the three models. For COSMO and WRF, each daily record is simply a sum of hourly values over the given time range 00:00-00:00 UTC (Coordinated Universal Time) for temperature and 09:00-09:00 UTC for precipitation -the latter to match climate station observational practice. For MÉRA, daily records of precipitation were built utilising 33-hour forecast files (thereby avoiding any negative impact from errors linked to model spin-up through use of the 3-hour forecast files) and consecutive subtractions of the 09:00 UTC forecast from the 33-hour forecast for each day. For 2 m temperature, hourly values were first obtained from the 3-hour files, followed by daily averaging. Annual records are then easily obtained through summation (precipitation) and averaging (2 m temperature) and comparisons with gridded observations are made.

Uncertainty estimates
The comparison for average annual 2 m temperature is shown in Figure 3.1. All three models capture the spatial distribution seen in the observations -cooler temperatures in the north and over the mountainswith WRF performing less well than either MÉRA or COSMO. This is clarified in Table 3 The comparison for average annual precipitation is shown in Figure 3.2. Again, all three models capture the spatial distribution seen in the observationshigher rainfall amounts in the west and over the mountains. In Table 3.2, WRF is shown to have the lowest MAE and STD (7.68% and 7.3%, respectively). Corresponding values for MÉRA and COSMO are 11.58% and 12.55%, respectively, and 10.02% and 12.41%, respectively.
A clearer picture emerges as the timescale for comparison is reduced from annual to daily. The results for daily 2 m temperature are given in      As a first step in this analysis, several representative thresholds were chosen. For precipitation, zeros were filtered out of the data and the thresholds chosen were 0.1, 1, 5, 10, 20, 30 and 50 mm. For temperature, the thresholds chosen were −5, 0, 5, 10,15 and 20°C. Appropriate contingency tables of the form shown in Table 3.5 were then calculated for each threshold to facilitate further analysis. Figure 3.3 shows the 24-hour precipitation frequency distribution for the gridded observations and each of the three models. The values are given as percentages, as the differing model grid resolutions mean different total numbers of observations and tend to distort the visual comparison. Additionally, the observed values were taken directly from the original data files, after removal of negatives and transformation of co-ordinates but before grid spacing was dealt with. This was done purely for visualisation purposes. From Figure 3.3, it is clear that the three models capture the overall trend of observed rainfall quite well. A similar trend (not shown) was found for 2 m temperature.

Accuracy
The accuracy score was calculated using the formula: where a, d and T are as described in Table 3.5. This equation answers the simple question of what fraction of the model forecasts were correct. The results for each of the three models are shown in Table 3.6 (2 m temperature), Table 3.7 (Ireland 24-hour precipitation) and Table 3.8 (Northern Ireland 24-hour precipitation). For 2 m temperature, MÉRA has better scores than Accuracy = a + d T

Frequency bias
The frequency bias was calculated using the formula: where a, b and c are as described in Table 3.5. The frequency bias answers the question of how the model frequency of "yes" events compares with the observed frequency of "yes" events, with 1 being a perfect score. As shown in Table 3.6 (2 m temperature), all three models tend to underpredict at most thresholds (apart from MÉRA and COSMO at high thresholds), with MÉRA outperforming both WRF and COSMO at lower (0°C and 5°C) and higher (20°C) thresholds. At moderate thresholds (10°C and 15°C) COSMO is the better performer. In Table 3.7 (Ireland precipitation) and Table 3.8 (Northern Ireland precipitation), for thresholds below 5 mm the models perform similarly, tending to slightly underpredict. Above this threshold, each model overpredicts, with MÉRA outperforming WRF, which in turn outperforms COSMO.

Hit rate
The hit rate (HR) was calculated using the formula: where a and c are as described in Table 3.5. The HR answers the question of what fraction of the observed "yes" events are correctly predicted by the models, with 1 being a perfect score. For 2 m temperature (Table 3.6) all three models display a decreasing ability with increasing threshold, with MÉRA being the best performer at lower thresholds (0°C, 5°C and 10°C) and COSMO outperforming both WRF and MÉRA at higher thresholds (15°C and 20°C). For 24-hour precipitation [ Table 3.7 (Ireland) and

False alarm rate
The false alarm rate (FAR) was calculated using the formula: , (3.4) where b and d are as described in Table 3.5. The FAR answers the question of what fraction of the observed "no" events are incorrectly predicted by the models as a "yes" event, with 0 being a perfect score. For 2 m temperature (Table 3.6) all three models display an increasing ability with increasing threshold, with MÉRA being the best performer at the lowest (-5°C) and higher (10°C, 15°C and 20°C) thresholds. For the remaining (0°C and 5°C) thresholds, COSMO is marginally best. For 24-hour precipitation [ Table 3.7 (Ireland) and

Hanssen-Kuiper skill score
The KSS (Dahlgreen et al., 2016) is simply the difference between the HR (equation 3.3) and the FAR (equation 3.4). It is used to indicate how well each model separated the "yes" events from the "no" events, with 0 indicating no skill and 1 being a perfect score. For 2 m temperature (

Equitable threat score
The equitable threat score (ETS) (Gandin and Murphy, 1992) was calculated using equations 3.5 and 3.6: , (3.5) and a, b, c and T are as described in Table 3.5. The ETS measures how well the modelled "yes" predictions correspond to the observed "yes" events when adjusted for hits due to random chance (a*). Here, 0 indicates no skill and 1 indicates perfect skill. For 2 m temperature (Table 3.6) each model displays the highest skill at moderate thresholds (5°C, 10°C and 15°C) and decreased skill at lower (-5°C, 0°C) and the highest (20°C) thresholds. At all thresholds MÉRA is the best performer of the three models, with COSMO the second-best performer at moderate thresholds (WRF otherwise). For 24-hour precipitation (Tables 3.7 and 3.8) the trend is the same as that seen for the KSS -values are highest at moderate thresholds, with MÉRA consistently outperforming WRF, which, in turn, outperforms COSMO.

Fractions skill score
An early question that needed to be addressed was what, if any, added value an increase in spatial resolution brings. The fractions skill score (FSS) (Roberts and Lean, 2008) provides a measure of forecast skill against spatial scale for selected thresholds and is described as follows: ( 3.12) A script to implement the FSS method was written and run for several thresholds (50th, 75th and 95th percentiles) for each of COSMO 1.5 km, WRF 2 km and MÉRA 2.5 km for neighbourhoods of length 6 km and 18 km. Example results for the three highresolution precipitation datasets (95th percentile, 18 km length scale) are given in Table 3.9 (columns 2-4), alongside results from the COSMO and WRF 18 km datasets (columns 5 and 6) only (there are no lower resolution MÉRA datasets). The patterns in Table  3.9 (for high-resolution datasets MÉRA marginally outperforms WRF and WRF outperforms COSMO, while for lower resolution datasets the skill scores are lower) were seen for all thresholds and length scales tested.

Hourly Estimates
Hourly synoptic station observations of 2 m temperature, precipitation, 10 m winds, relative humidity and mean sea level pressure were obtained from Met Éireann. There are 25 stations in total, with varying record lengths available. Each of the three model datasets was preprocessed so that a comparison with these observations could be made.

2 m temperature (°C)
A summary of the results from the analysis of 2 m temperature is given in Figure 3

Precipitation (mm)
For hourly precipitation amounts, COSMO-CLM and WRF show remarkably similar errors (Figure 3.5) and STDs (Figure 3.6) -overall error values are less than 0.01 mm and overall STDs are 0.63 mm for both models. Additionally, the MAEs are 0.18 mm for both models. By comparison, the analysis performed for hourly MÉRA precipitation gave the following values: < 0.01 mm (error), 0.55 mm (STD) and 0.16 mm (MAE).

10 m winds (m s -1 )
An analysis of 10 m wind speed and direction was performed utilising hourly station data obtained from Met Éireann. In total, there are 23 stations where such data are recorded. The data are provided in CSV format and a latitude-longitude location is provided with each file. A degree of caution is required when handling these datasets -unfortunately, they are not always continuous (i.e. missing data) or of equal duration, which can cause erroneous results if not accounted for. Octave scripts were developed to correctly process these datasets and produce hourly time series at each station location that can then be compared with the three model outputs. The three model datasets contain hourly 10 m U (eastward) and V (northward) wind components for the period 1981-2015. A script that utilises Climate Data Operators (CDO) commands has been developed to extract 10 m wind speeds and direction based on the following simple formulae: speed = sqrt (U 2 + V 2 ) and direction = 180 + (180/pi)*atan 2 (U, V). This CDO-based script first interpolates both (U, V) wind components to each station location for comparison with observations. Another Octave script was developed to correctly match each fixed-length, continuous model time series with the relevant variable-length, non-continuous observed time series. This script also produces a wind rose at each location for the observations and each model. An example from Casement Aerodrome, Dublin, for the period from1 January 1987 to 31 December 2015 is shown in Figure 3.7, in which each of COSMO, WRF and MÉRA have captured the fundamental characteristics of the observed wind profile. The overall bias, MAE, STD and RMSE were calculated at each location. The wind speed results are given in Table 3.10.

Relative humidity (%)
Relative humidity bias, MAE, percentage error, absolute percentage error, correlation and associated STDs were calculated at each station location for the three models. The overall results of this analysis, in which all station data were amalgamated to form one single dataset, are given in Table 3.11.

Sea level pressure (hPa)
Hourly sea level pressure bias, MAE, percentage error, absolute percentage error, correlation and associated STDs were calculated at each station location for each model. The results of this analysis, in which all station data were amalgamated to form one single dataset, are given in Table 3.12.

Upper-air Winds
There is great interest in long-term time series of wind speed and direction, particularly at the 80 m level, the standard turbine height. Numerous attempts to obtain observational data were made, with varying degrees of success. Wind speed and direction data were obtained from 10 and 7 Midlands turbines, respectively (section 4.1.1), and radiosonde data were obtained from two locations (section 4.1.2).

Turbine data
The observational data files required a degree of processing before they could be used for model validation: the values are comma separated but with many different data elements and durations; there are various conventions for handling errors and missing values; the data are 10-minute resolution averages, maximums and minimums for wind speed (10 series) and vector averages for wind direction (seven series); and the data are from numerous (non-modelled) heights -48, 68, 69, 70, 80 and 80.5 m. Octave scripts were written to process the observational data into 1-hour resolution time series, with missing/error values being flagged for future masking. Additional scripts were written that generate time series of appropriate duration from the three models. The model U and V components, recorded at various heights (e.g. 20, 40, 60, 80 m, …), were first interpolated to turbine heights using a simple log power law: where h is height, R is the roughness length and the subscripts old/new denote known/interpolated values. Once the new velocity components are found, they are interpolated to the latitude/longitude of the turbine and wind speeds and directions are then calculated. The results of the subsequent turbine height wind speed and direction validations are given in Tables 4.1 and 4.2, respectively. At each location, MÉRA consistently showed the lowest mean MAE and STD, with WRF outperforming COSMO.

Radiosonde data
Radiosonde data were acquired from two locations: Castor Bay, County Down, and Valentia, County Kerry. The data files contain numerous upper-air measurements, including wind speed and direction, from ground level up to approximately 25 km and were processed for comparison with the three models. An example of wind speed data processed for comparison with COSMO-CLM at Castor Bay is shown in Figures 4.1 and 4.2. The uncertainty estimates found for wind speed are given in Tables 4.3 and 4.4, while those for wind direction are given in Tables 4.5 and 4.6. The results of other, "more exotic" parameter

Satellite data
Satellite-based radiation data were obtained from two different sources, the Photovoltaic Geographical Information System (PVGIS) and the European       processing steps before use: the satellite datasets are on much larger domains and at different resolutions (0.05°) from the three models; the MÉRA 3-hour forecasts contain 1-, 2-and 3-hour accumulations; and the COSMO data are split into averaged direct and averaged diffuse surface downward short-wave radiation. Several CDO-based scripts were developed to prepare both the observational and the model data for error analysis. Results from the PVGIS analysis are shown in Figure 4.3; the WRF data display greater errors than either the COSMO or the MÉRA data. The PVGIS results (for all three models) and EUMETSAT results (for MÉRA and COSMO only) are given in

Station data
Yearly global irradiance totals were derived from 18 Met Éireann daily station observations and checked against the satellite observations (Figure 4.4). The length and range of each observation time series are station dependent but typically cover multiple years over the period 1987-2015. However, error values (not shown) are similar to those found during the satellite data analysis.      . A high level of care is required when calculating these indices for "in-base" model data (i.e. model data from 1981 to 1990). The reasoning, reference period data and methods applied to overcome this problem are discussed below for the calculation of the percentage of days when the minimum temperature is below the 10th percentile (TN10p).

Annual Temperature Indices
These indices require yearly comparison with various temperature thresholds and include the number of frost days, icing days, summer days and tropical nights and the length (and start) of the growing season. The indices were calculated as follows.

Number of frost days
This index is defined as the annual count of days when the daily minimum temperature (TN) is less than 0°C. More precisely, for each year j in the period 1981-2017, the minimum temperature TN ij is found for every i day. The number of days when TN ij < 0°C is then counted. This index was calculated using the CDO command eca_fd (i.e. "CDO eca_fd infile outfile"), where infile is an annual file containing TN at each model grid point and outfile is the number of frost days per grid point. Some example results using COSMO-CLM 1.5 km and MÉRA temperature data (for the years 1981, 1991, 2001 and 2011)

Number of icing days
The number of icing days is defined as the annual count of days when the daily maximum temperature (TX) is less than 0°C. For each year j in the period 1981-2017, the maximum temperature TX ij is found for every i day. The number of days when TX ij < 0°C is then counted. This index was calculated using the CDO command eca_id. Some example results using COSMO-CLM 1.5 km and MÉRA data (for the years 1981, 1991, 2001 and 2011) are given in Figures 5.4 and 5.5, respectively. Spatially averaged values over the period 1981-2017 are shown in Figure 5.6 for both COSMO-CLM and MÉRA. Since Ireland's climate is relatively mild, there is a paucity of data points around which to base a regression. Consequently, the regression slopes are more susceptible to outliers and have not been calculated here.

Number of summer days and tropical nights
The number of summer days is the annual count of days when TX is greater than 25°C, while the number of tropical nights is the annual count of days when TN is greater than 20°C. These indices have been calculated for the period 1981-2017 using the CDO commands eca_su and eca_tr, respectively. Given Ireland's relatively mild climate, it can be expected that these indices are low in value, particularly the number of tropical nights. Figures 5.7 and 5.8 show some example outputs for number of summer days for individual years using COSMO-CLM and MÉRA data, respectively, while Figure 5.9 shows the spatially averaged number of summer days per year over the entire 1981-2017 period for COSMO-CLM and MÉRA. Figure 5.10 shows the equivalent data for the number of tropical nights. As expected, the values for each index are low, with no immediate trend recognisable from the data.

Growing season length
For the northern hemisphere, the growing season length (GSL) is defined as the annual count of days between the first span of 6 consecutive days with a daily mean temperature (TG) greater than 5°C and the first span (after 1 July) of 6 consecutive days with a TG of less than 5°C. The CDO command eca_gsl was used to output the growing season start (GSS) and GSL for each year. Example spatial outputs of GSS and GSL for the years 1981 and 2017 are given in Figures 5.11 and 5.12 for COSMO-CLM and MÉRA, respectively. The models show similar results for GSL, with noticeably increasing values towards the south-west and an almost year-round growing season. The spatially averaged annual GSL for the period 1981-2017 is shown in Figure 5.13 for COSMO-CLM and MÉRA. Linear regressions (y = a + bx) for both models are also shown. Both models suggest an increasing GSL: b = 0.43 (95% CI −0.19 to 1.05) for COSMO-CLM and b = 0.62 (95% CI −0.03 to 1.27) for MÉRA. Linear regressions for GSS were also tested (not shown). As expected, both regressions have a        Figure 5.18.

Base Period (1961-1990) Temperature Indices
Several climate change indices that require comparison with the base/reference period  were calculated. Gridded observations of TG, TX and TN at 1 km spatial resolution were previously acquired from Met Éireann (during the first 6 months of this project) and have been utilised here to construct the base period. For the indices calculated here, a potential problem can arise as a result of inhomogeneity across the "in-base" (1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990) and "out-of-base"

Percentage of days when TN < 10th percentile
The daily TN for each day i over a period j was calculated; here, j is a calendar year from the period 1981-2017. These TN ij were then compared with TN in 10, which is the calendar day 10th percentile centred on a 5-day window for the base period. The percentage of days when TN ij < TN in 10 was then calculated. For those years that are "in-base" years (1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990), the following algorithm (Zhang et al., 2005) was implemented: 1. The period 1961-1990 was divided into a singleyear "in-base" dataset that contains the year for which the exceedance was to be calculated and a "29-year base period" from which the percentiles were to be calculated.
2. A selection of 29 new "temporary 30-year base periods" were constructed from the "29-year base period" by iteratively making a copy of one individual year within the "29-year base period".
3. Twenty-nine different TN10p values were then calculated by comparison of the "in-base" year from step 1 with each of the new "temporary 30-year base periods" constructed in step 2.
4. An average was then taken across all 29 TN10p values to form the TN10p value for this "in-base" year.
The procedure outlined above was then repeated for each of the years that are "in base", i.e. 1981-1990, generating a large volume of data. The processing was achieved through a combination of bash, CDO (e.g. the eca_tn10p command) and task farm scripts. A summary of the (annual, spatially averaged) results is given in Figure 5.19, where a potential downward trend was quantified by simple linear fits with slope b = −0.064 (95% CI −0.13 to 0) for COSMO-CLM and b = −0.079 (95% CI −0.17 to 0.01) for MÉRA. This downward trend suggests warming, as fewer daily minimum temperatures are below the 1961-1990 threshold.

Percentage of days when TX < 10th percentile
The daily TX for each day i over a period j was calculated; here, j is a calendar year from the period 1981-2017. These TX ij were then compared with TX in 10, which is the calendar day 10th percentile centred on a 5-day window for the base period. The percentage of days when TX ij < TX in 10 was then calculated (CDO command eca_tx10p). In Figure 5.20, the annual spatially averaged TX10p values found for COSMO-CLM and MÉRA are shown. As with TN10p, there is a noticeable downward trend, indicating warming, as fewer daily maximum temperatures are lower than the historical threshold. Also shown in Figure 5.20 are linear fits with slope b = −0.20 (95% CI −0.36 to −0.03) for COSMO-CLM and b = −0.27 (95% CI −0.42 to −0.12) for MÉRA, which quantify this trend.

Percentage of days when TX > 90th percentile
Values for TX ij (calculated earlier) were compared with TX in 90, which is the calendar day 90th percentile centred on a 5-day window for the base period. The percentage of days when TX ij > TX in 90 was then calculated (CDO command eca_tx90p). In Figure 5.21, the annual spatially averaged TX90p values found for COSMO-CLM and MÉRA are shown. There is a noticeable upward trend for MÉRA TX90p and a much weaker (practically non-existent) trend for COSMO-CLM TX90p. The MÉRA trend suggests warming, as increasing numbers of daily maximum temperatures are higher than the historical threshold. These trends are quantified in Figure 5.21 by linear fits with slope b = 0.004 (95% CI −0.11 to 0.12) for COSMO-CLM and b = 0.08 (95% CI −0.01 to 0.16) for MÉRA.

Percentage of days when TN > 90th percentile
Values for TN ij (calculated earlier) were compared with TN in 90, which is the calendar day 90th percentile centred on a 5-day window for the base period. The percentage of days when TN ij > TN in 90 was then calculated (CDO command eca_tn90p). In Figure 5.22, the annual spatially averaged TN90p values found for COSMO-CLM and MÉRA are shown. As with TX90p, there is a much stronger upward trend for MÉRA TN90p than for COSMO-CLM TN90p. The MÉRA upward trend suggests warming, as increasing numbers of daily minimum temperatures are higher than the historical threshold. Both trends are quantified in

Warm spell duration index
The 1981-2017 annual count of warm spell days has been calculated for both COSMO-CLM and MÉRA. A warm spell is defined as a period of at least 6 consecutive days when TX > 90th percentile (from the reference period 1961-1990). As part of the calculation, the CDO command eca_hwfi was invoked, which has two outputs: WSDI and the number of warm spell periods (NWSPs) per time period (i.e. year). In Figure 5.23, the annual spatially averaged WSDI values for COSMO-CLM and MÉRA are presented.

Cold spell duration index
The 1981-2017 annual count of cold spell days has been calculated for both COSMO-CLM and MÉRA.
A cold spell is defined as a period of at least 6 consecutive days when TN < 10th percentile (from the reference period . The CDO command eca_cwfi was utilised as part of the processing, which, like eca_hwfi, has two outputs: CSDI and the number of cold spell periods (NCSPs) per period (i.e. year). Figure 5.24 shows the annual spatially averaged CSDI values for COSMO-CLM and MÉRA.

Annual Precipitation Indices
A total of 13 annual precipitation indices were calculated for the period 1981-2015 utilising data from COSMO-CLM (1.5 km), WRF (2 km) and MÉRA (2.5 km). Each index requires different levels of processing and, in most cases, extant CDO commands were used. The datasets created are as follows: the simple precipitation intensity index (SDII); the annual count of days when precipitation is ≥ 1, 5, 10, 15, 20, 25 and 30 mm (RNmm); the maximum length of dry and wet spells [number of consecutive dry days (CDDs) and consecutive wet days (CWDs)]; the number of periods with more than 5 consecutive dry or wet days (CDDP/CWDP); and the annual total precipitation for wet days (PRCPTOT).

Simple precipitation intensity index
The SDII is based on the amount of precipitation (RR wj ) that occurs on wet days w (RR ≥ 1 mm) in period j and is given by the following formula: where N represents the number of wet days in period j. The SDII was calculated at monthly, seasonal and annual timescales for each of the three models (as well as from 1 km daily gridded observations from Met Éireann, for comparison) after preprocessing (i.e. the generation of monthly, seasonal and annual precipitation files) and running the CDO command eca_sdii. Examples of the (temporal and spatial) mean summer and winter SDII over the period 1981-2015 are given in Figures 5.25

RNmm
The index RNmm describes the annual count of days when precipitation is greater than or equal to N mm, with N determined by meteorologically significant precipitation amounts. For instance, the values N = 1, 10 and 30 were used to determine the number of wet, heavy precipitation and extremely heavy precipitation days, respectively, and were calculated for each of the three models over the period 1981-2015 utilising the CDO command eca_RR1. Other values used were N = 5, 15, 20 and 25 mm. Figure 5.27 illustrates the R10mm values (temporal and spatial means) found for each of the three models. There is a clear east-west divide visible in each spatial map, with the west showing higher R10mm values than the east. R10mm values are also noticeably higher in mountainous regions. The three models display similar temporal patterns ( Figure 5.27, bottom right panel) with no obvious overall trend; linear fits give slopes of b = −0.024 for COSMO-CLM, 0.041 for WRF and −0.005 for MÉRA. The results for R20mm ( Figure 5.28) and R30mm ( Figure 5.29) show a decrease in the number of days that fall into these categories (when compared with R10mm). The eastwest pattern is no longer obvious, with mountainous areas accounting for the majority of the R20mm and R30mm results. Although weak, there appears to be a positive temporal trend for both R20mm (Figure 5.28, bottom right panel) and R30mm ( Figure 5.29, bottom right panel). Linear fits show agreement in sign and (to a lesser extent) magnitude of this trend among all three models. The slopes found for R20mm were b = 0.018 for COSMO-CLM, 0.019 for WRF and 0.016 for MÉRA, while the values found for R30mm were b = 0.013 for COSMO-CLM, 0.008 for WRF and 0.011