Quality control procedures at Euskalmet data center

The Basque Country Mesonet measures more than 130 000 observations daily from its 85 Automatic Weather Stations (AWS). It becomes clear that automated software is an indispensable tool for quality assurance (QA) of this mesoscale surface observing network. This work describes a set of experimental semiautomatic quality control (QC) routines that is applied at Euskalmet data center. It has paid special attention to level validation design and associated flags, as well as to the system outputs, which are used by meteorologist and maintenance sta ff.


Introduction
In 1991 the Basque Meteorological Service (Basque Government) began the deployment of an AWS network.Priority was given to the real time observation of water surface level of the rivers.It is noted that flood return periods are quite small in most of the short length hydrologic basins of the Basque Country, especially those oriented to the Cantabrian Sea.In some sense, that objective conditions its current design.Thus much of the stations are gauging or water quality, located in valley bottoms, along the river beds (Fig. 1).In the course of time the network was gaining a more general purpose and was completed with other weather sensors.Nowadays the Mesonet has more than 85 AWS spread over the Basque Country (just over 7000 km 2 ), a quite high density network (Fig. 2) (Gaztelumendi et al., 2003).Its data are used in a wide range of applications related to meteorology -nowcasting, climate, data assimilation, verification -, and with many other fields -transportation, energy, insurance, planning, education, etc. From the beginning it became necessary to perform quality control tasks, both in real time and on recorded data (Navazo et al., 1999;Maruri et al., 2003).
The context in which this work was carried out corresponds to the real time monitoring and nowcasting requirements in Euskalmet (Gaztelumendi et al., 2006).Thus, a set of validation tests have been implemented that try to avoid the inclusion of erroneous data coming from the Mesonet in the visualization system.Moreover, provides information of great use to the QA system.

Overview of Mesonet QC processes
There are four components that integrate the QC system in the Basque Country Mesonet (Maruri et al., 2010a): (i) laboratory calibration, (ii) maintenance services, (iii) automated routines and (iv) manual inspection.Each component provides valuable information on the operation of the network and their results are shared across the system to ensure the accuracy of the data.Components work to different timescales, from the moment in which data is recorded to the analysis of data quality over time (Table 1).

Validation levels
The literature about QC methods of meteorological observations is very prolific.It is common to find in it a characteristic sequence of validation procedures: range, step, internal, persistence and spatial.The success of the checks depends largely on the thresholds used.In that sense, it is crucial to adapt them to the specific conditions of the region.In our case, much of the thresholds are based on those proposed by the University of the Basque Country (Maruri et al., 2010a), the WMO guidelines (WMO, 2008) and other meteorological services such as the Oklahoma Mesonetworks (Shafer et al., 2000;Fiebrich et al., 2010;Vejen et al., 2002).
According to Spanish normative (UNE-AENOR, 2004) related to AWS networks, and other operational services, we define six validation levels.Except the visual check, levels are successively applied to the meteorological variables Published by Copernicus Publications.(Table 2).The tests are usually generic, but some meteorological variables require certain specifications.The following briefly describe each of them (Table 3).

Validation of the structure of data recorded and the measurement time (level 0)
This section verifies the correct decoding of the data.In the case of solar radiation, the theoretical maximum value is given by the calculation of a clear sky model.The limit is occasionally exceeded under partially covered skies, so the theoretical values are multiplied by a factor of 1.2.Previously, we analyze the existence of noise into the signal.

Validation of the temporal consistency (level 2)
At this level both the consistency of the data and the consistency of the series are analyzed.In the first case, the following checks are performed: (i) step test ensures data do not change more than certain limit in 10 min; (ii) spike-dip test ensures data do not successively increase and decrease (or vice versa) more than certain limit in 20 min.
Regarding the second aspect, the persistence test ensures data change more than certain value in a defined period of time.At this point we do various specifications.For relative humidity, we check if the hygrometers saturate above or below 100 %.For precipitation, we ensure that the rain gauge does not register high rainfall intensities over a period of time questionable.

Validation of the internal consistency of the data (level 3)
The system checks the gust factor, ie the ratio between the mean wind speed and maximum gust, which must exceed a predetermined threshold.We also have established relationships for precipitation, flagging those observations that occur with low humidity or with a high percentage of solar radiation on the clear sky model.

Validation of the spatial consistency of the data (level 4)
The test tries to validate the spatial consistency of both absolute data and temporal changes.In the case of absolute data, it performs a cross validation process.The idea lies in removing one datum at a time from the data set and re-estimating this value from remaining data using kriging algorithms.Interpolated and actual values are compared to the standard deviation of the spatial domain: The observation is flagged when the difference exceeds twice the standard deviation (∆ > 2) or when the error estimation is greater than a certain absolute value.The estimation methods used are ordinary kriging, kriging with external drift and simple kriging with varying local means (Goovaerts, 1997;Hernández, 2001).The last two accounts for secondary information (terrain elevation, etc.) and they are preferably used to estimate the air temperature.
In the case of temporal changes, we perform a reanalysis of the values that have not passed the validation of the temporal consistency.The mechanism is the same as in the previous case, but using simple kriging.In this way we could relax the thresholds used in level 2. The assumption behind is that when a notable temporal variation of a meteorological variable happens in a given station, this should be reflected in the neighbourhood.
Although the high density of the Mesonet is appropriate to carry out this type of spatial tests, the possible spatial anisotropy must be taken into account as far as possible.There are several factors leading to strong gradients in the meteorological variables that influence their effectiveness.One is the existence of distinct climatic barriers.This is the case of Cantabria Mountains, that delimit the comarca of Rioja Alavesa, located in the south of the Basque Country.This represents an additional problem, because it is known that kriging errors grow to the edges of the domain.Other factor is the impact of unique meteorological phenomena: galerna, strong temperature inversions, heat bursts, etc.

Visual check (level 5)
We cannot broach the visualization of all information recorded by the Mesonet.Therefore, the time can be reduced to the display of suspect data.At this level it is relatively easy to decide whether the assigned flag is right, but some-times questions arise.For example, an anemometer stuck in a situation of calms.The test designed to detect the problem is the persistence (level 4), but it is not trivial where to cut.
The casuistry of errors in an AWS is very large.Show all of them here is beyond the scope of this paper.Figure 3 summarizes what happened on a particular day in a given station.We can see at glance the application of different types of test validation.Subsequently, there is also an adjustment work, which has a great impact on the quality of database.For example, once a step is detected in air temperature, is necessary to define dates that delimit the problem and calculate the magnitude.Other major supervised adjustments try to correct records coming from rain gauges and pyranometers not properly calibrated.Sometimes gauges underestimate the precipitation due to relay failures.Therefore, their amounts are compared and adjusted directly to the volume stored by the totalizer system (collected approximately every month by maintenance).With regard to solar radiation, it is hardly surprising that the constant number of the pyranometer lead to bad data.In this case, we reconstruct the series as best as we possibly could, specifically, fitting the observed data to the theoretical values through a factor calculated in clear sky days.

Flagging
It is noted that raw data are never altered, instead, all records are coupled with quality flags that indicate the level of confidence assigned by the QC system.The flags are stored in a metadata field, composed of four bytes of control (Maruri et al., 2010b).Each byte deals with (i) origin of the data, (ii) status of the data, (iii) validation levels, (iv) adjustments.On the other hand, the adjusted data series are considered as new variables, so you must define new fields for them.
Data are flagged as erroneous when sensor-based range test are not passed.All others levels (except visual check) qualify the data as suspicious.Also, results from temporal checks are combined with those coming from spatial tests.Thus, if an observation is flagged by both tests data are considered as a failure.

Automated QC summary report
The automated QC produces a summary report daily that compiles data incidences from the previous day.This report is accessible to the meteorologist on duty responsible for carrying out the surveillance of the network, who determines whether further action is warranted.If so determined by the meteorologist, it sends a new error message for a malfunction to maintenance services.This has an application designed for this purpose (Fig. 4).

Conclusions
This paper presents very briefly the QC procedures currently used by Euskalmet.As a special contribution on this issue, we highlight the efforts made in the development of algorithms for the analysis of spatial data consistency.
Despite the automation needs, it is important to note that the quality of the data can not fall solely on the application of automatic algorithms.The quality starts with a good location of each station.Subsequently, the information must flow properly between the different components of the QC/QA system.Among other things, this prevents errors are perpetuated over time.

Figure 2 .
Figure 2. Basque Country location and Mesonet map.

Figure 2 .
Figure 2. Basque Country location and Mesonet map.

3. 2
Validation of the data according to limits (level 1) Two types of checks are implemented: (i) sensor-based range test ensures data are between range of sensor hardware specifications or theoretical limits; (ii) climate-based range test ensures data are between certain flexible limits.Currently an observation is compared with the climatological values calculated from representative stations and the expected standard deviation.

Figure 3 : 5 Figure 3 .
Figure 3: This chart shows data from an AWS on a given day with different types of errors in 3 measurement sensors and what kind of test would be capable of detecting them.4 5

Table 2 .
Meteors and applied levels.
*x * Based on clear sky model.