FAIR: a project to realize a user-friendly exchange of open weather data
Access to high quality weather and climate data is crucial for a wide range of societal and economic issues. It allows optimising industrial processes, supports the identification of potential risks related to climate change or allows the development of corresponding adaptation and mitigation strategies. Although such data is freely available through Germany’s national meteorological service DWD (Deutscher Wetterdienst) since 2017, the application potential in industry and society has certainly not yet been fully unlocked. Major obstacles are the complexity of the raw data, as well as missing tools for their simple integration into existing industrial applications. The goal of the research project FAIR is to simplify the information exchange between the DWD and economical players. In order to reach this goal a requirement analysis with end-users of weather data from three different sectors was conducted. A central requirement regarding the site assessment of wind plants is quick and easy access to historical wind-series at specific sites. Preferably downloadable in formats like CSV or via an API. Event planning partners are interested in a quick access to health relevant weather information at their event location, and the E-mobility sector in temperature data along planned routes. In this paper, we summarize the results of the requirement analysis and present the deduced technical architecture and FAIR services aiming at a user-friendly exchange of weather data.
The quote ”Data is the new oil” (Clive Humby) very clearly describes the increasing relevance of information for society and economy. A particularly valuable source of information in this regard is climate and weather data which is instrumental in safeguarding of traffic and transportation, the optimisation of industrial processes, the identification of climate-related risks and the development of corresponding adaptation and mitigation strategies. However, correct understanding and handling of such data is often difficult for users without a meteorological background. Furthermore, processing and analysing this data is a challenging task that requires specialised software solutions and an infrastructure that is able to deal with large data sets.
Since 2017, DWD has implemented a very open data policy and has significantly expanded access to its raw data for the public. The aim of the open data policy is to increase the application potential of existing DWD data. For this purpose, DWD publishes a continuously increasing amount of meteorological information on a freely accessible server. DWD is aware of existing issues with this approach such as weak visibility of data products in the directory tree or technical obstacles (especially for non-meteorological users) due to the heterogenity of data formats. Therefore, DWD implemented the new CDC data portal (Kaspar et al., 2019) as a first step towards a more user-friendly provisioning of meteorological data. This portal allows access to a subset of the data located on the server via an interactive web interface. However, by providing only a subset of the published data without any further services such as visualisation or data processing (like format conversions, statistics), there is still a lot of potential for improvement towards easy and intuitive access to DWD's comprehensive data set.
In order to bridge the gap from raw data to the required information actually three pragmatic approaches exist: (1) The information processing based on the raw data is done internally by DWD staff members and only the processed information is made available to the customer, (2) professional users provide meteorological in-house expertise to evaluate the raw data in relation to their application, or (3) meteorological third-party providers (e.g. https://www.donnerwetter.de, last access: 7 September 2020, https://www.t-online.de/wetter/, last access: 7 September 2020) are commissioned to provide the desired information. However, each of these three solutions is either not scalable due to the limitations of DWD funding with regard to staff members (approach 1), expensive (approach 2) or non-transparent (approach 3). With the FAIR project described in the following we aim to address these deficits by providing a transparent service platform.
Under the umbrella of FAIR, a research project funded by the German Federal Ministry of Transport and Digital Infrastructure (BMVI), DWD and eight partners from industry and universities collaborate in order to improve the access and to facilitate the exploitation of open climate and weather data for a broad spectrum of users. To find a generic solution for the wide range of individual requests to the heterogeneous raw data (Gregow et al., 2016), the central idea is to approach the problem with the implementation of a variety of micro services. Each of the services is planned to solve only one independent task, but the combination of the services will provide a complete end-to-end chain from raw data to the specific answer. Moreover, flexible orchestration of the micro services will allow flexible information retrieval specified by individual users. The services are also planned to be used for data acquisition by users for applications of DWD.
Due to the early state of the project, this manuscript focuses on the requirement analysis for the intended services and on its implications for the FAIR architecture. Therefore, the paper is structured as follows. Section 2 introduces the application scenarios of the planned FAIR services and provides the background for the results of the requirement analysis in Sect. 3. Based on the collected requirements, Sect. 4 gives an overview of the planned FAIR services and the technical implementation. Section 5 provides conclusions and an outlook on FAIR.
The FAIR services are planned to solve a wide range of data requests as well as delivery processes. However, in order to develop the principle architecture and a fundamental set of micro services we focus on four different use cases where weather data of DWD are used by industrial companies and three backchannel cases where weather data are collected and provided by users to DWD. The following paragraphs introduce the application and backchannel cases in detail.
2.1 Use case – Planning of wind farms
The energy yield of wind farms depends on the wind conditions which vary from site to site. Thus, when planning wind power plants, the economic viability of the project has to be assessed. Therefore, sufficiently long time series, preferably decades, of wind information of high quality is of essential value. The state-of-the-art procedure for site assessment is to measure locally for at least one year and to relate this high-quality information to long-term wind estimates from either measurements in spatial proximity or to wind estimates provided by meteorological models, called reanalyses (FGW, 2017).
2.2 Use case – Integration of meteorological data for individual traffic routing
Weather and especially the air temperature is a central parameter when estimating the range of electric vehicles (Salisbury, 2016). As the number of e-vehicles increases, so does the need for optimised routing and charging recommendations tailored to the weather-dependent range of the vehicles. Currently weather influences such as wind, rain etc. are not taken into account for the range specification of an e-vehicle. Weather conditions have a decisive influence on the (optimal) charging and driving behaviour. Weather-related changes in charging requirements and the (regional and temporal) use of public charging points will be used in an advanced occupancy forecast for charging stations. This data can also be used for a better planning of the charging infrastructure.
2.3 Use case – Planning of social events
Especially for outdoor events like festivals, weather plays a crucial role. Herein, the entire process of an event including set-up, implementation, and dismantling is highly dependent on the local weather conditions. Most serious are possible safety problems for visitors and employees in case of extreme weather situations such as storms, hail or extreme heat. Next to safety, weather also influences for example the planning phase due to the changing consumption behaviour of visitors on extremely hot or cold days. Furthermore, different soil conditions might influence the applicability of cranes and other assembly and dismantling tools. Thus, it is important to provide the event manager with best forecasts in real time.
2.4 Use case – Weather data in a map platform
Travellers are directly influenced by the weather. Of particular interest for travellers is the weather along a planned route. Such coupled information might be provided by routing platforms themselves. In this use case it is planned to integrate the freely available weather information of DWD into the existing mapping platform SmartMaps.
2.5 Backchannel cases
FAIR also aims for an improved bidirectional data exchange. Thus, besides making the open data of DWD available for users, FAIR also intends to make meteorological information of individual companies or research institutes available to DWD. The extended data base of observations can then be used to improve the meteorological forecasts and models which in turn enhances the quality of the open data provided by DWD. Therefore, in the scope of FAIR, three categories of data will be collected: (1) SCADA data from wind power plants (comprising wind speed, wind direction, and generated power), (2) lake surface temperature estimates derived from satellite observations, and (3) air temperature and pressure estimates derived from smartphone sensors.
To develop a user-friendly access to weather and climate data we first need to understand the user requirements. In the conducted requirement analysis, the project partners BayWa r.e. Wind GmbH (BayWa, use case: wind farms), KME (use case: social events), and YellowMap AG (YM, use cases: e-traffic routing and weather data in a Map platform) were systematically asked about their requirements for user-friendly access to weather and climate data. For this purpose, a three-page questionnaire was designed with the main topics (1) required information, (2) typical further processing of the data, and (3) preferred access options.
3.1 Data requirements
The question of the required weather variables and quantities revealed interest in twelve parameters in total. For all four applications, data requirements includes wind speed, air temperature, precipitation amount and type, and occurrence of lightning (Table 1). In two of the use cases, the wind direction and slip hazard are also of interest. The other five parameters are of interest to the use case of BayWa only. A fundamental difference between the use cases is the required time coverage. While BayWa requests historical data, KME and YM mainly need forecasts (see Table 2). Historical data for YM is only required to improve the existing occupancy forecast for EV charging stations. Herein, historical data with an hourly resolution is acceptable and forecasts should be as highly resolved as possible. Furthermore, there are different requirements with regard to the height. While YM is only interested in near-surface data, KME asks for data up to about 60 m above ground in the context of the cranes' applications, and BayWa for information between 80 and 300 m above ground with 10 m vertical steps.
Required metadata are especially the timeliness of the data, the exact time stamp, the institutional source, the generating model or measurement instrument, and the licence (Table 3).
3.2 Data processing
In addition to the required data, it is particularly important to trace the steps required for data processing and data use. Only with this approach the appropriate architecture, the associated micro services and the APIs or user interfaces for the later exchange or display of the data can be determined. Accordingly, there was a strong focus on this in the requirement analysis. For each use case APIs are requested.
YM will use an API to enrich the map platform with weather data as well as to improve the charging prognosis for EV charging stations. KME would like to receive push notifications in an app containing warnings about specific weather conditions. For this purpose, a visualisation of the weather situation in a desktop application or app is also intended in order to provide an overview of the current weather situation. BayWa needs a Web-GIS to access the above defined data sets to enable the visualisation of the wind potentials and the download after prior data analysis in the Web-GIS. For the latter use case, BayWa also requires the possibility for interpolation of data. The validation of the weather data of the DWD including an uncertainty estimate is requested by BayWa and YM.
A common and important requirement of all partners is to reduce the volume of the data by defining subsets of the original data sets. The parameters required for this data reduction refer to (1) the specific meteorological variable, e.g. temperature, (2) the region or coordinate of interest, (3) the time period, and (4) the height above ground level. Fusion of the data with other weather data is currently not a requirement.
3.3 Data access
For the implementation of the user-oriented data access, it is important to define the requirements for data formats and update frequencies. These specifications are essential for the definition of the architecture and hardware. The Shapefile format, KML and GeoJSON were specified as necessary geodata formats. Furthermore, CSV, PDF and JPEG should be provided for further processing and presentation of the weather data. The requirements for timeliness of the data varies widely among the use cases. KME and YM require a 5 min update frequency for the festival operation and the map platform. During the setup or dismantling phase of the event, hourly updates are sufficient. A daily update is sufficient for the YM occupancy forecast. Depending on the use case and the underlying wind data, BayWa differentiates between the following scenarios: one day old (ideal), ≤ three months old (acceptable), > three months old (problematic). In order to predict the expected system load, the query frequency of the data must be considered in addition to the data timeliness. For acute weather warnings, KME would like the shortest possible frequency; YM wants to perform a 5 min, daily or weekly query, depending on the application. BayWa would also like to query the data daily to weekly.
In the requirement analysis, the required expertise of the end users was determined. We found a strong dependency on the use case. For the application of YM, knowledge about e-mobility and map platforms is necessary, for the applications of KME and BayWa background knowledge in the fields of weather apps, and atmospheric reanalyses is required, respectively.
Finally, KME mentioned its interest in push notifications for occurring extreme weather situations such as hail or thunderstorms. Thus, not a direct access to data is required but a push notification if either the probability of a hail event exceeds 50 %, the wind speed exceeds 61 km h−1, or the temperature passes 35 ∘C.
Based on the results of the requirement analyses (see Sect. 3) this chapter describes the first concept of the FAIR architecture. As a consequence of these requirements, the architecture requires to meet three major challenges. First, the FAIR backend needs to support a variety of specific workflows delivering quite individual results. The end user's demands differ much with regard to area, time intervals, granularity, variables, etc. Moreover, the use cases often require the computation of user-specific composed scores. For example, the outdoor event guest requires a simplified forecast – bad, normal, good weather, the organizer additionally requires an estimation of risk regarding critical weather conditions. Those terms refer to domain specific definitions. Finally, there is also variation in the desired output – depending on what the user needs – e.g. selected and filtered raw data, a report, a visualisation or an output that is dedicated to a specific (third-party) application. The same holds true for other UCs as well. Hence, a FAIR workflow not only selects data, it needs to interpret those according to domain-specific rules and deliver them in a specific form. Consequently, the architecture needs to consider that. Secondly, when a new requirement arises, the corresponding workflow should reside on existing functionality as much as possible. For example, if a sailing app requests information according to its own definition of bad, normal, good (or a new set of defined terms), the implementation should only cover those definitions. The remaining part should be be done by assembling and configuring existing functions. Thirdly, the allocation of resources and the overall resources should be adapted to the current demand. When the demand shifts from wind power planning to e-mobility, resource allocation should follow. When the overall load of the system exceeds a certain threshold (leading to increased response times), the system should expand the overall resources.
4.1 Technical implementation and orchestration of FAIR services
The architecture within FAIR shall be capable to transform complex climate and weather data into a flexible but user-specific and user-friendly output. The approach was inspired by two research projects BASMATI (Altmann et al., 2017) and GEISER (Georgala et al., 2018). It is based on three assumptions:
The processing can be performed in a (mainly) unidirectional end-to-end workflow.
It can be split into separate and (partly) generic services.
It requires an initial orchestration and configuration but sparse instructional interaction between the services involved.
In essence it foresees a composable micro service infrastructure. It is able to handle multiple micro services – each encapsulating a generic functionality. Each micro service can be replicated if required – leading to several instances of it. Reasons could be either load-balancing or an individual configuration mirroring use case specific parameters. To ease communication and enhance robustness, the configuration, orchestration and control rely on a centralised message bus component. The data flow uses a centralised data lake to enhance efficiency and ease branching, i.e. the reuse of intermediate results as input of multiple processing chains, without redundancy. As a result, adding new services to the system is relatively easy. The service developer only needs to define the I/O control message types and corresponding topics that his service consumes or produces. Finally, it is relatively easy to monitor the distribution and the overall load on the system and connect it to state-of-the-art resource managers to auto-allocate and auto-scale computational and memory resources. To provide a comprehensive overview, Fig. 1 illustrates the planned architecture using an example.
Figure 1 depicts an initial draft of the planned architecture using, as an example, one of the use cases (planning an application for wind power plants). The grey box depicts the infrastructure of DWD that currently provides data via FTP and in predefined formats, not suited for non-meteorological users or application developers. Any updates are signalled to FAIR by an update messenger service. The message is then published on a central Message Queue Bus (MQ) or comparable streaming engine – the only fix point of reference, handling the whole instructional work-flow within FAIR. Received by a caching service, this message triggers a download directly to a data lake, the central storage component in use. Other references or parameters required are subject to configuration. The end-user configures their desired output via a dedicated front-end-application that translates into configuration messages also published via the MQ. Those are received by the three exemplary micro service instances – a spatiotemporal filtering service, an attributive filtering service and a service to create a visual representation. The services are configured to interpret the completion message of the preceding service sent to the queue as a trigger, load the intermediate result or stream, process it, write the result and publish an own completion message. The planning application reads the output and presents it to the user.
The design thereby enables:
maximum re-usability of all services,
re-usability of intermediate outputs,
an end-to-end workflow without fixed connections.
4.2 Planned FAIR (micro) services
This section provides an overview of the planned micro services. In order to address the user's requirements as well as the provision of a backchannel for user or third-party data to DWD, the planned micro services can be divided into six topics: (1) Data processing, (2) data provisioning, (3) data visualisation, (4) metadata handling, (5) data acquisition, and (6) resource provisioning. The following describes the services per topic in more detail.
Data processing. Different user groups are used to specific variables, data formats, data volumes and require different quality standards. However, meteorologists provide a variety of variables from different sources and with different quality in various data formats like GRIB, netCDF, csv, or raw text. Therefore, FAIR services are developed in order to provide variable selection, data reduction (e.g. domain reduction in case of spatially resolved data), and reliable conversion to data formats known to the specific user group. In this respect, it is planned to implement converters for widely accepted formats, such as GeoJSON.
Data provisioning. In order to provide user-friendly access to historical, current and prognostic weather data, FAIR aims at developing a data portal which provides a transparent overview of available meteorological products as well as applicable FAIR services. Two services will be developed to provide the required information or data: (1) a download service, and (2) a web-feature-service (WFS). Both services can be configured to provide either raw or processed data.
Data visualisation. Visualisations of statistical information are a necessary and essential step for a variety of users. Visualisations provide a first insight into the data in terms of units and value range and additionally provide the opportunity for a rough visual data control. Therefore, the FAIR visualisation service will provide flexible and user group specific visualisation metrics which will be partly available via geoservices like web map features (WMS). Examples for visualisations are histograms as a standard visualisation tool or wind roses at different height levels as an important tool for specific users such as the renewable energy sector. Furthermore, it is planned to visualise the uncertainty of the data (e.g. uncertainty estimates of ensemble-based weather forecasts).
Metadata handling. Metadata are crucial for the interpretation of data sets. The lack of detailed information on data sources or pre-processing steps may render the data useless for users. Therefore, FAIR plans to implement services for provision and preparation of metadata. If, for example, different data sets are combined into one extensive data set, intelligent processing of metadata is necessary. Furthermore, standardised metadata are useful to support the retrievability and reproducibility of data and conclusions.
Backchannel. The backchannel aims to simplify the way for third parties to provide raw data to DWD. In order to make this backchannel transparent and to allow the use of a standardised format, specific FAIR micro services are planned.
Resource provision. In order to also provide access to the FAIR services to small companies (or even individuals) it is also planned to incorporate the services into an Infrastructure as a Service (IaaS). The IaaS allows to combine the aforementioned FAIR services with individual processing logic remotely, without the need to have own resources available. The IaaS provides the advantage that computational power and storage can be seen as a scalable service leading to a cost optimised access to the FAIR services.
4.3 Planned end-to-end chains
Planning of wind farms – With the FAIR services, we plan to make the high-resolution reanalysis products COSMO-REA6 (Bollmeyer et al., 2015) and COSMO-REA2 (Wahl et al., 2017) of DWD easily accessible to the wind sector. Previous studies show a high quality of these data sets for wind applications (Kaspar et al., 2020), but a user-friendly platform for that community does not yet exist. Thus, relevant wind estimates and other weather data at the specific sites and heights will be made available via a data portal and downloadable in the required data format CSV. Furthermore, the user will have the possibility to obtain graphical visualisations for the selected locations before the downloading of the complete data selection is necessary.
The integration of meteorological data for individual traffic routing – With the FAIR services we plan to integrate forecasted weather information of DWD's current numerical weather prediction models COSMO-D2 (Schraff et al., 2016) and ICON (Zängl et al., 2015) into routing applications. By estimating the impact of the temperature and wind speed on the cruising range of the electric vehicles, optimised routing recommendations will become feasible.
Planning of social events – The FAIR services aim to make data on severe weather events as well as health-related data of DWD directly accessible for event managers, i.e., the original huge data volume will be streamlined by filtering and visualising in traffic light manner, and thereby provide simpler support to these decision makers. Again, the micro services will be used to read, cut, visualise, and add customised uncertainty visualisations to the forecasts.
Weather data in a map platform – With the FAIR services we make weather data accessible and usable via the map platform of YM. Thus, the data can be used in terms of e.g. visualisation of weather along a planned track. Moreover, the map platform provides the possibility to make weather data available through a separate map module to enable the retrieval of weather data and their presentation on a map. The retrieval of data will not only be possible through an API request, but also via predefined map layers for the simple visualisation of weather information as e.g. temperature data.
With a variety of micro services, a data portal, and an infrastructure as a service, FAIR has the goal to provide the openly available climate and weather data of DWD in a user-friendly way. At the same time, FAIR also aims to collect meteorological data from individual companies and institutes in order to complement DWD's database and thereby improve numerical weather prediction models.
In order to find a generic and flexible solution of data provisioning, pre-processing, conversion, visualisation and metadata handling a requirement analysis was conducted. An important requirement from three application-oriented perspectives is an access point which provides a transparent overview of the available meteorological information. Another important need is an efficient access to the data via an API or per download in data formats common to the specific application. Due to the large data volumes, filter options for specific time frames or variables are desired. In addition to these general requirements the users specified a wide range of user-specific requirements.
In order to find a generic and flexible solution to satisfy most of the requirements, FAIR pursues an intelligent orchestration approach of so-called micro services. Each micro service is intended to solve an individual task in the chain of a specific request. The presented FAIR architecture provides a first approach on how to organise and orchestrate the individual micro services. Thus, the approach creates a framework for various future applications. For testing and further optimisations of the re-orchestration procedure of the FAIR services, a close collaboration with associated partners like DACHSER, Lufthansa, Deutsche Bahn, and other companies is planned. In this respect, the associated partners are expected to provide the opportunity to extend the planned FAIR services by even further specialised micro services unknown in the current stage.
CWF edited the manuscript with input from all authors.
The authors declare that they have no conflict of interest.
This article is part of the special issue “19th EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2019”. It is a result of the EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2019, Lyngby, Denmark, 9–13 September 2019.
This research has been supported by the Bundesministerium für Verkehr und Digitale Infrastruktur (project name “FAIR”, grant no. 19F2103A).
This paper was edited by Renate Hagedorn and reviewed by Andreas Hense and Janine Aquino.
Altmann, J., Al-Athwari, B., Carlini, E., Coppola, M., Dazzi, P., Ferrer, A. J., Haile, N., Jung, Y.-W., Marshall, J., Pages, E., Psomakelis, E., Santoso, G. Z., Tserpes, K., and Violos, J.: BASMATI: An Architecture for Managing Cloud and Edge Resources for Mobile Users, in: Economics of Grids, Clouds, Systems, and Services, edited by: Pham, C., Altmann, J., and Bañares, J. Á., Springer International Publishing, Cham, 56–66, https://doi.org/10.1007/978-3-319-68066-8_5, 2017. a
Bollmeyer, C., Keller, J. D., Ohlwein, C., Wahl, S., Crewell, S., Friederichs, P., Hense, A., Keune, J., Kneifel, S., Pscheidt, I., Redl, S., and Steinke, S.: Towards a high-resolution regional reanalysis for the European CORDEX domain, Q. J. Roy. Meteor. Soc., 141, 1–15, https://doi.org/10.1002/qj.2486, 2015. a
FGW: Technische Richtlinien für Windenergieanlagen, Tech. rep., available at: https://wind-fgw.de/produkt/bestimmung-von-windpotenzial-und-energieertraegen/ (last access: 7 September 2020), 2017. a
Georgala, K., Obraczka, D., and Ngonga Ngomo, A. C.: Dynamic Planning for Link Discovery, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10843 LNCS, 240–255, https://doi.org/10.1007/978-3-319-93417-4_16, 2018. a
Gregow, H., Jylhä, K., Mäkelä, H. M., Aalto, J., Manninen, T., Karlsson, P., Kaiser-Weiss, A. K., Kaspar, F., Poli, P., Tan, D. G., Obregon, A., and Su, Z.: Worldwide survey of awareness and needs concerning reanalyses and respondents views on climate services, B. Am. Meteorol. Soc., 97, 1461–1474, https://doi.org/10.1175/BAMS-D-14-00271.1, 2016. a
Kaspar, F., Niermann, D., Borsche, M., Fiedler, S., Keller, J., Potthast, R., Rösch, T., Spangehl, T., and Tinz, B.: Regional atmospheric reanalysis activities at Deutscher Wetterdienst: review of evaluation results and application examples with a focus on renewable energy, Adv. Sci. Res., 17, 115–128, https://doi.org/10.5194/asr-17-115-2020, 2020. a
Salisbury, S.: Cold Weather On-Road Tesing of a 2015 Nissan Leaf, Idaho National Laboratory, Technical Report June, 2016. a
Schraff, C., Reich, H., Rhodin, A., Schomburg, A., Stephan, K., Periáñez, A., and Potthast, R.: Kilometre-scale ensemble data assimilation for the COSMO model (KENDA), Q. J. Roy. Meteor. Soc., 142, 1453–1472, https://doi.org/10.1002/qj.2748, 2016. a
Wahl, S., Bollmeyer, C., Crewell, S., Figura, C., Friederichs, P., Hense, A., Keller, J. D., and Ohlwein, C.: A novel convective-scale regional reanalysis COSMO-REA2: Improving the representation of precipitation, Meteorol. Z., 26, 345–361, https://doi.org/10.1127/metz/2017/0824, 2017. a
Zängl, G., Reinert, D., Rípodas, P., and Baldauf, M.: The ICON (ICOsahedral Non-hydrostatic) modelling framework of DWD and MPI-M: Description of the non-hydrostatic dynamical core, Q. J. Roy. Meteor. Soc., 141, 563–579, https://doi.org/10.1002/qj.2378, 2015. a