Climate change projections of maximum temperature in the pre-monsoon season in Bangladesh using statistical downscaling of global climate models

The climate of Bangladesh is very likely to be influenced by global climate change. To quantify the influence on the climate of Bangladesh, Global Climate Models were downscaled statistically to produce future climate projections of maximum temperature during the pre-monsoon season (March–May) for the 21st century for Bangladesh. The future climate projections are generated based on three emission scenarios (RCP2.6, RCP4.5 and RCP8.5) provided by the fifth Coupled Model Intercomparison Project. The downscaling process is undertaken by relating the large-scale seasonal mean temperature, taken from the ERA5 reanalysis data set, to the leading principal components of the observed maximum temperature at stations under Bangladesh Meteorological Department in Bangladesh, and applying the relationship to the GCM ensemble. The in-situ temperature data has only recently been digitised, and this is the first time they have been used in statistical downscaling of local climate projections for Bangladesh. This analysis also provides an evaluation of the local data, and the local temperatures in Bangladesh show a close match with the ERA5 reanalysis. Compared to the reference period of 1981–2010, the projected maximum pre-monsoon temperature in Bangladesh indicate an increase by 0.7/0.7/0.7 C in the near future (2021–2050) and 2.2/1.2/0.8 C in the far future (2071–2100) assuming the RCP8.5/RCP4.5/RCP2.6 scenario, respectively.


Background
Bangladesh is highly vulnerable to climate change due to its flat and low-lying delta landscape, dependence upon agriculture, high population density and poverty. Natural disasters and extreme weather events, such as floods and heat waves, occur frequently and are often detrimental to lives and livelihoods. Reliable information about how regional and local climates have changed in the past and may change in the future is important for managing climate change risks.
Global climate models (GCMs) can provide future projections of climate variables (Stocker et al., 2013) but typically at a coarse resolution that is of limited use in impact studies assessing the local response to global climate change. These projections can nevertheless provide some important information about large-scale climate change. A common way of approaching the question of local climate change is through so-called downscaling where information about large-scale climate change is combined with information about how the local climate depends on the large-scale situation and local geographical factors . One implicit assumption of downscaling is that the climate models have a minimum skillful scale that is different from the spatial resolution (Takayabu and Hibino, 2016), and that the local details have a predictable dependency on the large-scale features that the model is able to reproduce. Some explanations for the models' minimum skillful scales include their simplicity and idealistic representation of the world, their use of discrete numbers and imperfect numerical algorithms for solving mathematical operations, a lack of small-scale surface details, the presence of unresolved small-scale processes, and an imperfect model design due to our limited understanding of the climate system. There are two main approaches to downscaling GCM results to higher spatial resolutions: dynamical downscaling using regional climate models (RCM) (Rummukainen, 2010) and empirical-statistical downscaling (ESD) in which a statistical connection is established between the large-scale climate and local response (Hewitson et al., 2014). Dynamical downscaling is based on physical-dynamical relationships and provides a comprehensive picture of the climate, which makes it well suited for climate process studies (Giorgi and Gutowski, 2015;Kjellström et al., 2007). One main disadvantage with RCMs is the high computational demand which tends to limit the downscaling to one or a small set of GCM simulations (Gutman et al., 2012), but there are also other caveats such as mismatch between the driving GCM and the RCM (von Storch et al., 2000) and the need for bias-adjustment (Gudmundsson et al., 2012;Maraun, 2013). ESD on the other hand is computationally cheap and can easily be applied to a large GCM ensemble, but requires long observational time series and typically downscales each climate variable separately (Maraun and Widmann, 2018;Benestad et al., 2009). The assumption of a stationary statistical relationship is essential to obtain trustworthy projections for future periods in ESD, but this assumption is also implicitly made for GCMs and RCMs which rely on statistical parameterization schemes to represent some of the many complex physical processes that govern the climate.
Various methods can be used in ESD, such as analogue methods and circulation typing, regression analysis, or neural network methods (Wilby et al., 2002;Maraun et al., 2015). The statistical model is typically calibrated on historical observations and then applied to GCM results to generate output for future climate scenarios to study the impact of climate change on a regional scale (Wilby et al., 2004). There are two different approaches in ESD: (1) training the statistical model on each time step (e.g., daily weather charts and daily temperature) to downscale the local "weather"; (2) aggregating statistical aspects such as the mean µ and standard deviation σ over a season to downscale the parameters of probability density functions describing the local "climate" . Bias adjustments such as quantile mapping are often used to correct for systematic biases in global or regional climate model simulations, and can be seen as a simple and non-parametric statistical downscaling method (Gudmundsson et al., 2012;Gutjahr and Heinemann, 2013) which attempts to adjust the distribution of historical model data such that it matches the distribution of the observations, hence, it does not require a prior knowledge of the theoretical distribution of the weather variables. Large-scale features can be represented in different ways in ESD, either based on the nearest grid-box values from reanalyses and GCMs (e.g., Liu et al., 2016), or by an index that represents some spatially aggregated quantity or a statistic describing the spatio-temporal patterns in a larger region (e.g., Benestad, 2011). Common Empirical Orthogonal Function (common EOF) analysis applied to GCM and reanalysis data can be used to represent the large-scale predictors (Flury, 1988;Barnett, 1999;Benestad, 2001Benestad, , 2011. This means that EOF analysis is applied to an object combining the GCM and reanalysis data, which decomposes the combined data into a common set of spatial patterns, eigenvalues representing the relative importance of each pattern, and principal components (PCs) describing the temporal variations associated with the spatial patterns for the GCM and reanalysis data separately (see schematic in Fig. 1). In ESD, the part of the PCs that represents the reanalysis is used to tune the statistical downscaling model which is subsequently applied to the part of the PCs that represents the GCM data to obtain future projections. The method ensures that the exact same large-scale spatial structures are identified in both data sets. The common EOFs can also facilitate an evaluation of the GCMs' ability to reproduce the spatio-temporal covariance structure found in the reanalysis through a comparison between the part of the PCs that represents the reanalysis and GCM, respectively . Usually, one set of common EOFs is estimated for each combination of reanalysis and GCM simulation.
Bangladesh has been the subject of some statistical downscaling studies. Rahaman et al. (2015) found that a statistical downscaling model (SDSM) showed good agreement between observed and simulated maximum and minimum temperatures, indicating that the method can be used to predict future scenarios with some confidence. For precipitation, on the other hand, their results were not in such a good agreement. Nury and Alam (2013) downscaled temperature and precipitation for Northeastern Bangladesh from the HADCM3 model, using a similar method and found reasonable agreement between observed and downscaled data. Shourav et al. (2016) used SDSM to downscale future climate projections over the city of Dhaka, Bangladesh. Their study indicated that climate change will cause continuous increases in rainfall, temperature, and related extreme events. Pour et al. (2018), using a Support Vector Machine (SVM) based Model Output Statistics (MOS) approach (i.e., relating GCM model outputs to observational data, see Glahn andLowry, 1972 or Maraun andWidmann, 2018), downscaled 4-GCM simulations that were selected to cover the whole range of projections. Considering four different emission scenarios, they reported an increase in annual mean precipitation in almost all parts of Bangladesh for all scenarios, but a stronger increase in the western part of Bangladesh. Alamgir et al. (2019) also used an SVM method to establish downscaling models of minimum and maximum temperature in Bangladesh which were based on 8 GCM simulations. The downscaled projections showed a significant increase in temperature during the 21st century, with the minimum temperature increasing in winter (December-February) and the Figure 1. Schematic representation of the common Empirical Orthogonal Function (common EOF) analysis. The figure shows how EOF analysis is applied to a mixed object containing GCM and reanalysis data joined along the time axis, deconstructing the data into a set of n spatial patterns, eigenvalues representing the relative strength of the patterns, and principal components representing the temporal variations associated with the patterns. maximum temperature increasing in the pre-monsoon summer season (March-May). Khan et al. (2020) studying an ensemble of 11 RCM simulations projected a shift in the spatial patterns of drought in the pre-monsoon period in the coming century. All these past studies have involved a small number of GCM simulations, which can be problematic because small ensembles do not give a representative picture due to the presence of random noise (Mezghani et al., 2019).
There are examples of studies using GCMs or RCMs to evaluate the local response to global climate change in Bangladesh, but in general the spatial resolution has been too coarse for many impact studies. Using 11 GCMs, OECD (2003) estimated that by 2100, the average temperature in Bangladesh would increase by 2.4 • C following the SRES B1 scenario (Nakicenovic et al., 2000). Rahman et al. (2012) applied the RegCM3 to a single GCM simulation and also detected an overall rise in temperature while changes in precipitation were dependent upon season and region. Mishra (2015) found that RCMs show a larger uncertainty of increasing (1-3.6 • C) temperature in the CORDEX South Asia historical experiments than that of the observations in the Himalayan water towers like Indus, Ganges and Brahmaputra river basins. This evaluation also shows that the RCMs demonstrate large cold bias (6-8) • C and are not able to repeat the observed warming in the Himalayan water towers. The downscaled seasonal mean temperature in this multi-RCM ensemble was found to have relatively larger cold bias than their driving CMIP5 over the hilly sub-regions within the Hindu Kush Himalayan region (Sanjay et al., 2017).
It is essential to evaluate the downscaled results in order to have confidence in the downscaling methods. The European project VALUE established a framework for validating ESD-based downscaling and provided recommendations for techniques such as cross-validation (Maraun et al., 2015), but it did not consider a more comprehensive approach to vali-dating the results of the ESD results when applied to either a GCM (Benestad, 2001) or an ensemble of GCMs. The traditional approach to validate has been to apply the downscaling methods to some reanalysis to test the ability to reproduce the local temperature or precipitation when given a "perfect" large-scale predictor. However, when using downscaling to provide local climate projections, it is also important to test the downscaling when used with GCMs that also have biases. One advantage of ESD is that it is computationally cheap which makes it possible to downscale large ensembles of GCMs for extensive time intervals. It is then possible to validate the downscaling of the GCMs for the past based on how well it reproduces the past trends and interannual variability through various statistical tests .
The main purpose of this study is to reassess changes in the maximum temperatures in the pre-monsoon season (March-May) over Bangladesh through statistical downscaling of GCMs assuming three representative concentration pathways (RCP) which describe future greenhouse gas emissions -RCP2.6, RCP4.5 and RCP8.5. The statistical downscaling was applied to all GCM simulations of maximum temperature from the fifth phase of the Coupled Model Intercomparison Project (CMIP5) that were available from the KNMI Climate Explorer (64, 105 and 77 simulations for RCP2.6, RCP4.5 and RCP8.5, respectively). The ESD model was developed using a method developed by , in which stepwise multiple linear regression was used to establish a statistical relationship between the principle components of the maximum temperature observations at stations in Bangladesh and the large-scale temperature as represented by reanalyses and GCM data. One advantage of this method is that it requires only seasonal mean data, needs less computational resources than dynamic downscaling, and thus can be applied to many scenarios, GCM simulations and long time intervals rather than the brief time slices. The prin- ciple component analysis (PCA) of the station data also emphasises the large-scale temperature variability and reduces local noise, which is beneficial for statistical downscaling purposes (Benestad et al., 2015a). The outcomes of the study will assist in climate change impact evaluation and adaptation studies in Bangladesh.

Study region
In this study, we focus on the daily maximum temperature of the pre-monsoon season (March-May) which corresponds to the hottest season in Bangladesh exhibiting frequent heat waves with temperatures above 36 • C (36-38 • C are mild, 38-40 • C moderate, and above 40 • C are severe heat waves). Station based observations have shown that the average daily maximum temperature for Bangladesh in the pre-monsoon season varies between 23-30 • C with April and May being the hottest months (Khatun et al., 2016; see also our Figs. 2 and 3). The highest maximum temperature ranging from 36-40 • C has been attained in the northwestern and southwestern districts. Figure 2 displays the mean seasonal cycle of the maximum temperature at stations in Bangladesh. The western interior part of the country displays a strong seasonal variability and the highest maximum temperature in the premonsoon season.
Due to the daily intense heating of the land surface over northwestern Bangladesh, low pressure systems tend to develop over Bihar, West Bengal of India and the adjoining northwestern part of Bangladesh. In the afternoons, moisture from the Bay of Bengal occasionally incurs to the low pressure system resulting in the formation of thunder clouds and severe thunderstorms. These severe thunderstorms are known as Nor'westers ("Kalbaishakhi" in Bengali) and they occur frequently at many places across Bangladesh in the pre-monsoon season. The Nor'westers are often accompanied by destructive squalls, thunder and heavy rainfall with hail. Flash floods typically occur in the northeastern part of Bangladesh due to heavy rainfall associated with severe thunderstorms, especially over northeastern parts of Bangladesh and adjoining northeastern states of India. Nevertheless, only 19 % of the total annual rainfall occurs in the pre-monsoon season.
The pre-monsoon season is also characterized by cyclogenesis in the Bay of Bengal. Some of the low pressure systems formed over the Bay of Bengal intensify into depressions which sometimes turn into storms and move initially north-westwards and subsequently recurve to the northeast towards the coast of Bangladesh and Myanmar (Khatun et al., 2016). Some of these storms become very severe at landfall along the Bangladeshi coast. They are occasionally associated with storm surges and may cause high casualties and damages.

Data description
Local observations of the daily maximum temperature from stations in Bangladesh were collected from the Bangladesh Meteorological Department (BMD). Only stations with at least 30 years of valid data in the calibration period 1981-2019 were retained. Further information about the observational stations are found in Table S1 of the Supplement.
The ERA5 data set (Hersbach et al., 2020;C3S, 2017), a global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), was used as predictor data to describe the pre-monsoon seasonal mean temperature over the domain 80-100 • E/15-45 • N.
Mean temperature GCM data from the CMIP5 ensemble were downloaded from the KNMI Climate Explorer (https: //climexp.knmi.nl/start.cgi, last access: 21 January 2021) which provides monthly data regridded to a 2.5 • resolution for the time period of 1900-2100. Three future pathways were considered: the high emission scenario RCP8.5 (Riahi et al., 2007), the medium emission scenario RCP4.5 in which the radiative forcing stabilizes shortly after 2100 (Clarke et al., 2007), and the more optimistic peak-and-decline scenario RCP2.6 (Van Vuuren et al., 2007). It should be noted that the ensemble size and climate models included differ between RCPs, as shown in Table S2.

Empirical-statistical downscaling method
The empirical-statistical downscaling approach used in this study involved deriving statistical relationships between the mean daily maximum temperature over the pre-monsoon season from station based observations in the period 1981-2019 and the large-scale mean temperature patterns as represented by the common EOFs of the ERA5 reanalysis and GCM simulations. In other words, we chose the downscaling "climate" approach here, focusing on the statistical characteristics of the daily maximum temperature.
A principal component analysis (PCA) was applied to the seasonally averaged observational data which decomposed the data into a set of spatial patterns, associated principle components (PC) describing their variation in time, and eigenvalues describing the relative weight of the various patterns. The first two leading PCs, which accounted for approximately 94 % (80 % for PC1 and 14 % for PC2) of the variability of the observed maximum temperature, were used as predictands.
A common EOF analysis, as proposed by Benestad (2001), was applied to the reanalysis and GCM maximum temperature data extracted for the domain 80-100 • E/15-45 • N to represent the large-scale predictors . The GCM data (one simulation at a time, using the full period of available data 1850-2100) was combined with ERA5 reanalysis data for the period 1979-2019 along the time axis. EOF analysis was subsequently applied to the combined data set. This procedure decomposed the data into a set of spatial patterns and eigenvalues that were common to the reanalysis and GCM data, and principal components (PCs) for each pattern representing temporal variations where one part was associated with the reanalysis data and another part associated with the GCM data (Fig. 1). The part of the PCs associated with the reanalysis data were then used to calibrate a statistical model as follows: where Y 1 is the first leading PC of the predictand, X 1 , X 2 , . . . , X 10 are the ten leading PCs of the predictor, and c 0 , c 1 , . . . , c 10 are model coefficients obtained through multiple regression. Similar models were fitted for the second PC of the predictand. The number of predictor patterns was chosen because for all ensemble members, the ten leading common EOF patterns explain almost all (99 %-100 %) of the variance of the GCM simulation and reanalysis data. A stepwise multiple regression was used to estimate model parameters, using predictand and predictor data from the calibration period 1981-2019. Only the part of the predictor PCs that represent the reanalysis were included in the regression. The number of predictors was reduced using the Akaike Information Criterion (AIC), a measure of model quality which takes both model performance and complexity into consideration in order to avoid overfitting (Akaike, 1974). In practise, this means that one or several of the coefficients c 0 , c 1 , . . . , c 10 can be set to zero during model calibration. The statistical model was then applied to the predictor PCs associated with the GCM data to obtain future projections of the predictand PCs. Future estimates of the local maximum temperature could then be constructed from the projections of the predictand PCs combined with the corresponding spatial patterns and eigenvalues.
The procedure of common EOF analysis and stepwise multiple regression was repeated for each GCM simulation, but because the calculations were rather efficient they could nevertheless be applied to a large number of simulations (see Table S2 for a complete list of the included GCM simulations).

Validation
To evaluate the skill of the empirical-statistical models, a five-fold cross validation was applied in which the fitting process was repeated five times (Maraun et al., 2015), each time leaving one fifth of the predictor and predictand data out of the multiple regression (Wilks, 2011). Predictions were then made for the left out period and a cross validation score was calculated as the correlation (Pearson's r) between these independent predictions and the original principal components. The cross validation was performed for every regression model, i.e., for each PC and GCM simulation.
To assess how well the downscaled multi-model ensembles represent the past trends and interannual variability for each station, the observed seasonal mean maximum temperature in the period 1981-2019 was compared with statistical characteristics of the downscaled ensembles. The trend in the period 1981-2019 was calculated for the observed maximum temperature and for each downscaled ensemble member at all stations. To estimate the probability (p trend ) of seeing a maximum temperature trend of the observed strength given that the downscaled multi-model ensemble was a true representation of the distribution, the observed trend in the calibration period was compared with the probability density function (pdf) of a normal distribution (the "pnorm" function in R) with statistical characteristics (mean and standard deviation) given by maximum temperature trends of the downscaled ensemble members in the same period. To assess the representation of the interannual variability, the number of observed values outside of the 90 % confidence interval (CI) of the downscaled ensemble was counted and the probability of the outcome (p CI ) was calculated using a binomial pdf (the "pbinom" function in R). On average, 10 % of values are expected to fall outside of the 90 % CI, and a much higher (lower) number of values outside of this range suggests that the downscaled ensemble may be underestimating (overestimating) the interannual variability. The results of the ensemble validation are summarised in a target plot with p trend on the x axis and p CI on the y axis (see Fig. 6).

Analysis software
The analysis was carried out within the R-environment using the "esd" package (Benestad et al., 2015b) to analyze and visualize the data and perform the statistical downscaling. The "esd" package has a wide range of functionalities, including methods for reading and processing data, generating various infographics, and performing statistical analysis (e.g., calculating empirical orthogonal functions, principal component analysis, and empirical-statistical downscaling) and is suitable for processing results from global climate models.

Past climate change in Bangladesh
Based on the observations, the average daily maximum temperature in Bangladesh in the pre-monsoon season has changed by between −0.04 and +0.67 • C per decade over the period 1981-2019, depending on the station (Fig. 3; Table S1). The average temperature increase over all stations is +0.30 • C per decade. While the average observed maximum temperature during the period was highest in the west (Fig. 3a), the strongest change occurred in the east (Fig. 3b), which means that there has been a decrease in the east-west temperature gradient over the last decades. The warming observed in the central and eastern stations was statistically significant (p < 0.01 based on a student's t-test) while the trends in the west were not (Fig. 3b).

Validation of the statistical downscaling
An evaluation of the statistical downscaling models is shown in Figs. 4 and 5. The cross-validated correlation between the first (PC1) and second (PC2) principal components of the maximum temperature and the corresponding PCs estimated based on empirical-statistical downscaling were 0.90 and 0.61, respectively, when using only the reanalysis as predictor data (both statistically significant with p values lower than 0.01). A separate regression model was fitted for each reanalysis-GCM simulation combination and the cross validation was also repeated for all ensemble mem- bers. For PC1, the cross validation correlation score was between 0.82 and 0.92, in all cases statistically significant with p values < 10 −9 . For PC2, the cross validation ranged between 0.11 and 0.69, and only in 63 % of the ensemble members statistically significant at the 99 % level (i.e., p < 0.01). An attempt was made to downscale PC3 and PC4 too, but the cross-validation indicated that the esd models had little predictive power for these PCs, with lower correlation scores (0.36 and 0.13, respectively when using only the reanalysis as predictor) that were not statistically significant (the 95 percent confidence intervals of the correlation scores spanned negative and positive values). PC3 and PC4 are thus excluded from the future projections of maximum temperature.
As further validation, the statistical characteristics of the downscaled ensembles were compared to the observed maximum temperature as described in Sect. 2.3. The results presented in Fig. 6 suggest a spatial pattern where the downscaled ensemble tends to underestimate both the trend and interannual variability in the south-east, but gives a better representation of the observed statistical characteristics in the central and western part of Bangladesh. Figure 6 displays results only for RCP4.5, but the other emission scenarios (not included here) showed qualitatively similar results. Figure 6 shows an example of downscaled ensembles of climate projections for the station Dhaka for the different emission scenarios (RCP2.6, RCP4.5 and RCP8.5). The results for all stations are summarized in Table 1, which includes the ensemble mean of the change in the maximum temperature of the pre-monsoon season for two periods: the near future (2021-2050) and the far future (2071-2100) relative to the base period 1981-2010. Figure 8 additionally displays the ensemble mean of the projected temperature change at the various stations on a map. The ensemble mean of the downscaled projections indicated an increase in the maximum temperature in Dhaka by +0.8 • C for the near future and +1.0 • C for the far future assuming RCP2.6. For RCP4.5, the estimated temperature rise in Dhaka is +0.9 and +1.6 • C for the near and far future, respectively. The most severe scenario, RCP8.5, suggested an increase of +1.0 • C for the near future which is intensified for the far future reaching up to +3.0 • C. The projected change for Dhaka is slightly higher than the average increase in maximum temperature over all stations in Bangladesh. The estimated ensemble mean warming in the near future is independent of the chosen scenario (+0.7 • C for RCP2.6, RCP4.5, and RCP8.5), but for the far future the different scenarios have a larger impact on the results, which show an ensemble mean warming of +0.8, +1.2 and +2.2 • C for RCP2.6, RCP4.5, and RCP8.5, respectively. The highest projected ensemble mean warming at individual stations for RCP4.5 was +1.0 • C for the near future and +1.4 • C for the far future (both at the station Ishurdi). For the most severe emission scenario, RCP8.5, the strongest projected warming was +1.2 and +3.2 • C for the near and far future, respectively (also at Ishurdi).

Future projections of pre-monsoon maximum temperature
Tables 2-4 display ensemble statistics (5th, 50th and 95th percentiles) of the projected change in maximum temperature relative to the base period 1981-2010 for the emission scenarios RCP2.6, RCP4.5, and RCP8.5, respectively. The ensemble spread can be interpreted as representative of the internal climate variability as well as model differences. The average 5th percentile of projected temperature change over all stations in Bangladesh for the far future (2071-2100) assuming RCP2.6 (Table 2) is +0.1 • C, indicating that 5 % of the included projections show little change or even a temperature decrease compared to the base period . The corresponding 50th and 95th percentiles of temperature change is +0.8 and +1.6 • C, respectively, indicating that 50 % observations are showing changes of temperature of +0.8 • C or more but only 5 % showed a warming of +1.6 • C or higher. For RCP4.5 (Table 3), the 5th, 50th and 95th percentiles of the projected temperature changes averaged over all stations are +0.4, +1.2 and +2.3 • C from the base period to the far future. The corresponding ensemble statistics of maximum temperature change for RCP8.5 de- scribe an even stronger warming, with station average 5th, 50th and 95th percentiles of +0.9, +2.2 and +3.8 • C, respectively (Table 4).

Discussion
The three emission scenarios considered in this study lead to an increase in the expected seasonal mean maximum temperature in the future in Bangladesh. The low-emission scenario RCP2.6 describes a future in which CO 2 emissions remain constant until the early 21st century and become negative by the end of the century. As a result, the radiative forcing reaches a value of around 3.1 W m −2 by mid-century but returns to 2.6 W m −2 by 2100. Based on this scenario, our results suggest that the average pre-monsoon maximum temperature over meteorological sites in Bangladesh is likely to increase in the near future and then increase only slightly till the end of the 21st century. Following the low-to-moderate emission scenario RCP4.5 or the high-emission scenario RCP8.5, the average projected warming in Bangladesh in the near-future is similar to the warming under RCP2.6, but con-tinuing into the far future the warming is considerably higher, reaching around +1.2 and +2.2 • C for RCP4.5 and RCP8.5, respectively. Similarly to the RCP2.6, the intermediate path RCP4.5, is characterised by a peak-and-decline scenario in which emissions are considerably reduced after 2050 leading to a stabilization of radiative forcing toward the end of the 21st century. The continued increase of the projected maximum temperature following both RCP2.6 and RCP4.5 demonstrates the inertia of the climate system and the lag in the response of temperature to greenhouse gas emissions and radiative forcing.
The projected increase in maximum temperature in the already hot pre-monsoon season indicates that there will likely be an increase in the frequency and intensity of heat waves in Bangladesh. In the southwest and northwest regions, the maximum pre-monsoon temperature is already very high (Figs. 2 and 3, Table S1) and severe heat waves are already a common occurrence, and an increase of several degrees as our estimates suggest (see the regions Khulna, Rajshahi and Rangpur in Tables 1-4), could be detrimental to the life and health of people living in these areas. One important motivation behind the ESD method applied in this paper was to make use of the large scales that the models are able to reproduce realistically to say something about local changes. All GCMs have a minimum skillful scale which means that their individual grid-box values are not a good representation of the area they represent in the real world (through design and because computers work with discrete numbers). The use of common EOF analysis makes it possible to identify common spatial patterns in reanalysis and GCM data on a scale that is well represented by climate models.
A potential limitation of the statistical downscaling method presented here is that the predictor-predictand relationship may not be stationary, i.e., the statistical model may not be applicable to future periods. The stationarity assumption is central to empirical-statistical downscaling but difficult to validate, especially without access to long observational records. Here, the local pre-monsoon maximum temperature was downscaled from the mean temperature over the domain 80-100 • E/15-45 • N. There are circumstances that could change the predictor-predictand relationship in a period of climate change. For example, the local temperature response to large-scale temperature patterns could be influ- enced by changes in precipitation patterns, cyclonic activity and the timing of the monsoon. One way to address this could be to incorporate additional predictor variables that describe other aspects of the climate. While observations showed a stronger pre-monsoon warming in the east compared to the west in the past , which reduced the east-west temperature gradient in this period (Fig. 2), there was less of an obvious spatial pattern in the projected maximum temperature (Fig. 7). A simple correlation analysis indicates that there was no statistically significant correlation between the longitude/latitude and the rate of projected warming.
In the current analysis, only the two leading predictand PCs were downscaled. This was done because higher order PCs contained little information about the large-scale climate structure and thus could not be robustly downscaled. An attempt was made to downscale the third and fourth PCs but cross-validation indicated that the statistical models had little to no predictive power. This makes sense as the leading PCs represented coherent temperature anomalies on larger scales which are more likely connected to largescale forcing. Since the first two PCs stand for a majority of the observed maximum temperature variability and the observed trends, this should be enough to represent the relevant variations and trends, assuming sufficiently skillful regression models for PC1 and PC2. PC1 represents 80 % of the variability of the observed premonsoon maximum temperature in Bangladesh and partly contains the trend in the reference period 1981-2019 (Fig. 4). However, the spatial pattern in the observed temperature trend, which shows a stronger warming in eastern and southeastern Bangladesh compared to the western part of the country (Fig. 3), is contained within PC2 (Fig. 5). At the stations with strongest trends, 30 %-60 % of the trend is represented by predictand PC2. The regression model, which was trained on detrended data, slightly underestimated the long term trend in PC1 indicating that there is only a slight non-stationarity problem. This is more pronounced in PC2, which also has a poorer performance and represents a smaller fraction of the covariance. Since the downscaling models of PC2 are not as skillful as the models for PC1 (see cross-validation scores in Sect. 3.2 and Figs. 4c and 5c), the downscaling models fail to reproduce the observed gradient in the maximum temperature trend in the calibration period (Fig. 8). Further validation by inspection of the downscaled ensemble (Fig. 6) suggested that the downscaling reproduced the statistical characteristics of the observed climate well in central and western Bangladesh but in the east and south-east, which is the region with the strongest warming in the past (Fig. 3b), the downscaling tends to underestimate the trend and interannual variability at some stations. A possible explanation why the large-scale temperature is not a perfect predictor of PC2 could be that the observed spatial differences in warming are related to changes in precipitation patterns or other environmental circumstances that are not captured by the chosen predictor (the mean temperature). A preliminary analysis where the pre-monsoon maximum temperature was downscaled with alternative predictor variables from the ERA5 and CMIP5 data sets indicated that the mean Sea Level Pressure (SLP) or precipitation are better predictors than the mean temperature for PC2, but not as good for PC1 (not shown here). The downscaling method could be improved by the inclusion of multiple predictors, using multivariate EOFs and/or allowing different predictor variables for different predictor PCs. If the spatial patterns of pre-monsoon precipitation and drought change in the coming decade, as projected by Khan et al. (2020), a downscaling strategy allowing multiple climate drivers may have a better chance of representing the effect of these changes on the temperature.
To evaluate the influence of the predictor, the downscaling was performed with a variable number of predictor patterns. The cross validation scores indicated that the leading 8-10 predictor patterns should be included to achieve the best downscaling models, but including higher order patterns had    to the near future (2021-2050, shown in a-c) and far future (2071-2100, shown in d-f) assuming emission scenarios RCP2.6 (a, e), RCP4.5 (b, f), and RCP8.5 (c, g). The range of the color scale (0-+2 • C) does not span the full range of projected temperature changes for the far future assuming RCP8.5 (see Tables 1 and 4). little effect on the outcome, both in terms of cross validation, reproducing the present climate, and for the projected temperature change. A closer look at the common EOFs showed that 99 %-100 % of the variability was contained within the leading 8-10 predictor patterns for all GCM-reanalysis combinations, which explains why higher order modes of variability added no skill. These results suggest that the stepwise model selection should be applied to all predictor patterns that hold relevant information (e.g., considering all modes with an explained variance of 1 % or higher).
There is a subtle difference between the way that the principal component analysis of the predictand and the EOF analysis of the predictor is used in the downscaling method. For the predictor, only the eigenvectors representing the temporal variations of the predictor patterns are used in the regression models. This means that the predictor patterns are not ex-plicitly weighted by their relative importance. The stepwise model selection will instead determine which patterns are important, and since including additional parameters in the linear regression costs little time and computational resources, we can include all potentially relevant predictor patterns. For the predictand on the other hand, the downscaled predictand PCs are combined with the corresponding eigenvalues and spatial eigenvectors to obtain temperature projections. The leading PCs are thus given larger weight by its eigenvalue than higher order PCs. Every additional predictand PC included requires another regression for every GCM simulation, which can add significant computation time. Downscaling higher order predictand PCs can therefore be a waste of time and computational resources if they have low eigenvalues and their regression models have little predictive power.