Four centuries of summer temperature changes in Tierra del Fuego: atmospheric drivers and tree-ring 1 reconstruction from the southernmost forests of the world 2

12 Proxy climate records, such as those derived from tree rings, are necessary to extend relatively short instrumental meteorological observations into 13 the past. Tierra del Fuego is the most austral territory with forests in the world, situated close to the Antarctic Peninsula, which makes this region 14 especially interesting for paleoclimatic research. However, high-quality, high-resolution summer temperature reconstruction are lacking in the 15 region. In this study we used 63 tree-ring width chronologies of Nothofagus pumilio and Nothofagus betuloides and partial least squares regression 16 (PLSR) to produce annually resolved December-to-February temperature reconstruction since AD 1600 which explains up to 65% of instrumental 17 temperature variability. We also found that observed summer temperature variability in Tierra del Fuego is primarily driven by the fluctuations of 18 atmospheric pressure systems both in the South Atlantic and South Pacific, while it is insignificantly correlated to major hemispheric modes: ENSO 19 and SAM. This fact makes our reconstruction important for climate modelling experiments, as it represents specific regional variability. Our 20 reconstruction can be used for direct comparison with model outputs to better understand model limitations or to tune a model or contribute to 21 larger scale reconstructions based on paleoclimatic data assimilation. Moreover, we showed that PLSR has improved performance over principal component regression (PCR) in the case of multiple tree-ring predictors. According to these results, PLSR may be a preferable method over PCR for the use in automated tree-ring based reconstruction approaches, akin widely used point-by-point regression. of tree-ring chronologies were tested. We used a manual analogue of stepwise regression, adding and 148 removing predictors in an attempt to find the best set. The earliest part of the reconstruction, which was based on the longest chronology, was derived by the scaling method. The chronology was adjusted to have the same mean and standard deviation as the instrumental data for their common period. that anticyclone east of Patagonia, northern advection. The showed anticyclone to form near the west coast of TdF and moves further to the north-east. It was investigated (from 500 mb geopotential heights 256 composites fields) that this high-pressure centre forms within the large-scale baroclnic wave system over the South Pacific. This process begins 257 of a low-pressure anomaly near the south-east coast of and develops over a period of 21 days throughout the South Here we showed that anomalies TdF the scale

explained variance will increase steadily, while in PCR it will increase irregularly with big steps when adding the next principal component (PC) 130 which explains a significant amount of variance of the dependent variable. At the same time, when using PLSR it is especially important to use 131 cross-validation, as it is strongly subjected to overfitting due to its ability to adjust to the target variable (Geladi and Kowalski 1986;Abdi 2010). 132 In our study we used cross-validation to select the appropriate number of latent variables. This experiment may be considered as a theoretical 133 example of one step of point-by-point regression (PPR), tested with real data for TdF archipelago. 134

Nested reconstruction approach 135
To get advantages of both the number of chronologies and the lengths of some of these, we used a nested reconstruction approach (Meko 1997). In 136 this approach, we provide one reconstruction for the whole period covered with the appropriate data, but parts of the reconstruction are produced 137 using different sets of chronologies. Thus, each part of the reconstruction has different calibration-validation statistics, with the quality usually 138 decreasing back in time. In this way we, on the one hand, provide the maximum reasonable length of the reconstruction, and on the other hand 139 provide the maximum reconstruction skill for each period. Statistics r, R 2 , RE and CE on cross-validation were used to describe the skill of each 140 part of the reconstruction (see Appendix for details). 141 To select the target variable for the reconstruction we were guided by a set of considerations. First, it was previously demonstrated, that air 142 temperature for different summer months is the main driver of tree growth in the region, with higher correlations in December for N. pumilio temperatures has been widely used in multiple climatological and paleoclimatological studies, including modelling. Hence, this target variable was 145 preferred for compatibility with other studies. Third, in our experiments with different target variables we found that mean DJF temperature was 146 one of the best performing targets for our dataset. 147 To select the predictors, different sets of tree-ring chronologies were tested. We used a manual analogue of stepwise regression, adding and 148 removing predictors in an attempt to find the best set. The earliest part of the reconstruction, which was based on the longest chronology, was 149 derived by the scaling method. The chronology was adjusted to have the same mean and standard deviation as the instrumental data for their 150 common period. 151 3 Results 152

Observed summer temperature variations in Tierra del Fuego and its drivers 153
Although climate in southern South America is strongly influenced by large-scale atmospheric systems in the Southern Hemisphere (Garreaud et 154 al. 2009(Garreaud et 154 al. , 2013, we found that observed summer temperature in TdF was weakly connected to major Southern Hemisphere indices. There were no 155 significant correlations with either summer SAM (r=-0.10, p=0.3, 1901-200; r=0.01, p=0.94 1957-2016), nor with summer ENSO 3.4 (r=0.08, 156 p=0. 39,. Hence, we expected regional variability of atmospheric circulation to be responsible for summer temperature oscillations in 157 TdF. 158 Circulation pattern for positive temperature anomalies larger than 1 °C (Fig. 1a,b, red isolines) shows dominance of an active low-pressure centre 159 to the west of Antarctic Peninsula (mostly pronounced in December) and a high-pressure ridge near the east coast of Patagonia (which is stretched 160 to Antarctic Peninsula in February). This system enhances the meridional circulation and promotes the advection of warm northern air masses into 161 TdF. For the years with negative summer temperature anomalies with absolute values greater than 1 °C (Fig. 1a,b, blue isolines), the atmospheric 162 circulation pattern shows dominance of a cyclonic activity near the Weddell Sea, which probably strengthens the westerlies and produces the south-163 west cold air advection into TdF. These two patterns of pressure systems for positive and negative summer temperatures in Southern South America 164 were also discussed in Alessandro (2008). 165 According to the average wind speed data for the 1901-2015 period, during summer (DJF) climate in TdF is affected by the westerly winds ( Fig.  166 1a-c). However, there is a difference in geopotential heights between years with positive and negative temperature anomalies (Fig. 1c). These 167 anomalies represent an anticyclone in the Southern Atlantic with the centre between Malvinas and South Georgia Island. Another anomaly forms 168 a cyclone in the Southern Pacific with the centre migrating from north to south at the longitude of approximately 100°W. 169 These processes are not actual synoptic processes but rather an average of pressure anomalies for many years. However, this atmospheric dipole 170 explains rather well the variability of summer temperatures in TdF. Correlations of regional monthly temperatures with 850 mb geopotential heights 171 reach r=0.73 (p<0.001) in the centre of this anticyclone and have significant negative values (up to r=-0.4, p<0.001) in the cyclone (Fig. 2c). Wind 172 vectors show that during warm temperature anomalies the intensity and frequency of northern winds in TdF increase (Fig. 2a,c). The described 8 The geopotential height and wind anomaly composite fields for temperature anomalies greater than 1°C at TdF (Fig. 2a) confirm that in December 176 and January, the pressure centre that drives positive temperature anomalies is a part of a quasi-stationary wave train with a quasi-barotropic structure, 177 extending from Australia across the South Pacific, and ending with a cyclonic circulation centre in the South Atlantic Convergence Zone (SACZ). 178 In February, the wave train appears to extend less westward, spreading over the adjacencies of southern South America and the Antarctic Peninsula. 179 Conversely, negative temperature anomalies in TdF are associated with a circulation scheme with anomalies of inverse sign to those of the positive 180 temperature anomalies (Fig. 2b). An anomalous low-pressure centre is located over the South Atlantic affecting the TdF region with enhanced 181 southern circulation. This centre is also part of a wave train that extends along the Pacific ending in the SACZ, in this case in a more zonal form, 182 and extended towards the Indian Ocean for all months. 183 Based on these results, we argue that regional oscillations of large-scale pressure systems forming on the way of Southern Hemisphere subpolar 184 westerly winds affect summer temperature variability in TdF. The sign and intensity of summer temperature anomalies depend on the existence 185 and intensity of the atmospheric dipole and the storm tracks in the South Atlantic and South Pacific. 186

Comparison of PLSR and PCR 187
We first made a series of experiments to address the methods' performance with increasing number of predictors and variable number of latent 188 variables (Fig. 3). For the PCR method, the maximum R 2 on cross-validation was steadily increasing with the number of predictors used. It was 189 also increasing with the number of PCs, reaching maximum values with 10 to 25 PCs (Fig. 3a). The number of PCs for the best R 2 may be hard to 190 define, e.g. when using more than 50 predictors we got comparable values of R 2 for 15 to 25 PCs. For the PLSR method the maximum R 2 on cross-191 validation was also steadily increasing with the number of predictors used. In contrast to PCR, it consistently reaches maximum R 2 values with 192 much fewer (3-5) latent variables. Larger numbers of latent variables did not give an improvement of R 2 . On the contrary, the prediction skill drops 193 rapidly, especially when the number of predictors is big. This may be explained by an ability of PLSR to concentrate the signal in the first latent 194 variables, while the noise is left in the remaining ones. Figure 3c is a cross-section of the results presented in Figures 3a and 3b for the maximum 195 number of predictors (N=63). Here we can make several additional observations. First, the PCR method shows negative R 2 for the first several PCs. 196 It reaffirms that the main modes of the common variability in our tree-ring dataset are insufficient for an adequate representation of the target 197 variable, and that the additional lower-order components that contain valuable information are required. Second, the PCR method shows similar 198 explained variance for the number of PCs from 15 to 25, while the spread of the R 2 values acquired for various experiments with randomized 199 samples is increasing with the increased number of PCs. Lower spread points to an advantage of a reduced number of selected PCs. Finally, PLSR 200 consistently outperforms PCR in our experiment ( Figure 3d). The maximum average R 2 was higher for the PLSR method for each number of 201 predictors. 202 Although Figure 3 shows that the more predictors we have, the higher values of R 2 we get, it does not mean that we cannot get better performance 203 with fewer predictors. Our experiments showed that rigorous selection of predictors by a step-by-step procedure akin stepwise regression helps to 204 improve the results. In Figure 4 we show the results for 10 best predictors which were finally used for the most skilful part of the reconstruction 205 (AD 1803(AD -2002. In contrast to PLSR, where we observe that a large portion of target variance was explained by the first five latent variables, for 206 PCR we observe that the 3rd, 6th and 10th PCs added more information than the others. This is another confirmation that the important information 207 about the climate variability may be hidden in the principal components of lower order (those that explain less common variance of the dataset of 208 predictors). In this case, both PCR and PLSR reached equal performance with the maximum number of latent variables. 209

Summer temperature reconstruction in Tierra del Fuego since CE 1600 210
The final reconstruction ( Fig. 5) was derived for the target variable of summer (DJF) air temperatures using PLSR and a nested reconstruction 211 approach. In Figure 6, all individual reconstructions (nests) are shown, and their performances are described in Table 1 Figure 6 gives an opportunity to explore the similarities and differences of the individual reconstructions (nests). Even those parts that include 224 chronologies with EPS values lower than 0.85, and therefore were excluded from the final reconstruction, are in a good agreement, especially 225 throughout the 19th century. This fact increases our confidence in the reconstruction for this period. The spread among the reconstructions is higher 226 during the transition from the 18th to 19th century, with the reconstructions of higher explained variance demonstrating lower reconstructed values. 227 However, these parts are of less confidence because of lower sample replication. At the same time, the longest individual reconstruction (and the 228 less skilful one) shows rather good agreement with the other reconstructions for the period 1700-1980, especially at lower frequencies. Altogether, 229 the intercomparison of the individual reconstructions reaffirms credibility of the nested reconstruction approach used in this study. 230  Atlantic subtropical anticyclone occurred, and the westerlies zone moved south. These processes probably caused divergent trends in winter and 247 summer temperatures in Southern Patagonia after the year 1980, which are clear in the observed data (Fig. 7a,b). Negative temperature trend in 248 TdF summer temperature since early 1980s also diverges from positive trend in South America and South Hemisphere summer temperature 249 throughout the same period. Our reconstruction shows two long-period spectral peaks with cycles of 67 and 101 years long (Fourier analysis, 250 significant at p<0.05 against red-noise spectrum), which are evident from the smoothed reconstruction (Fig. 5a). This cyclicity persists for the whole 251 period of the reconstruction and is responsible for the observed summer temperature decline since 1970s. Hence, the recent climatic shift in the 252 region may be a recurrent feature which persists throughout at least four centuries. anticyclone begins to form near the west coast of TdF and moves further to the north-east. It was investigated (from 500 mb geopotential heights 256 composites fields) that this high-pressure centre forms within the large-scale baroclnic wave system over the South Pacific. This process begins 257 from the formation of a low-pressure anomaly near the south-east coast of Australia and develops over a period of 21 days throughout the South 258 Garreaud et al. (2013) highlighted that the stronger westerlies cause cooler summers at the southern part of South America. Besides, Alessandro 261 (2008) showed that cold summer T anomalies occur in TdF when the trough at 500 mb level is situated over the Antarctic Peninsula or moves to 262 the east and is centred over the Weddell Sea, producing the south-western cold advection. Thus, the positive summer T anomalies in TdF apparently 263 are consequences of high-pressure anomaly to the east of Patagonia, which corresponds to a ridge in the region at high altitudes. As T anomalies in 264 TdF are connected with the variations in storm tracks in South Pacific and South Atlantic, hence negative T anomalies occur when the storm track 265 moves to the north, enhancing the cyclonic activity to the south-east from TdF. Stability of climate-to-proxy relationships is another concern when reconstructing past climates, and especially when complex dependencies 275 including many predictors are considered. In our case, similarity of different reconstructions based on different sets of chronologies (Fig. 6), 276 including low-frequency variability, is an indirect confirmation of the stability of discovered relationships. Another confirmation is that all the 277 reconstructions passed cross-validation tests and showed positive R 2 values on independent data (Table 1). 278 Concerning the methods for the extraction of climatic information from tree rings, here we considered two of them -PCR and PLSR. The difference 279 between PLSR and PCR is that the former method constructs components to maximize the explained variance in the target variable, while the latter 280 maximizes it in the matrix of predictors. Such a modeling approach implemented in PLSR seems reasonable in case of dendroclimatic 281 reconstructions based on multiple chronologies, since we try to extract information related to the target climatic variable from every chronology, 282 and not to extract common signal first, as is happens when using the PCR method. That is why it usually needs less latent variables to reach the 283 same amount of explained variance. Less latent variables are better for interpretation, but we can also consider this method as a better filter that 284 separates the signal from the noise. For the PLSR, the signal is left in the first few latent variables, and the noise is omitted with the others, which 285 is confirmed by higher cross-validation statistics in comparison with PCR. For PLSR, the signal here is not considered as common variance of the As we have seen in the results (Section 3.2), when using PCR, some important information may be contained not in the first principal components. 288 Conversely, for PLSR explained variance increases rapidly for the first components, and then the increase decelerates. Such a property of PLSR 289 makes it easy to fix the necessary number of components to cut, which is especially important for automated working with large datasets, as 290 commonly happens for spatial reconstructions using the PPR method. Our results show perspective for substitution of PCR for PLSR in an 291 automated setting like PPR, because PLSR showed higher cross-validation statistics and also made it easier to define the number of latent variables 292 a priori. However, our experiments were performed for a specific region and dataset, and additional tests are required to confirm this finding. 293 Moreover, it may be difficult to define a priori the chronologies which contain important information for the final model, as they might not correlate 294 strongly with the target variable. That is why different screening procedures aimed at excluding some of the chronologies may lead to the loss of 295 important climatic signal. In an automated setting, when rigorous selection of the chronologies is impossible, PLSR may show improvement over 296 PCR, using less latent variables at the same time. 297 In our case PLSR better than PCR achieves the aim of extracting as much useful information as possible from a set of tree-ring predictors. It might 298 be that some of the trees growing in specific conditions provide some useful information that is different from the information from other locations. 299 That is why PCR which extracts common information from tree-ring predictors may leave this important information in the components of lower 300 order, consequently losing it when those components are omitted. 301

Comparison with other regional temperature reconstructions 302
In this work we updated and incorporated some of the existing tree-ring chronologies in the area. Thus, some similarities among the datasets are 303 unavoidable. We compared however our summer temperature reconstruction for TdF with other temperature reconstructions for TdF and 304 Southern Patagonia. Such a comparison is necessary to place the new reconstruction into the paleoclimatic context in the region and to define its , hereafter referred to as B89, N11, V03 and A02 respectively (Fig. 7, Table 2). All but N11 are based on tree-ring 309 data, while N11 is a multi-proxy reconstruction. All reconstructions except B89 are produced for different regions, while B89 tree-ring data are 310 acquired from the same region as our reconstruction, and many sites are the same. All reconstructions except N11 have different target seasons 311 compared to our reconstruction. To explore the differences in target variables, the different temperature targets for different regions are plotted 312 during the instrumental period (Fig. 7). Some of the differences between the reconstructions may be due to different target seasons or regions, 313 however certain similarities of the reconstructions are also obvious. These similarities indicate the most prominent and reliable paleoclimatic 314 shifts in the region. One of the most obvious similarities is the pronounced cold period lasting almost a decade in the 1850s. Other common 315 features include relatively warm periods in 1760s-90s (except B89), 1820s-30s, and 1910s. 316 317 To assess quantitative measure of similarity for the compared reconstructions, we calculated correlation coefficients between these 321 reconstructions for the common period 1830-1983 (Table 2). Our reconstruction has the strongest correlation with A02 (r = 0.38). To compare 322 the reconstructions at higher frequencies and to eliminate the effect of long-term trends, which may arise from data treatment, we also considered 323 detrended (subtracted linear trend) reconstructions A02 and V03. Correlations with detrended reconstructions increase, reaching r = 0.56 and 324 r = 0.41 respectively. We consider detrended data because some reconstructions based on tree-ring data may have spurious trends connected to 325 detrending procedures and sampling biases (Briffa and Melvin 2011). For example, V03 reconstruction may contain a trend arising from 326 'Modern-sample' bias (Briffa and Melvin 2011, look at their Fig. 5.6) due to application of the Regional Curve Standardization (RCS) method to 327 a sample consisting of living trees without inclusion of any subfossil trees. However, the question of actual low-frequency temperature variations, 328 including trends for the last 350-400 years, remains open, as our reconstruction is limited in its ability to reproduce climatic trends on the 329 timescales approaching the mean age of the trees due to the segment length curse (Cook et al. 1995) and the individual detrending approach we 330 used to standardize the tree-ring series. At the same time, reconstructed temperature variations at periods up to 150-200 years long should be 331 reliable due to mean length of tree-ring series used for the reconstruction.
The reconstructions N11, V03 and A02 showed high values of pairwise correlation coefficients (ranging from 0.47 to 0.52, Table 2) first due to 333 common trends (correlation coefficients for detrended reconstructions are lower, ranging from 0.27 to 0.31) and second because of the presence 334 of common predictors, since N11 includes data used for V03 and A02. 335 The differences between our and the other reconstructions (N11 and V03) are especially evident for the earlier part, particularly for the period 336 between 1750 and 1850. Our reconstruction showed opposite medium-frequency variations (Fig. 8), and also negative or zero correlations (r = -337 0.22 and r = 0.02 for N11 and V03 respectively). Exceptionally high reconstructed values that are represented in our reconstruction in the 1770s 338 and 1791 are not present in the other considered reconstructions. These discrepancies might be due to regional differences, as N11 and V03 339 represent continental part of the Southern Patagonia. Hence, our new reconstruction provides new paleoclimatic evidence for TdF which was not 340 previously available from other reconstructions. 341 The change in variance of our reconstruction that is evident before the year 1765 is the direct consequence of the regression-based methods and 342 difference of the variance explained in the instrumental period (Table 1). The latter part (after 1765) is much better in terms of the variance 343 explained. The earliest part (before 1750) is based on only one chronology of N. betuloides (LRB) with more than a double drop of explained 344 variance compared to the reconstruction segment after 1765. Still the earlier part has some skill, and we retain it, keeping in mind its difference 345 from the latter part. For the consistency of different parts of the reconstruction in terms of variance, for the earliest part before 1750 we used the 346 rescaling method. This technique is usually used to inflate variance that is partially lost due to application of regression-based methods (Lee et al. South Atlantic and South Pacific near the coast of Tierra del Fuego, while it is insignificantly correlated to major hemispheric modes such 356 as ENSO and SAM. This fact makes our reconstruction important for climate modelling experiments, as it is more relevant for specific regional variability. Our reconstruction can be used for direct comparison with model outputs to better understand model limitations or to 358 tune a model or contribute to larger scale reconstructions based on paleoclimatic data assimilation. At the same time, the pressure system 359 that drives summer temperature variability in TdF is a part of quasi-stationary wave train with a quasi-barotropic structure extending across 360 the South Pacific. 361 • PLSR showed improved performance over PCR in the case of multiple tree-ring predictors without pre-screening. According to these 362 results, PLSR may be a preferable method over PCR for the use in automated tree-ring based reconstruction approaches, e.g. widely used 363 point-by-point regression. However, this conclusion should be additionally tested on multiple datasets from other environments. 364 • Due to its location in a remote and poorly studied region, extended length, widespread target variable, high explained variance, and 365 described relationship of the target variable with the regional atmospheric processes, the new reconstruction is a unique and especially 366 valuable source for the paleoclimatic community, including climate modellers. 367

Appendix. Cross-validation design 381
We used the following design of cross-validation to achieve a trade-off between the sample size used for model training and the size of the sample 382 used for an independent validation. 383 To calculate the results for the Figure 3