Comparison of sub-series with different lengths using şen-innovative trend analysis

Climate change causes trends in hydro-meteorological series. Traditional trend analysis methods such as Mann-Kendall and Spearman Rho are sensitive to dependent series and cannot detect non-monotonic trends. Şen-innovative trend analysis method is launched into literature in order to overcome these restrictions. It does not require any restrictive assumptions as serial independence and normal distribution and examines a given time series as equally divided into two sub-series. The Şen multiple innovative trend analysis methodology is improved to detect partial trends on different sub-series, again with equal lengths. Climate change strongly affects hydro-meteorological parameters today compared to the last twenty or thirty years and gives asymmetrical trend change points in hydro-meteorological time series. Due to asymmetric trend change points, it may be necessary to analyze sub-series with different lengths to use all measured data. In this study, the Şen innovative trend analysis method is revised to satisfy these requirements (ITA_DL). The new approach compared with the traditional Mann-Kendall (MK) and Şen innovative trend analysis (Şen_ITA) gives successful and consistent results. The ITA_DL gives four monotonic trends on May, July, September, and October rainfall series of Oxford although the MK gives three monotonic trends in the May, July, and December and cannot detect trends on the September and October. In the ITA_DL visual inspection, the December rainfall series does not show an overall or partial trend. The ITA_DL trend results are consistent with the Şen_ITA except for the September rainfall series, although it has different trend slope amounts.


Introduction
With increasing levels of greenhouse gases, the atmosphere has warmed more or lost its essential ability for cooling down. There are a lot of studies to investigate increasing or decreasing trends in hydro-meteorological events. The most used classical methods are Mann-Kendall (Kendall 1975;Mann 1945), Spearman's Rho (Spearman 1987), and Sen slope estimator (Sen 1968). These methods have some restrictive assumptions such as Gaussian distribution and serial independence, which leads to type-1 and type-2 errors in trend calculations (Alashan 2020a;Cox and Stuart 1955;Wang et al. 2015;Yue et al. 2002;Yue and Wang 2004). Although MK needs serial independence and normality assumptions, hydro-meteorological time series may frequently have serial correlation (Yue et al. 2003). The prewhitening, over-whitening, and variance correction methods are used to remove serial correlation on a given time series (Hamed and Ramachandra Rao 1998;Kulkarni and Storch 1995;Şen 2017;. The pre-whitening processes that remove a portion of the trend with serial correlation may reduce the power of the MK on any given time series (Bayazit and Önöz 2007;Douglas et al. 2000). Şen (2012) launched to the literature a new trend calculation method named an innovative trend analysis (Şen_ITA) to release these restrictive assumptions. The method has high a visual capability and can identify non-monotonic trends on low, medium, and high values on a given time series. Although the method is new, it is used to determine trends in hydro-meteorological data such as temperature (Pandey et al. 2021;Sonali and Nagesh Kumar 2013); rainfall (Fanta 2022; Marak et al. 2020;Oruc 2021;Phuong et al. 2021; 1 3 Şan et al. 2021;Wang et al. 2020); precipitation (Alifujiang et al. 2020); aerosol optical depth (Kotrike et al. 2021); sea surface temperature (Şişman 2021); evapotranspiration (Ma et al. 2018); flow (Yilmaz and Tosunoglu 2019); water clarity (He et al. 2022).
Climate change is an up-to-date topic for two three decades in comparison with hydro-meteorological data measured for more than one hundred fifty years. The Şen_ITA method detects trends on a time series divided into two subseries with equal lengths. Temporal partial trends on main time series cannot be detected by this approach. Also, the method may not show any trend on long series if decreasing and increasing partial trends neutralize each other. Mohorji et al. (2017) prefer to divide a main time series into several sub-series with equal lengths such as 5, 10, 20, 30, 40, and 50 years, using the Şen_ITA to detect partial trends. Güçlü and friends improved the Şen_ITA derivatives as called halftime, double, triple, and triangular to detect partial trends by dividing a main time series into several sub-series with equal length without detecting trend change points (Güçlü 2020(Güçlü , 2018a(Güçlü , 2018bGüçlü et al. 2020). Arab Amiri and Gocić (2021) divided a given time series into three sub-series to detect partial precipitation trends in Serbia. Şişman and Kizilöz (2021) investigated the Oxford rainfall and temperature series by dividing a 150 year time series into 30 year of sub-series and calculated the maximum trend slopes over the last 30 years.
One may want to examine partial trends on sub-series with different lengths based on trend change points. The Şen_ITA method requires equal data lengths in the subseries to compare equally probable scatter points. In this study, the Şen_ITA method is revised to detect partial trends on sub-series with different lengths (ITA_DL) but equiprobability scatter points. The ITA_DL can examine sub-series with different data lengths using equiprobability scatter points. It is useful for investigating trend stability and temporal partial trends in any time series. The method is applied to Oxford's annual and monthly rainfall series to detect partial trends and check trend stability on a given time series. Mann-Kendall (MK) is the most frequently used trend analysis method although some restrictive assumptions. It is based on the comparison of the temporal ordered values according to their magnitude. Let Z be a time series as z 1 , z 2 , ……, z n and for ∀j > i if z j > z i then sign z j − z i = 1 and if z j < z i then sign z j − z i = −1 else sign z j − z i = 0 . S is a trend indicator and calculated as S = ∑ n−1 i=1 ∑ n j=i+1 sign � z j − z i � and its variance is calculated as Var(S) = n(n − 1)(2n + 5)∕18 .

Methodology
Its variance changes with data length due to the Gaussian distribution assumption and tied group number not mentioned here. Standard normal z MK values are calculated as followed, if S > 0 then z MK = (S − 1)∕ √ Var(S) else if S < 0 then z MK = (S + 1)∕ √ Var(S) . It gives an idea of a presence of a trend for certain statistical significance levels (e.g. 5% and 10%) in a given time series. If the absolute z MK values are greater than the standard normal z MK values (1.96 and 1.65) for the statistical significance levels, then there is a trend in the time series. The negative (positive) z MK values give decreasing (increasing) trends.
The Şen_ITA divides a given time series ( Z = z 1 , z 2 , … , z n ) into two sub-series, for the first half series, Z f = {z 1 , z 2 , … , z n∕2 } , a n d t h e s e c o n d h a l f series,Z s = z n∕2+1 , z n∕2+2 , … , z n , with equal lengths. Each sub-series is ascendingly sorted. The sorted first half sub-series on the X axis, X = x 1 Z f ,min , x 2 , … , x n∕2 Z f ,max , and the sor ted second half ser ies on t he Y axis, Y = y 1 Z s,min , y 2, … , y n∕2 Z s,max , are plotted on a graph with a 1/1 (45°) line. Thus, scatter points and the 1/1 (45°) line appear on the graph. If the scatter points are above (below) the 1/1 (45°) straight line, there is an increasing (a decreasing) trend. If some of the scatter points are above (below) the 1/1 straight line and other of the scatter points below (above) this line, there is a non-monotonic decreasing (increasing) trend. If the scatter points are almost over the 1/1 trendless line, there is no trend on the given time series. s = 2 y − x ∕n gives trend slopes and Here; x , y , n , , and y,x is an average of the first and the second half, data length, standard deviation, and correlation coefficient between the first and second half series. The trend slope deviation formula has the assumption that variations are equal, but this assumption cannot be valid for skew and dependent series (Serinaldi et al. 2020;Wang et al. 2019 . The formula gives more successful and robust results compared to the MK and Şen_ITA also in skew and dependent series. Here; Var x , Var y , and Cov xy are first half variance, second half variance, and covariance values between first and second halves. z ITA values can be calculated as z ITA = s∕ s to compare with the traditional MK method. More detailed information and steps to derive the formulas can be found in (Alashan 2020b).
Although the Şen_ITA does not have restrictive assumptions such as serial correlation, independence, and normality as in the MK, it has a limitation in that it can detect trends by dividing a given time series into equal sub-series lengths. It uses equal data lengths to compare scatter points that have equal probabilities. For example, let's assume n 1 and n 2 are first and second-half series data lengths. If the sub-series data lengths are equal n 1 = n 2 , each of the sorted scatter points has equal probability for the cumulative probability values from 1∕n 1 = 1∕n 2 to (n − 1)∕n . To release the equal sub-series length assumption from the Şen_ITA and compare sub-series with different data lengths, for k = n 1 ∕n 2 and i = jk , then equal probability points from i n 1 = j n 2 to n−1 n are obtained. Each k-value helps to divide a given time series at certain rates to obtain the first and second sub-series data lengths. The sorted first half series x i are plotted on a horizontal axis and second half series y j on a vertical axis same as in the Şen_ITA (Fig. 1).
The scatter point locations with respect to the 1/1 trendless line give monotonic or non-monotonic trends or no trend on the sub-series which have different lengths (ITA_DL). The ITA_DL converts to Şen_ITA for k = 1.
To test the ITA_DL method, a random series is produced with one hundred data lengths. The increasing and decreasing trends are embedded in the last twenty values in the random series. The ITA_DL method detects trends on subseries for various k n 1 ∕n 2 values in Fig. 1. Trend slope values change with k values as seen in Fig. 1. k = 1 gives the  Şen_ITA trend slopes and it is between maximum and minimum trend slopes. The k = 8/2 gives a trend change point and has maximum (minimum) trend slopes on the partial increasing (decreasing) series. The implementation steps to apply the ITA_DL in Fig. 1 are shown below.
Again for k = 2/8, the first half sorted sub-series, x 80 Z f ,max , and the second half sorted sub-series, Y = y 1 Z s,min , y 2 , … , y 20 Z s,max , are provided. c) Each i and j coefficients related to a certain k-value are determined to convert sub-series to equal length. For example, the k = 2/8 gives the i and j values as 2 and 8 provided that i = jk. d) For the k = 2/8, X and Y sub-series to be plotted on the graph are determined as X p,i = x 4 , x 8 , … , x 80 and Y p,j = y 1 , y 2 , … , y 20 . The similar processes are repeated for other k values. e) The other remaining processes are as in the Şen_ITA.
Pettitt test is used to compare trend change points obtained by the Şen_ITA. The method is based on Mann-Whitney rank test. Let D i,j = sgn z i − z j and the sgn z i − z j is calculated as aforementioned in the MK. Also, the test statistics U t,T and K T can be calculated such as ∑ T j=t+1 D i,j and K T = max U t,T . If a significance probability, p = 2 exp( −6K 2 T T 3 +T 2 ), is smaller than a significance level, α (α = 0.10 in this study), there is a change point on a given time series. More detailed information can be found in (Güçlü 2020;Mallakpour and Villarini 2016;Pettitt 1979).

Application and results
Oxford is chosen as the study area because of the long rainfall data measurements. It has an area of approximately 46 km 2 , an altitude of 60 m from sea level, and geographic coordinates of 51° 45′ N and 1° 15′ W in the southeast of England. The maritime temperate climate prevails in the city (https:// en. wikip edia. org/ wiki/ Oxford# Geogr aphy). All months are rainy. Annual total rainfall values vary between approximately 379 and 984 mm, with an average of 657 mm ( Table 1). The wettest (driest) month is October (February). April has minimum rainfall values. March rainfall values have a maximum skewness coefficient. Standard deviation is maximum in October rainfall values.
The Mann-Kendall (MK) test is applied for Oxford's total annual and monthly rainfall series (Table 2). Trend values are defined as a trend (an important trend) if the absolute z MK value is equal to or greater than a value of 1.65 (1.96) at a significance level of 10% (5%). To not overlook the MK results, 1.62 and 1.92 values in Table 2 are accepted as 1.65 and 1.96 with very small errors.
The MK can detect three trends on the May, July, and December rainfall series out of thirteen series. In May, there is an increasing trend on the rainfall series. An important decreasing trend prevails on the July rainfall series. The December rainfall series has an important increasing trend. The ITA_DL method gives partial increasing and decreasing trends on Oxford rainfall series. 6 rainfall series out of 13 have increasing or decreasing trends (Table 3). Oxford rainfall series that have monotonic trends are given in the table. The ITA_DL is applied with k = 1, 2, 3, 4, and 7 values or 1940, 1970, 1980, 1990, and 2000 trend checkpoints. k = 1(2, 3, 4, 7) indicates that the last 80 (50,40,30,20) years of rainfall series of Oxford is compared with the previous 80 (100, 120, 120, 140) years.
Annual and November rainfall series have important increasing trends by the 1990 trend checkpoint (k = 120/40 = 4). Şişman and Kizilöz (2021) calculated positive trend slopes on the annual rainfall series of Oxford by comparing the 1990-2020 period with the 1960-1990 period, but they could not detect a statistically significant trend at the 5% and 10% significance levels. There is an important increasing trend in the March rainfall series by the 1970 trend checkpoint. In May, there is an important increasing trend by the 1940 checkpoint (k = 80/80 = 1 or Şen_ITA). July has an important decreasing trend by the 1970 and 1980 trend checkpoints (k = 100/50 and k = 120/40) and a decreasing trend by the 1990 trend checkpoint (k = 120/30 = 4). Important decreasing trends are dominant in the September rainfall series with k = 120/40, and k = 140/20 (1980, and 2000 trend checkpoints).
Trend stability on any time series can be investigated using the ITA_DL. In a stable trend case, all trend slopes on the time series have a similar direction (positive or negative) for all defined k values. As seen in Table 3, while the July and September rainfall series have negative trend slopes, the May rainfall series has positive trend slopes for all k values. Thus, it can be accepted that the May, July, and September rainfall series have stable trend slopes.
To detect non-monotonic trends in the annual and monthly rainfall series of Oxford, ITA_DL graphics are used (Fig. 2). There is an increase in the annual high rainfall values. Although there is a decrease in the medium values in January, low and high rainfall values increase. The February, March, October, and November rainfall series have decreasing amounts only at high values. There is a decrease in low values and an increase in high values of the April and June rainfall series. In the May rainfall series, there is an increase in medium values and a decrease in high values.
The July and September rainfall series have monotonic decreasing trends. There is no non-monotonic or monotonic trend in the August rainfall series. In the December rainfall series, an increase is observed at low and medium values, and a decrease is observed at high values.
To determine trend change points, maximum absolute z ITA values calculated for the rainfall series of Oxford are used. 1940 is a trend change point for the May rainfall series. The March and July rainfall series have a trend change point in 1970 while the Annual and November rainfall series have a trend change point in 1990. September has a trend change point in 2000.

Conclusion and discussion
Oxford's annual and monthly rainfall series from 1861 to 2020 are analyzed to identify trends with the Mann-Kendall (MK), the Şen innovative trend analysis (Şen_ITA), and the innovative trend analysis with different sub-series lengths (ITA_DL). The ITA_DL gives results coherent with the MK and turns into the Şen_ITA with equal sub-series lengths or k = 1. The MK and ITA_DL methods don't give opposite trends on the rainfall series of Oxford. Also, trends on the 7 out of 13-time series (6 no trend and 1 important decreasing trend) are the same. The ITA_DL either detects trends that the MK cannot detect or accept trends by more high significance levels in the other 6-time series but December. The MK gives one trend (May) and two important trends (July and September). The ITA_DL trend test results confirm the MK except for December. Neither the MK nor the Şen_ITA can detect any trends in the September rainfall series. The ITA_DL does not show any trend or partial trend in December, although the MK shows a significant increasing trend. The Şen_ITA helps to detect a non-monotonic trend that cannot be determined by the MK in a given time series. It has the capacity to view trends on low, medium, and high values in the time series. The ITA_DL can detect partial monotonic or non-monotonic trends in addition to holistic trends. It is important to determine climate change effects emerging over the last two or three decades and inspect trend stability on long time series. In the September rainfall series of Oxford that has stable trend slopes, although the MK and Şen_ITA cannot detect any trend, the ITA_DL gives important decreasing trends with k = 120/40 (1980 trend checkpoint) and k = 140/20 (2000 trend change point) values. It also gives results compatible with the MK in other months and the Şen_ITA is a special case of the ITA_DL (k = 1).  The ITA_DL detects six trend change points in the Annual (1990), March (1970), May (1940), July (1970), September (2000, and November (1990) rainfall series, although the Pettitt test can detect one trend change point in the July rainfall series. In the other rainfall series that the ITA_DL detects trend change points, differences of years that have absolute the maximum U (t, T) test statistics and maximum absolute z ITA values are in a range of ± 2 years. The July (March) rainfall series has an important decreasing (increasing) trend by the 1970 trend change point that the world temperatures have started to increase continuously (IPCC 2007). 1990 is a trend change point for the Annual and November rainfall series that have important increasing monotonic trends. Also, the UK has maximum greenhouse gas emissions (800 million metric tons of carbon dioxide equivalent) this year and has gradually reduced this amount by 50 percent by 2020 (https:// www. stati sta. com/ stati stics/ 326902/ green house-gas-emiss ions-in-the-united-kingd om-uk/). This may be one reason for the important increasing trends in Oxford's Annual and November rainfall series.
Rainfall is the main source of clean water and an important part of the hydrological cycle. The monotonic important decreasing trend in July and the non-monotonic decrease in low rainfall values in June require designing larger reservoir capacity in dams to provide drinking and irrigation water during summer droughts. Possible floods in Oxford could be more intense, with a monotonic important increasing trend in May and a non-monotonic increase in high rainfall values in January and April. City planners, hydrologists, and politicians should take into account trends in the Oxford rainfall series to avoid possible damages that emerged from droughts and floods.

Conflict of interest
The author declares that he has no conflict of interest.