Wind speed persistence at the Fernando de Noronha archipelago, Brazil

The use of wind energy has been growing worldwide continuously over the last years due to global efforts to combat climate change. The modern turbines are becoming ever more cost effective and are gaining features that aim to reduce further the impact on the environment, such as reducing noise or increasing the height. In this context, the current study investigates the persistence of wind in Fernando de Noronha archipelago, an important Brazilian ecological site, using hourly wind speed data at 100-m height. To this end, data from Era5 reanalysis were used, as they present high resolution and good performance in estimating meteorological data, and two well-known methods were applied to quantify wind persistence: duration curve and conditional probability. In addition, a novel method is proposed regarding persistence probability of periods of viable energy generation. The results obtained show that Fernando de Noronha archipelago presents rather high wind persistence on a monthly scale, with considerably long intervals of wind speed useful for wind power generation.


Introduction
The use of wind energy has been growing worldwide at an exponential rate over the last decades, primarily due to the lack of emission of greenhouse gases. Moreover, wind energy generation is viable in remote locations, and the cost of investing in wind turbines has decreased to the point of making this energy source competitive with other (often subsidized) sources. Novel models of wind turbines are being developed in order to obtain greater power and at the same time further reduce the impact on the environment. Most new models tend to be more powerful and quieter (Jamieson 2018;Møller and Pedersen 2011), while other models were created with greater height (Caduff et al. 2012), and even models without propellers have been designed (Chizfahm et al. 2018).
Globally, an average yearly increase of 71 GW of wind energy installations is expected by 2024 (GWEC, Global Wind Energy Council 2019). In 2019, there were 60.4 GW new wind energy installations, bringing the world production capacity to 651 GW. Brazil was the fifth country that has invested the most in wind energy, adding 744.95 MW in 2019 to its electric power grid to bring the installed capacity to 15.45 GW, which represents 9.1% of the nation's power matrix (ABEEólica, Brazilian Wind Energy Association 2019).
The implementation of wind farms requires a number of preliminary studies to estimate the characteristics of the wind speed and wind energy potential. To find the ideal location for wind energy production, both the magnitude and the persistence of wind speed have to be taken into account (Cancino-Solórzano et al. 2010;Koçak 2008). Most studies are focused on selection of the best wind speed distribution function (Mazzeo et al. 2019;Morgan et al. 2011;Ouarda and Charron 2018;Ouarda Taha et al. 2015;Soulouknga et al. 2018), while persistence properties of wind dynamics have been far less explored (Cancino-Solórzano et al. 2010;Koçak 2008;Koçak 2009;Santos et al. 2012).
Fernando de Noronha archipelago is an important ecological site, located in the Atlantic Ocean, 360 km offshore from the Brazilian coast. Administratively, it belongs to the state of Pernambuco and is divided in two conservation units: National Marine Reserve (retained for fauna, flora, and natural resource protection) and the Environmental Protected Area which is reserved for human occupation. Each of these units has distinct preservation rules established by the federal and the state governments, with the goal of preservation of natural resources. It was the site of the first large wind turbine for commercial operation installed in South America in 1992 (Araújo and Freitas 2008), but it was (unfortunately) destroyed by lightning in 2009. Currently, the energy supply for the island comes mainly from diesel generators, and there has been a long-standing effort to develop efficient technological solutions for energy supply based on integration of wind and solar resources (Rosas et al. 2013).
In this work, we analyze persistence properties of wind speed at the location of Fernando de Noronha Island in order to contribute to the development of a reliable scientific base for evaluation of renewable energy potential at this location. We use statistical methods wind speed duration curve (WSDC) and conditional probability approach (CPA) (Koçak 2008), and we propose a novel approach that we term "persistence probability curve," which not only yields the identical persistence threshold level as CPA but also offers an additional insight into the persistence properties of the wind speed series (stretches of time viable for energy production), at different scales.

Materials and methods
The location of the Fernando de Noronha archipelago is shown in Fig. 1.
The division Center for Time Prevision and Climatic Studies (Centro de Previsão de Tempo e Estudos Climáticos (CPTEC)) of the Brazilian National Institute for Space Research (Instituto Nacional de Pesquisas Espaciais (INPE)) provides historical meteorological data for Brazil. There is a meteorological station installed at Fernando de Noronha island, which collects wind speed data with 3-h frequency, but after careful examination of the available database, we found that it is not suitable for our purposes due to low quality (there are a lot of missing data, and various long periods of several months show the same values of wind speed).
Other data sources that have been extensively used in both academia and industry for modelling wind power are outputs from meteorological reanalysis (Archer and Caldeira 2009;Cannon et al. 2015;Holt and Wang 2012;Olauson and Bergkvist 2015). Reanalyses are consistent gridded datasets for a long record of time (typically more than 30 years) which are result of combining a state-of-the-art numerical model with the observations from several sources. These datasets allow inferring features such as variability or trends for regions or variables where in situ observations are lacking (Chadee and Clarke 2014). In this work, we chose ERA5 reanalysis because of high spatial (~30 km) and temporal (1 h) resolution, and availability of wind speed series at hub height of 100 m. This reanalysis was shown to perform well in representing the wind speed features at turbine hub heights (Ramon et al. 2019) and modelling wind power, both for countries and for individual wind turbines (Olauson 2018). While both of these works demonstrate advantage of ERA5 over MERRA2 (which has been a de facto standard in numerous studies), they do not include Brazil in the study area. A recent validation of the reanalysis data was performed (Gruber et al. 2020) for different regions including Brazil, and it was also found that the newer ERA5 reanalysis outperforms MERRA-2. The data were downloaded for latitude − 3.85 and longitude − 32.42 (Copernicus Climate Change Service (C3)).
The hourly data from the ERA5 reanalysis of the orthogonal components of wind u (eastward) and v (northward) at 100-m height for 20 years (between 1 January 2000 and 31 December 2019) were collected for the island of Fernando de Noronha. The value of interest here is the wind speed, obtained as ffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi u 2 þ v 2 p , and the resulting wind speed series are presented in Fig. 2. Three methods were applied to estimate the wind persistence two of which have been widely used (Cancino-Solórzano et al. 2010;Koçak 2008;Korkmaz and Koçak 2017;Kulkarni et al. 2015;Masseran et al. 2012), and the last being a novel proposal of this study. The methods used were implemented with C programming, and the figures were elaborated in software R. In what follows these methods are described in some detail.

WSDC
The wind speed duration curve (Koçak 2008) is the twodimensional graphical presentation of wind speed (vertical axis) and the percentage of time that wind speed is equal to or greater than a certain value (horizontal axis). Given the sequence of n wind speed data which are arranged in ascending order, and denoting by v i wind speed of the ith ordered observation, the percentage of time wind speed is equal to or greater than each particular value On the WSDC plot, the value P(v i ) which corresponds to a particular truncation level v i (e.g., cut-in wind speed at which turbine starts generating electricity from turning), can be used as an indication of the persistence of wind: greater P(v i ) values correspond to higher persistence of wind above truncation level.

CPA
The conditional probability approach (CPA) was proposed by Koçak (2008) for quantifying wind speed persistence, and has been subsequently used in a number of works (Cancino-Solórzano et al. 2010;Kulkarni et al. 2015). Given a time series of wind speed observations (e.g., at hourly frequency), choosing a threshold level of interest (e.g., cut-in speed-the minimum wind speed necessary for energy generation, that generally varies between 3 and 4 m/s), one can assign to the instances, with wind speed values equal or greater than the threshold level, an attribute "windy" (W), and for wind speed values below the threshold an attribute "calm" (C). The conditional probability that a "windy" hour follows another "windy" hour can be estimated aŝ where n WW represents the number of pairs of consecutive time steps in which wind speed belongs to category W, and n W represents the total number of time steps with wind speed in category W. Although conditional probability value in (2) provides information about wind persistence, it is limited to only a single time step ahead. This can be improved by including more previous wind speed values in the calculation of conditional probability. More precisely, by denoting n W (q) the number of subsequences of W's (words) of length q (so that n W (1) ≡ n W , n W (2) ≡ n WW , etc.), conditional probability of a word of length q is given bŷ and the cutoff value q 0 for which this probability becomes equal to zerô is directly related to persistence: the greater q 0 values correspond to higher persistence. Graphically, q values are shown on the x axes and the corresponding conditional probabilities on the y axes.

PP
While the CPA approach yields the cutoff length q 0 of the largest subsequence of consecutive values above a given threshold (which quantifies persistence), for wind speed series with long stretches of values above the threshold, it yields values close to unity for most of the q values, which does not reflect well the intermediate q behavior of the sequence. We find that a more informative measure, that yields the identical q 0 cutoff, is the probability of finding a sequence of length q of consecutive values above the threshold, given bŷ where the value n − q + 1 in the denominator corresponds to the total number of words of length q in the sequence of length n. By construction this quantity, which we denote here "persistence probability," yields the same cutoff value q 0 as CPÂ but the graph ofP p versus q also yields additional information on stretches of different lengths of values above threshold, encountered in the series.
To demonstrate the advantage of persistence probability (PP) over CPA, consider a simple example of hourly wind speed data for a year, with 24 × 365 = 8760 observations, in which there is a single week (7 × 24 = 168 observations) below the threshold (not useful for wind power generation). Consider now ordering these data in two ways so that the first group (G1) starts with 3949 W's, followed by 168 C's, and ends with 4634 W's, while the second group (G2) starts with 2000 W's followed by 100 C's, then 1949 W's followed by 68 C's, and ends with 4634 W's. Note that both groups were ordered so that the cutoff value is q 0 = 4634 for both CPA and PP. On the left panel of Fig. 3, it is seen that the CPA graphs do not distinguish between the two groups, while the PP graphs on the right panel demonstrate rather different behavior, showing that, e.g., the probability of having 3 months (90 × 24 = 2160) of continuous data above the threshold is much higher for the first group than for the second group.
It should be noted that both CPA and PP are trivially generalized for situations where both the lower bound (cut-in wind speed) and the upper bound (cut-out) are necessary, simply by implementing the adequate criterion for useful speed (attributing label "U" for "useful," rather than "W" for "windy").

Results and discussions
In Table 1, the descriptive statistics is presented for the Fernando de Noronha wind speed data at 100 m on monthly level. The overall average wind speed is 7.56 m/s with a standard deviation of 2.09 m/s. Henceforth, cut-in of 4m/s will be used. The months from August to December show high average with minimums close to, or even above the cut-in (as Table 2 Distributions most commonly used for modelling wind speed data

Distributions
Density function Parameters Weibull x > 0 β, θ > 0, β is the shape and θ is the scale ; x > 0 k, β, θ > 0, k and β are shape, θ is the scale and Γ(.) is the gamma function Fig. 5 Histogram of the wind speed data with distributions fits September and November). In Fig. 4, the wind speed distributions per month are shown, highlighting that the months from August to December have lower variability and a higher average than the others.
To identify the probability distribution that best describes the wind speed data, we consider here the three distributions that are mostly used to model wind speed: Weibull, gamma, and generalized gamma (Ouarda Taha et al. 2015;Laib et al. 2018), as listed in Table 2.
In Fig. 5, we show the histogram of the wind speed data from Fernando de Noronha, and the curves of the adjusted distributions. Using the Akaike information criterion (AIC) and Bayesian information criterion (BIC), it is found that the generalized gamma distribution is the best for fitting the data, with parameters k = 0.48, θ = 9.71, and β = 6.79. Figure 6 shows the WSDC curve for wind speed data. Approximately 94% of the speeds obtained are above or equal to 4 m/s which indicates a high persistence of the wind in this region.
The CPA and PP curves, with q 0 = 6682, are shown in Fig.  7. This q 0 value confirms a high persistence of wind at 100 m and indicates that the series contains consecutive months with wind speeds useful for generating wind energy. Moreover, the PP curve on the right panel of Fig. 7 demonstrates that the probability of finding a month (30 × 24 = 720) with consecutive observations (h) with useful speeds is about 50%, and that the probability for 5 consecutive months (150 × 24 = 3600) is approximately 14%.

Conclusions
This study investigated the persistence of wind speed at a height of 100 m for the Fernando de Noronha archipelago. Considering the results obtained via CPA, WSDC, and the novel persistence probability (PP) approach, it can be concluded that the island of Fernando de Noronha has a high persistence of wind speed at 100 m height, as roughly 50% of 500-h (20-day) intervals demonstrate continuous conditions favorable for energy generation (W's). This suggests that a greater investment in wind energy in this region should be made, although practical economic justification of such investments may require consideration of other factors.
Moreover, a new measure was introduced in this work to quantify wind persistence based on persistence probability of wind speed useful for generating wind energy. It was also shown that this proposal is simple and can complement information that is not observed from the conditional probability curve, widely used in the literature.