Bayesian method for estimating Weibull parameters for wind resource assessment in the Equatorial region: a comparison between two-parameter and three-parameter Weibull distributions

The two-parameter Weibull distribution has garnered much attention in the assessment of wind energy potential. The estimation of the shape and scale parameters of the distribution has brought forth a successful tool for the wind energy industry. However, it may be inappropriate to use the two-parameter Weibull distribution to assess energy at every location, especially at sites where low wind speeds are frequent, such as the Equatorial region. In this work, a robust technique in wind resource assessment using a Bayesian approach for estimating Weibull parameters is first proposed. Secondly, the wind resource assessment techniques using a two-parameter Weibull distribution and a three-parameter Weibull distribution which is a generalized form of two-parameter Weibull distribution are compared. Simulation studies confirm that the Bayesian approach seems a more robust technique for accurate estimation of Weibull parameters. The research is conducted using data from seven sites in Equatorial region from 1 o N of Equator to 19 o South of Equator. Results reveal that a three-parameter Weibull distribution with non-zero shift parameter is a better fit for wind data having a higher percentage of low wind speeds (0-1 m/s) and low skewness. However, wind data with a smaller percentage of low wind speeds and high skewness showed better results with a two-parameter distribution that is a special case of three-parameter Weibull distribution with zero shift parameter. The results also demonstrate that the proposed Bayesian approach and application of a three-parameter Weibull distribution are extremely useful in accurate estimate of wind power and annual energy production.


Introduction
Wind energy has now become one of the world's fastest-growing sources of energy. It is an inexhaustible source of energy with increasing utilization all around the world. Growing climate change concerns have prompted many developed and developing countries to implement policies that reduce their reliance on non-renewable energy and instead utilize renewable sources such as wind, hydro, and solar energies (Mostafaeipour et al., 2014). However, developing countries encounter several challenges in generating sustainable wind energy. There is a need for reliable wind data and proper assessments of a country's wind energy potential before initiating energy generation projects that would help them meet the sustainable development goals set by the United Nations.
While climate change is being experienced globally, some regions are getting affected more than the others. Pacific islands countries (PICs), particularly those in the warmer Equatorial region, are more susceptible to its effects. The contribution of the PICs to the current global greenhouse gas emissions, according to the UN Permanent Forum on Indigenous Issues (2015), is below 0.03%; yet they are among the first to be affected. It is projected that the people of PICs will be among the first who will need to adapt to climate change or be required to relocate from or abandon their traditional homeland. Some islands are already facing the impacts of climate change on their communities, infrastructure, water supply, coastal and forest ecosystems, fisheries, agriculture, and human health. Island states such as Kiribati, Marshall Islands, Tokelau and Tuvalu are the immediate victims of this phenomenon due to rising sea level. Knowledge of the effects of climate change on PICs should act as a driving force behind the commitment to decrease greenhouse emissions. The PICs, which currently depend heavily on imported fossil fuels and their byproducts, need to become more energy efficient and self-reliant (Mohanty, 2012). PICs have some of the lowest rates of access to electricity and the prices of electricity are among the highest in the world due to their heavy reliance on high-cost diesel-based generation. Energy security and lowcost energy are becoming increasingly important within the region, which requires increasing investments in renewable energy technologies. PICs are also some of those most vulnerable to natural disasters. The energy sector can be highly vulnerable to such events, which requires adequate attention to these issues in the design of energy production and distribution infrastructure. This can only be achieved by adopting resilient renewable energy policies. Most of the countries in the region have their national sustainable development plans to achieve United Nations' sustainable development goals (SDGs); for examples, the country Cook Islands, aims to have 100% renewable power generation in near future and Fiji is committed to reducing 30% of its national greenhouse emissions and achieve 99% renewable energy generation by 2030 (VNR Report, 2019).
However, lack of reliable and accurate wind resource data acts as a barrier to a clean energy future in the PICs, especially in the smaller developing islands (Kidmo et al., 2015). So far, wind resource assessment has received only limited attention in the PICs, and there is a need for further wind data collection and analysis and accurate wind energy potential assessment. World Bank provides support to PICs through the Sustainable Energy Industry Development Project (SEIDP). In various phases of renewable energy resource mapping, they support the countries to carry out an assessment of solar and wind potential. The objectives of this component are to enhance awareness and knowledge of the potential for renewable technologies (solar and wind) to the governments, power utilities and private sector, and to provide governments with a spatial planning framework to guide investments in the renewable energy sector (PPA 2015).
The utilization of wind power technology to generate energy is slowly increasing in PICs such as New Caledonia, Fiji, Vanuatu, Cook Islands and Samoa. However, there have been little to no attempts to establish wind power in many of these countries. The University of the South Pacific installed towers of 34 m height, named as Integrated Renewable Energy Resource Assessment Systems (IRERAS), in Kiribati, Nauru, Niue, Tuvalu, Tokelau, Samoa, Tonga, Fiji, Vanuatu, Solomon and Cook Islands to collect data on wind and solar energies (Gosai, 2014).
In recent years, the Weibull distribution has become a widely accepted tool in determining the potential of wind energy (Indhumathy, 2014). Wind energy professionals in different parts of the world have widely employed the use of Weibull distributions in the statistical analyses of wind characteristics and wind power density (Corotis et al., 1978). The Weibull shape parameter defines the width of wind distribution. A higher shape parameter indicates that the distribution is narrower, and the peak value is higher. The Weibull scale parameter controls the abscissa scale of the data distribution plot (Chang, 2011). Thus, the Weibull distribution function is comprehensively used for analyzing the wind power potential at a site.
Past researchers have found the two-parameter Weibull distribution to be a useful and practical tool for wind energy estimation. The advantages of two-parameter Weibull distribution include its flexibility, simplicity in parameter estimation, ability to use goodness-of-fit tests on these parameters as well as its dependence only on two parameters that can be expressed in closed form.
However, some authors suggested that the distribution is not suited for all wind regimes encountered in nature such as regimes with a high percentage of low wind speeds and bimodal distributions. Therefore, its usage cannot be generalized. To minimize errors, a suitable probability density function must be carefully selected for different wind regimes (Carta et al. 2009;Sukkiramathi and Seshaiah, 2020). Tuller and Brett (1984) proposed a three-parameter Weibull function in wind analysis and found that it showed better fitness and flexibility than the two-parameter Weibull function. Recently, some authors utilized the three-parameter Weibull distribution and found that it has more flexibility with improved fitness than the two-parameter Weibull distribution in wind energy assessments. Wais (2017) compared the two and three-parameter Weibull distribution to study the most appropriate distribution of wind speed. The results revealed that methods other than the threeparameter Weibull distribution cannot account for cases where the frequency of low wind speed is higher. The author compared the wind speeds for three different sites and found that the threeparameter Weibull distribution performed best when there was a greater frequency of lower wind speeds. Sukkiramathi and Seshaiah (2020) also utilized the three-parameter Weibull distribution for analyzing wind power potential. However, to date, only limited research has been carried out on wind analysis using the three-parameter Weibull distribution.
Furthermore, many estimation methods have been proposed for estimating Weibull parameters. Among these, maximum likelihood estimation (MLE), a popular frequentist technique, has been a widely used method for estimating the parameters (Teimouri and Gupta, 2013). Recently, the Bayesian estimation approach has received great attention from many researchers. Among them is Al Omari and Ibrahim (2011), who considered the Bayesian survival estimator for Weibull distribution with censored data. Many authors, including Hossain and Zimmer (2003) and Pandey et al. (2011), did some comparative studies on the estimation of the Weibull parameters using complete and censored samples, and Lye et al. (1993) determined the Bayes estimation for the extreme-value reliability function. Guure et al. (2012) examined the performance of MLE and Bayesian methods for estimating the two-parameter Weibull failure time distribution. However, the use of the Bayesian technique for modelling wind data and analyzing wind power potential was not explored in their work.
The present work aims to compare the two-parameter and three-parameter Weibull distributions to fit wind speed data more accurately at seven locations in the Equatorial region, where wind speeds are generally lower. The aim is also to develop a novel approach using the Bayesian method for estimating the Weibull parameters. The results from Bayesian technique will be compared with those of the traditional MLE method to determine a more accurate evaluation method of wind speed characteristics.

Wind speed data
Wind speed data from seven different sites in the Equatorial region were used in the present work, as shown in Table 1. For sites 1, 2 and 3, data were obtained from measurements using 34 m tall towers with the help of sensors described in Table 2. The NRG systems towers, named Integrated Renewable Energy Resource Assessment Systems (IRERAS), with a height of 34 m were used. NRG SymphoniePlus3 was the data-logger used which was connected to seven different sensors installed on the tower. The sensors measured wind speed, temperature, pressure, rainfall, solar insolation, humidity, and wind direction. The data were either collected from the SD card in person or sent via the GSM-based network to a data-bank located at the ICT centre of the University of South Pacific at the Laucala Campus, Fiji. The anemometers (serial numbers 179500189054-57, 179500189089-90) have an accuracy of 0.1 m/s and a range of 0.4 to 96 m/s. The wind vane is placed at 30 m AGL. The data were recorded in a time-series format in an RWD file which were later transferred to a Microsoft excel sheet. The wind speed data were recorded continuously at an interval of 10 minutes with a cup anemometer at hub heights of 34 m and 20 m, respectively. For sites 4, 5 and 6, satellite data were downloaded; land data from ERA5 were used in the present work (Ref: https://cds.climate.copernicus.eu/). ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 4 to 7 decades. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, after certain number of hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. For site 7, an NRG systems towers, similar to the ones used for sites 1, 2 and 3, but 50 m high, was installed; it has anemometers at 50 m, 40 m and 30 m AGL. For the measured values, some uncertainties were taken into account such as calibration errors, the terrain of the site that was used, the dynamic over speeding, the error introduced due to wind shear and the inflow angle (Jain, 2016). The measurements in the present work were performed close to the shoreline at a flat terrain. The flow was in the horizontal plane, resulting in a lower uncertainty level. The calibration report for the anemometers used in the present work showed a maximum uncertainty of 0.6% for a wind speed range of 4-7 m/s, which reduced at higher wind speeds. The overall uncertainty in the estimation of wind speed is obtained by taking all the above uncertainties into account (Jain, 2016) and using the relation in equation (1): where i  is each component of uncertainty and N is the number of components of uncertainty. The uncertainties were estimated at 95% confidence level. As per the IEC Standard IEC 61400-12-1 (IEC, 2017), the uncertainty in the measurements was estimated to be approximately 1.74%.

Weibull Distribution
Assessment of wind power energy at a site requires knowledge of the appropriate probability distribution of the site's wind speed, as the estimation of wind energy depends on its accuracy.

Two-parameter Weibull distribution
The two-parameter Weibull probability density functions (pdf) and the cumulative distribution function (cdf) for wind speed (U ), respectively are given by where   fU is the probability of observing the wind speed, k is the shape parameter and A is the scale parameter (m/s) of the distribution. The parameter k indicates the wind potential and what peak the distribution can reach. Its value ranges between 1 and 3. A lower k value signifies highly variable winds, while constant winds are characterized by a larger k. The parameter A denotes how windy the site under study is and it takes a value proportional to the mean wind speed (Manwell et al., 2010;Sukkiramathi and Seshaiah, 2020).

Three-parameter Weibull distribution
The three-parameter Weibull pdf and the cdf for wind speed, respectively are given by and where   fU is the probability of observing the wind speed, k is the shape parameter, A is the scale parameter (m/s), and θ is the shift or location parameter (m/s) of the distribution. If 0 FU become the PDF and CDF of a two-parameter Weibull distribution, respectively.
As the name implies, the shift parameter, θ, shifts the distribution along the abscissa. When 0   , the distribution starts at U = 0 or at the origin. Whereas, if θ > 0, the distribution starts at the location θ to the right of the origin. If θ < 0, the distribution starts at the location θ to the left of the origin. For the distribution of wind speed, θ provides an estimate of the earliest time-to-start the wind (Tuller and Brett, 1984;Wais, 2017).

Methods of Estimating Weibull Parameters
To estimate the Weibull parameters, we propose a Bayesian approach and compare its performance with a popular frequentist approach, the maximum likelihood estimation (MLE) method.

Two-parameter distribution
MLE is the most popular technique for deriving estimators (Casella and Berger, 2002;Aukitino et al., 2017;Chaurasiya et al., 2018). If 1 ,..., n UU are the wind speed values with the Weibull density function given in (2), the shape parameter (k) and scale parameter ( Finally, equation (7) is used for estimating the shape parameter ( k ) as: which may be solved to obtain the estimate of k using Newton-Raphson method or any other numerical procedure because equation (7) does not have a closed form solution. When k is obtained, the value of A is found from equation (6).

Three-parameter distribution
The likelihood function L for estimating parameters is given by gives the equations of MLE of the parameters as shown in Equations (8-10): There is no closed form solution of the equations but the non-linear equations (8) -(10) may be solved by applying some optimization techniques such as Newton-Raphson method or other numerical procedures (Lawless, 2003;Teimouri and Gupta, 2013).

Evaluation of MLE methods
To determine the best model, we can compare the fit of the two MLE methods using different measures of goodness-of-fit (Luceño, 2008;Ramachandran and Tsokos, 2015;Cousineau and Allan, 2015). The most used criteria are:

Log-likelihood (log-like):
If a pdf   fU  fitted on the wind speed data and  is the estimated parameter of the distribution, then the log-likelihood for the goodness of fit is obtained by the following equation: where i U is the ith observed wind speed and n is the number of observations in the dataset. A higher value of log-likelihood value indicates a better fit.

Akaike information criteria (AIC):
If k is the number of distribution parameters to estimate, the AIC is obtained using equation (12): A lower value of AIC indicates that the model fits the data better. Compared to the log-likelihood, this criterion takes into consideration the parsimony of the model as it includes a penalty term that increases the number of parameters.

Bayesian information criteria (BIC):
This criterion is obtained using equation (13): Similar to AIC, a lower value of BIC indicates that the model fits the data better. However, BIC provides a stronger penalty than AIC for additional parameters.

Kolmogorov-Smirnov (KS):
The Kolmogorov-Smirnov (KS) test is also used to check the adequacy of a given theoretical distribution for a given set of wind speed data. The KS test computes the maximum difference between the predicted and observed distribution, and the test statistic D is given by: where ˆi F is the ith predicted cumulative probability from the theoretical cdf and i F is the empirical probability of the ith observed wind speed.

Anderson-Darling (AD):
For a finite data sample, the Anderson- where

The Bayesian Method
A classical frequentist approach such as MLE has certain drawbacks. Most of its properties hold only for large sample sizes and it requires a symmetric form of sample distribution. The Bayesian approach, however, is free from such limitations. Moreover, Bayesian simulation tools provide an exact method of inference even if sample size is very small. Thus, in real-life situations where the sample size may be small, Bayesian methods seem to be more suitable over frequentist methods if prior information about the parameters is available.
In this paper, a Bayesian inference approach for modeling of wind speed data is proposed. In the Bayesian paradigm, data and prior information about the parameters are combined together to make an inference about the parameters of interest.
The most influential contribution of the Bayesian approach is its modification of the likelihood function into a posterior -a valid probability distribution defined by the classic Bayes' rule. The posterior distribution of wind speed is expressed as: The denominator of equation (16),   pU , normalizes the posterior Since it is independent of U, it is often convenient to write the posterior distribution as: that is, the posterior distribution of the parameters is proportional to the likelihood function times the prior distribution of parameters. While fitting wind speed data, we use a non-informative uniform prior distribution as we have very little prior knowledge about its model parameters. In Bayesian computations, a sample of the joint posterior distribution is obtained by using Gibbs sampler to simulate a sample from a Markov Chain Monte Carlo (MCMC). Then, we can calculate the desired values of the posterior.
In this paper, the software JAGS is used to fit the model. The R package R2jags is used to summarize the posterior inference, which is discussed in more detail in Section 4.2.1.

Bayesian Fitting of Weibull distribution with JAGS
JAGS, an acronyms for "Just Another Gibbs Sampler" (Plummer, 2003), accepts a model string written in an R-like syntax that compiles and generates MCMC samples from the model using Gibbs sampling. It is an open-source software written in C++ using GNU compilers and packaging tools -freely available at http://mcmc-jags.sourceforge.net/. R packages such as R2jags or rjags allow running JAGS models within R on Windows machines for the summarization of posterior inference.
In the present work, we develop JAGS models for the Bayesian fit of wind speed with two and three parameter Weibull distributions and generate MCMC data. In MCMC simulations, we run the Gibbs sampler with the JAGS models for 10,000 iterations using the jags function in R2jags package (Su and Yajima, 2020). Then, the posterior estimates of parameters are obtained by performing the Gibbs sampler iterations and using 1000 burn-in period to attain convergence with five thinning intervals and 3 chains with a sample size of 1800 per chain. The model specifications to perform the Bayesian fit, and the jags function and its arguments are presented in Appendix 1.

Evaluation of Bayesian Models
The standard likelihood, AIC and BIC statistics, discussed in Section 4.1.3 are not relevant while evaluating Bayesian methods such as MCMC. Instead, Spiegelhalter et al. (2002) suggested that the Deviance Information Criterion (DIC) be used to compare models. The DIC is a generalization of AIC that is based on Deviance statistics: where   hU is some standardizing function of the data. The DIC is then defined as: ; Then, the expected wind power ( P ) is estimated by Wind power, Substituting (20), (2) and (4) in (21), the expected wind powers for two-parameter and threeparameter Weibull distributions respectively, are determined by

If 8760
Year t  is the total number of hours in a year, the expected wind energy for two-parameter and three-parameter Weibull distributions respectively, are determined by

Analyzing performance of different estimators
The efficiency and performance of MLE and Bayesian methods for estimating two and three parameter Weibull distributions were determined using different goodness of fit and error measures such as coefficient of determination ( 2 R ), root mean square error (RMSE), coefficient of efficiency (COE), mean absolute error (MAE) and the mean absolute percentage error (MAPE). Arithmetically, these are computed as follows (Azad et al., 2014;Kidmo et al., 2015;Aukitino et al., 2017): It is a statistical measure that gives some information about the goodness of fit of a model, that is, how much the variance of the observed data is explained by the fitted model. It is defined as: where n is the number of observations, i U is the ith actual data, ˆi U is the ith predicted data with the Weibull distribution, U is the mean of actual data. A higher 2 R value indicates a better fit and 2 R = 1 indicates that the regression predictions perfectly fit the data.

Root mean square error (RMSE):
It determines the deviation of the predicted values of wind speed from the observed values and obtained by A smaller RMSE value normally indicates accurate modeling. The calculated RMSE value approaches zero as the difference between the observed and predicted values becomes smaller (Indhumathy et al., 2014).

Coefficient of efficiency (COE):
It quantifies the ratio of difference between predicted wind speed and the mean wind speed to the difference between actual values and the average of wind speeds. A higher COE value indicates a good fit for the data. It is expressed as:

Mean absolute error (MAE):
The mean absolute error is a measure of the absolute difference between predicted and actual values. A smaller value of MAE indicates higher accuracy. The MAE is mathematically expressed as:

Mean absolute percentage error (MAPE):
It is a comparative measure, indicating the error as a percentage of the actual data which helps accurately predict the forecasting method. Like MAE, a lower value of MAPE indicates better accuracy. It is mathematically expressed as: (30)

Results
In this section, the results of fitting of two-parameter (2-p) and three-parameter (3-p) Weibull distributions are presented. Further, the results for the application of MLE and the proposed Bayesian approach for estimating the parameters, as discussed in Sections 3 and 4, are also presented. To accomplish this, wind speed data at seven different sites were collected as mentioned in Section 2. Table 3 provides wind speed distributions at these sites. The table shows that the range of speed varies at different sites. The lowest range of wind speed was observed at site 1 (0-19 m/s) and the highest range was found at site 3 (0-34 m/s). Some sites tend to have more low to null speed (0-1 m/s). Both the 2-p and 3-p Weibull distributions were fitted to the recorded wind speed data and the parameters in the distributions were estimated using the MLE and the Bayesian methods. In the MLE method, the goodness of fit with 2-p and 3-p Weibull distributions is evaluated using the statistical measures AIC, BIC, AD, KS and log-like.  In Bayesian estimates, uniform prior distributions of the parameters were used to fit wind data. Firstly, a sample of the joint posterior distribution by simulating a sample from MCMC methods using a Gibbs sampler as discussed in Section 4.2.1 is obtained. Finally, the DIC is obtained to evaluate the Bayesian parameters of the Weibull distributions. Table 5 presents the estimated mean values of the parameters with standard deviation (SD) of both two and three parameter Weibull distributions for all the sites. The 95% credible region (lower limit -2.5% and upper limit -97.5%) for each of the parameters and the model evaluation statistic DIC values are also presented. The goodness-of-fit criteria and summary statistics presented in Tables 4 and 5 indicate that the 3p Weibull distribution fits better than the 2-p Weibull distribution for wind speed at the sites 1, 2, 3, 5 and 7 as all the goodness-of-fit measures (AIC, BIC, AD, KS and log-like) are smaller in MLE estimate and the DIC is also smaller in Bayesian estimate. Moreover, as shown in Table 5   Trace plot (left) shows a 'fat hairy caterpillar' appearance which is indicating a random scatter around the stable mean of the shape parameter k = 1.947640 within 95% credible region (1.936812, 1.958478) and the scale parameter A = 7.017671 within 95% credible region (6.992180, 7.042895). The density plots (right) also display smooth curves of these simulated values for both the parameters. Thus, the plots clearly indicate the convergence of the simulations of the Bayesian estimates presented in Table 5. Trace plot (left) shows a 'fat hairy caterpillar' appearance which is indicating a random scatter around the stable mean of the shape parameter k = 2.635850 within 95% credible region (2.604492, 2.667974) and the scale parameter A = 8.804653 within 95% credible region (8.718409, 8.892835). The shift parameter also shows a random scatter around the stable mean of  = -1.546999 within 95% credible region (-1.624995, -1.472661), which is far away from zero and it reveals the appropriateness of using 3p Weibull. The density plots (right) also display smooth curves of these simulated values for all the parameters. Thus, the plots clearly indicate the convergence of the simulations of Bayesian estimates presented in Table 5.

Shape, k
Scale, A Figure 3: Trace and posterior density plots for site 4 (2-p Weibull). Trace plot (left) shows a 'fat hairy caterpillar' appearance which is indicating a random scatter around the stable mean of the shape parameter k = 2.385596 within 95% credible region (2.347390, 2.424420) and the scale parameter A = 8.446062 within 95% credible region (8.367275, 8.525195). The density plots (right) also display smooth curves of these simulated values for both the parameters. Thus, the plots clearly indicate the convergence of the simulations of Bayesian estimates presented in Table  5.

Shape, k
Scale, A Shift,  Figure 4: Trace and posterior density plots for site 4 (3-p Weibull). Trace plot (left) shows a 'fat hairy caterpillar' appearance which is indicating a random scatter around the stable mean of the shape parameter k = 2.405809 within 95% credible region (2.356128, 2.461114) and the scale parameter A = 8.507469 within 95% credible region (8.377540, 8.660753). The shift parameter also shows a random scatter around the stable mean of  = -0.055714 within 95% credible region (-0.169674, 0.031573), which includes zero and it reveals the appropriateness of using 2-p Weibull. The density plots (right) also display smooth curves of these simulated values for all the parameters. Thus, the plots clearly indicate the convergence of the simulations of Bayesian estimates presented in Table 5.

Discussion
In Section 7, the results for the goodness of fit for the wind speed distributions at seven different sites were presented. Results showed that the 3-p Weibull distribution was a better fit for wind speeds at all the sites investigated, except the sites 4 and 6, in the Equatorial region. However, the 2-p Weibull distribution may be a better fit for wind speeds data at the sites 4 and 6 as the shift parameter  in 3-p Weibull was found to be zero as detected in Bayesian estimate. As discussed earlier, the 2-p Weibull distribution can be considered a special case of 3-p Weibull distribution.
In this section, further investigations were carried out to explain the difference between the performance of the two distributions. Referring to the percentage of lowest wind speed (0-1 m/s) presented in Table 6, the results clearly show that the wind distributions of the sites (sites 1, 2, 3, 5 and 7) that have high percentage (0.79% -7.04%) of lower (or closer to null) wind speeds perfectly fit the 3-p Weibull distribution. The shift parameter in the simulations is also found to be significant i.e. 0   for these sites. On the other hand, the 2-p Weibull distribution was a better fit for wind speed distributions at Sites 4 and 6, where the percentage of low wind speed was smaller (< 0.78%). Similar findings were reported by Wais (2017).
Moreover, histograms presented in Figures 5-11 for wind distributions at sites 1 to 7 show different shapes, indicating a variation in skewness. Thus, another reason for fitting a better distribution is the skewness of the wind speed distribution. The skewness ( 1  ) is a measure of the asymmetry of the wind speed distribution about its mean, which is defined for a sample of n values as: where, s = standard deviation and    is expected to be positive. Table 6 presents the mean (U ), standard deviation (s) and skewness   1  of the wind speed data at each site. It shows that the wind speed distributions of sites 4 and 6 have higher skewness compared to sites 1, 2, 3, 5 and 7. Thus, the results reveal that the 3-p Weibull distribution is a better fit for wind speed data with both: greater frequency of low wind speeds (0-1 m/s) and low skewness, compared to a 2-p Weibull distribution. The Bayesian analysis also confirms that the wind speed data with a smaller percentage of low speeds fitted better as a 2-p Weibull distribution than the 3-p Weibull distribution as its location parameter  was found to be zero.
To reiterate, this research is aimed at comparing the goodness of fit of both 2-p and 3-p Weibull distributions and to compare the performance of frequentist MLE and Bayesian methods for the estimation of Weibull parameters. Therefore, a comparison study of the four methods is conducted by: Figure 5: 2-p and 3-p Weibull curves by four methods and histogram of the observed wind speeds at site 1. It can be seen that the wind speed distribution is almost symmetric and the standard deviation of the site's wind speed is the lowest compared to the other sites. The mean wind speed is close to the centre; the site's mean wind speed is 5.35 m/s. The percentage of lowest wind speed (0 -1 m/s) is relatively high at this site, hence the Bayesian 3-p Weibull distribution fitted the data best as can be seen from Tables 5 and Table 7. Figure 6: 2-p and 3-p Weibull curves by four methods and histogram of the observed wind speeds at site 2. Very symmetric wind speed bars on both the sides of the peak which is 6-7 m/s and hence a symmetric Weibull distribution curve for this site with a very small skewness can be seen. The percentage of lowest wind speeds of 0-1 m/s is similar to site 1, while the mean wind speed is 5.83 m/s, which is evident from a comparison of Figures 5 and 6. It can also be seen that the Bayesian 3-p distribution fits the histogram well. It is clear from    to sites 1 and 2 with a clearly high skewness. For this case, the percentage of the lowest wind speeds in the range of 0-1 m/s is the lowest, hence the Bayesian 2-p distribution fits the wind speed data well. Although the four Weibull distribution curves seem to be coinciding, the Bayesian 2-p distribution curve lies above the 3-p one before the peak and below the 3-p curve after the peak. From Table 4, it can be seen that the value of the shift parameter  is the smallest (close to zero) for this site, hence the 3-p Weibull distribution becomes 2-p distribution. Figure 9: 2-p and 3-p Weibull curves by four methods and histogram of the observed wind speeds at site 5. There are relatively higher frequencies of lower and higher wind speeds compared to the mean wind speed. However, the skewness is still relatively low for this site. The percentage of lowest wind speed of 0-1 m/s is 0.79% and it can be seen that the Bayesian 3-p Weibull distribution fits the histogram better. The fit is supported by the analysis presented in Tables 5 and 7.
Figure 10: 2-p and 3-p Weibull curves by four methods and histogram of the observed wind speeds at site 6. The skewness is relatively higher for this site. The mean wind speed is close to 7 m/s and the standard deviation is relatively higher. The percentage of the smallest wind speeds of 0-1 m/s is 0.78% for which the 2-p Weibull distribution is a better fit as per the analysis in Tables 5 and 7. Figure 11: 2-p and 3-p Weibull curves by four methods and histogram of the observed wind speeds at site 7. This site has the second highest percentage of smallest wind speeds. It can be seen that the Bayesian 3-p distribution is a better fit here with the 3-p curve touching almost all the bars at the peak. The error analysis of Table 7 clearly shows that the 3-p Bayesian curve is the best fit for this site's wind speed data.
To evaluate the performance of the four methods, the five statistical goodness of fit measures discussed in Section 6: 2 R , RMSE, COE, MAE and MAPE are estimated and compared. The results of the estimation of these measures for the wind data from all the seven sites are presented in Table 7, which clearly reveal that BAYESIAN.3P is the most efficient method for the sites 2, 3, 5 and 7 as it produces highest 2 R value and the least RMSE, MAE and MAPE values. This indicates that the 3-p Weibull with Bayesian estimates is a better method for wind energy assessment at these sites. Moreover, the method equally performs with MLE.3p for the site 1. Whereas, BAYESIAN.2P performs better for the sites 4 and 6 producing highest 2 R value and the least COE and RMSE values, which indicates 2-p Weibull with Bayesian estimates is a better method for assessment at these two sites which have the lowest occurrence of smallest wind speeds in the range of 0-1 m/s.

Assessment of wind power and energy:
To estimate the turbine productivity, it is necessary to calculate wind power and AEP to be obtained from each fitted model. If the density of the air  = 1.16 kg/m 3 and the turbine rotor diameter D = 32 m, the expected annual wind power and energy achieved from each site is determined using equations (22-25) for both 2-p and 3-p Weibull distributions. Table 8 shows estimated results for wind power and AEP.
The relative error of the estimated power shown in Table 18 is calculated as follows: Estimated power -Actual power Relative error (%), RE = 100% Actual power  If the RE is close to zero, the method estimates the parameter accurately. However, a positive RE implies an over-estimate and a negative RE implies an under-estimate by the method. From the values of Power, RE and AEP presented in Table 8, we observed that the BAYESIAN.2P and BAYESIAN.3P are found to be most efficient methods for estimating power and AEP. The BAYESIAN.3P procures the most accurate wind power and AEP for the sites 1, 2, 3, 5 and 7, which is very close to the actual power of these sites with smaller RE compared to other methods. On her other hand, the BAYESIAN.2P provides the best estimate of power and AEP for the sites 4 and 6 with smaller relative error. Thus, the comparison results, based on goodness of fit and the power estimation at different sites in the Equatorial region, show that the 3-p Weibull with Bayesian estimates the method to be used for wind energy resource assessments. If any sites have lower occurrences of the lower wind speeds, then the shift parameter  in the 3-p Weibull distribution will be close to or equal to zero, which will be the 2-p Weibull distribution with Bayesian estimates. Another advantage of Bayesian approach is that it will reduce the need for long-term measurements for assessing the wind power potential of a site. This is possible with the integration of prior information with shortterm data from a candidate site or historical data from one neighboring survey station.

Conclusion
Knowledge of correct statistical distribution of wind speeds at a given site is very important for accurate wind resource assessment. Some sites provide high uncertainty while fitting the traditional two-parameter Weibull distributions to wind speed data and warrant the need to explore distributions that characterize wind speeds better, such as the three-parameter Weibull distribution which is also a generalized form of two-parameter Weibull distribution with an additional nonzero shift parameter. In this study, investigation of wind characteristics and wind energy potential are carried out at different locations in the Equatorial region, of which three sites are in Fiji and one each from Cooks Islands, Tonga, Kiribati and Vanuatu, respectively. The wind speed data at these seven sites were tested for the best model between the two-parameter and three-parameter Weibull distributions. Furthermore, as there is no unique method that characterizes wind data perfectly, it is also imperative to have the knowledge of the best method of estimation for the parameters of wind speed distribution at a given site. In this study, we also introduced a novel approach by using the Bayesian method for estimating parameters of wind speed distributions at the seven sites selected for testing the method. Then, a comparison study was conducted for the robust performance of the proposed Bayesian method with the popular frequentist MLE method. Finally, the results suggest that the three-parameter Weibull distributions should be used in analyzing wind power potential irrespective of a location. When the wind distribution has frequent low wind speeds and is less skewed, a three-parameter Weibull distribution is found to be a better fit. On the other hand, when wind distribution has less frequent low speeds and highly skewed, the two-parameter Weibull distribution which is a special case of three-parameter distribution with zero shift parameter is found to be more appropriate. The results also indicate that the Bayesian approach provides more accurate results while characterizing wind speeds and can be proposed as an alternative technique for estimating Weibull parameters. The proposed method can be incorporated in the popular software packages such as WAsP used for wind resource assessment and for planning wind energy projects.

Data accessibility statement
Sample data are provided with the manuscript. Full data will be made available upon request and after approval from the respective Governments.

Appendix 2
Trace and posterior density plots that show the convergence of Bayesian models for sites 1, 2, 5, 6 and 7.

Shape, k
Scale, A

Shape, k
Scale, A