Comparison of eight methods of Weibull distribution for determining the best-fit distribution parameters with wind data measured from the met-mast

In order to assess the wind characteristics of a specified region, a pre-analysis of the region can be made with different numerical methods. For instance, the two-parameter Weibull distribution is widely used in wind energy studies and the wind energy sector to obtain information about the wind characteristics of the specified region. The main goal of this study is to perform a detailed analysis of the data obtained from the wind measurement sensors on a meteorological mast with a height of 80 m to determine the wind characteristics and wind energy potential of a region in Osmaniye, Turkey. The suitability of the two-parameter Weibull distribution, which is the most popular probability distribution model, was investigated to evaluate the distribution of these wind data. In the precise determination of the Weibull distribution parameters (k and c), the suitability of eight different numerical methods, namely, graphical (GM), empirical of Justus (EMJ), empirical of Lysen (EML), power density (PDM), moment (MoM), maximum likelihood (MLM), modified maximum likelihood (MMLM), and alternative maximum likelihood (AMLM) methods, was examined. Root-mean-square error (RMSE), chi-square (X2), and analysis of variance (R2) were used to compare and verify the performance of these models. The best and worst performances in these eight methods were MMLM and GM, compared with the actual measured data. Also, wind power density was calculated considering these methods and prevailing wind directions.


Introduction
One of the foundation stones of a country's economic and industrial growth is energy. In today's world, the rapid increase in energy consumption poses a fundamental problem for countries that struggle to meet the demand. Also, since each country is not self-sufficient in meeting the energy demand, the situation forces countries to reconsider their resource diversity and carefully plan their energy strategies for the available energy in their energy policies. Hence, in case of a global energy crisis, the countries most affected are the ones that do not have enough non-renewable energy resources. As this situation makes energy security and trade even more important globally, it is considered essential in forming the states' international policies. Parameters such as industrialization, urbanization, and the increase in population are other issues in meeting the required energy supply (Kumar 2020). In addition, fossil fuels have an essential place in the energy supply chains of countries; however, their overuse causes energy and environmental problems, and this increased use of fossil fuels results in climate change and global warming. As a result of the adverse effects of fossil fuels on human health and the environment, climate change and global warming, the decrease in the countries' energy reserves, and the increase in external dependency, most countries are turning to renewable energy resources (OECD 2011). Therefore, demand and interest in renewable energy resources are growing, becoming a focus of many countries. In this regard, renewable energy technologies are critical in solving environmental challenges by providing clean and reliable energy. Like other renewable energy sources, wind energy is clean, environmentally friendly, sustainable, and can reduce the use of fossil fuels (Ermolenko et al. 2017;Wang et al. 2018).
As stated above, wind energy has been increasing rapidly in recent years due to the decline of fossil fuel resource reserves and their negative effects. The use of wind energy does not necessitate high technology requirements, and there is no transportation problem. Another important advantage is the abundance of wind energy worldwide and the low cost of energy generation. However, different energy conversion is needed to benefit from wind energy. The kinetic energy of wind energy is converted into mechanical and electrical energy, respectively. Energy conversion from wind energy to electrical energy is carried out through wind turbines and generators (LeGourieres and South 1985;Ilkılıç and Türkbay 2010;Shoaib et al. 2017). It is stated in the global wind report (GWEC 2019) that the global new wind energy installations in 2019 exceeded 60 GW with a growth of 19% compared to 2018, and the total installed capacity increased to 650 GW with a 10% growth compared to the previous year. According to the GWEC 2019 Report, The total wind energy market has exceeded 60 GW, of which 54.2 GW is onshore and 6 GW is offshore wind energy. This capacity accounted for 10% of the global new installation and was the highest for 2020. According to the report, the top five markets for new installations in 2019 are China, the USA, the UK, India, and Spain. These countries accounted for 70% of the global capacity last year. The report shows that in terms of cumulative installations, the top five markets for the end of 2019 are unchanged, with 72% of these countries accounting for the total wind power market. In terms of installed power capacity from renewable energy sources other than hydroelectric power in Turkey, wind energy is ranked at the highest level (Emeksiz and Demirci 2019). When the amount of electricity generation is analyzed regarding resources, natural gas-LNG ranks first with 1,863,701 MWh (33.88%) which is followed by imported coal with 1,112,497 MWh (20.22%), hydraulic with 1,001,784 MWh (18.21%), and hard coal-lignite with 872,840 MWh (15.87%), 335,202 MWh (6.09%), and others. The amount of electricity generated via wind energy is 315,091 MWh (5.73%). Finally, among the other resources contributing 6.09% of the overall Considering the applicability, profitability, and sustainability reasons, it is quite challenging to establish wind power plants throughout the country. Thus, it is necessary to calculate the estimation of wind energy in different regions to reveal the commercial value of wind energy (Kara and Yaniktepe 2021;Saulat et al. 2021). Since wind speed is a random variable, the Weibull distribution (WD) function is usually used to determine the wind's speed distribution and energy potential by many researchers (Bilgili et al. 2004;Rehman et al. 2012Rehman et al. , 2021Wais 2017;Jung and Schindler 2019). In this context, Deep et al. (2020) estimated the wind energy potential for three regions along the Indian coastline using the WD model. In their study, they explained the methods used in WDs the least-squares (LSQR), empirical (EM), energy pattern factor (EPF), maximum likelihood (MLM), modified maximum likelihood (MMLM), and moment method (MoM), and they estimated and compared energy potentials Guarienti et al. (2020) examined the performance analysis of numerical methods to determine the WD parameters in Mato Grosso do Sul, Brazil. They used six numerical methods (GM, MLM, MMLM, MoM, EM, and PDM) to estimate WD parameters with hourly wind speed data series of 27 stations. As a result of the analysis, MLM and MMLM performed best in most of the evaluated stations. Shoaib et al. (2017) evaluated the wind energy potential and wind characteristics of Baburband (Pakistan) using the WD function, and the most suitable k and c parameters for the region were determined by comparing MLM, MoM, EPF, and PDM methods for a 3-year time period. Usta et al. (2018) used a new multi-objective moment method (MUOM) to define the WD parameters required in the wind power estimate of an area. They compared this method with well-known methods, for example, MLM, MMLM, MoM, and EPF. As a result, it was revealed that the MUOM method for WD is more accurate than other well-known methods. Azad et al. (2019) aimed to determine the wind energy potential and prospective wind sites of three different regions in Australia using the three numerical methods of WD. The study's most prospective windy site is Hamilton Island, and the annual mean wind speed is 7.5 m/s. Moreover, they revealed that the EM method is the best method for analyzing Weibull parameters. Li et al. (2020) investigated the evaluation and potential for the onshore and offshore wind characteristics in China by using the WD function for predicting wind speed. They determined Weibull parameters utilizing MoM, MLM, the curve fitting method (CFM), PDM, wind power density, and mean wind speed variations through monthly and hourly periods. Tiam Kapen et al. (2020) analyzed and compared 10 (ten) numerical methods to estimate the Weibull parameters of the city of Bafoussam in the Western Region of Cameroon using wind speed data collected from an altitude of 10 m at the meteorological station between 2007 and 2013. They found that MLM gives more precise results in the simulation tests, followed by the equivalent energy method (EEM), EPF, and EMJ, respectively. As a result, the wind power density and wind energy potential results acknowledged that the area was unsuitable for wind farm installation, yet it could be used for rural electrification or pumping water in agricultural applications. Sumair et al. (2020) developed a novel method to determine Weibull parameters (k and c) by using wind energy intensification method (WEIM). Wind data measured hourly for 3 years (2014-2017) were used from 50 m mast height in sixty different locations in Pakistan. MLM and MMLM results, which are frequently used in the literature, were compared with this method and the reliability of the results was examined by statistical analysis methods. Bórawski et al. (2020) conducted a study on wind energy development, which has an important place for the renewable energy sector in Europe. They analyzed the policy issue, installed capacity, and changes in wind capacity providing the statistics and forecasts.
Like other countries globally, Turkey prioritizes all renewable energy sources, especially wind energy, which has a high potential in Turkey and makes investments. Yaniktepe et al. (2013a, b) studied wind speed distribution to estimate the wind energy potential at a specific location in Osmaniye City. They calculated WD, k, and c as parameters the most commonly used for the similar types of wind energy studies utilizing the graphical method. Supciller and Toprak (2020) listed the usage criteria of wind turbines in the literature and investigated the criteria for selecting the best wind turbine in consultation with company experts. As a result of comparing different methods, it was concluded that the most suitable wind turbine could be selected with the Borda method. Celik (2004) used hourly wind speed data and Weibull and Rayleigh statistical distribution functions to evaluate Iskenderun's wind energy potential, and the annual average wind energy density value was obtained as 30.20 W/ m 2 . This average was listed as the class of 1 wind energy due to the power density value of the evaluated area being less than 100 W/m 2 . As a result, he stated that the wind energy potential of the investigated region is low and that it can only be used to meet basic requirements such as agriculture, water pumping, and lighting. Emeksiz and Demirci (2019) used an innovative hybrid site selection method to determine Turkey's offshore wind energy potential. They found suitable coastal regions by using novelty hybrid site selection method. They concluded that offshore wind farms are suitable for the coastal regions of Turkey, such as Bafra, Sinop, and Mersin. They calculated the capacity of installed powers for these regions as 2112 MW, 1176 MW, and 1293 MW, respectively. Bagci et al. (2021) investigated the inverse Kumarswamy distribution (IKum) as an alternative to the WD to model Lake Van, Turkey's wind speed data. They compared the efficiencies of the different methods with IKum, and the results stated that IKum distribution is an alternative to the well-accepted WD. Yildiz et al. (2021) studied the estimation of the shortterm wind power of a region using an artificial intelligencebased method. In the study, the direction and speed of the wind were used as the data set since they are the most critical parameters in power potential estimation. The method was compared with the results obtained from the physical model and the state-of-the-art deep learning architectures, and it was emphasized that the method performed better than other architectures. For example, Arslan et al. (2020), using wind speed data obtained from 335 stations for the period 1980-2013, comprehensively analyzed the effects of wind speed distribution on electricity generation across Turkey and emphasized that the Çatalca region has the highest wind energy potential in Turkey in terms of the most probable wind farm establishment. Gungor et al. (2020) examined the suitability of the wind farm setup in İzmir, the wind characteristics, and the region's wind energy potential using four different numerical methods. In addition, the economic analysis of the wind farm investment was made, and the region's suitability for investment was examined. In the study, the standard deviation-average wind speed method was determined as the most appropriate method, and the cost of electricity from wind was determined as 0.0111 USD/ kWh. In addition, Akgül and Şenoğlu (2019) examined the effects of many different probability distribution functions on wind energy as an alternative to the WD, which is the most popular in wind energy studies. Finally, they determined the most appropriate distributions for different wind regimes. As a result, they emphasized that WD does not give effective results in all wind regimes.
Regarding related literature, Weibull parameters to estimate the wind characteristics for a given location must be precisely calculated and compared to existing methods with each other rather than a single method. Determining the optimal WD can provide more precise information about the wind characteristic of the region. In this context, eight statistical methods were studied precisely and accurately to predict the two-parameter of Weibull distribution (k and c) in Osmaniye location using measured wind data from a metmast at 80 m. Weibull probability distribution was investigated using GM, EMJ, EML, PDM, MoM, MLM, MMLM, and AMLM.

Wind data and site description
The wind speed data was obtained from the met-mast for 1 year. This data in time series was measured 10 min time intervals for each of the 24 h of a day. The selected met-mast station and geographic location are shown in Fig. 1. As can be seen in Fig. 1, different numerical methods to estimate the Weibull parameters were applied for Hasanbeyli in Turkey, which has a high altitude, forested, and mountainous terrain.
The observation time, the missing data period, location, and other information about measurement are given in Table 1. In addition, the wind speed and direction data set was measured by Thies first-class advanced sensors from the met-mast at the height of 80 m. The obtained data were recorded with the CR800-series data logger. Technical information about the devices is also given in Table 1. Figure 2a shows the 10-min wind speed data recorded from the met-mast station throughout the year, and   were also measured as 5.36 m/s in October and 10.21 m/s in June, respectively. Besides, daily wind speed values show the same characteristics as monthly values. For this location, the average annual wind speed taken from the measurement station was calculated as 7 m/s.

Different Weibull distribution methods
The WD function is vital in determining a region's wind speed frequency distribution and predicting wind energy potential. Wind speed variation of the region can be characterized by using two distribution functions. These functions are the probability density function (PDF) and the cumulative distribution function (CDF), and are given in Eqs. (1) and (2), respectively, where v (m/s), k (dimensionless), and c (m/s) values are, respectively, wind speed, shape factor, and scale factor, and these values are greater than zero. Many different numerical methods are used in the literature to estimate the Weibull parameter (Akdaǧ and Dinler 2009;Bilir et al. 2015a;Mohammadi et al. 2016;Natarajan et al. 2021). This study focused on eight most frequently used methods to make comments about the precise characteristics of the wind in the area, and these methods were compared among themselves: GM, EMJ, EML, PDM, MoM, MLM, MMLM, and AMLM.

Graphical method
GM, which is one of the most used methods in the literature, is found by applying two times logarithms to Eq. (2) (Seguro and Lambert 2000; Khalid Saeed et al. 2019).

Empirical method of Justus
Based on the standard deviation method, this method is presented by Justus, and the parameters k and c can be calculated with Eq. (4) and Eq. (5) (Justus et al. 1978;Akdağ and Güler 2015). (1) In Eqs.(4)- (5), v and values are the mean and standard deviation of wind speed, respectively. Also, the gamma ( Γ ) can be obtained by Eq. (6)

Empirical method of Lysen
The difference in the EML (Mohammadi et al. 2016;Faghani et al. 2018) presented by Lysen from the EMJ is the equation used to calculate the c value (Eq. (7))

Power density method
The power density method is essentially based on the energy pattern factor method but includes fewer, accessible, and applicable formulations for calculating the shape and scale factor. The first step in this method is to calculate the E pf , and it can be calculated by dividing the average of the wind speed cube by cubes of the average wind speed as in Eq. (8). The k value can be calculated by Eq. (9) using the E pf coefficient. The calculation of c parameter is similar to Eq. (5) given in EMJ (Akdaǧ and Dinler 2009;Mohammadi et al. 2016).

Method of moments
MoM uses v and to calculate the parameters of the WD based on numerical iterations (Eqs. (10)-(11)) (Shoaib et al. 2017;Guarienti et al. 2020).

Maximum likelihood method
MLM, one of the probability functions in the time series format, is mathematically calculated using numerical iterations (Chang 2011). Using MLM, the k and c parameters are computed by Eqs. (12)-(13), respectively.
In these equations, N is the number of wind speed data greater than zero and v i is the wind speed in time step i (m/s).

Modified maximum likelihood method
MMLM and wind speed data should be in the form of frequency distribution. k and c parameters are determined by following Eqs. (14)-(15), respectively (Chang 2011;Shoaib et al. 2017;Tiam Kapen et al. 2020).
where f v i and f (v ≥ 0) is the Weibull frequency and positive wind speed probability for data within bin i.

Power density
The calculation of wind power density P(v) for the measured data is given Eq. (18) (Celik 2004;Arslan 2010).
where is the standard air density ( = 1.225 kg/m 3 and it is dry air at 1 atm and 15 °C). PDF of WD ( P W ) is as in Eq. (19) (Akpinar and Akpinar 2006;Akdaǧ and Dinler 2009;Arslan 2010).   where y i , x i , z i , and n are the observations frequency, Weibull frequency, mean wind speed, and the number of data, respectively. (21)

Results and discussions
In order to obtain monthly wind variability and wind characteristics of the region, standard deviation and mean speed values were calculated and given as in Fig. 3. According to the data obtained from the met-mast at 80 m, the annual mean wind speed and standard deviation values are 7.60 m/s and 3.50 m/s, respectively. The highest wind speed values were 10.21 and 9.67 m/s in June and August, while the lowest wind speed values were 5.36 and 6.31 m/s in October and April. The results of standard deviation and average wind speed show that the behavior of the wind is consistent. It is known that these results are obtained from a sensor at the height of 80 m, and assuming that it is a large-scale wind turbine instead of a measurement sensor, it can be interpreted that the energy generated from the region can have a constant and regular output. Yearly and monthly WDs of wind speed calculated using eight different methods, histograms of observed actual wind data, and probability density function f(v) are given in Fig. 4. The figure also presents how the values obtained from the WD function calculated as a probability density function match the wind speed histogram observed from the mast used to identify the accuracy of the methods. Considering the calculated annual and monthly data distribution,  The cumulative frequency variation (CDF) according to methods on an annual basis is seen in Fig. 5. In the cumulative frequency curves, unlike the probability density functions curves, it is seen that other methods are compatible except for the AMLM method.
Yearly k and c parameters of WDs and power density were obtained using eight different statistical methods, and these values are summarized in Table 2. According Table 5 Statistical error analysis results according to all methods

Months
Error analysis methods  The k and c parameters and the power density take the lowest values in the GM compared to the other methods. When the remaining methods are considered and evaluated, the k and c parameters generally take the lowest value in the MLM. Table 3 and Fig. 6 present the parameters of WD according to eight numerical methods. As can be seen from the table and figure, these parameters, which are k and c values, are in the range of the variation 1.5233-4.0301 and 5.3600-11.3552 m/s, respectively. In Fig. 6, while the k and c parameter shows the same trend in the first 5 months, it changes between the 5th and 10th months. While they increase between the 5th and 8th months, they decrease after the 8th month until the 10th month. Also, between the 10th and 12th months, the change is the same as in the first 5 months for both parameters. As a result of this table and figure, the wind characteristics studied location is probably uniform.

Methods
Wind power density according to the actual value and these statistical methods are shown in Fig. 7. It can be observed that power density is high in the summer (June-August), followed by the months of November-March. The power density value in the remaining months is lower than in the other months. The highest value of wind power density for the actual situation is 802.51 W/ m 2 in June, whereas the lower is 165.06 W/m 2 in October. The highest and lowest power density value of MMLM method, which is determined as the best method compared with performance analysis, is 798.84 W/m 2 in June and 164.14 W/m 2 in October, respectively. The annual average power density value for the actual value and MMLM method is 444.85 W/m 2 and 450.47 W/m 2 , respectively.
The power density generated by the wind can be classified according to its potential. Wind power classification has been determined for different regions by different studies (Elliott and Schwartz 1993;Mohammadi and Mostafaeipour 2013;Bilir et al. 2015b). The power density generated by the wind can be classified according to its potential. This classification by US National Renewable Energy Laboratory (NREL) is for a hub height of 50 m (Elliott et al. 1986). However, wind power classes have been derived for turbines with hub heights such as 80-100-120 m with the interpolations carried out (Phadke et al. 2011;Salvação et al. 2013). Wind power classification for 80 m is given in Table 4.
According to Table 4, 3rd class and above wind classes can be considered suitable for investment. Therefore, the region where our study was carried out is in class 3 with a power potential of 444.85 W/m 2 and is potentially suitable.
k and c parameters of WD were calculated by different equations using the eight numerical methods. Performance analysis of the methods was investigated to confirm the effectiveness and reliability of these methods. These methods were evaluated comprehensively monthly to be evaluated more precisely and the statistical performance analysis results obtained are given in Table 5. When assessed on a monthly basis, the MLML method yielded the best results in February, March, April, June, July, and August. The MoM method obtained the best results in May, November, and December. Other methods were calculated with slightly good results in months with different values. According to annual values calculated considering monthly, when all methods were compared in terms of their RMSE values, it is clearly seen that the MMLM method recorded the smallest RMSE value with 0.0073. Also, the same result can be said for X 2 statistical analysis, and the minimum yearly value is 0.0254 for MMLM. Considering that compliance with regression accuracy depends on the methods used, the model's accuracy was evaluated using the R 2 parameter. The high R 2 indicates that the regression model accuracy is good fitness. The value of R 2 tending towards one indicates a good fit. Therefore, it is revealed that the highest R 2 value belongs to the MMLM method with 0.9467 in Table 5. The case refers to the data R 2 , RMSE, and X 2 considered at the height of 80 m.
Considering the results obtained from the measurement mast, the frequency of wind speed and wind directions are given in Table 6. The arranged representation of Table 6 in the form of a wind rose graph is as in Fig. 8. As a result of Table 6 and Fig. 8, the prevailing wind directions are between 115-125° and 125-135° with frequencies of 22.901% and 14.846%. The results showed that higher wind speeds ranging from 5 to 15 m/s tended to flow from prevailing directions. For example, lower wind speeds between 0 and 5 m/s were found between 5° and 25° with an average frequency of 4.75%.

Conclusion
In the present study, wind characteristics of the Osmaniye region in Turkey were investigated using eight statistical methods. The WD's two parameters were calculated using these methods using measured wind data from the met-mast at 80 m height. Hourly average wind speed values were calculated using recorded 10 min average data, and the k and c of Weibull parameters according to eight methods were determined using these hourly average wind speed values based on monthly and yearly. Performance analysis of the Fig. 8 Wind speed frequency distribution and wind direction in the wind rose diagram methods was investigated to confirm the effectiveness and reliability of eight statistical methods. PDF according to actual and statistical methods and the CDF for the annual basis were calculated. Prevailing wind direction according to wind speed frequency was determined. The accuracy of these methods was analyzed by calculating RMSE, R 2 , and chi-square (X 2 ). Thus, the best method was determined by comparing all the numerical methods.
The findings obtained as a result of these calculations are given below: The k and c of Weibull parameters are in the range of the variation 1.5233-4.0301 and 5.3600-11.3552 m/s, respectively. The highest value of wind power density for actual is 802.51 W/m 2 in June, whereas the lower is 165.06 W/ m 2 in October. The power density is high in the summer season (June-August), followed by November and March. The best results were determined using the MMLM method. The highest R 2 , the smallest RMSE, and X 2 values were calculated as 0.9467, 0.0073, and 0.0254, respectively. The highest and lowest value of power density of the MMLM method is 798.84 W/m 2 in June and 164.14 W/ m 2 in October, respectively. The annual average power density value for the actual measurement value and MMLM method is 444.85 W/m 2 and 450.47 W/m 2 , respectively. The region where our study was carried out is in class 3 and is potentially suitable for wind power production. The prevailing wind directions are between 115-125° and 125-135°, with frequencies of 22.901% and 14.846%. The results showed that higher wind speeds ranging from 5 to 15 m/s tended to flow from prevailing directions.
Author contribution CO contributed in the collection of measured data; BY and OK contributed in the first draft of the manuscript; IA and OK performed the analysis and preparation of tables and figures. All the authors read and approved the final manuscript.
Data availability Not applicable.

Declarations
Ethics approval and consent to participate Not applicable.

Competing interests
The authors declare no competing interests.