2-Tuple unbalanced linguistic multiple-criteria group decision-making using prospect theory data envelopment analysis

The linguistic terms used to describe the qualitative data in problems are symmetrical about the central linguistic word. With the increasing complexity and a large number of DMs with different domain expertise involved in decision-making, the symmetric linguistic term set seems constricted. This paper examines the multiple-criteria group decision-making problems in which the decision-makers use the unbalanced 2-tuple linguistic term set to describe the alternative–criteria assessments matrices. We adopt a data envelopment analysis (DEA) approach and develop a linear programming model to evaluate alternatives criteria weights for each decision-maker. The non-rational factor of risk in criteria gets modeled using the value function from the prospect theory. The values of prospect gain and prospect loss, respectively, on cost and benefit parameters are computed and applied to formulate a DEA model to assess the weights of each criterion on each alternative. Eventually, the cross-efficiency scores and their entropy values provide a ranking of the alternatives. A numerical example is presented to illustrate the methodology. A comparative analysis of the proposed method with fuzzy ordinal priority approach and fuzzy TOPSIS is performed for the illustrative example. Finally, an application example of ranking the faculties on their end-semester feedback from their class students in an academic institute is presented.


Introduction
Over the last few decades, multi-criteria group decisionmaking (MCGDM) problems are widely studied and applied in a variety of areas, including supply chain management, medical diagnosis, alternative selection, technology evaluation, investment decision, to quote a few. For instance, refer to Dursun and Karsak (2013), Ehrgott et al. (2004), Liu et al. The frameworks based on fuzzy sets or their generalized variants typically employ membership function scale representation. All computing is performed on this scale using fuzzy computing rules or some other defined mathematical operators in the context of research. These methods are not apt when the uncertainty in problems arises due to vagueness in the meanings of the qualitative assessments by DMs. Herrera and Martínez (2000) introduced the 2-tuple linguistic model to unfold the vague information and introduced the idea of computing with words (CWW). The language of CWW is simple to follow; it adds flexibility by allowing expressing information in natural language and provides meaning to the mathematical operations on such information. The 2-tuple linguistic model (Herrera and Martínez 2000) improves upon the basic linguistic model by including an additional parameter to bring in the translation process in computing. In other words, the 2-tuple linguistic model possesses a continuous scale, resulting in no loss of information while computing with linguistic terms. Typically, the linguistic term set is balanced, meaning that the words in set are uniform and symmetrically spread across the mid-term.
We believe that every DM may have a different perspective on assessing each alternative on different criteria, and the DM can express it in natural language. Moreover, the DM may weigh favoring or positive viewpoints differently than the opposing viewpoint on objects. This paper aims to study the MCGDM problems where the DMs use the non-uniform, unsymmetrical 2-tuple linguistic term on alternatives-criteria valuation. If the information is unevenly distributed and possesses higher granularity on either side of the central term, (Liu et al. 2004;Torra 2000) comes with its challenges. Such a linguistic term set is called an unbalanced linguistic term set (Herrera et al. 2008b). The uneven spread could be due to a higher cardinality of terms on one side of the central term than the other, or we have an equal number of terms on both sides of the mid-term, but the spacing between them is unequal.
Recent years witnessed significant contributions in developing procedures analogous to the CWW in the unbalance linguistic term set. Herrera and fellow researchers (Herrera et al. 2008a;) utilized the notion of linguistic hierarchy to allocate semantics to each linguistic term via a parametric membership function. Wang and Hao (2007) unified the frameworks of Lawry (2001) and 2-tuple linguistic model. Zou et al. (2012) proposed a linguistic aggregation operator elicited from the 2-tuple linguistic model and linguistic hierarchy. Dong and Cooper (2016) presented a new interpretation of the 2-tuple linguistic model by putting forward the notion of numerical scale. Zhang et al. (2020) studied large-scale group decisionmaking problems in which the decision-makers provide their multi-granular information using the hesitant fuzzy linguistic term sets. The authors also proposed two algorithms, one for transforming an unbalanced hesitant fuzzy linguistic term set into a balanced linguistic distribution, and the other transforms the balanced linguistic distribution to the unbalanced linguistic distribution assessment. More recently, Zhang et al. (2021b) put forward a simplified linguistic model for expressing multi-granular unbalanced linguistic term and developed two optimization models that generate adjustment advice for DMs in reaching consensus in group decisionmaking. Yu et al. (2021) build a consensus reaching model in a multi-criteria group decision-making involving multigranular hesitant fuzzy linguistic term sets. Their model employs the trapezoidal fuzzy numbers representation of the multi-granular hesitant fuzzy linguistic term set. Malhotra and Gupta (2020a) proposed a novel idea to represent unbalanced linguistic information via a multiplicative linguistic label set. They developed a 2-tuple unbalanced linguistic term set applying the concept of minimum distance measure. Their multiplicative labeling model offers two advantages. One, it enables presenting a formal definition of unbalanced linguistic information. Secondly, it is computationally simple and less cumbersome than the linguistic hierarchy approach (Herrera and Herrera-Viedma 1997) and numerical scale concept (Dong and Cooper 2016).
In the aggregation process, the weights of decision-makers on each criterion and the weights of each alternative criterion are vital. The decision-makers assessments may not carry equal weight; based on the positional hierarchy, responsibility, professional expertise, and familiarity with the relevant problems, some experts may play more decisive roles than the others in a group. As a result, determining the weights of the decision-makers in a group is critical. Furthermore, a decision maker may not be an expert on each criterion to assess an alternative accurately. If not always, the participating decision-makers often employ quantitative descriptors to communicate their assessments of alternatives on criteria. In many studies, the weights of the DMs are given in advance. In contrast, in certain other studies, the decision-makers weights get determined by describe procedures. Geng et al. (2017) employed the data envelopment analysis (DEA) approach to solve MCGDM problems integrating 2-tuple linguistic DEMATEL for capturing influence relationships among criteria. Wan et al. (2016) studied the MCGDM problem with intuitionistic fuzzy numbers and determined DMs weights using TOPSIS. Dong and Cooper (2016) used the Markov chain method to determine DMs weights. Ju (2014) applied the similarity degree in the 2-tuple linguistic evaluation and defined the 2-tuple linguistic positive, negative, and left negative ideal solutions and proposed optimization models to determine the weights.
Against this brief backdrop, the foremost objective of the present study is to provide a ranking of the alternatives in an MCGDM problem when the decision-makers use the 2-tuple unbalanced linguistic term set to give the alternative-criteria matrices entries. We use the DEA-based linear programming model to evaluate each DM weight on each alternative criteria pair. The DEA model provides a more reliable optimization framework yielding the global optimum weights and completely removes any subjective judgments in the process. These weights are applied to find the aggregate alternativecriteria matrix. Besides, the criteria are put in binary baskets of benefit and cost. Behavioral studies showed that the pain of loss (cost) of a value is greater in magnitude than the pleasure of gaining (benefit) the same value. We envision capturing the impact of criteria on the alternatives by integrating the cumulative prospect theory (CPT) value function-based DEA model that applies the prospect gain and loss values on the aggregated matrix's cost and benefits criteria and computes criterion weight on each alternative. These weights help to compute the cross-efficiency scores and their entropy values to rank the alternatives finally.
DEA is a nonparametric technique to measure the relative efficiency of homogeneous decision-making units (DMU) that use multiple inputs to produce multiple outputs. In the context of MCGDM, the alternatives act as DMUs. Since pioneering work of Farrell (1957) and Charnes et al. (1978), DEA has come a long way in research and applications in a variety of domains. Each DMU self-evaluates its best relative efficiency using its most favorable input and output weights from the DEA model. On the other hand, the crossefficiency, proposed by Sexton et al. (1986) and elaborated by Doyle and Green (1994), assessed the efficiency of each DMU not only through self-evaluation only but also takes into account the peer evaluation. The peer evaluation evaluates each DMU with the optimal weights of other DMUs. The cross-efficiency score is the average of self-evaluation efficiency and peer-evaluation efficiencies of a DMU.
Soleimani-Damaneh and Zarepisheh (2009) applied Shannon (1948) entropy to aggregate different efficiency results. Wu et al. (2012) introduced the entropy value of crossefficiency score and developed a distance entropy function between cross-efficiency score and CCR efficiency score to generate relative weights for cross-efficiency aggregation. Xie et al. (2014) used Shannon's entropy to derive the degree of importance of each DMU and merged the calculated efficiency scores and the degrees of importance to help discriminate between traditional DEA models. A modified weight-restricted DEA model for the derivation of non-zero optimal weights proposed by Qi and Guo (2014) using Shannon's entropy to aggregate those weights to be the common weights. Wang et al. (2016) used the DEA entropy model to find the cross-efficiency intervals with imprecise inputs and outputs. The DMUs are ranked based on the distance to ideal positive cross-efficiency.
The contributions made in this paper can be summarized as follows. Firstly, we provide a concise review of key ideas of the unbalanced linguistic MCGDM problem and CWW for it. We then evaluate the weights of each DM on each alternatives-criteria pair using a non-parametric DEA linear program model. Subsequently, we applied the CPT power function to accommodate the risk-averse perspective in computing the criteria weights. These weights finally led to comprehensive decision-making of ranking the alternatives.
We organize this paper with the following sections. Section 2 presents preliminary concepts, mainly the unbalanced linguistic term set and the prospect theory-based DEA model computing the cross-efficiency of alternatives. Section 3 explains the DEA model for an MCGDM problem with alternative-criteria matrices entries from the unbalanced linguistic term set. Section 4 describes the step-wise procedure to solve the underline MCGDM problem and ranking of the alternatives. Section 5 presents a numerical example to illustrate the proposed methodology. A comparative analysis of the proposed scheme with fuzzy ordinal priority approach and fuzzy TOPSIS is also included. Section 6 presents a case study to show the availability of the proposed model. Section 7 concludes the paper.

Preliminaries
In this section, we present two key ideas that we have applied in developing our scheme for MCGDM problems. We keep our descriptions brief for the two up to the point which are vital in the context of our study.

Unbalanced Linguistic term set
The 2-tuple linguistic model adds a new parameter called symbolic translation to the basic fuzzy linguistic model. The model is represented by a pair (l, α), where l ∈ L is a linguistic term from linguistic terms set L = { l 0 , l 1 , . . . , l g } and α ∈ [−0.5, 0.5) is a numerical value for the symbolic translation. The 2-tuple linguistic model assists in explaining the process of CWW by imparting continuity to the linguistic domain. This model is also referred by continuous 2-tuple linguistic model. Martínez and Herrera (2012) and Malhotra and Gupta (2020b) presented a comprehensive review of the 2-tuple linguistic model and its applications to the decisionmaking problems.

Definition 1 Let
. . , t a 0 , . . . , t a g } be a multiplicative unbalanced linguistic term (MULT) set with cardinality f + g + 1, where f , g are positive integers, and a is a real number, a > 1. Let β ∈ [a − f , a g ] be a numerical value representing the result of a symbolic aggregation. Then the unbalanced 2-tuple linguistic value that expresses the equivalent information to β is explained by the function and the following procedure (Malhotra and Gupta 2020a) is used to compute σ L , σ R , d L , and d R .

Prospect theory based cross-efficiency evaluation
The following three are the guiding principles behind their approach.
1. Reference dependence. A DM generally perceives outcomes as gains or losses relative to a reference point. The prospect value curve of a DM by the reference point is into the gain and loss domains. 2. Loss aversion. Decision-makers have different risk appetites toward gains and losses. In a decision process, a DM is typically more sensitive to losses than to absolute commensurate gains. For this purpose, the prospect value curve is steeper in the loss domain than the gain domain. 3. Diminishing sensitivity. A DM possesses a risk-averse tendency for gains and a risk-seeking tendency for losses. Consequently, the prospect value curve is concave in the gain domain and convex in the loss domain. The marginal values of both gains and losses decreases with an increase in their sizes.
One of the most common prospect value functions ν( z) representing the subjective attitude of a decision-maker is the power function, defined as follows: Here, z = z − z o , is the deviation of z from the reference point z o . It exhibits gain above the reference point and losses below the reference point through its non-negative and negative values. The parameters α ∈ (0, 1) and φ ∈ (0, 1) denote the bump degree in the gain domain and diminishing sensitivity in the loss domain of the power function, respectively. The greater values of α and φ indicate the greater risk appetite of the DM. The parameter θ > 0 is a loss-aversion coefficient; θ > 1 indicates that the decision-maker is more worrisome for losses than gains. Liu et al. (2014) suggested to take α = φ = 0.85, and θ = 4.1. An experimental validation by Tversky and Kahneman (1992) suggested to take α = 0.89, φ = 0.92, and θ = 2.25. Tversky and Kahneman (1992) added a minor extensions to the prospect theory to present cumulative prospect theory (CPT) that employs cumulative rather than separable decision weights to the value function. The CPT has attracted attention in many diverse decision-making areas.  Wu et al. (2012) proposed to use Shannon entropy of cross-efficiency for aggregating the efficiencies in the cross-efficiency matrix. The entropy model generated weights for aggregating and explaining the final crossefficiency values of the DMUs. Song and Liu (2018) noticed that the weights produced by Wu et al. (2012) approach are inconsistent from the acceptable views. They suggested an improvement in the method and presented a variation coefficient method based on the Shannon entropy for DEA cross-efficiency aggregation.

DEA model for aggregating alternative-criteria matrices
Without loss of generality, we assume that in each matrix B (k) , the first -columns corresponding to the cost criteria (lower is better) and the remaining n − = s columns are beneficial criteria (higher is better).
The distinction in the criteria makes DEA a natural to implement in MCGDM problem. The cost criteria data values are considered as input data, while the beneficial criteria data values are the output data in the DEA model. The m alternatives act as the m DMUs. The DEA model computes the weights w (k) i j of kth DM for the ith alternative on the jth criterion, i = 1, . . . , m, j = 1, . . . , n, k = 1, . . . , d. We shall be using the index p to denote the cost criteria and index q for the beneficial criteria; p = 1, . . . , ; q = + 1, . . . , + s.
We use the output oriented DEA model by Charnes et al. (1978) for efficiency evaluation of a DMU o as follows: iq , p = 1, . . . , ), q = + 1, . . . , + s, i = 1, . . . , m, j = 1, . . . , n = + s, k = 1, . . . , d. We normalize the optimal weight vector over the index k and obtain the aggregated (1) Using the worst and the best reference points in DEA,  proposed a cross-efficiency model based on the prospect theory. The worst DMU uses the maximum inputs to yield the least outputs, while the best DMU does exactly the opposite by producing maximum outputs using the least inputs. The gain and loss of any DMU are measured relative to values above the worst DMU and below the best DMU, respectively.
In context of the MCGDM, the ith DMU input and output data values in B agg matrix are, respectively, b i p , p = 1, . . . , , and b iq , q = + 1, . . . , + s.

Definition 4
If the reference point is the worst DMU, then the prospect gain value with respect to pth input and qth output of DMU i , i = 1, . . . , m, is defined as tively, the pth input and qth output of the worst DMU.
Definition 5 If the reference point is the best DMU, then the prospect loss values with respect to pth input and qth output of DMU i , i = 1, . . . , m, are defined as tively, the pth input and qth output of the best DMU.
The DM always prefer to allocate weights to inputs and outputs so to maximize gain and minimize loss for DMU i , i = 1, . . . , m. To accomplish this task,  proposed the following cross-efficiency model for DMU i : Here, η ∈ [0, 1] depicts the trade-off parameter between the prospect gain and prospect loss for the decision maker.

Proposed methodology
Taking motivation from the cited works, in this section, we put forward an algorithm to solve MCGDM problem with entries coming from the unbalanced 2-tuple linguistic term set.

Algorithm 2 Ranking the alternatives
Step 1: Prepare a linguistic term set L S for assessing the criteria. Convert the linguistic assessment of criteria into the corresponding 2-tuple linguistic term. If the linguistic information on criteria is partly supplied in the form of t β with β ∈ [a − f , a g ], then construct the 2−tuple linguistic assessment using the map .
Step 2: Collect the alternative-criteria matrices B (k) = [ b (k) i j ] m×n , k = 1, . . . , d, from the d decision-makers with entries from the set L S.
Step 3: Convert the matrix B (k) into the corresponding unbalanced 2tuple linguistic decision matrix b Step 4: Solve the model (M1) described in Section 3, and determine the optimal weights w (k) i j , i = 1, . . . , m, j = 1, . . . , n, with respect to i th alternative and j th criterion for the k th DM, k = 1, . . . , d.
Step 5: Compute the aggregated matrix B agg = [b i j ] m×n using relation (1) listed in Section 3.
Step 6: Compute the prospect gain and prospect loss values by choosing appropriate values of the parameters α, φ and θ in Definitions 4 and 5 , respectively.
Step 9: Rank the alternatives in the ascending order of E agg i , that means, the least value of E agg i results is the highest rank of the alternative A i .

Algorithm 3 Aggregated cross-efficiency scores
Step 1: Compute the cross-efficiency E ii of DMU i using the optimal weights u * iq and v * i p by (C E) model, described in Section 3, for DMU i .
Step 2: Compute Shannon's entropy of E ii defined by Step 3: Compute the mean and standard deviation of entropy for Step 4: Compute the variation coefficients δ i and normalize it to find the cross-efficiency weights of alternative A i as ξ i .
Step 5: The aggregated cross-efficiency of DMU i is defined by

Illustration
In this section, an example is presented to showcase the proposed MCGDM methodology. The example is set in the backdrop of medical devices selection. Medical devices are expensive and incur a high maintenance cost. How to select the best medical device is a key problem for a hospital. Suppose a hospital H is ready to order a set of sensor devices to improve the technological level. Medical experts have shortlisted eight sensor devices {A i , i = 1, . . . , 8}. Seven criteria are identified for assessment: maintenance support difficulty (C 1 ), purchase cost (C 2 ), degree of environmental interference (C 3 ), stability (C 4 ), sensitivity (C 5 ), linear range (C 6 ), and the degree of intelligence (C 7 ). Among these, C 1 , C 2 , and C 3 are cost criteria thus considered as inputs, while C 4 , C 5 , C 6 , and C 7 are beneficial criteria taken as outputs in DEA model (M1). The assessment of eight alternatives on seven criteria is carried out by four decisionmakers {D k , k = 1, . . . , 4} from the linguistic term set having a semantic representation as follows: Here, one can view medium as average, slightly poor as better than poor but below average, slightly good as above average but not good. Table 1 exhibits the alternative-criteria assessment matrices by the four decision-makers. For instance, the four decision makers have rated alternative A 1 on criterion C 1 by SG, SP, M, P, respectively, forming the entry in cell (1, 1) in Table 1. Thereafter, we apply Step 1 of the Algorithm 2 to construct the 2−tuple linguistic assessment using the map . Applying Step 2 of Algorithm 2, Table 2 reports the optimal weights for alternative-criteria of the decision-makers on solving model (M1) (defined in Sect. 3). Table 1 Linguistic assessment of alternatives on the criteria provided by the four decision-makers (to read the linguistic abbreviations for each decision maker in order from left to right and then bottom left to right)  Step 3 of Algorithm 2, the optimal weights are given in Table 2, and applying 1, we obtain the aggregated matrix B agg as follows: We take η = 0.5, α = 0.89, φ = 0.92 and θ = 2.25, and solve (C E) model for each of the alternatives. The values of the parameters are taken from Kahneman and Tversky (1979), while in the absence of any information on the tradeoff between loss and gain, we set η = 0.5. Table 3 exhibits the optimal weights from (C E) model and Table 4 provides the cross-efficiency values E ii , i, i = 1, . . . , 8.
We next compute the entropy values of the cross efficiencies, and list the same in Table 5. Table 6 presents the cross-efficiency weights ξ, i = 1, . . . , 8, of alternatives, the aggregated cross -efficiency scores, and ranks of alternatives.
Keeping the parameters α = 0.89, φ = 0.92 and θ = 2.25, at the same levels, we vary the trade-off parameters η ∈ (0, 1), to see the change it brought in the ranks of the alternatives. The results are summarized in Table 7. Table 3 Optimal solution of (C E) model, defined in Sect. 3, for each alternatives when the parameters are α = 0.89, φ = 0.92, θ = 2.25, η = 0.5     The different values of the parameter η capturing different perspectives toward gain and loss yield different rankings of alternatives that cannot be ignored. We apply Kendall's coefficient of concordance test to test the statistical significance of rank change for the pessimistic and optimistic situations. Kendall's coefficient of concordance is a measure of agreement among several quantitative variables that assess a set of a finite number of objects of interest. The test statistics W is given by where R i is the sum of ranks for the ith object judged by τ assessing variables,R is the mean of all R i values, τ is the number of judgments on m objects. The random variable χ 2 = τ (m − 1)W is asymptotically chi-square distributed with m − 1 degrees of freedom. We first consider the pessimistic case when η = 0.1, 0.2, 0.3, 0.4, 0.5, so, τ = 5 (judges) and m = 8 alternatives (A 1 , . . . , A 8 ). The null hypothesis and the alternative hypothesis are set as follows: H 0 pess : there is no concordance in rankings of alternatives in pessimistic case; H 1 pess : there is a concordance in rankings of alternatives in pessimistic case.
The test statistics W = 0.9714, and χ 2 = 34 with p value 1.76 × 10 −4 . The null hypothesis gets rejected. On the similar lines, in the optimistic case when η = 0.5, 0.6, 0.7, 0.8, 0.9, the test statistics W = 0.958, and χ 2 = 33.53 with the p value 7.8 × 10 −5 . The null hypothesis of no concordance in rankings of alternatives in optimistic case also gets rejected. In both the cases W value is close to 1, indicating the rankings are more toward in agreement.

Comparative analysis
In addition, in the same example, we make a comparison of our method with the existing fuzzy ordinal priority approach (OPA-F) (Mahmoudi et al. 2021). After converting cost criterion C 1 , C 2 , C 3 into the benefit criterion, the linguistic representation of B agg , using the linguistic semantics in L S, is as follows. Mahmoudi et al. (2021) described the representation of each linguistic term into a triangular fuzzy number (TFN) using the fuzzy scale presented in Table 8. Each of the entries of B agg are converted into TFN applying this scale.
Following the steps of the OPA-F algorithm by Mahmoudi et al. (2021) and explained toward end of Sect. 3, we formulated the linear programming attributes model of Mahmoudi et al. (2021) to obtain the criteria weights. This model involves 38 constraints and 45 variables and the obtained triangular fuzzy weights of criteria are exhibited in Table 9.
We then solve the alternatives model of the OPA-F algorithm and compute the scores of the alternatives. The problem model, when written for the data in the Example 1, involves 300 constraints and 339 variables. The triangular fuzzy weights of the alternatives and the ranking function of their total fuzzy scores are depicted in Table 10, where the total fuzzy score of i-th alternative isT S = (L, representing j-th criterion and i-th alternative triangular weights, respectively, and R(T S) = L+4M+U

6
. The values R(T S) yield the ranking of alternatives as follows. We observe that the top two ranks and the bottom two ranks of alternatives by our proposed approach wit η = 0.5 and the OPA-F method are the same. However, it is the ranks in the middle where one notice a difference. While we obtain A 8 to be mainly at fourth rank, the OPA-F ranking placed it at the sixth spot; A 7 got a lower rank of fifth or sixth place in our method, but it got the third rank by the OPA-F method.
We tried to see if the two rankings statistically differ. We compared the rankings of our method when η = 0.5 with that of the OPA-F method. Set the null hypothesis, H 0 : there is no concordance in rankings vs the alternate hypothesis and the alternative hypothesis H 1 : there is a concordance in rankings. Here, τ = 2, m = 8, and the test statistics W = 0.6486, χ 2 = 9, and the p value equals 0.2527, indicating we fail to reject H 0 at almost 25.27% of significance level. In other words, the statistical test agrees with our noting of different ranking of alternatives by our proposed and OPA-F methods. The W value close to 0.65 also indicates that the two rankings are partly in unanimity but this unanimity is not very strong.
Though it is hard to explain one reason for the ranking difference, to point out the major ones, unlike the OPA-F algorithm where the usual linear program computes the criteria weights, our method uses prospect theory-based value function to capture the perspective difference on the cost and benefit criteria. We also used the cross-efficiency scores that consider the other alternatives' efficiency while calculating the self-efficiency of an alternative. Another limitation is that the OPA-F algorithm uses fuzzy equivalence scales to capture the semantics of the linguistic term. This scale is subjective and cumbersome to use when the cardinality of the linguistic term set increases, and its granularity becomes finer with complex interpretation.
In order to expand the scope, we include the fuzzy TOPSIS (we shall be calling it by F-TOPSIS) in our comparative analysis. On applying the linguistic semantics in L S, Table 12 depicts the triangular fuzzy weights of criteria.
Following the procedure for F-TOPSIS of Nȃdȃban et al. (2016) on the aggregated matrix B agg , we obtain the alternatives rank Comparing the F-TOPSIS ranking with that of ours for η = 0.5, the two rankings appeared to be quite different. For instance, A 2 is ranked seventh by our proposed approach while F-TOPSIS ranks it at the top. Also, A 3 gets the top rank by our approach but ranked the lowest by F-TOPSIS. The difference in ranking of A 4 and A 6 is also quite evident. We test our observation statistically by setting the null hypothesis: H 0 : there is no concordance in rankings of alternatives by two approaches (ours one with η = 0.5 and F-TOPSIS); against the alternative hypothesis: H 1 : there is a concordance in rankings of alternatives by the two approaches.
Here, τ = 2, and m = 8, and the test statistics W = 0.238, and χ 2 = 3.33 with very high p value equals 0.853. We fail Table 13 Rank of alternatives by our method with η = 0.5, OPA-F method (Mahmoudi et al. 2021) and F-TOPSIS (Nȃdȃban et al. 2016) with criteria weights in Table 12 Alternatives Our method OPA-F F-TOPSIS to reject the null hypothesis at a very high significance level of almost 85.3%. It is in line with what we have observed.
The W -value is far away from one and more close to zero, clearly signaling that the two rankings differ more or less, and any agreement between them is fairly weak. The rankings by F-TOPSIS are sensitive to the choice of the pre-defined criteria weights. The same is summarized in Table 15. We have picked four weight vectors to solve the F-TOPSIS method, depicted in Table 14 and obtained the rankings of alternatives. The first weight vector W 1 is derived using column-wise minimum of the first component, average of the middle component and maximum of the third component of the TFN from the equivalent representation of matrix B agg with triangular fuzzy entries applying the scale given in Table 12. The weight vector W 2 is the one obtained from OPA-F linear programming problem, while the other two weight vectors W 3 and W 4 are randomly chosen.
From Table 15, we observe that altering the weight vector causes change in rankings of alternatives by F-TOPSIS method.
In Table 16, we report the correlation of rankings between our method and F-TOPSIS or that between OPA-F and F-TOPSIS, for different choices of the pre-defined weight vectors on seven criteria in the F-TOPSIS. We observed that result in both the positive and negative correlation values.
The summary of our observations on comparative analysis for Example 1 is as follows: (i) One can obtain different rankings of alternatives by different MCGDM methodologies. (ii) Ranking of the alternatives by F-TOPSIS is sensitive to the choice of the fuzzy weight vectors of the criteria, which are assumed to be pre-specified. (iii) The correlation coefficient in different rankings between our method, OPA-F, and F-TOPSIS, could take both negative and positive values. (iv) Our method involves the four parameters α, φ, θ in the value function of CPT and the fourth parameter

Case study
In this section, we present a decision-making problem requiring to evaluate the overall teaching performance of faculties in the respective courses they teach in a semester based on the feedback received from the students on the end-semester questionnaire for the course. A class of students is asked to fill the end-semester questionnaire and provide feedback on five core courses taught to them by five different faculties in the semester on the following nine criterion. The first two criteria are related to examination and evaluation policy in the course while the remaining seven criteria pertain to the course organization, delivery, and learning in the course.
C 1 The exams are lengthy and involve concepts not covered in the lectures C 2 The evaluation is not fair and not effective to understand mistakes and improve on basics C 3 Course plan and lectures met the course objectives C 4 Effective in response and addressing queries C 5 Addresses and answers to queries C 6 Effective and clear communication and concepts explanation C 7 Inspire to pursue the subject area further C 8 Encourage independent and logical thinking C 9 Punctuality in holding lectures and exams Among these, C 1 and C 2 are cost criteria thus considered as inputs while C 3 to C 9 are beneficial criteria taken as outputs in DEA model (M1). The assessment of five alternatives (faculty) on nine criteria is carried out by decision-makers (students) from the linguistic term sets L S 1 (for criteria C 1 and C 2 ) and L S 2 (for criteria C 3 to C 9 ) having semantic representations as follows: The class size participated in providing feedback by filling questionnaire was more than 30 but after scrutinizing the responses, the incomplete forms where the responding students left over the questions unanswered, are discarded. We are then left with 30 acceptable responses which we call as   We set α = 0.89, φ = 0.92 and θ = 2.25, in the prospect value function and apply Algorithm 3, by varying the tradeoff parameters η ∈ (0, 1), to see the change it brought in the ranks of the alternatives. The results are summarized in Table 19.
We consider the pessimistic case, η = 0.1, 0.2, 0.3, 0.4, 0.5, so, τ = 5, and m = 5 alternatives (A 1 , . . . , A 5 ), to test the null hypothesis against the alternative hypothesis: H 0 : there is no concordance in rankings of alternatives in pessimistic case; H 1 : there is a concordance in rankings of alternatives in pessimistic case. The test statistics is W = 0.896, and χ 2 = 17.92 with p value equals 0.000195. The null hypothesis gets rejected. Analogously, in the optimistic case when η = 0.5, 0.6, 0.7, 0.8, 0.9, the test statistics turns out to be W = 1.957, and χ 2 = 68.5 with the p value equals 1.2 −8 . The null hypothesis gets rejected.
In conclusion, one can say that, in general, faculty A 5 is top ranked , while faculties A 2 and A 3 get the lower rank among the five faculties.

Comparative analysis
After converting the first two cost criteria into the benefit ones, we applied OPA-F procedure on the 3 × 9 aggregated matrix. Table 20 provides the fuzzy scores, and subsequently the alternatives are ranked as A 5 > A 2 > A 1 > A 3 > A 4 . We tested the hypothesis on concordance between rankings by OPA-F and ours approach with η = 0.5. We obtain W = 0.65, and χ 2 = 5.2 with p value equals 0.2674. We fail to reject the null hypothesis of 'no concordance' on ranking by the two approaches at almost 26.74% level of significance.
We next examine the ranking with F-TOPSIS on B agg . Table 21 lists the criteria weights.
The ranking is A 4 > A 5 > A 1 > A 3 > A 2 . The test statistics is W = 0.9, with χ 2 = 7.2 and the p value 0.1257. We fail to reject the null hypothesis on 'no concordance' between rankings by the F-TOPSIS and our approach with η = 0.5.
The example and the case study indicate that the ranking of alternatives by our approach and OPA-F and F-TOPSIS is not a concordance. More experiments are needed to conduct a thorough comparative analysis of decision-making problems involving many decision makers, evaluating a large number of alternatives on various numbers of cost-benefit criteria. All these parameters can be varied, and we can simulate assessment matrices for the decision makers from the linguistic term set. A large-scale experimental investigation can reveal the benefits and drawbacks of our suggested technique compared to OPA-F and F-TOPSIS.

Conclusions
The highlighting contributions of our paper are four-folds. Firstly, we have used the 2-tuple unbalanced linguistic term set that provides a broader and more realistic domain set for DMs in supplying the alternative-criteria judgment matrices. Unlike several other studies on MCGDM problems demanding precise numerical quantification, the linguistic terms embrace decision-makers' multi-granularity and cognitive process in comprehending the available alternatives on different criteria.
Secondly, we present the DEA optimization model to compute the weights of each DM on each alternative-criteria pair in the group decision-making. The DEA model incorporates complete information from the input criteria (or cost criteria) and output criteria (or beneficial criteria) from the decision matrices to designate weights that enable the computation of one aggregated decision matrix.
Thirdly, we determine the positive ideal (the best) and negative ideal (the worst) alternatives and apply the prospect theory to determine the alternatives' gain and loss values concerning the two ideal options. The value function allows for segregating the gain and loss domains based on the risk perspective, where guaranteed small gains are preferred over uncertain large gains and large uncertain losses get preferred over certain small losses.
Finally, we put forward a cross-efficiency DEA optimization model and aggregation of cross-efficiency scores to rank the alternatives.
Thus, allowing unbalanced 2-tuple linguistic term set in decision-making, usage of DEA in computing weights of DMs on each alternative-criteria and the prospect value theory function for determining criteria weights in different loss-gain domains to compute cross efficiency of alternatives offers a comprehensive package to rank the alternatives. We present a numerical example to illustrate the proposed methodology. The rankings obtained on varying the parameter in the cross-efficiency model get statistically tested for concordance or agreement. The null hypothesis of no concordance is rejected at a very high confidence level, thereby showing the efficacy and robustness of the proposed methodology in ranking the alternatives both in the pessimistic and optimistic situations.
For future works, a promising research avenue could be to use personalized individual semantics to represent the distinct understanding of each linguistic term by individual decision-maker. We may explore extending the proposed method to deal with more complex linguistic expressions in decision-making applying multi-granular hesitant fuzzy linguistic term sets and incomplete criteria weight information. Some work in this direction is already reported (see, Ren et al. 2018;Zhang et al. 2021a) yet more needs to be done specifically to handle large scale MCGDM problems that often arise in data science and explainable artificial intelligence. We can also delves into applications of other utility functions of cumulative prospect theory in MCGDM. In addition, we can employ different cross-efficiency DEA models, including satisfaction and consensus information for ranking alternatives.