Coalitional manipulation of voting rules: simulations on empirical data

Using computer simulations based on empirical data, we show that seven voting rules that we call the IRV family (Instant-runoff voting, exhaustive ballot, Condorcet-IRV, Benham, Smith-IRV, Tideman and Woodall) are less sensitive to coalitional manipulation than a large selection of prominent voting rules. While the relative performances of these seven rules still deserve further investigation, we show that the differences are at most marginal.


Introduction
The Gibbard-Satterthwaite theorem (Gibbard, 1973;Satterthwaite, 1975) implies that essentially any voting rule is coalitionally manipulable (CM), i.e. sensitive to strategic voting by a coalition of voters (except dictatorship, and provided that at least three distinct candidates can be elected).However, not all voting rules need be equal in this respect: they may differ by the frequency of the situations where they are CM, the complexity of computing the strategic ballots, the number of candidates who can benefit from the manipulation, the consequences of strategic voting on the quality of the outcome, and the balance of power between naive, sincere voters and sophisticated, strategic ones.In order to investigate all these quantitative aspects, we run computer simulations with the Python package SVVAMP (Durand et al., 2016a) 1 3 Coalitional manipulation of voting rules: simulations on… on the basis of two datasets: the FairVote dataset, gathering 162 political American elections, and the Netflix dataset, that enabled us to generate 2243 profiles of preference of users about movies.The rest of the paper is organized as follows: Section 2 gives our general definitions and notations; Section 3 defines the voting rules of our study; Section 4 introduces the two datasets; Section 5 give an overview of SVVAMP and the algorithms that we use; Section 6 presents our results; Section 7 concludes.

Definitions and notations
A profile is defined by: • Two non-empty finite sets V and C , whose elements are respectively called vot- ers and candidates, • For each voter v, a utility function u v that assigns a value u vc to each candidate c.
We denote V = card (V) and C = card (C) .In the following, v denotes a generic voter; c and d, generic candidates.We always assume that for any voter, all her utility values are distinct.We denote by r v her preference ranking, defined by: r vc = 1 + card {d ∈ C s.t.u vd > u vc } .For example, r vc = 1 if c is her most liked candidate.
W denotes the weighted majority matrix of the profile, defined by W cd = card {v ∈ V s.t.u vc > u vd } .The graph naturally associated to W is called the weighted majority graph.M denotes the majority matrix of the profile, defined by M cd = 1 W cd >W dc (where 1 denotes the indicator function).A candidate c is a Con- dorcet winner (CW) if, for any other candidate d, M cd = 1 .The Smith set is the smallest set of candidates S such that, for any c ∈ S and d ∉ S , M cd = 1 .A profile has a Condorcet Order (CO) if the binary relation represented by M is a strict total order.A candidate c is a majority favorite (MF) if card {v ∈ V s.t.r vc = 1} > V 2 .A voting rule is defined,1 for any V and C , by: • A set of strategies S (in this paper, it is the same for all voters), • A counting function f that maps a tuple of strategies ( v ) v∈V ∈ S V to a winning candidate w ∈ C , • A sincerity function s that maps the utility function u v of a voter v to a strategy v ∈ S .
Together, the utility function of the profile, the set of strategies and the counting function of the voting rule define a game in the usual sense of game theory.All the voting rules in this paper consist in a succession of rounds (often only one) where all voters play simultaneously.The action of a voter at a given round is called a ballot.In particular, for rules in one round, the strategy of a voter is simply called her ballot.
In the following, we will always denote w = f s(u v ) v∈V and call her (by a slight abuse of language) the sincere winner.We say that a voting rule is Condorcetconsistent if, for any profile with a Condorcet winner c, it holds that w = c.
For any c ≠ w , we denote V NM (c) = {v ∈ V s.t.u vw > u vc } and V M (c) = {v ∈ V s.t.u vc > u vw } , whose respective cardinalities are denoted V NM (c) and V M (c) .We say that a voting rule is coalitionally manipulable (CM) in a given profile if there exists c ≠ w and v v∈V M (c) ∈ S V M (c) such that f s(u v ) v∈V NM (c) , v v∈V M (c) = c .In that case, we say that c is a CM winner.We say that a voting rule is unison manipulable (UM) in a given profile if there exists c ≠ w and ∈ S such that f s(u v ) v∈V NM (c) , v∈V M (c) = c (note that all the manipulators use the same strategy). 2n a given profile, we say that a candidate c is a resistant Condorcer winner (RCW) if, for any pair of candidates (d, e) that are different from c: 2 .This is equivalent to: in this profile, any Condorcet-consistent voting rule elects c and is not CM (Durand et al., 2016b).This property is stronger than CW and weaker than MF, in the sense that:

Voting rules under study
The voting rules described in the literature are often irresolute: in a (generally limited) number of cases, they can output several candidates.In order to always select a single winner , we use the same tie-breaking principle for all of them: the candidates of a profile are equipped a priori with distinct integer indices and, in case of tie, candidates with lower indices are favored.Hence when describing a voting rule, we may write "the candidate with highest (resp.lowest) score is elected (resp.eliminated)" as a shortcut of language meaning "among the candidates with the highest (resp.lowest) score, the candidate of lowest (resp.highest) index is elected (resp.eliminated)".
The classification of the voting rules in the following sections is made for exposition purposes only; many rules could be included with reason in several categories.For each rule, we also define an abbreviated name that we use in the figures.Sections 3.1 to 3.4 present ordinal rules, where sincere voting depends only on the voter's preference ranking; unless otherwise stated, the set of strategies is the set of rankings over the candidates, and sincere voting consists in giving one's true preference ranking.Section 3.5 presents cardinal rules (i.e.non-ordinal).

3
Coalitional manipulation of voting rules: simulations on…

Score-based voting rules
Each candidate c is assigned a numerical score denoted score (c) .Elect the candi- date with the highest score.
Bucklin rule (Buc) score (c) = (−m c , x c ) , where m c = median v∈V r vc and x c = card {v ∈ V s.t.r vs ≤ m c } .Scores are compared using the lexicographic order.

Elimination rules
In the six following rules, one or several candidates are eliminated, and the process is iterated until only one candidate remains, who is then declared the winner.When a score is used (plurality score, Borda score, etc), it is always computed on the profile restricted to the non-eliminated candidates.
Instant-runoff voting (IRV) Eliminate the candidate with the lowest plurality score.

Baldwin rule (Bal)
Eliminate the candidate with the lowest Borda score.

Nanson rule (Nan)
Eliminate all candidates whose Borda score is below the average.

Coombs rule (Coo)
Eliminated the candidate with the lowest veto score.

Kim-Roush rule (KR)
Eliminate all candidates whose veto score is below the average (Kim & Roush, 1996).

Viennot rule (Vie)
Let (c, d) be the two candidates with the lowest plurality scores .If W cd > W dc , then eliminate d, and vice versa (Durand, 2015).
In the two following rules, the election proceeds in several rounds.
Exhaustive ballot (EB) At each round, each voter casts a ballot for one candidate.
Sincere voting consists in voting for one's preferred candidate among the non-eliminated ones.The candidate with the lowest score is eliminated.Note that if voters are sincere, the winner is the same as in IRV.
Two-round system (TR) This is similar to exhaustive ballot, but after the first round, only the two candidates with the highest plurality scores are selected for the second and last round.Note that for C = 3 , this rule is equivalent to exhaustive ballot.

Condorcet-consistent variants of IRV
Together with IRV and exhaustive ballot, we call the five following rules the IRV family.
Condorcet-IRV (CI) If a Condorcet winner exists, elect her.Otherwise, elect the IRV winner.

Benham rule (Ben)
As long as the profile has no Condorcet winner, eliminate the candidate with the lowest plurality score.Then elect the Condorcet winner of the restricted profile.
Tideman rule (Tid) Alternately, eliminate all the candidates outside the Smith set (if any), and the candidate with the lowest plurality score.When only one candidate remains, she is declared the winner.

Smith-IRV (SI)
Eliminate the candidates outside the Smith set, then run IRV on the restricted profile.
Woodall rule (Woo) Among the candidates of the Smith set, elect the one that is eliminated latest in IRV.
Condorcet-IRV is defined by Green-Armytage et al. ( 2014) and Durand et al. (2016b); the four other rules above are described by Green-Armytage (2011).

3
Coalitional manipulation of voting rules: simulations on…

Black rule (Bla)
If a Condorcet winner exists, elect her.Otherwise, elect the Borda winner (Black, 1958).
Ranked Pairs (RP) Construct a graph whose vertices are the candidates.One by one, add the same edges as in the weighted majority graph by order of decreasing weight, except when the newly added edge would create a cycle.Finally, elect the candidate at the maximal vertex of the graph (Tideman, 1987).
We now denote by S cd the width of the widest path from c to d in the weighted majority graph.
Schulze rule (Sch) Elect the candidate w such that ∀c ≠ w, S wc ≥ S cw (Schulze, 2011).

Approval voting (AV)
Each voters votes for any number of candidates.Elect the candidate with most votes.

Range voting (RV)
Each voters assigns a numerical grade to each candidate, in a set of authorized grades.Elect the candidate with the highest total grade.
Scoring then automatic runoff (Star) Ballots are the same as in range voting.Let c and d be the two candidates with the highest total grades.If c is rated higher than d by more voters than the opposite, then elect c, and vice-versa.

Majority Judgment (MJ)
Each voters v assigns a mention m vc to each candidate c, in an ordered set of authorized mentions.Denote m c = median v∈V m vc , ) .Scores are compared using the lexicographic order (Balinski & Laraki, 2010).
We will specify the sincerity function that we consider for these rules (and the set of authorized grades or mentions for range voting, Star and majority judgment) when we present our datasets in Sect. 4.

Datasets
In this paper, we study two datasets that we call the FairVote dataset and the Netflix dataset.
The FairVote organization (www.fairv ote.org) has collected the ballots of 172 single-winner elections using IRV in the US: member of city council, member of board of supervisors, mayor, sheriff, district attorney, school director, assessor treasurer, etc.Generally, ballots give a truncated preference ranking: voters are allowed to mention their k most-liked candidates (with k = 3 typically).For reasons of com- putation time, we limit our dataset to the 162 elections with at most 11 candidates.Figure 1 gives an overview of the selected profiles, with elections ranging from 3 to 11 candidates and from 1560 to 299,107 voters.
Our second dataset is extracted from the "training set" of the Netflix prize (www.netfl ixpri ze.com). 3The original dataset consists of 100,480,507 integer grades, from 1 to 5 stars, that 480,189 users gave to 17,770 movies.For a given number of candidates C, we generate several preference profiles by the following greedy algorithm.We select the movie that was graded by most voters; then we select a second movie that maximizes the number of common voters with the first movie; a third movie, that maximizes the number of common voters with the first and second movies, etc.When C movies are selected, we save our first profile, defined by these C movies and their common voters (removing the voters who assign the same grade to all of them).Then we remove these C movies from the database, and we proceed similarly to generate the next profile with C candidates.We continue as long as the generated profile has at least 1000 voters.This whole algorithm is used for all C ∈ {3, … 11} .Finally, this process generates 2243 pro- files with 3 to 11 candidates (movies) and 1000 to 91,880 voters (users), as illustrated in Fig. 2. The interest of this dataset is threefold: it provides a large number of profile; the preferences are cardinal, and not only ordinal; and voters (users) have incentive to reveal their true preferences, because it helps Netflix' algorithm advise them about other movies that they may like.
For the FairVote dataset, we convert the truncated rankings into cardinal preferences by considering an adapted Borda score (where c has 1 point for each d such that v ranks c higher than d, and 0.5 points for each d such that v's ballot treat c and d equally).Note that this choice has only an impact on the cardinal voting rules.
For each profile of both datasets, for all cardinal ratings, we add i.i.d.uniform noises whose amplitude is negligible compared to the differences between cardinal ratings.As a consequence, if v declares preferring c to d in her original ballot, 1 3 Coalitional manipulation of voting rules: simulations on… then it is the case in her noised ballot; but if v puts several candidates as tied, then they are in a uniformly random order after adding the noise.The objective is twofold: lead investigations in a space that is richer than the only original profile; and simplify the analysis by considering only strict preferences.For each profile, we actually draw several noised profiles: 62 for the FairVote dataset, and 5 for the Netflix dataset, so that the margin of uncertainty due to the random realization is of order 1∕ √ 162 ⋅ 62 < 1% and 1∕ √ 2243 ⋅ 5 < 1% respectively.By convention, this statistical uncertainty will not be represented in the figures.
For approval voting, we consider that a sincere voter will vote for all candidates who have a cardinal utility at least equal to the average possible value, i.e.
C−1 2 in the FairVote dataset and 3 stars in the Netflix dataset.For range voting, Star and majority judgment: • In the FairVote dataset, the authorized grades (or mentions) are the continuous interval [0, 1]; we consider that sincere voters will apply an affine transformation to their cardinal preferences so that their most (resp.least) liked candidate has a grade of 1 (resp.0).

Algorithms
In order to study the manipulation by coalition, we use the Python package SVVAMP 0.8.3:Simulator of Various Voting Algorithms in Manipulating Populations (Durand et al., 2016a). 4 core feature of SVVAMP is to study the unweighted coalitional optimization problem (UCO): compute X(c), the minimal number for which there exists strategies Roughly speaking, it is the minimal number of manipulators needed to make c win.If and only if V M (c) ≥ X(c) , candidate c is a CM winner.Unfortunately, computing X(c) can be very expensive: for example, it is NP-hard for IRV (Bartholdi & Orlin, 1991), maximin and ranked pairs (Xia et al., 2009), Borda, Baldwin and Nanson (Davies et al., 2014).For this reason, SVVAMP computes bounds X(c) and X(c) such that X(c) ≤ X(c) ≤ X(c) .Table 1 indicates the type of algorithm used for each voting rule and their time complexity: "exact" means that X(c) = X(c) = X(c) ; "approxi- mate" means that there is a theoretically proven guarantee on the ratio or the difference between X(c) and X(c) ; "heuristic" means that there is no such approximation guarantee.Table 1 also indicates what type of algorithm is used to compute UM.
Since even UM cannot always be computed exactly in polynomial time, we also use the notion of trivial manipulation (Durand, 2015).Let t(u v , c, w) be the trivial strategy of voter v in favor of c against w, defined as follows: • If the voting rule is ordinal, v acts as if c was her most liked candidate, w her most disliked candidate, with other candidates in the same relative order as in her true preferences; • If the voting rule is cardinal, v gives the best grade or mention to c, and the worst one to all other candidates.
We say that the voting rule is trivially manipulable (TM) in a given profile if there exists c ≠ w such that f s(u v ) v∈V NM (c) , t(u v , c, w) v∈V M (c) = c .Firstly, this can always be computed in polynomial time (provided the winner can be computed in polynomial time, which is the case for all voting rules in this paper).Secondly, it is a relatively simple and natural manipulation heuristic, requiring little information about the whole profile; it can be argued as more realistic for human manipulators than a sophisticated manipulation, like the one resulting from a non-polynomial algorithm.

Qualitative features of the profiles
Figures 3 and 4 represent the qualitative features of the profiles.In both datasets, more than 99 % of the profiles have a Condorcet winner (CW), which qualifies previous theoretical work (such as Gehrlein ( 2006)), but confirms previous similar empirical findings (Tideman, 2006).Even having a Condorcet order (CO) happens very often: 99% in the FairVote dataset and 97% in the Netflix dataset.41% of the profiles in the FairVote dataset, and 7% in the Netflix dataset, have a resistant Condorcet winner (RCW): no Condorcet-consistent rule can be CM in these profiles.Finally, 37% of the profiles in the FairVote dataset, and 5% in the Netflix dataset, have a majority favorite (MF): some rules such as plurality or IRV cannot be CM in these profiles.Since all these rates are higher in the FairVote dataset, we can already expect more possibilities of coalitional manipulation in the Netflix dataset.Coalitional manipulation of voting rules: simulations on…

CM rate
The CM rate of a voting rule is the proportion of profiles where the rule is CM (in a given dataset or probabilistic model).Figures 5 and 6 show the CM rates of the voting rules under study.In these figures and all the following bar plots, the solid bar gives a lower bound, and the upper end of the thin black line provides an upper bound.For example, for Benham rule (Ben) in Fig. 5: SVVAMP proves that Benham is CM in 3% of the profiles (solid blue bar), is unable to conclude in less than 1% of the profiles (thin black line, representing the algorithmic uncertainty), and proves that Benham is not CM in the remaining 96% of the profiles.These figures also indicate the RCW bound: no Condorcet-consistent rule can have a higher CM rate because of the profiles having an RCW.As we already suspected, all the values of CM rate are higher in the Netflix dataset than in the FairVote dataset.However, several qualitative conclusions are common.
Our main conclusion is that the seven rules of the IRV family have a lower CM rate than all the other ones.Their CM rates are very similar: the difference is lower than 3% in both datasets.In the Netflix dataset, it is not excluded that Tideman, Benham and Smith-IRV have a CM rate significantly lower than the other rules of the family (with a difference of at most 2%) ; this would deserve further investigation.Apart from the IRV family, the two-round system has the lowest CM rate.This can be partly explained by the fact that in both datasets, approximately one third of the profiles have 3 candidates, a case where the two-round system is equivalent to exhaustive ballot.We will discuss the performances of the two-round system depending on the number of candidates in Sect.6.6.As for the Condorcet-consistent rules that are not part of the IRV family, we can take maximin and Schulze as references, because their results are almost exact in practice (the algorithmic uncertainty is less than 0.1%), and the lower bound for their CM rate is identical, whatever the dataset.Compared to them: • Baldwin and Copeland show promising results (better lower bound) that would deserve further investigation; • Viennot, ranked pairs and split cycle have essentially the same lower bound, but more precise algorithms would be necessary to determine if they are as good or worse than maximin and Schulze; • Nanson and Black exhibit worse CM rates; in both datasets, Black has a CM rate that is close to the RCW bound, i.e. the worst possible CM rate for a Condorcet-consistent rule.
Plurality has a higher CM rate than maximin and Schulze, and other rules have an even higher CM rate, for example majority judgment, Bucklin, Kim-Roush, veto and Borda.Four rules have a higher CM rate than the RCW bound in both datasets: Star, range voting, approval voting and Coombs.

UM rate
Figures 7 and 8 represent the UM rate, defined similarly to the CM rate.Most conclusions are similar to Sect.6.2, with the following precisions or differences.The Condorcet-consistent rules of the IRV family show the same results, with a UM of at most 1% in both datasets.This UM rate is strictly lower than for IRV or exhaustive ballot, but the difference is less than 1%.• Baldwin and Copeland confirm their promising results, compared to maximin and Schulze.• Veto has much better results for the UM rate than for the CM rate (for example, its UM rate is lower than maximin and Schulze).This is not surprising because a typical manipulation for c in veto consists in dividing the manipulators' ballots between all the other candidates.

TM rate
Figures 9 and 10 present the TM rates.The conclusions are similar to Sect.6.3, with the following precisions.
• Star has significantly better performances in terms of TM rate than for the CM rate or UM rate, but with a dramatic difference between the FairVote dataset (0%) and the Netflix dataset (80%).• Viennot, ranked pairs and split cycle have essentially the same TM rate as maximin and Schulze; the TM rate of Viennot is slightly lower, but the difference is less than 1% in both datasets.

CM complexity index
The computational complexity of computing the strategic ballots has often been mentioned as a way to deter manipulation : for example, it is NP-hard for Borda (Davies et al., 2014).However, in practice, Borda has exactly the same CM rate and TM rate in both datasets, suggesting that strategic ballots are actually easy to compute.To formalize this idea, we introduce the CM complexity index as the share of the profiles where the rule is neither UM nor TM, divided by the share of the profiles where the rule is CM.Since our margins of uncertainty are too high in the Netflix dataset to have interpretable results, we present only the results for the FairVote dataset in Fig. 11.
All the rules of the IRV family, the two-round system and veto have a CM complexity index that is higher than 75%.Baldwin, Copeland, Viennot, ranked pairs, split cycle, Nanson, Kim-Roush, Black and Schulze have a CM complexity index that is lower.Approval voting, range voting, Coombs, majority judgment and plurality have a CM complexity index that can easily be proven equal to 0% in general, because they are CM if and only if they are UM.Star, Borda, Bucklin and maximin have no such theoretical property, but in practice, their CM complexity index is also equal to 0% in this dataset.

Number of CM winners
We now investigate the indeterminacy of the outcome that is due to strategic voting.Figures 12 and 13  Coalitional manipulation of voting rules: simulations on… of CM winners divided by C − 1 .This confirms the good results of the IRV fam- ily, with 0-2% in both datasets, better than all the other rules.
In Fig. 14, we represent the average number of CM winners as a function of the number of candidates C, only for the voting rules whose algorithmic uncertainty is less than 1%.We omit the FairVote dataset, where we do not have enough different elections for each possible number of candidates.Globally, the average number of CM winners can roughly be described as an affine function of C. By increasing result, we have: Condorcet-IRV, IRV and exhaustive ballot, with barely 0.3 CM winners for C = 11 ; Schulze, Veto and Bucklin ( ≈ 5 CM win- ners for C = 11 ) ; majority judgment and Borda ( ≈ 8 CM winners for C = 11 ) ; approval voting, range voting, Coombs, Star and plurality, who can lead to the election of almost any candidate on average.The case of the two-round system deserves a particular mention: almost 0 CM winners when C = 3 (when the rule is equivalent to exhaustive ballot), but performance degrades when C increases and reaches approximately 8 CM winners for C = 11.

Condorcet violation rate
The sincere Condorcet consistency rate is the share of the profiles where the sincere winner is the Condorcet winner, divided by the share of the profiles where a By definition, all Condorcet-consistent rules have a sincere Condorcet violation rate equal to 0%.When coalitional manipulation is taken into account, the seven rules of the IRV family outperform all the others, and have relatively similar results between them: in the Netflix dataset, where the differences are larger, their Condorcet violation rates with CM span from 1% (lower bound for Tideman) to less than 4% (for Condorcet-IRV, IRV, exhaustive ballot and Woodall).

Loss of social welfare
The social welfare of candidate c is defined as SW (c) = ∑ v∈V u vc .The loss of normalized social welfare of candidate c is SW (d) .Applied to c = w , it yields the sincere loss of normalized social welfare.Applied to the CM For the sincere value, by definition, range voting is optimal.When coalitional manipulation is taken into account, once again, the seven rules of the IRV family outperform all the others, with a loss of 2% or lower.

CM power index
Strategic voting can generate an inequality of power between the naive, sincere voters and the sophisticated, strategic ones .To quantify this idea, we introduce the CM power index of a voting rule.In a given profile, it is defined as max c≠w V NM (c) X(c) .In a dataset or probabilistic model, it is the average value of the above quantity over the profiles.Roughly speaking, if a voting rule has a CM power index of x, then a strategic voter have x times as much power than a non-strategic voter.
Figures 19 and 20 show, in both datasets, that the rules of the IRV family have a CM power index between 1.00 and 1.15, thus being close to the "one person, one vote" principle.All other voting rules have a higher CM power index, with values that can be as high as 5.93 (reached for Star in the Netflix dataset).

Conclusion
We have studied coalitional manipulability by computer simulations on the basis of two empirical datasets .For all the indicators, the seven rules of the IRV family (exhaustive ballot, IRV, Condorcet-IRV, Benham, Smith-IRV, Tideman and Woodall) outperform all the other rules of our study.Although the differences between these seven rules seem inconsequential, further studies with more precise algorithms would be interesting to evaluate their respective performances.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http:// creat iveco mmons.org/ licen ses/ by/4.0/.

Fig. 3
Fig. 3 Qualitative features of the profiles: FairVote dataset

Table 1
Gaspers et al. (2013)heir time complexity * The algorithm byGaspers et al. (2013)is exact, but since we consider a different tie-breaking rule, it induces an uncertainty of 1 manipulator Rule