This section presents the proposed model, the configurations, and the training and evaluation of the model. The conceptual model represented in Fig. 1 shows the relationships between the key concepts in our proposed model. We discuss how these concepts were operationalized in our model.
Figure 1 shows how each stage in the model feeds into another. It also shows how the exchange decisions feedback into the model to inform subsequent decisions. 1 and 2 are collectively termed the learning phase while 3 and 4 are collectively termed the prediction phase. In the learning phase, we theoretically identified motives (internal states) for exchange decisions. The combinations of the observable differences between individuals and definite behaviours that signify the motives of the individuals are the observed behaviours in this phase. Subsequently, cluster analyses of the observed behaviours (of the learning phase) were used to impute unobserved game strategies (internal states of the prediction phase) to predict exchange decisions (observable behaviour of the prediction phase). Further explanations are provided in subsections. The material (code, data and analyses) for this model can be found via the link: https://tinyurl.com/bdd46utb.
Social Exchange Data and Exchange Decisions
The proposed model aims to predict human exchange decisions during interactive social exchanges. The model was trained and tested using interactive social exchange data and theoretical knowledge of possible motives underlying social exchange decisions. The data was obtained from Virtual Interaction Application (VIAPPL) 2013 and 2014 experiments reported by Durrheim, et al. 6. In the experiments, individuals were required to allocate a single token per round to any player in the two 7-player groups. The selection was based on the observable differences (e.g., token balance, group identity, and previous allocations) that differentiate the players.
An individual’s token balance in each round shows how wealthy the individual was in that round. Thus, a token balance was regarded as a show of the individual’s status. An individual’s status was measured relative to the wealth of the other individuals in the game, i.e., we measure how wealthy an individual is in comparison to other individuals. Status was determined by the formula \(Status = a/max\left(A\right),\)where \(a\) is the token balance of an individual in a round, and \(A\) is a vector of the token balances of all the players in that round. Thus, status is represented as a real number in the range of 0 to 1 inclusive. An individual is of high status if the individual’s status is greater than the average status in the round; otherwise, the individual is of low status. Group identity simply indicates the group to which the individual belongs. The group identities are represented as 1 and 2 for group 1 and group 2 respectively. Previous allocation indicates an in-group giving, out-group giving or self-giving in the previous round, and is represented as 0, 1 or 2 respectively.
The experimental games were randomly assigned to conditions in which players and groups started with either equal or unequal token balances. The data for each allocation in each round records the game identifier, the group number, experimental condition, starting and ending token balance, and directed ties showing player to player allocations. These ties provide traces of relational interdependencies (e.g., competition and cooperation) that develop between interacting individuals and groups. The unprocessed data contains 2 (conditions) x 40 (rounds) x 14 (participants) x 5 (games) of VIAPPL 2013 data and 1 (condition) x 40 (rounds) x 14 (participants) x 4 (games) of VIAPPL 2014 data reported in 6. This implies 7,840 exchanges. To reduce noise in the data, the first rounds of the data were not used because players were likely to randomly allocate their tokens.
Each exchange shows the presence or absence of each definite behaviour: reciprocity, defined as giving the player who gave you in the previous round, vs non-reciprocity, (2) giving a rich vs giving a poor player and (3) giving out-group vs giving in-group. Table 1 presents the exchange decisions and their explanations as used in the current study. An act of reciprocation is indicated as 1 while its absence is indicated as 0. Giving a rich player (1) and giving out-group (1).
Motives
This section shows how definite behaviours help to signify the motives underlying exchange decisions. The theoretical framework in Fig. 2 identifies three motives, namely, group ties, wealth aspiration, and interpersonal ties, that are likely to inform exchange behaviours. These motives, inferred from previous receipts (in the case of ties) and previous allocations (in the case of wealth aspiration), are the theoretically identified motives that account for the absence or presence of the definite behaviours in exchange decisions. The strength of each motive is measured from past observed behaviour and is used to impute strategies which in turn, is used to predict future exchange decisions. For example, an individual who has a strong motive for fairness in the game will often give a token to the poorest player (definite behaviour), irrespective of the poor player’s group.
Group ties measure the relationship between individuals and groups. Individuals and groups adopt various identity management strategies aimed at creating positive social identities 21,22. Social identity theory 23 provides a theoretical lens to understand the social psychological motives of social exchanges. The theory argues that individuals categorise as group members in intergroup contexts22, compare the status of in-group and outgroup, and are motivated to identify with groups that are positively valued within the hierarchy 24. These processes occur as the individual seeks to achieve a positive internal perception of self or a high perception of self-worth 25. Prejudice and discrimination occur as an expression of this positive distinctiveness motive, which is expressed as in-group favouritism behaviour or parochial altruism intergroup exchange experiments 26,27.
Bounded generalised reciprocity (BGR) theory 28,29 argues that parochial altruism is ultimately motivated by self-interest. Individuals favour in-group members because there is an expectation to do so. To avoid acquiring a bad reputation and being excluded from exchange network, individuals will favour the in-group compared to the out-group. In other words, individuals expect profitable and advantageous interactions with in-group members because of the expectations that favours are more likely to be reciprocated by in-group members compared to the out-group members 26,28.
Both SIT and BGR expect that in-group favouring behaviour will strengthen the bond between the in-group members. We refer to this bond as group tie. An individual motive may be to strengthen the bond with the in-group (or even with the out-group), to maintain a good reputation. Therefore, group ties were defined in terms of in-group versus out-group exchange.
For each player, we measure the strength of each (in-group and out-group) tie by determining how often the in-group and out-group members allocate tokens to the individual (see Table 2). We determine in-group relationships by the ratio (\({\text{R}}_{\text{r}}\)) of the number of tokens received from in-group members (\({\text{T}}_{\text{i}\text{n}}\)) to the number of in-group members (\({\text{N}}_{\text{i}\text{n}}\)) in the round (\(\text{r})\), given that the game rules specify one token allocation per round. Thus, \({\text{T}}_{\text{i}\text{n}}= {\text{N}}_{\text{i}\text{n}}\) will result in \({\text{R}}_{\text{r}}=1\) (a very strong bond), while \({\text{T}}_{\text{i}\text{n}}= 0\) will result in \({\text{R}}_{\text{r}}=0\) (no bond). Out-group relationship, i.e., the ratio of the number of tokens received from out-group members to the number of out-group members, is calculated in the same way as the in-group relationship, with \({\text{T}}_{\text{i}\text{n}}\) replaced by \({\text{T}}_{\text{o}\text{u}\text{t}}\) and \({\text{N}}_{\text{i}\text{n}}\) replaced by \({\text{N}}_{\text{o}\text{u}\text{t}}\). An individual may have a strong bond with both the in-group and out-group members. Thus, in-group and out-group relationships are two separate measures.
In most cases, the comparison between in-group and out-group is made relative to status (low/high/equal status), which is associated with power. Wealthy individuals are often seen as powerful individuals. Thus, wealth aspiration underpins two exchange behaviours: giving to the rich (or seeking power) and giving to the poor (or fairness).
Capraro, et al. 4 suggested that fairness can be the basis on which some individuals interact. According to Tajfel and Turner 27, interaction may be moderated by the legitimacy (perceptions of fairness) of the status hierarchy. When perceptions of fairness are low, low-status group members challenge the status quo by strengthening in-group reciprocal behaviours and in-group favouritism. High status group members may either enter into intergroup competition or rectify the injustice by outgroup altruism, giving to the poor. In contrast, when the situation is viewed as legitimate, low status group members may seek to enrich themselves individually by making exchanges with rich outgroup members.
Wealth aspiration was measured relative to the wealth of the player to whom the individual allocates a token, that is, the individual's aspiration to associate with the poor (fairness) or the rich (power-seeking). Associating with the poor means allocating one’s token to a low-status individual while associating with the rich means allocating one’s token to a high status individual in the round. The former is considered as being fair, while the latter is considered as seeking power. Thus, participants’ wealth aspiration in round \(r\) is calculated based on the allocations made in round \(r-1\), using the formula \(WealthAspiration = {a}_{r-1} /max\left({A}_{r-1}\right),\)where \({a}_{r-1}\) is the start token (individual’s token before allocation) of the player to whom an individual allocated a token in round \(r-1\), and \({A}_{r-1}\)is a vector of the start tokens of all the players in round\(r -1.\)
Interpersonal ties of trust are built by reciprocity. Relationships between individuals are more trusting when exchanges occur without explicit negotiations between members 30,31. Trustworthy individuals gain positive reputations; they are likely to be rewarded by other individuals 32, and their actions are more likely to be reciprocated, especially by the in-group members. As shown by De Dreu, et al. 33, expectations of reciprocity can promote in-group interactions, and reciprocation. But powerful reciprocity norms also motivate reciprocation with individuals who are not in-group members.
Interpersonal ties were measured in terms of reciprocation motive, A’s desire to allocate a token to another participant B who allocated a token to A in the previous round. This motive was measured by the history of reciprocity in terms of the presence or absence of reciprocity in the previous round indicated as 1 and 0 respectively.
Motives and experience (represented as observed behaviours) form the basis on which individuals in social interaction plan and/or adjust their plans. We refer to the combination of motives and observable differences between individuals as features used to impute strategies in social interaction. In this study, features are group ties, wealth aspiration, reciprocity, status, group identity, and previous allocation. These features form input to the cluster analysis used to determine strategies in the game. Table 3 summarises the features while Table 4 shows the representation of the features as input to the cluster analysis.
Strategies
This section identifies unobserved strategies in the minds of players. It does this by finding patterns of association between features of exchange behaviours of each player over the course of the game. Note that each row in Table 4 represents a single exchange decision, by one player in one round. The actual decision, recorded in the three-character depiction (see Table 1) is recorded in the final column. The “Features” columns record observable features that characterize the motives and game features of the decision. We use cluster analysis to identify and categorize patterns of exchanged behaviours based on the co-occurrence of features. These categories of exchange behaviours represent unseen and implicit strategies.
As shown by Zaki, et al. 34, clustering behaviours can ensure high performance in predicting exchange decisions. Partitioning around medoids (PAM) clustering algorithm in R 35 was applied with Gower distance 36 as the distance measure. Although other distance measures, such as Euclidean and Manhattan, can be used see 37, Gower distance is very useful and performs well in a domain with mixed data types – categorical and non-categorical data 36,38.
The silhouette width also referred to as the silhouette coefficient 39, was used to determine the optimal number of clusters. It measures the within-cluster cohesion and the separation distance between clusters. The silhouette width of a data sample ranges from − 1 to 1, where large \(s\) (near 1) implies well-clustered, a small s (near 0) implies that the data sample lies between clusters, and a negative \(s\)implies that the data sample has been placed in the wrong cluster. Thus, the higher the silhouette width, the better the cluster.
The clustering procedure and results
The optimal number of clusters was determined experimentally by obtaining the silhouette width for two to 20 clusters. Figure 3 plots the number of clusters on the x-axis against the silhouette width on the y-axis. The optimal number of clusters is indicated by the highest silhouette width. The plot shows that the optimal number of clusters is six, with a silhouette width of 0.685.
We interpret the clusters by identifying the two dominant exchange decisions that characterize each one (see Table 1). Figure 4 plots the stacked bar charts of these exchange decisions for each cluster, which show the dominant and recessive decisions that characterize each strategy. Table 5 reports the two dominant decisions for each cluster and interprets the strategy represented by this cluster. For example, Cluster 1 is represented by Ingroup-Care strategy (individuals allocate their tokens to in-group members irrespective of their status) while Cluster 6 is represented by Ingroup-Promotion (individuals allocate their tokens and reciprocate only to the poor in-group member).
Predicting Strategies via Machine Learning and resultsWhereas the cluster analysis identifies the strategies on the basis of all the data, our ultimate objective was to predict the strategy that motivated a single exchange behaviour. To this end, we trained an artificial neural network (ANN) 40–42 to predict the cluster membership of each exchange in each round by each player and represent the strategy of the player as that depicted by the predicted cluster. Input to the ANN are Features (see Table 4) generated, including the exchange decision, based on an individual’s previous round while the output is the cluster to which the past exchange belongs. The ANN was designed to operate in real-time to classify a single exchange decision into one of the identified strategies or as a newly formed strategy. Thus, we use an artificial neural network to compute the probability that an exchange decision in the past round forms part of a complex strategy. Where the probability is below a given threshold (95%, in this study), the decision is categorised as part of a new strategy. Recognising the strategy of an individual during interactive social exchanges will improve the prediction of the individual’s exchange decision. This capability has been shown to work in other domains such as traffic congestion prediction 34. The ANN was implemented using the \(Deeplearning4j\) framework 43. The study used a feed-forward artificial neural network with three layers – an input layer, a hidden layer with four neurons, and an output layer with six neurons, one for each cluster. The final artificial neural network model was trained using a batch size of 40 and a learning rate of 0.01, with 15 epochs. It makes use of the softmax activation and the NegativeLogLikelihood function in Deeplearning4j 43 for computing the error which is used to determine the direction of learning. These parameters were determined experimentally.
To train the artificial neural network, data (i.e., features and their corresponding strategies discovered by the cluster analysis) were divided into training and testing sets, each having X (the features) and Y (the strategy) components. Of the data, 70% were used for training while the remaining 30% were used for testing the artificial neural network. Both X and Y were provided to the neural network during training, but only X was provided during testing. The function of the neural network is then to classify X into one of the available clusters, irrespective of the round at which X is produced.
The artificial neural network was evaluated using the accuracy score. This simply counts the number of samples correctly classified. However, accuracy is not a true reflection when the number of samples in each class is not equal or not almost equal (imbalance dataset). To ensure a more accurate measure, the multi-class confusion matrix, detailed in 44, was used. Precision, recall and F1 scores were calculated from the confusion matrix. Precision measures the actual number of samples belonging to a class among the total number of samples the artificial neural network identified as belonging to the class. The value ranges from 0 or 0% (no identification) to 1 or 100% (perfect identification). Recall measures the artificial neural network’s ability to discriminate samples that do not belong to a particular class. Again, the value ranges from 0 or 0% (no discrimination) to 1 or 100% (perfect discrimination). F1-score – measures the balance between precision and recall. It ranges from 0 to 1. A higher value indicates a better score.
We present the result of predicting the strategies applied by individuals based on the previous exchange decisions. Figure 5 shows the learning curve for the artificial neural network.
Tables 6 and 7 show the performance measures obtained in one of the experiments with the number of epochs set to 5 (all other parameters remained as reported). Tables 6 and 7 show the confusion matrix and the performance table respectively for the test set. The performance statistics show that the neural network predicted the strategies with high accuracy of above 94%. This result is similar to that obtained for the training set. The F1 scores of 90% (see the supplementary material) and 89% on the training and test set, respectively, confirm that the performance is not biased towards any cluster. The final parameter was obtained when the number of epochs was set to 15. The confusion matrices generated internally by the Deeplearning4j 43 inbuilt function for the training and test dataset are provided in the supplementary materials https://tinyurl.com/bdd46utb.
Predicting future moves via a Hidden Markov Model
The ANN is trained to identify the exchange strategy that a single exchange behaviour belongs to. It can take all the exchange decisions enacted by the player at round r and classify them into various strategies. We now develop a hidden Markov model to predict what each player will do in the next round, r + 1. The main aim of the hidden Markov model is to predict future moves from a player's past behaviour. Rather than taking the player's past behaviour as input in the form of Features, the hidden Markov model takes the player's past strategies as input and predicts the player's next exchange decision. This was done to improve the predictive accuracy of the model.
A hidden Markov model has hidden states on which the observables are conditioned. For example, an altruistic act can be motivated by empathic concern 45. An altruistic act is an observation while empathy is the state on which the act is conditioned. See 46 for a recent review.
A hidden Markov model process 47 is characterised by five tuples \(\{Q,O,\pi , A,B\}\).
-
\(Q\) = {\({q}_{1}, {q}_{2}, {q}_{3}, \dots {q}_{T}\}\) is the set of states, each one drawn from N number of possible states, where \({q}_{t}\) denotes the state at time \(for 1\le t\le T\), and \(T\) is the maximum number of times an observation was made.
-
\(O\) = {\({O}_{1}, {O}_{2}, {O}_{3}, \dots {O}_{T}\}\) is the set of observations, each one drawn from \(M\) number of possible observations.
-
\(\pi\) = \({\{\pi }_{1}, {\pi }_{2}, {\pi }_{3}, \dots {\pi }_{N}\}\) is the initial state probability for each \(q\) in the set of all possible states.
-
\(A=\left\{{a}_{ij}\right\}\) is the transition probability. This describes the probability of moving from state \(i= {q}_{t-1}\) to state \(j= {q}_{t}\)
-
\(B\) = {\({b}_{j }\left({O}_{t}\right)\}\) is the emission probability. This denotes the probability of observation at time \(t\), given state \(j= {q}_{t}\)
-
\(\lambda =\{\pi , A, B\}\) is the parameters of the hidden Markov model.
For each participant in the VIAPPL experiments, a hidden Markov model takes the previous strategies \(S\) as states and the previous exchange decisions \(O\) as observations, as shown in Fig. 6. It then predicts the next exchange decision of the participant. Thus, we used a time-homogenous hidden Markov model that uses round-forward chaining time-series cross-validation for training and testing. As shown in Table 8, round-forward chaining starts by using data from rounds 1 to \(r\) to train the hidden Markov model, which is tested by predicting the exchange decisions in round\(r +1\). Next, it includes the prediction from round \(r +1\) in the training set and predicts round \(r + 2\). This process continues until the last round is predicted. The hidden Markov model is retrained after each round of the game and the transition and emission probabilities change per round. This process was implemented to accommodate changes over time in individuals' strategies.
In each round of the forward chaining time-series cross-validation, the hidden Markov model is trained using the Baum-Welch expectation-maximisation algorithm described in the seminal work of Rabiner 47. Given the exchange decisions and the strategies, training the hidden Markov model implies finding the parameters that would make the exchange decisions most likely. This is also known as parameter estimation.
Using sample observations from five participants, Table 9 shows the hidden Markov model evaluation. The hidden Markov model is evaluated using the average accuracy score per round in the round-forward chaining. That is, for each participant in each round, the hidden Markov model predicts the exchange decision (in the test set) of the participant. The average accuracy score per round is the number of exchange decisions correctly predicted divided by the number of participants in that round. The hidden Markov model is also evaluated on its accuracy in predicting the actions that make up the exchange decisions. The individual and combined evaluation are crucial for the application of the model, as it presents the opportunity to plan interventions based on one or more actions during interactive social exchanges.
The data is imbalanced, meaning that certain behaviour occurs less often than others. For example, out-group allocation occurs less often than in-group allocation. For this reason, sensitivity and specificity scores are included to measure the performance of the model more accurately. The sensitivity of the model is the percentage of the definite behaviours that are predicted as present among those that are truly present, whereas the specificity of the model is the percentage of the definite behaviour that are predicted as absent among those that are truly absent 48.