2.1 Literature review
In recent years, the role of social processes in the formation of the study has become a growing area of research (Leung, 2019). In their seminal study, Bala and Goyal (2000) have examined strategic tie formation in the presence of perfect information. It is an important area of research given the heterogeneity of actors (Goldsmith-Pinkham & Imbens, 2013). While some studies have shown that differences in productivity levels among actors may lead to homophily (Jackson & Xing, 2014; McPherson et al., 2001), the presence of differences in the endowment of actors may also facilitate exchange and encourage heterophily (Johnson et al., 2009; Kimura & Hayakawa, 2008; Rogers & Bhowmik, 1970). In general, however, imperfect information, asymmetric linking opportunities and the presence of existing ties among some actors may constrain preferential attachment optimising individual pay-offs (Goeree et al., 2009). Lacking information, actors joining a network may perform local searches in the neighbourhood of randomly met peers (Jackson & Rogers, 2007). Alternatively, actors may rely on other cues for selecting peers (Biggs et al., 2002; Biggs & Shah, 2006; Kossinets & Watts, 2009; C. Zhang et al., 2017); such focal play may lead to the evolution of cultures or conventions (Jackson & Xing, 2014; Schelling, 1960). A recent study reports that incomplete information may lead to the development of biased perceptions about potential peers, so that individual utility maximisation may lead to group segregation, reducing social welfare (Zhang & Carver, 2022). In dynamic games, however, repeated interaction enable agents to observe their potential peers over time and assess their productivity, trustworthiness and other relevant attributes; it enables actors to revise the assessment of which actors to form ties with (Chen et al., 2014; Roth & Schoumaker, 1983). Star networks may develop in such situations, based on a single high value agent, with the centrality, stability and efficiency of the network increasing over time (Goeree et al., 2009).
Existing studies on tie formation among heterogeneous actors with incomplete information are either based on a theoretical structure (Bala & Goyal, 2000; Dutta & Jackson, 2003; Jackson, 2010; Jackson & Wolinsky, 1996), or behaviour of agents in laboratory settings (Goeree et al., 2009). Empirical studies based on the actual behaviour of communities are rare, and mostly use either an endogenously formed network (Sacerdote, 2001), or large data sets containing information on existing ties (Chandrasekhar & Jackson, 2018; Jain & Kapoor, 2015; Jain & Langer, 2019; Patacchini et al., 2017). In contrast, longitudinal studies of the formation of networks are rare because of the practical difficulties of studying networks over the entire duration of interaction. Fresh entry and attrition are other constraints (Biggs et al., 2002; Greif, 1989; Kali, 1999).
2.2 Objective and hypothesis
The present study uses data collected from post graduate students in a single department of an Indian University to examine the evolution of ties formed for academic purposes over the first three semesters of the programme.[1] Studying student networks in non-residential academic institutions have some advantages. Given the short duration of the course, stability in network size, heterogeneity with respect to academic merit and availability of information about actors in initial stages of interaction,[2] the study of ties between students provide an opportunity to examine the interplay of heterogeneity and lack of information to examine the evolution of the network and its characteristics over time.
The present study argues that the absence of information about the productivity (in terms of academic merit) of each actor constrains optimal tie formation in the first semester.[3] Since there has to be some basis for choosing peers, ties will initially be formed on the basis of easily observed markers like the under graduate college (whether studied in Presidency, or not). In this phase, class representatives will be the key players in the network. Subsequently, actors gain information about each other and ties will form on the basis of academic performance in the previous semester. High value actors will emerge with high centrality scores; they will compete with, and even displace, the class representatives as the key players.[4] Simultaneously, given that academically strong students gain by interacting between themselves rather than with weaker students, polarisation is expected occur within the network. It will be reflected in increasing homophily.
The hypotheses of this study are:
1. Over time the density, centrality, compactness and connectivity of the network will increase (H1);
2. Ties will be formed selectively, leading to polarisation and the emergence of marks based homophily (H2);
3. The number of high value actors will increase over time (H3); and
4. The network will become more efficient and stable over time (H4).
2.3 Data
The data was collected from students enrolled in the Masters of Applied Economics course of the Economics Department of Presidency University, Kolkata, in 2020. The post-graduate programme comprises four semesters over a period of two years. The first semester normally starts in July and ends in December. However, in 2020, the course started in January 2021 as a result of the Covid-19 pandemic. The first two semesters had to be completed within the first six months due to the delay in the academic schedule (they ran from January to March, and April to July). The third semester commenced in mid-August and continued until January 2022.
The data was collected using Google forms, where the students were asked to report the names of their classmates whom they approached for academic help. Such data was collected after the end of each of the semesters to study the evolution of the network formed, and assess the corresponding changes in network positions for each of the actors. In addition, after informing the students information about their gender, caste, religion, their undergraduate college (that is, whether from Presidency University, or elsewhere), Semester Grade Point Average (SGPA) obtained was extracted from the records. Informed verbal consent of the students were taken and the data anonymised.
The number of students was 21. The majority of the students were females, Hindus, general social groups and had completed their undergraduate degree outside Presidency University (Table 1).
Characteristics
|
Number
|
Percentage
|
Table 1
Gender
|
|
|
Male
|
10
|
47.62
|
Female
|
11
|
52.38
|
Religion
|
|
|
Hindu
|
18
|
85.71
|
Muslim
|
3
|
14.29
|
Social category
|
|
|
General
|
11
|
52.38
|
Disadvantaged
|
10
|
47.62
|
Under Graduate College
|
|
|
Presidency University
|
8
|
38.10
|
Other universities
|
13
|
61.90
|
Source: Estimated from data |
2.4 Methodology
This study uses standard measures like degree, density, reachability, distance and centrality to capture the main network characteristics.
The number of actors in the network is its size. Given a network of n actors, there are potentially n×(n-1) directed ties. The density of a network is the ratio of the actual number of ties to the potential number of ties. The average number of connections each node has with other nodes is the average degree of the network.
If reachability is high in a network, information will be transmitted throughout the network regardless of the point of its origin. An actor is reachable by another if we can trace a set of connections from the origin (i.e., the source) to the target actor, irrespective of the number of actors in between them. However, even if an actor might successfully communicate with another, the connection between them may be weak. By weak ties we mean that there are few pathways connecting two actors. In such cases, actors have a low connectivity, which means that there are few routes for information to travel from one actor to the next.
To capture the aspect of how the actors are embedded in the networks, a common approach generally employed is to examine the distance between the actors. If two actors are adjacent, then the distance between them is one. The distance among the actors is an important macro-characteristic of the network as a whole. The most commonly used definition of distance between two actors is called the geodesic distance; it is the shortest distance between two actors. The longest path is called eccentricity, with the largest eccentricity in the network being the diameter of the network.
While the centrality of an actor assesses the roles of each agent and identifies their importance in the network, the average centrality score of the network indicates whether the actors are becoming more powerful over time. Centrality is estimated using degree, betweenness and closeness. Degree centrality is a measure of an actor's level of involvement or activity in the network. Specifically, it measures the number of ties an actor has. In this case, we are measuring out-degree centrality which measures the number of alters an actor sought information from.
\({\text{C}}_{\text{D}}\left(\text{i}\right)\text{=} \sum _{\text{j}\text{=1}}^{\text{n}}{\text{x}}_{\text{ij}}\) [1]
When xij is the value of the tie from actor i to actor j (value is either 0 or 1) and n is the number of nodes in the network. Betweenness centrality deals with an actor’s position in the entire network. It counts how many times an actor is placed on the geodesic, which is the shortest path linking two actors (Cook et al., 1983; Freeman, 1978):
\({\text{C}}_{\text{B}}\left(\text{k}\right)\text{=} \sum \frac{{\text{∂}}_{\text{ikj}}}{{\text{∂}}_{\text{ij}}}\text{, }\text{i }\text{≠}\text{ }\text{j }\text{≠}\text{ }\text{k}\) [2]
where \({\text{∂}}_{\text{ikj}}\) is the number of geodesics linking actors i and j passing through the node k, and \({\text{∂}}_{\text{ij}}\) is the number of geodesics linking actors i and j. Closeness is another measure of centrality, which considers the entire network. It emphasises the actor's ability to access information in the network easily (Leavitt, 1951), power (Coleman, 1973) and influence (Friedkin, 1991). Here, closeness is measured as the distance between two actors:
\({\text{C}}_{\text{c}}\left(\text{i}\right)\text{=} \sum _{\text{j}\text{=1}}^{\text{n}}{\text{d}}_{\text{ij}}\) [3]
where dij = distance connecting actor i to actor j.
Substructures within the network are analysed using the clustering coefficient and E-I index. While the overall graph clustering coefficient is simply the average of the densities of the neighborhoods of all of the actors, the weighted version attaches a weight to the neighborhood densities proportional to their size (Watts, 1999).[5] The E-I index is used to measure the degree of homophily (Krackhardt & Stern, 1988). It is given by:
E-I index = \(\frac{{\text{T}}_{\text{E}}\text{-}{\text{T}}_{\text{I}}}{{\text{T}}_{\text{E}}\text{+}{\text{T}}_{\text{I}}}\)[4]
where, TE = the number of external ties; TI = the number of internal ties.
Sociograms—visual depictions of ties linking actors in a network—are used to examine the nature of ties formed, particularly whether a star formation develops. They also identify key players. In addition, the centrality scores of each actor and their scores as hubs and authorities (Kleinberg, 1999) are estimated. Authorities are actors who can directly provide information; on the other hand, hubs are actors who do not have the information themselves but can either direct other actors to the best source of information or provide the information after collecting the same from another actor.
The efficiency of the network was analysed using the average of the mean ranks of egos of every actor:
\(\sum _{\text{i=1}}^{\text{21}}\sum _{\text{j=1}}^{{\text{n}}_{\text{i}}}{\text{R}}_{\text{ij}}\) [5]
when Rij is the rank of all the egos (j = 1, … n1) for each actor i. A decline in the average rank over time implies that actors are choosing their peers more efficiently.
The stability of the network is examined using stability and expansion ratios. They are defined as follows:
Stability ratio = \(\frac{\text{Ties retained from previous semester}}{\text{Ties in previous semester}}\) [6]
Expansion ratio = \(\frac{\text{Ties newly formed in current semester}}{\text{Ties in previous semester}}\) [7]
Fragmentation is a measure that helps to identify how many nodes are not able to interact among themselves if a certain actor is removed from the network. The mean fragmentation score also indicates the stability of the network.
[1] In the fourth semester students undertook projects, which limited their interaction; so ties in the last semester were not studied.
[2] Some students had studied in the University for their Bachelors programme; such individuals have inherited ties between themselves. The network also comprises actors who are strangers to each other as they have joined the University from other institutes.
[3] The lack of information is compounded by the online nature of interaction in the first semester. The difficulties of forming ties in online student communities has been discussed in Namhata et al. (Namhata et al., 2022).
[4] The two class representatives were selected through a process of self-nomination, followed by voting by the students. Though not a rule, students were advised to select one representative from Under Graduate students of the University and another from new entrants to the institute.
[5] It implies that actors with larger neighborhoods get more weight in computing the average density.