Exploring Intelligent Approach of Inﬂuence Minimization considering the Node Surveillance in Online Social Networks

Attractive information such as innovations, awareness campaigns, branding, and advertising help people positively. Whereas, awful information such as rumors, malicious viruses, pornography, and revenge disturb people. The negative information contributes to chaos among people; therefore, it is to be blocked and hinder from further diﬀusion. This has motivated us towards the study of the problem named inﬂuence minimization . As the real world network can be modeled to a multilayer network, we focus our study towards the information diﬀusion through a multilayer network. Each node assigns a threshold, and its variation aﬀects the rate of inﬂuence propagation across the network. In the inﬂuence minimization problem, the energy level of each node changes that help to formulate the function that minimizes the inﬂuence propagation. By applying two reduction policies, we are able to optimize our objective of minimizing the inﬂuence towards repulsive information. In this article, we consider the user response and its surveillance in the network. Repeated experiments on real networks has helped us to validate the proposed methods.

. Online social networks create gratifying support for sharing information among people globally. Some examples of social networks are Facebook, Twitter, Instagram, google+ and Linkedin. Social Networks act as a medium for propagating information across the world. The information includes both the positive information such as novel ideas, interesting news, and advertising, and the negative information such as disinformation, hateful rumors, and malicious virus, etc. The positive information propagates along the length and breadth of the networks and impacts most of the entities, which is named as influence maximization [7], [4]. It is commonly observed that the negative information gets minimal care to spread widely. This paves to a design strategy for diminishing the influence advancement, called influence minimization [31], [33].
The Influence Minimization intent to block or remove nodes, which are capable of spreading negative information in each of the social community. The impact can be restricted by deleting edges to block malicious nodes. Minimizing the propagation of awful information in a networked graph is a challenging issue. Influence minimization is accomplished by analyzing the entropy of each node capable of inflating the spread. In general, the influence minimization performs by cutting the edges that lead to dis-joined the nodes, no matter where the nodes are or how much capacity to carry over the information. Thus we can attain our primary objective of protecting society from untruthful facts and troubles.
The influence minimization finds its existence in many applications across various domains like social media, epidemiology, security, and public health. Here we coin a novel concept for the problem and propose a term entropy based Influence Minimization (e-IM). The seed nodes(k) [8] starts propagating information which follows the process of Influence Maximization [7]. The e-IM model select m nodes which tend to be propagated negative information. These sets of nodes that propagate awful information are called malicious nodes. The nodes between the seed set and the malicious set are removed from the network thereby the subgraphs that link to malicious nodes are disjoined temporarily. The immunization is an efficient process for controlling diseases by isolating the infected one. Restrict the spreading of rumors in a limited area, and prevent the infection by a computer virus.
The surveillance nodes are the optimal subset of nodes that advertise the spreading information. When the transfusion of rumorss or viruses is being reported, it helps to shape the social networks based on an epidemic outbreak. The function of an energy level of the infected node forms the subset of the surveillance nodes. Hence, it affects the estimation of the source and outbreak path. This helps in early detection of the spreading of awful information. Therefore designing a minimization algorithm contributes to maximizing the accuracy of surveillance function, that leads to the social well-being by preventing unwanted information.
In this article, we propose the study of influence minimization problem by node surveillance. It is carried out by the function of the source of information. When a node receives information, its threshold energy increases [6]. Energy spikes are measure in terms of entropy. The influence maximization algorithm generates a seed set (k) for maximizing the information explicitly. The e-IM algorithm maximizes the threshold function and forms a subset of nodes, called malicious nodes(m). Applying Targen's algorithms, we identify the articulation nodes. Let p be the number of nodes, we design a k-m policy, remove all common nodes from the graph to make the two sets are disjoint. We further verify that the two sets p and k-m are disjoint. This helps in performance improvement of our method.
The k-m policy and ((k-m)-p) policy improve our algorithm to reduce the influence to a great extend. In both cases, the principle of the disjoint subset helps to improve the e-IM problem and help to attain linear running time by considering network size. To show the reliability and effectiveness of the e-IM algorithm, we conduct experiments with the help of a python-based simulator on real-world networks. We could establish that our proposed algorithm and policies are better than any of the state-of-the-art. It is faster in magnitude than any of the similar kind. To summarize the article, the major contributions are: -Propose a novel influence minimization algorithm called e-IM. The rest of this paper is organized as follows. Section 2 familiarize some related work in the field of influence minimization, it gives an insight into the literature. Section 3 describes the background concepts in the proposed methods that help to understand the concept thoroughly. The problem and proposed procedure to downsize the influence of information diffusion in social networks, and it described in section 4 and its validation method is done in section 5. Section 6 illustrate the results. Finally, the article is concluded in section 7.

Related Work
There are two broad types of information propagation across social networks. One category is positive information that is beneficial to the community where the other is malicious information. The first category has to propagate wildly, whereas the other one has to minimize. The influence of maximization and influence minimization are the two research problems in social networks.
The influence minimization problem reduces the propagation of rumors or disinformation by blocking nodes from a topic modelling perspective [34]. When undesirable events propagate in a social network, reduce the size of the infected volume by blocking some nodes outside the infection area. This optimization problem makes use of HDA-LDA and KL to analyze the influence in topic modeling in the independent cascade model [2]. The topic-aware influence minimization approach works based on betweenness centrality and the concept of out-degree. We observed that this approach is better than any of the centrality based approach but especially at the beginning of the contamination.
The targeted influence minimization is intended to minimize the influence of negative information to some particular class of user groups in social networks [31]. The algorithm focus on two cases of influence minimization problems, the first one is the impact of budget, and the second one is robust sampling [21], [17]. The algorithm provides an optimal solution and greedy approximation. Both are not appropriate for large dynamic networks such as online social networks. The robust sampling method applies to real social networks that guaranteed an effective solution. The sampling-based solution covers the maximum area where the information spreads. But the method is less efficient when the incremental addition of nodes having awful information.
Another approach is the minimization of rumors spreading by considering user experience dynamically [30]. The method is to plug off a subset of the nodes so that malicious rumors can be blocked. The constraint of the user experiences is also taken care of by the minimization problem. A threshold is applied to each node and is called the time-stamp, the blocking time. If a user exceeds the threshold, the service of the network system declined. Based on the above restrictions, a problem has defined and proposed a solution based on survival theory, and maximum likelihood principle [26]. The popularity and user inclination towards rumors analyze through Ising Model [12]. The algorithm became lesser accurate when the structure of the networks become more complex. Each node examines closely, and its surveillance can be incorporated to enhance the effectiveness of influence minimization.
The influence minimization problem is studied in the Linear Threshold(LT) model [32], [33]. The model consists of directed graphs where each node can be of two states: active or inactive. Information propagation progresses, the status of the node can change from inactive to active. Two scenarios for the influence minimization has analyzed. The first one is to cut the supply of products into distributors to minimize the sale of faulty products. The second is to block the information at the node level for minimizing further propagation. The integer programming problem helps for the above cases for an optimal solution. The optimal solution generated failed to scale for the large graph or multilayer graphs.
The study on the Influence Minimization in online social networking is challenging. When we consider real-world networks, layers arrange in the form of a stack for building a model. The information diffusion across the layers has to minimize. We observe that none of the present methods are capable of minimizing the influence spread in dynamic complex networks. This has been motivated us to design an effective method that blocks widespread rumors and awful information.

Preliminaries
In this section, we discuss some basic concepts of mathematical representation of complex networks, diffusion model, and influence minimization. Here we consider multilayer networks. All these would help the reader to understand the proposed work.

Multilayer Networks
A monolayer network represent as a graph with tuples, G = (V, E), where V is the set of vertices, E is the set of edges [23]. This mathematical representation can easily be extended to a network with multiple layers. A multilayer graph is expressed as a quadruple, where V is the set of N vertices, and L is the set of elementary layers [27].
The layer L d m in which m is the total number of layers and d is the aspect of a network [20]. The notation L = {L d m } m=1 stands for stack of layers in the networks described through d aspect. We could build up layers in a multilayer network by integrating a set of all possible combinations of elementary layers The set of vertices of layer α is denoted as The interconnection edge between layers α and β is defined as,

Supra-Adjacency Matrix
The adjacency matrix for each layer G α is represented as where 1 ≤ i, j ≤ N α and 1 ≤ α ≤ m. The cross layer or inter layer adjacency matrix corresponding to E αβ is the matrix A [αβ] dj = (a αβ ij ) is given by The matrix representation for each layer is G [α] , and the supra adjacency . Where each element It is a symmetric matrix with order N [α] , number of nodes in the layer α.
Where each element, a [α] ij = 1 if and only if there is an edge between i and j in G [α] . And the coupling matrix, C ← G c = {C ij } an N × N matrix with C ij = 1 if and only if the same node in different layers [10]. Table 1 Supra-Adjacency matrix for the multiplex network shown in Fig. 1 A The Supra-Laplacian matrix could be derived from supra-adjacency matrix. The Supra-Laplacian matrix represents the multilayer graphs which disclose major spectral properties such as coupling strength between layers and dynamic diffusion process [13]. The Supra Laplacian matrixL =D −Ā where D is the diagonal matrix and A is the adjacency matrix [11]. [19]. Additional parameter θ is the threshold value, θ i = (0, 1] and The function W assign an influence weight, W (i, j) ∈ (0, 1] to each edge(i, j) ∈ E. As this model scale to multilayer networks, number of tuple added to represents multilayer networks. The function assign weight to each edges either in a layer or between the layers. The weighted supra-adjacency matrix is to be formed for the simulation purpose. The in-neighbor set and out-neighbor set of each node are to be defined and Edge(i,j) represents that node j can be influenced by node i where i and j are node at any layer [25].

Linear Threshold Model
The threshold matrix is represented by S = diag{[θ 1 , θ 2 , .., θ n ]}. The diagonal elements of the matrix is the threshold of respective nodes. Let (M live ) t=0 be the active set of nodes at t = 0, and the node changes from inactive state to active state in every steps. (M live ) t = t x=0 (M live ) x . The function continues until all nodes reaches active state.

Entropy-Based Influence Minimization
In this section, we define the influence minimization in regular networks using the linear threshold model. We consider the public interest in the information and behavioural changes of users who receives the information. The physiological changes associated with emotion lead to a change in the energy level and the activity of each user. These emotional changes are analyzed through the LT model. The goal of the model is to minimize the influence of awful information. The number of activated nodes are needed to be minimum at the final stage of the information diffusion under node surveillance. We propose a survival model where the probability of a node v, that activate by the weighted sum of all probabilities of the previously activated nodes. The proposed algorithm blocks newly generated active nodes in the previous state,t n−1 . The propagation of negative information causes the social contagion process [18]. As time increases, people's interest in such information tends to shrink with time. When a node receives a rumors, the energy level increases [15]. The difference in energy level measure in terms of entropy and it is computed by the equation, ∆E = cm p ∆T . Where ∆T is change in heat energy, c is constant and m p is m-PageRank that shows the energy distribution capability [9]. Since the node is activated, it transmits the information to its neighbours. The influence minimization achieves through either blocking the user for further transmission by annealing process or remove the link between infected nodes [3]. Remove a link cause isolate one group of nodes completely. The influence minimization process depicted in Fig. 2, which helps to understand well. The process of (c) removal of an edge possible only when the infected node is an articulation node. If the blocked time exceeds a threshold, the user either leaves away from the social networks or give intimation about the status. The system has to unblock or retain the link after a certain time-stamp. The time-stamp depends on how long the negative information lives in the system. The tolerance for latency is a major issue in the influence minimization problem.

Solution Approach
The proposed influence minimization is a greedy algorithm where we find the optimal solution, either to maximize the threshold or to minimize it. The algorithm works better in the LT model, epidemic models. The degree centrality is the fundamental metric to find the most significant nodes and suits only for simple networks. Since degree centrality is a basic one and not suitable for multilayer networks. We select an appropriate metric, m-PageRank, for study-ing the information diffusion. The formation of seed sets and malicious nodes depends on the propagation probability. The algorithm picks an edge randomly which has a weight, the weight function, p : E m → [0, 1]. Let (i, j) be the live-edge with weight p ij , all liveedge form a subset E live . Each node that connects edges in the subset E live has a threshold, let threshold of j be θ j . The node, j, is infected if p ij ≥ θ j . Let B is the subset of immunized nodes which are to be blocked for a duration, Initilize B ← 0; P ← 0; 3: for w ← 1 to n do 4: for i ← 1 to k do 5: \*checking the threshold value for each node*\ 6: if Ew ← M ax((argM ax (w∈V −B) σ(B ∪ {w} − σ(B)), 1); \*calculate the entropy*\ 8: B ← B ∪ {w}; 9: for j ← 1 to m do 10: ) \*maximizing the influence*\ 11: while j = n do 12: \*construct the DFS tree*\ 13: construct a DFS Tree 14: consider an edge (u, v) ∈ E live 15: cotinue; 16: for j ← 1 to m do 17: if (v, p) ∪ (p, u) / ∈ E then 18: M * live ← argM in((σ(M live |V )−σ(B|V ))−σ(P |V \B)) \*minimize the number of node influenced*\ 19: Return(M * live ) \*final list of live node*\ 20: End procedure The algorithm for minimizing the influence spread by maintaining both policies are depicted in Algorithm 1. The two classical centrality metrics used as the basis for the function to maximize the spread and form a subset of nodes which are more influential to the given networks. The metrics described as follows: -Multiplex PageRank. A node in a multilayer network resides at different layers, and these distinct positions constitute the rank of each node. The weight of the node in corresponding layers is aggregated with intra-layer influence and votes [28]. Rank depends on the coupling relation between layers and characteristics of interconnections. -m-PageRank. The rank of a node computes by considering the interconnection strength, not only from the same layer but from the neighbouring layers also. The rank of each layer accounted for the computation. It works based on a biased random walk and adds the impact of links from the other layers.
The performance of those metrics on the proposed algorithm analyzes through experiments and its analysis shown in section 6. Influence minimization problem further improved by ((k-m)-p) policy. The problem of influence minimization takes in input as a multilayer social graph and produces a subset of nodes that block for minimizing the negative information spread across the networks. The algorithm finds live nodes from the network graph. A set of seed nodes form by using a centrality metric. The first policy state that the set of live nodes form by performing the set difference operation between the set of live nodes and seed.
The algorithm formulated for influence minimization based on the energy level is illustrated in the Algorithm 1 and it's time complexity is O(m 2 k). The notation, m is the cardinality of the live nodes, k is the size of the seed set. The algorithm has m iteration in the worst case.
The influence minimization has further optimized by second policy, where the articulation points have to remove from the set of live nodes. The subset of blocked nodes to be formed by the intersection of live nodes and seed by using the Algorithm 1. Further, optimized by the union of blocked nodes with a set of articulation nodes, so that a complete minimization of spread across the networks.
The procedure for the ((k-m)-p) policy state as follows, during the first stage of the algorithm, construct a DFS tree and choose a random node u. Find any cyclic path exists to reach u through p selected nodes. If there is any path that exists, the node u adds to a previously defined set of articulation nodes, |{P }| = p. The process continues to discover all the nodes in the live graph.

Experiment and Validation
The competency of our proposed e-IM algorithm is validated through experiments. We have developed a python based simulator, where we used Python 3.6.0 and NetworkX 2.4 [14].  The proposed method is compared with two of the state-of-the-art algorithms, DRIMUX [33] and DGMT [30].
DRIMUX: It is a rumor spreading minimization model called dynamic rumor influence minimization with user experience. It reduces the rumor propagation by blocking a certain subset of nodes in the network. if the block period exceeds the time limit, affect the utility of the whole the system DRIMUX considers the characteristics of the rumor and its experience of the user. DGMT: It applies a linear restriction on seed set and find the optimal solution for minimizing diffusion using an integer linear programming problem.

Performance Parameter
We want to block k susceptible nodes from the live nodes, where k nodes are most central as per m-PageRank ranking metric. Set of seed nodes(k = |seed|) are to be removed from set of live nodes, m = |M live |. Let p is the set of nodes which connects two or more connected components, where p ≤ k. The optimal set of live nodes is to be formed by deducting p nodes to minimize the spread rate by isolating graph components.
The evaluation of the performance of our influence minimization policies on live networks described in Section 5. Let k be fixed to a constant,(k=50, k=100), and compute number of nodes has been blocked while completing iteration on Facebook. We analyzed such that how many nodes has reached the negative information. Compare with DRIMUX and DGMT. The statistical relationship between blocked nodes and live nodes for Facebook are shown in the Fig. 3. The experiments repeat for other networks by using other centrality metrics, DRIMUX, and DGMT. It found that as the blocked nodes increases, the number of nodes who received negative information gets decreased. The algorithm gives a more optimized solution and is found to be minimal spread while applying the ((k-m)-p), the second policy. Fig. 4 shows the correlation between blocked nodes and influence spread on Hep-Th and Netscience. The metric that compare the influence minimization of various algorithm is made possible with help of minimization factor (µ). It is defiend as the ratio of number of blocked nodes to the number of live nodes. The higher the value for µ, the better the minimization capability for the algorithm. The comparison between e-IM and others are shown in the Table 3.

Conclusion
In this article, we formulated two policies for developing an algorithm capable to minimize the influence spread across the network. We have identified the most central nodes in the network through a leading ranking metric, m-PageRank. By blocking or removing such nodes make the network capable to minimize the information spreading. The algorithm further optimized by adding an another policy which isolate nodes by infecting them with virus. A The result shows that the proposed method outperform compared with stateof-the-art. The future enhancement for the work can be done by scaling it for present pandemic environment. It can be modified for implementing an effective method for isolating people from any kind of viral diseases.