Event Prediction Based On Large Scale Network Subgraph Convolution

Aiming at the problem of event prediction in large-scale event network, a collapse subgraph convolution (CSGCN) algorithm is proposed, which uses event subgraph to predict the subsequent events of event group. CSGCN algorithm collapses the edge induced event subgraph in large-scale event network, removes the irrelevant event nodes from the subgraph, and forms a new event subgraph. GCN algorithm is used to learn the graph embedding representation of the event subgraph, and the subsequent events of the event group are predicted by comparing the similarity between the graph embedding representation of the event group and the subsequent events. Because only some related nodes are processed each time, the application of the model in large-scale data graph is feasible. Through experiments, we explore and verify the effectiveness of extracting features from subgraphs of large-scale graph by using graph convolution training to obtain graph embedding representation. We find that GCN has better event prediction effect than Euclidean distance and co rotation similarity, which further shows that graph convolution algorithm has good performance in the field of graph feature extraction.


Introduction
At present, most of the event prediction methods are based on time event series. Bilod 1 and others use Markov model or RNN model, Mei 2 and others use LSTM model, Chunyan Sang and others use LSTM and attention mechanism to predict the evolution trend of social network information dissemination. However, in the specific event prediction scenario, the relationship between events is more complex than that of time series, and the length of time window is not consistent, which leads to the effect of dynamic event prediction using time window in many cases is not ideal. Event chain is used to describe the chain structure formed by the successive triggering of multiple events in a certain time and space. Many scholars also try to use the chain structure to express the temporal and spatial relationship of sequential events and further achieve event prediction. Yuan et al. 3 causal chain mining, Li et al. 4 epidemic model, Kritika Singh 5 accident path analysis method, etc., mostly use the event pair 6,7 method to achieve subsequent event prediction, but similar research like Wang et al. 8 using sequence event chain to obtain high accuracy prediction is rare. At present, event chain and event pair prediction model have a typical feature, that is, the expression ability of the relationship between events is insufficient, and the rich relationship between events has not been fully explored. In 2004, leveson 9 pointed out in the research of safety system accident model that the method of analyzing the event process through sequence events ignored the potential relationship between events. In the above research results, this view is also confirmed. People have been exploring methods to fully express the relationship between events and to predict more effective events.
In 2007, Li ben et al. 10 put forward the concept of link prediction for the first time. Through the analysis of network structure and node information, the link possibility between two unconnected nodes is evaluated to realize the prediction. Traditional link prediction is based on judging the similarity of two nodes or whether there is a link between two event nodes.
Common methods include node attribute comparison 11,12 , network topology comparison 13,14 , maximum likelihood estimation 15,16 . With the development of graph neural network, graph embedding representation is more and more used in link prediction 17,18,19,20 for feature learning.
In reference 21 , the covariance matrix adaptive evolution strategy (CMA-ES) is used to optimize the linear combination weight in the similarity comparison of neighborhood and node, and the combination of topology features and node attributes improves the accuracy of link prediction.
O'madadhain et al. 22 proposed a local conditional probability model based on network structure and node attributes to predict the time-varying relationship between nodes. At the same time, the research of link prediction promotes people's understanding of network structure characteristics and the exploration of complex network structure evolution law 23,24 , which provides support for using complex network to express the relationship between events. In most link prediction methods, abnormal events or behaviors are predicted through the prediction of abnormal or missing edges in the network. However, in predictions such as terrorist events, a group of events after occurrence means that other events may occur, and there is no strict timing relationship between the events that have occurred. Traditional methods cannot provide effective solutions for this kind of problems, while link prediction methods based on complex networks provide ideas for solving the problem of event group prediction.
Li 28 constructed directed event evolution graph (NEEG) based on script events with event context, and proposed event chain prediction method based on large scale network graph neural network (SGNN) and gating graph neural network (GGNN). Based on graph embedding representation, the interaction between event sequences is modeled and the subsequent events of known event chain are predicted. This method still follows the method of sequence prediction, which requires the event to have obvious temporal characteristics. This paper improves SGNN algorithm, uses graph convolution neural network algorithm instead of gated graph neural network (GGNN), and proposes collapse subgraph convolution algorithm (CSGCN). Through the association between context event groups, the possible events after a group of events occur are predicted, and the effectiveness of application in large-scale network is verified by experiments.

Theoretical Method
Graph structure has the ability to express complex relationships. Event network graph structure can not only express the relationships between events, but also combine event characteristics and network structure, which helps people to use more accurate and structure sensitive methods 25 to study the evolution of event network and predict future events. This paper uses the event network diagram to show the relationship between the events in the text based on the above and below. It takes events as nodes and uses edges to represent the relationship between event nodes.
We regard the prediction problem of subsequent target events of event group based on event graph as a binary classification problem. If the causal relationship between event group and target event is known, the sample is marked as positive link, otherwise it is negative link.
Extracting useful information from graph structure and node features is very important for event prediction. In contrast, the local relationship coupling of events in event group is closer. Therefore, we extract event subgraphs from large-scale narrative event network, and use graph neural network to obtain graph embedded information representation of event nodes in the graph for event prediction. Through training and learning a large number of sample data, on the basis of graph embedding representation of subgraph structure and event node features, we compare the similarity between event group feature information and target event feature information, train feature information describing the relationship between event group and target event, and further realize event prediction.∈ Event network graph structure and Event prediction description A graph can be represented as G= {E, V}. where V={v 1 ,…,v N } is a collection of N nodes and E is a collection of edges. Graph structure information can also be represented by an adjacency matrix A ∈ R N×N . In addition, each node in the graph has a node feature, and we use X ∈ R N×d denotes the node characteristic matrix, where d is the characteristic dimension and N is the number of nodes. Each node in graph G can be seen as a d-dimensional graph signal.
A group of graphs is represented as {G i }, and the corresponding tag event is y i .We combine each graph G i with its label event y to form an event subgraph, and judge whether the follow-up event of {G i } is through the two classification of the event subgraph, so as to transform the event prediction problem into graph classification problem. In the event graph, we extract a set of subgraphs {G i } composed of K related The task of event group prediction is to make graph {G i } (including structural information and node characteristics) is used as input to predict its corresponding event y. The task of event group prediction is to map {G i } as input to predict its corresponding event y i ,using structure information and node characteristics. If features of the {G i } and y i obtained from the node feature information and the structure information are identical, then y i is considered to be a  for any edge e ∈ E ' , G ' =(E ' , V ' ) is also a graph, which is called edge induced subgraph by ′ . The subgraph mentioned in this paper refers to edge induced subgraph.
The edge induced subgraphs of event nodes in the event network are divided into event subgraphs, that is, the event network graph is divided into several different event subgraphs. In particular, the subgraph mentioned in this paper refers to the edge induced subgraph. After extracting the event subgraph, the subgraph is separated from the large-scale network structure graph by subgraph collapse. In order to collapse the event subgraph, we first need to extract the event subgraph to form a new graph, and then form a new adjacency matrix according to the arrangement of events in the graph, which is used to represent the network structure of the event subgraph. The selection of scope is the core step of graph collapse process. We divide the event groups and target events in the sample data into an event subgraph, and remove all other nodes in the event network graph that do not belong to the event subgraph. After the event subgraph collapses, we further fuse the attribute information and structure information of the nodes in the graph, and obtain the characteristics of different subgraphs through training. The specific process is shown in Figure (1).
In this paper, we obtain the collapse adjacency matrix A ′ of edge induced subgraph G ′ of graph G by Equation (1), where A is the adjacency matrix of graph G. Let G= (V, E) be a graph with n vertices, then the adjacency matrix of G is a square matrix of order n with the following properties: Equation (1) S is an adjacency matrix composed of nodes in G ' , The embedding matrix X ' of is expressed as Equation (5), where X is the characteristic attribute matrix of G. The matrix representation of event subgraph G ' is obtained.

Pooling
It is necessary to train pooling operator to collapse event subgraph. The design of pool operator is based on graph Fourier transform of subgraph{G i }. L=D ' -A ' represents the Laplacian matrix of subgraph {G i }.
The Eigenvector of Laplace matrix L is expressed as u 1 ,…,u N . according to the literature 29 , the upper sampling operator c will be used as the basic feature of the subgraph, and the Eigenvector is sampled into the whole graph as u ̅ l = cu l (l= 1…N).
The pool operator is obtained by using the upper sampling Eigenvector, according X l =u ̅ l T X. The X l ∈ R N×d is a pooling operator represented by the Eigenvector u 1 , … , u N of Laplace matrix L and the sampling operator. For the convenience of calculation, the first H Eigenvectors can also be used for X p =[X 0, …,X H, ].

Embedding representation of event subgraphs
Convolution network on graph is divided into spectral domain convolution and spatial domain convolution. The graph convolution algorithm GCN 26

mentioned by Thomas
Kpif is spectral convolution, which transforms the convolution core and the graph signal to the Fourier domain to extract the characteristics of the graph signal. After sampling a fixed number of domain nodes, the spatial convolution network obtains the embedded representation of the network by inner product of neighborhood points and convolution kernel parameters. In comparison, spectral convolution algorithm has higher accuracy, but the scalability of network nodes is not strong, so it is not suitable for large-scale graph classification and other functions training. On the contrary, spatial convolution algorithm has strong expansibility, but because it is based on sampling technology, the accuracy of its model is slightly inferior to spectral convolution. In this paper, we apply graph convolution network to event subgraph. The network scale is small, so spectral domain convolution algorithm GCN is used to realize information embedding and message propagation of graph structure nodes, so as to obtain the best effect. SGCN algorithm uses Equation (6) to realize the spectral domain convolution process of event subgraph.
Where M is the message function, this paper uses the frequency domain graph convolution algorithm to extract the typical feature structure, and W (K) ∈R d×d is the weight matrix that needs to be learned and trained. A ' is the sum of adjacency matrix of G ' and identity matrix, D ' is the degree matrix of degree A ' .
Event prediction based on event subgraph convolution After the embedded representation of the graph is obtained by CSGCN algorithm, the graph is pooled into a node, and the Eigenvalues of each node in the collapse subgraph are averaged as the Eigenvalues of the pooled node. The similarity between the pooled node and the subsequent event node is compared, so as to realize the event prediction.
The method of embedding representation of nodes and averaging Eigenvalues cannot further reflect the structure information of subgraphs. We try to integrate and extract the information in each subgraph. EigenPooling 30 uses graph Fourier transform to integrate and output the graph structure information and attribute information. In this paper, after convoluting the subgraph to obtain the graph embedded representation, we further apply EigenPooling algorithm to enhance the fusion of graph structure and feature information, but the effect is not obvious. At the same time, we also try to use the attention mechanism to obtain the complex relationship between events in the event group, and achieve good results.

Experiments and results
We tested our idea in the experiment, through the operation of graph convolution after the event subgraph collapses to realize the prediction of subsequent events. Firstly, the undirected event evolution graph (UDEEG) is constructed to represent the event collapse subgraph. Its nodes are events, and its edges represent the temporal or causal relationship between events.
In the experiment, we use the relationship between known event groups and subsequent events in the sample data to form undirected UDEEG.

Data sets
The data set for this study is from the billion word corpus of the New York Times. After POS tagging and dependency parsing, phrase structure parsing and coreference resolution, independent context event groups and candidate events are extracted. Based on the data set used in reference 1, this paper improves it. Each group of context events and the correct events in its five candidate follow-up events are taken as the positive samples for training, and an error event is randomly selected to form a negative sample.
The ratio of positive and negative samples in the test set is 1:1, forming the data set used in this experiment. Table 1 shows the detailed amount of sample data in the dataset.

Experimental setting
In the experiment, the number of event groups processed in a batch is Batchsize.
According to Equations 6, 7 and 11, the collapse subgraph convolution is performed for each event group in the data, and the convolution level is K = 2.
Super parameters are optimized on the training set, especially CSGCN is actually two classification problem, so cross entropy loss function is used as the objective function.
Where P(x i ) is the true value and q (x i ) is the predicted value. W is the model parameter set. λ Is the parameter of L2 regularization, which is set to 0.00001. The learning rate is 0.0001, the batch size is 4000, and the convolution level K is 2. In this paper, the initial node feature of the node is represented by the embedding result of event predicate-GR trained by DeepWalk algorithm in reference 28, and the embedding dimension d is 128. The model parameters are optimized by RMSprop algorithm. The ratio of training set to test set is 9:1.

Comparison of experimental results and baselines
In previous studies, the prediction of subsequent target events of narrative text event sequence is realized by selecting one of several candidate events. For example, Li 28 divided the five candidate events into five categories, and set their labels to 0, 1, 2, 3 and 4 respectively according to the order of candidate events. In the event prediction process, one of the five candidate events was selected. In this case, the label of candidate events does not have practical significance, and the candidate events belonging to the same category do not have the same characteristics. The neural network algorithm is mainly used for feature learning, which is one of the important reasons for the poor effect in narrative text event prediction.
In this paper, the candidate events are divided into positive and negative samples, and the classification has practical significance. Although facing the complex event group prediction, the experiment still achieves good prediction results. The learning curve of the optimized CSGCN that can express the change of accuracy with the training period is shown in Figure (2), and the accuracy of the optimized algorithm reaches 62.56%. In the experiment, we optimize the experiment on the basis of CSGCN algorithm.
Firstly, the graph embedding obtained from the convolution of subgraphs is further combined with the graph structure information, and the Eigenvector of the adjacency matrix of subgraphs is combined with the graph embedding representation (CSGCN + Eigen). The interactive utilization of local structure information and node information is attempted, and the improvement of 1.33% is obtained. We use the attention mechanism (CSGCN + Attention) to learn the event group relationship, and get a better learning effect, which is 3.32% higher than CSGC algorithm. The comparison of accuracy between our algorithm and baseline algorithms is shown in Table 2.

Analysis of experimental
In CSGCN algorithm, it is necessary to compare the similarity between the feature representation of event group and target event. In this paper, we try a variety of methods in event representation, event similarity comparison and embedded representation of event group in the convolution layer, and analyze these methods through experiments.
• Analysis of convolution layer event representation. This paper attempts to use the following methods and makes a comparison. In the second convolution output of twolevel graph convolution, the Eigenvalue output method is used, and no obvious effect is achieved. If the dimension of the second layer convolution is set to 128 or 12, the effect is not obvious. Finally, the output dimension of the second level graph convolution is set to 2. After similarity and comparison, good results are obtained.
• Analysis of similarity comparison method. We use the similarity of event group and target event to get the probability that the sample belongs to positive sample. When the convolution output dimension is 128 or 12, this paper uses the methods of cosine similarity and vector subtraction to train, and the training accuracy is not significantly improved, which leads to a higher recall rate in the training process. When the convolution output dimension is 2, the vector subtraction method is used for similarity comparison, the training accuracy is improved greatly, and the training recall rate is relatively low. The specific learning results are shown in Figure (   mechanism, the efficiency of the algorithm has been effectively improved, but it has caused serious over fitting. This phenomenon shows that it is of great significance to identify the important events in the group events. Although the algorithm is improved by adding parameters, the phenomenon of over fitting shows that the important events in the event group can get better recognition effect through training, but the universality of its law is not enough. The phenomenon first shows that it is of great significance to identify the important events in group events. However, the over fitting phenomenon after increasing the parameters to improve the accuracy of the algorithm shows that the important events in the event group can get better recognition effect through training, but the universality of the rule is not enough.

Conclusion
CSGCN algorithm only disposes some related nodes at a time, which makes the application of the model in large-scale data graph feasible. Through experiments, we explore and verify the effectiveness of extracting features from subgraphs to obtain graph embedding representation for event prediction.
The following conclusions are verified by this experiment: • Using graph convolution training on subgraphs of large-scale graph can obtain more effective graph embedding representation.
• Using event group based on event subgraph to predict subsequent events can obtain a relatively high accuracy.