Heterogeneous graph neural network with semantic-aware differential privacy guarantees

Most social networks can be modeled as heterogeneous graphs. Recently, advanced graph learning methods exploit the rich node properties and topological relationships for downstream tasks. That means that more private information is embedded in the representation. However, the existing privacy-preserving methods only focus on protecting the single type of node attributes or relationships, which neglect the significance of high-order semantic information. To address this issue, we propose a novel Heterogeneous graph neural network with Semantic-aware Differential privacy Guarantees named HeteSDG, which provides a double privacy guarantee and performance trade-off in terms of both graph features and topology. In particular, we first reveal the privacy leakage in heterogeneous graphs and define a membership inference attack with a semantic enhancement (MIS) that will improve the means of member inference attacks by obtaining side background knowledge through semantics. Then we design a two-stage mechanism, which includes the feature attention personalized mechanism and the topology gradient perturbation mechanism, where the privacy-preserving technologies are based on differential privacy. These mechanisms will defend against MIS and provide stronger interpretation, but simultaneously bring in noise for representation learning. To better balance the noise perturbation and learning performance, we utilize a bi-level optimization pattern to allocate a suitable privacy budget for the above two modules. Our experiments on four public benchmarks conduct performance experiments, ablation studies, inference attack verification, etc. The results show the privacy protection capability and generalization of HeteSDG.


Introduction
Heterogeneous graphs are more suitable for depicting entity relationships in realistic scenarios and can describe more sophisticated interaction histories in social networks, so many existing works represent social networks with graphs that make up a more complex semantic structure and fuses the representation of multi-node features and multi-relationship structures [1,2]. Representation learning for heterogeneous graph properties facilitates neural network models to learn latent information and use it for downstream tasks, such as recommendation systems [3,4]. However, in most cases, they only exploit some explicit information such as nodes and edges, without considering higher-order features, complex topology, or implicit semantic information. The higher order of node features refers to the fact that the features of a node are not only determined by its own characteristics but also by the features of its neighboring nodes. And the complexity of the network topology refers to the intricate and non-trivial patterns of connections and interactions among nodes in the network.
To adapt to the heterogeneity of social networks, heterogeneous graph neural networks (HGNNs) are popular in graph representation learning because of their powerful representational capabilities [5][6][7][8][9][10]. In the existing works, to learn the dependencies of node to neighbors and high-order nodes in graphs [11][12][13][14][15][16][17], we divided HGNNs into two main categories: neighbor aggregation based on convolutional kernel and random walk based on meta-path, meta-graph, and meta-schema [1,18,19]. And meta-paths are often regarded as semantic information. However, in real-world applications, powerful HGNNs boost the downstream representation ability while leading to an additional risk of privacy leakage. They mine the implicit information on the social graph, but more private information is undoubtedly implied in the representation results. For example, some malicious behaviors can deliberately push insecure links based on community detection to get private information, like social relationships [20], behavioral trajectories [21][22][23], and medical records. The existing works focus on how to improve the representational power of heterogeneous graphs and ignore the security issues of private information.
As we benefit from the convenience brought by HGNNs, our private information will face the risk of malicious collection, learning, and utilization. That would bring a series of privacy leakage issues which are reflected in two important graph properties, i.e., features and topology. Fortunately, to address privacy issues, we can be inspired by some prominent works on common protection methods in machine learning and privacy-preserving technology for homogeneous graphs [20,22,24]. They perturb the node features before they are input, or add noise to the gradient of the model learning. And they use some popular and advanced privacy protection technology like differential privacy [25,26], which is based on data distribution perturbation and with a strict mathematical definition. However, existing differential privacybased designs are difficult to adapt to semantic data and are not resistant to MIS. MIS is an attack that infers the membership of dataset members by leveraging the social relationships and interaction history between users in the dataset (i.e., semantic similarity and correlation  Fig. 1 A toy example of privacy risk in a heterogeneous graph. The red line represents the user's purchase history, and "predict" indicates the target of the attacker's attack. Specifically, the attacker can identify the identity information of a certain user by analyzing the user's purchase history, and further infer the identity information of other users (target) between samples). We use the recommendation systems as an example to illustrate our MIS scenario. Figure 1a is a heterogeneous graph schema that expresses the node types and edge types in a social network. For feature MIS in Fig. 1b, some traditional methods in homogeneous graphs protect nodes from malicious inference attacks by perturbing the correlation of homogeneous nodes and reducing the prediction probability, such as perturbing the relationship between user A and user B to minimize the expected likelihood of user C purchasing item 1, which confuses the preference inference for C. But for a heterogeneous graph, a meta-path (e.g. "UserA-Item1-Store-Item2-UserC (UA-I1-S-I2-UC)) can indicate semantic information, which will enhance the probability of a successful attack, provide the similarity between nodes to infer the potential purchase records. For topology MIS, we assume that a malicious attacker can seek public networks with similar topologies and infer information about the target network by comparing topological attributes (e.g., degree, quadrilateral). For instance in Fig. 1c, User A is the only node in the subgraph with degree 2 and is located in a quadrilateral, while store B is a node with degree 4 and is located in two quadrilaterals. The background knowledge allows the attacker to obtain connections between arbitrary nodes, regardless of the node type, and analyze semantics to gain insight into the preferences of specific users. In a heterogeneous graph, even if there is no direct connection between nodes of the same type, node information can still be inferred based on semantic guidance. Figure 1c reveals that a meta-graph instance "StoreD-UserA-StoreB-UserC (SD-UA-SB-UC)" as a semantic relationship will benefit the attacker to extract topology characteristics and infer the nodes' information.
In conclusion, the difference between feature MIS and topological MIS lies in that attackers utilize semantic enhancement of different attributes in the graph, and both have potential privacy risks (i.e., they represent privacy leakage risks caused by semantic enhancement in high-order features and topological complexity, respectively). Therefore, we would like to propose a unified framework to implement privacy protection. We propose three main challenges. (1) On heterogeneous graphs, nodes are diverse and can aggregate to different types of neighboring nodes, which brings high-order semantic information but also enhances the attacker's inference ability. Therefore, it is difficult to determine the sensitivity of nodelevel privacy easily. (2) The complex topological structure on heterogeneous graphs can be decomposed into meta-paths to represent it. Privacy protection requires processing multiple meta-paths and considering their correlation, while semantic associations can be easily captured by attackers for malicious inference. (3) The number of nodes and edges in heterogeneous graphs is usually very large, making it necessary to add a large amount of noise in differential privacy protection, which can easily affect the accuracy of data and the perfor-mance of the model. Therefore, a challenge is to balance privacy guarantees and convincing predictions.
To resolve the above problem, we propose a novel Heterogeneous Graph Neural Network with Semantic-aware Differential Privacy Guarantees method named HeteSDG. 1 First, we define a novel privacy leakage scenario for heterogeneous graph recommendation systems, and we reveal the privacy leakage risks associated with the heterogeneity. Further, we design two stages of privacy-preserving strategies about feature attention personalized mechanism (FeatADP) and topology gradient perturbation mechanism (TopoGDP). The FeatADP is based on a heterogeneous attention mechanism to learn and perturb the node representations. And the degree of perturbation depends on the sensitivity of features' gaussian noise [27], which is influenced by the different types of neighbors and relationships. Specifically, we follow meta-paths as semantic information to build node neighbors and make them debias for semantic attention since the build process would diminish the semantic information representation. For TopoGDP, we input the perturbed node representations to heterogeneous variational graph auto-encoders [28] for reconstructing the privacy-preserving topology, and design a regularization term as the soft supervision objective of side semantic relationships. And the link predictor can set learnable gradient clipping hyperparameters as noise sensitivity to clip and perturb the gradients. In addition, to better integrate these two properties, we use a bi-level optimization mechanism to achieve a trade-off between privacy-preserving and performance optimization and a reasonable privacy budget allocation.
A preliminary version of this work focuses on the Heterogeneous Graph Neural Network for Privacy-Preserving Recommendation (HeteDP) is published in the proceedings of ICDM 2022 [29]. This journal version has extended our method from the traditional Markovian walk into personalized awareness enhancement for semantic representation and designed a semantic-centric regularization term. In other words, we extend the two-stage mechanism using a semantic-aware debias mechanism. In addition, we add more detailed technical explanations to the journal version and add attack experiments for more extensive datasets and analyses to evaluate our extended model HeteSDG.
The organization of this paper is listed as follows. We revisit previous works on related topics in Sect. 2. We introduce the problem definition in Sect. 3, and the overall proposed method in Sect. 4. The experimental results and analysis are presented in Sect. 5. In the last Sect. 6, we present the conclusion of this work.

Heterogeneous graph embedding
Heterogeneous graph neural networks (HGNNs) [30][31][32] are proposed to deal with ubiquitous heterogeneous data. We divide HGNNs into two main categories: neighbor aggregation based on convolutional kernel and random walk based on meta-path, meta-graph, and metaschema. To learn high-order information, some existing models performed graph convolution directly on the original heterogeneous graph. HGT [12] proposed a transformer-based model for handling large academic heterogeneous graphs with heterogeneous subgraph sampling. RGCN [31] captured the heterogeneity of graphs by projecting node embeddings into different relational spaces using multi-relational aggregation weight matrices. HetSANN [10] used a type-specific graph attention layer for the aggregation of local information, avoiding manually selecting meta-paths.
In addition to mining explicit node features and topology structure in graphs, some works extracted semantic information as additional guidance for heterogeneous graph embedding by adding meta-paths, meta-graph, or meta-schema as prior knowledge to effectively fuse heterogeneous data and improve learning performance. HetGNN [30] considered each node's heterogeneous content (node's attribute information) and used the random walk to sample a fixed number of strongly associated heterogeneous neighbors for graph nodes, and then used BiLSTMs to process the heterogeneous graph. Metapath2vec [19] designed a metapath-based random walk and utilized skip-gram to generate node embeddings. HIN2Vec [18] learned node representations based on meta-path random walks to incorporate semantic information in heterogeneous graphs. HGConv [33] introduced node representation based on mixed micro/macro-level convolution operations on heterogeneous graphs. A micro-level convolution can learn the dependency of nodes under the constraints of the same relation, and a macro-level convolution was used to distinguish subtle differences between relation types.
With the contextualization of advanced research, many works had also made excellent progress in recommendations [34,35]. However, with the representation and application of rich information, more user information is undoubtedly exposed, and the adversary collects the side information of the node features, topology structure, or semantic information from public sources to infer the private information.

Graph privacy protection
Since graph neural networks (GNNs) play a crucial role in deep learning, the privacy problems in graph embedding are exposed, and some early work tried to protect graph data privacy and achieve meaningful results.
For homogeneous graphs, some works preserved the user information by personalized privacy protection [36] and the use of anonymization mechanisms [37,38] prevented an attacker from inferring private information. Recently, DPGGAN [26] had carried out differential privacy in GNNs by referring to DP-SDG [39] privacy-preserving design patterns and taking advantage of VGAE [28]. LDP [40] disturbed local user features to protect the privacy of node features and solved precision degradation by excessive injection noise through KProp. PrivGnn [41] randomly sampled private data from the training set and input the sampled data into the model for training pseudolabels and then mixed pseudolabels in the public data to achieve privacy protection. In application, GERAI [25] was a recommendation model, in which a combination of GNNs and LDP ensures the practicability of learning and protects users' information from attribute inference attacks.
For heterogeneous graphs, there was new work on the design of a heterogeneous differential privacy mechanism [42] whose target was to solve the problem of privacy budget allocation due to the different distribution of heterogeneous data. However, with the addition of more side information (semantic information), the inference capability of the attackers may be enhanced, and the existing methods are difficult to adapt to the high-order feature and topological complexity caused by semantics.

Preliminaries and problem definition
To efficiently implement privacy protection for heterogeneous graphs, we use differential privacy techniques [43] (DP) that are consistent with our data type and framework design. Differential privacy is recognized as one of the quantifiable and practical privacy-preserving models. The basic idea is that any computation cannot be significantly affected by any operation such as add, delete and modify. Even if the attackers know all records except this one, they cannot obtain any information from it. ( , δ)-Differential Privacy [44]. A random algorithm M satisfies ( , δ)-Differential Privacy for any two neighboring datasets D and D and any possible subset of output O ⊆ Range (M), and it holds that where the nodes D and D differ by at most one record. The privacy strength of DP increases as the privacy budget decreases, which is controlled by and δ. Thus, ( , δ)-DP is guaranteed by adding appropriate noise to the output of the algorithm, and the amount of injected noise is calibrated to the sensitivity. Sensitivity [44]. Given any query S on D, the sensitivity for any neighboring datasets D and D which is defined as Gaussian Mechanism [27]. Let S : D → O K be an arbitrary K-dimensional function and define its l 2 sensitivity to be 2 S. The Gaussian Mechanism with parameter σ adds noise scaled to N 0, σ 2 to each of the K components of the output. Given ∈ (0, 1), the Gaussian Mechanism is ( , δ)-DP with Adding noise is the primary means to implement privacy-preserving by differential privacy. In this work, we will apply Gaussian noise to the node features and link prediction gradients of the heterogeneous graph G, respectively, and the overall form is defined as where 2 S controls the amount of noise in the generated Gaussian distribution from which we will sample noise into the target.
Privacy leakage analysis Some existing works [26] only consider the high-order influence of the same node types while reducing the probability of malicious attackers stealing user interest orientations by perturbing their edges. However, the heterogeneous graph with complex data in which will enhance the attackers' inference ability, i.e., the attacker can obtain the membership of nodes by inferring the semantic information from other types of nodes. Consequently, the existing privacy-preserving methods are hard to adapt to the high-order features and topological complexity caused by semantics. And MIS in HGNNs will utilize node features and topology structure to further formulate semantic links to infer private information. Therefore, we transform the privacy problem on heterogeneous graphs into an associative differential privacy problem of graphs with solid semantic correlation. This means that our problem further becomes a multi-objective optimization problem for representation learning as well as optimal privacy budget allocation. The key symbols of this paper are summarized in Table 1.

Problem definition
We aim to maximize the privacy-preserving effect while minimizing the information loss due to the noise. Therefore, we combine optimal privacy budget allocation with model optimization as a multi-objective optimization problem. There is a heterogeneous graph G = (V , E, φ, ψ) with an entity mapping function φ (v) : V → A and a relation mapping function ψ (e) : E → R, where V and E are the set of nodes and edges. Each node v ∈ V belongs to the node typeset A, and each edge e ∈ E belongs to the edge typeset R. The graph has the meta-paths m = a 1 Then, given an objective function with FeatADP f (x, y) and TopoGDP g (x, y) on the privacy budget of f and s , the problem can be defined as follows where global privacy budget = f + s . Such problems are usually difficult to find a unified optimal solution, which is the same as the multi-objective optimization in existing graph learning. We are inspired by the multi-head attention mechanism [45] and differentially private stochastic gradient descent [39]. We formulate two protection strategies for node and topology, respectively. In the next section, we will specify our proposed privacy-preserving approach.

Fig. 2
The framework of HeteSDG. HeteSDG consists of two major components: FeatADP and TopoGDP. The first part secures the node attributes, and the second part protects the topology. The two parts are constrained by a global privacy budget so that the perturbation to the model is within a reasonable range and the optimal accuracy is pursued

Proposed methodology
In this section, we introduce the HeteSDG framework, a heterogeneous graph neural network with semantic-aware differential privacy guarantees, and illustrate the details of privacy mechanisms and learning optimization mechanisms. Figure 2 presents our proposed framework with the two-stage DP mechanisms of node features and topology structure. Specially, we elaborate on the HeteSDG approach (see Algorithm 1) and intuitive illustration that FeatADP will provide the perturbed features for TopoGDP to execute downstream. Furthermore, we clarify that the technical difference between HeteSDG and HeteDP is whether the semanticaware design is taken into account. In other words, HeteSDG protects the privacy of nodes by introducing a semantic-aware noise, where the size of the noise depends not only on the topological information but also on the semantic information of the nodes. This method can better protect the privacy of nodes compared to HeteDP, which does not consider semantic information and may fail to resist MIS, and potentially damage the semantic consistency of the graph.

Feature attention personalized mechanism
In this section, we detail the node-level privacy-preserving mechanism and incorporate it into feature representation learning. Specially, we compute the representation of various nodes by learning the influence weights of neighbors and semantics. The semantic neighbor building follows a Markov chain and meta-paths, i.e., the conditional probability of the node u with type a i+1 at moment is determined by the node v with type a i only, and the next node type to walk is fixed. In particular, to adapt the heterogeneity of the data, we constrain the generation of semantic neighbors through meta-paths m.
In addition, we obtain natural semantic neighbor information weights C for preparing further conditioning semantic-level feature learning, which we call the semantic-aware debias mechanism. The formulation is expressed formally as where a i+1 = a 1 . For multi-nodes, we map them to a uniform space through linear transformation, and the embedding of the l-th layer neural network as where h For neighbor-level aggregation, to learn the dependence between node u and neighbor v, we leverage the attention mechanism and normalize the overall attention value to quantify that we calculate the attention score between nodes as where V gs (u) is the set of neighbors which includes node v and following Eq. (6), σ (·) is an activation function, α is a learnable weight vector and denotes concatenation. Then, we introduce multi-head attention for node representation learning to pay attention to more comprehensive neighbor information. And we explicitly obtain the learnable influence weight of node u by other nodes simultaneously as the guiding of neighbor-level perturbation. So we design the multi-head attention coefficients and node representations between nodes on the (l + 1)-th layer of each subgraph as where K is the head of multi-head attention and z (l) v is the embedding of the neighbor node. Specifically, We use two types of multi-headed attention aggregation for node embeddings and attention weights, this design is just to better fit our data format, actually, they can be mixed.
For semantic-level aggregation, we utilize residual concatenate for the node embeddings to retain more semantic dependency as where M is the number of meta-paths. As we analyzed, heterogeneous data are more vulnerable to semantic inference attacks, and we further consider the impact of semantic level on node representation. The semantic attention from multilayer perceptron (MLP) as where W m is the attention weight of m and B m is the normalized attention coefficient. We note that C can further metric the awareness of semantics and provide more accurate semantic preferences following Eq. (6). So we get the multi-level embeddings as For the privacy-preserving feature learning, we inject personalized noise into the nodes individually, which means our noise fuses the weights of neighbor and semantics. We design the sensitivity and Gaussian noise on heterogeneous graph following Eqs. (2), (10) and (13) as where λ is a hyperparameter, the privacy budget f < and N u feat is the Gaussian distribution with mean 0 and standard deviation σ f 2 S feat for u to satisfy f , δ -DP.  12 Encode h to z from encoder q(·); 13 Predict A from p(·) Get gradient g; 16 Update sensitivity 2 S topo = C;

Topology gradient perturbation mechanism
In this section, our design is based on a heterogeneous variable auto-encoder which contains an embedding encoder and link predictor. It executes the heterogeneous differential privacy stochastic gradient descent to achieve privacy protection for topology structures. For embedding encoder. We build a two-layer heterogeneous graph neural network (HeteGCN) inspired by some state-of-the-art model [28,31,46]. And its aggregated representations of multi-nodes and multi-relationships as where f r is the HeteGCN module of each r ∈ R, X = h is node features, and A r is the relationship matrix. The hidden layer representation of each node under the relational subgraph is as where ζ is a normalization constant and w (l) and h (l) v are the learnable weight matrices and neighbor node embeddings of the l-th layer, respectively.
Our training process is an unsupervised representation learning and we design negative sampling to enhance the generalizability of the model, which will compute the difference in scores between two connected nodes and any pair of nodes. For example, there is an edge e ∈ E between nodes u ∈ V and v ∈ V in graph G, and we want the score between u and v to be higher than the score between u and k nodes v sampled from an arbitrary distribution v ∼ Pn (v). We random extract a batch negative sample in each iteration training through the neighbor sampling of the multi-layer GNN, and the negative sampling definition as Then, we adopt a two-layer HeteGCN model following Eq. (16) as an encoder and utilize the reparameterization trick in training where z is a stochastic latent sampling variable, μ r = HeteGCN μ (X, A r ) is the matrix of mean vectors μ r i and log r σ = HeteGCN σ (X, A r ) is the matrix of standard deviation vectors σ r i . For link predictor, we compute the inner product between latent variables to reconstruct the edge and leverage the calculation to express the connection probability of two different types of nodes φ(z u ) and φ(z v ) as where z T u represents the transpose of z u . The topology representation learning is to study a suitable and superior data distribution and discover latent structure. Therefore, we can learn the interdependence and association of node u and v based on semantic association rules and calculate the score between the node pair with the unsupervised cross-entropy loss of the graph as where k is the number of negative sampling. To alleviate the topology perturbation, we set the KL divergence L K L to constrain the distribution between generated samples and real samples. Specially, we design a regularization term L C as the soft supervision objective of side semantic relationship. The loss function as where γ and η are hyperparameters, p (Z i ) = i Pn (z i |0, I) is a Gaussian prior, and k∈V (u i ) z k is the predicted average of neighbors V (u i ) to nodes u i . For the topology privacy-preserving learning, we protect topology by perturbing the gradient of the representation learning. And we inject the Gaussian noise to the training gradient, so the sensitivity further defines as 2 S topo = C following Eq. (4).
Then, for each iteration in training, we calculate the gradient of predictor g = ∇L from backpropagation, inject noise into the gradient after gradient clipping and before gradient update, and finally perform gradient descent. Thus, the perturbed gradients as where B and |B| are the batch and size for each training iteration, respectively, g r i 2 is the l 2 norm of gradient clipping, and N topo (·) is the Gaussian distribution with mean 0 and standard deviation σ s 2 S topo . The distribution satisfies ( s , δ)-DP, where the privacy budget s < . We control the sensitivity to noise by limiting the norm bound C of a gradient. To adapt to the noise distribution in heterogeneous data, we utilize privacy accounting [39] to regulate the privacy budget of each iteration. We set a constant number c 2 , the sampling probability P, and the number of iterations T for training to make σ s ≥ c 2 P √ T log 1/δ.

Bi-level optimization of HeteSDG
In this section, we introduce a bi-level optimization mechanism [47] to achieve a two-stage privacy budget allocation with Eq. (5). The aim is to maximize the privacy-preserving effect while minimizing the information loss due to noisy inputs. The optimization is organized into two processes. For FeatADP optimization, we fix the hyperparameter of privacy budget s and find the optimal value of f where { f ∈ R : 0 < f < }, and the approximate solution formula is where y ∈ f , x ∈ s , t = 1, 2, . . . T and ∇ denotes gradient descent. For TopoGDP optimization, we fix the hyperparameter of privacy budget f and find the optimal value of s where { s ∈ R : 0 < s < }, and the parameter update as

Privacy-preserving analysis of HeteSDG
In this section, we give a detailed privacy analysis and proof for HeteSDG in the following theorem. We combine the two-stage privacy protection serially, i.e., since the features will be used for topology learning, feature learning must be performed first. In particular, the nature of two-stage optimization is also a sequential optimization process (bi-level optimization), which is consistent with our message-passing setup.

Theorem 1 A random function
yielding ( , δ)-DP for the Gaussian mechanism, where f ,s denotes the privacy budget of noise on node features or topology.
Proof Let x, y ∈ D and fix ∀ ( which shows that the combination algorithm A satisfies ( f + s , δ)-DP.

Experimental setup
In this section, we conduct experiments on four datasets and two tasks to demonstrate the adaptability of heterogeneity privacy protection and the effectiveness of heterogeneous graph learning. The experiment results of HeteSDG are shown in Table 3, where the best accuracy is shown in bold and the best privacy-preserving results are underlined. Furthermore, "−" indicates that the current model hardly implements in the dataset. We then further analyze how HeteDP and HeteSDG are affected by changing the strength of privacy preserving and our contribution to the overall performance of the optimization model.  Table 2 where we mark the classified nodes and the predicted edges with bolded. For example, in the downstream task of the ACM dataset, we perform node classification for "paper" and link prediction for "paper-author". The choices follow the majority of heterogeneous graph learning.
Baselines We compare the HeteDP and HeteSDG with state-of-the-art heterogeneous baseline methods in different categories: (1) meta-paths-based GNNs, we select metapath2vec [19] and HetGNN [30], where HetGNN has ignored the node representation fusion since our node is without self-loop edges; and (2) convolution-based GNNs, we choose HGConv [33], HGT [12] and RGCN [31]. For the other parameters, in feature representation learning, we set the dropout of training to 0.8, the regularization coefficients to 0.001, the number of heads of the multi-headed attention mechanism K to 8, and a hyperparameter λ to 0.01. Specially, we define the meta-paths m for each node type from all possible walks, and the number of layers depends on the meta-paths m and the types of edges R in the graph. In topology learning, we set the batch size |B| to 2048 and the number of negative sampling k to 5. For the baseline models, the parameters are set as the default values in their papers. To sum up, the categories of node classification and edge prediction set for each data selection follow Table 2 and the dataset split setting following VGAE [28].

Performance comparison
We set up two downstream tasks to test the performance of our proposed method, node classification (NC) and link prediction (LP). Table 3 summarizes the performance of HeteDP and HeteSDG in the different downstream tasks and on four datasets, comparing with the baseline methods, which reflects the inherent generalizability and the privacy-preserving effect. For the node classification task, we consider the practice of unsupervised node classification [46], using negative sampling of edges for training and 2-order neighbor sampling at each iteration of validation. We use the F1 score as a classification effectiveness measure. For the link prediction task, we extend the sampler [46] to negative sampling on heterogeneous graph, sampling k negative pairs for each edge. Each training randomly selects a specific size of data to form batch training. The encoder of TopoGDP consists of heterogeneous convolutional layers following Eqs. (16) and (17), and the link predictor calculates the scores of positive and negative sample pairs by inner product, respectively. We utilize the ROC-AUC score as an indicator to judge the performance of HeteSDG. The experimental results show that HeteDP and HeteSDG reduce the average score by 10.0% and 8.5% when privacy budget = 1. This phenomenon indicates the utility of privacy preservation, and with the work of semantic-aware mechanism, it is feasible to design privacy noise more efficiently and with a lower loss of accuracy. Our designed framework has the highest average accuracy ranking in the pure state, indicating that our privacy-preserving mechanism can be implemented with a state-of-the-art learning model, ensuring the fundamental performance of the model under perturbations.
Overall, in the LP task of DBLP and IMDB, compared to the runner-up model, our proposed HeteDP (pure) improves performance over 21.62% and 7.75%. And we likewise observe that the noise of different sensitivities to each dataset brings diverse levels of influence. For example, the model accuracy improves in different magnitudes with an increasing privacy budget, such as the ROC-AUC score of Amazon is only reduced by 0.28% with = 1. It shows that the generalization ability of our model is guaranteed to a certain extent, and the model can maintain the utility of the data under the influence of noise. Similar to what has been elaborated above, the ACM dataset has an accuracy reduction of about 14% on the LP task when setting the privacy-preserving strength of = 0.01. It shows that our proposed privacy-preserving method can affect MIS enhanced by semantic relations (graph topology).

Further analysis
In this section, we conduct an ablation study, bi-level optimization, sensitivity analysis and MIS verification. Some results demonstration follows HeteDP because they are a similar validation form of HeteSDG and we will not perform a new presentation of the results.

Ablation study
We further conduct ablation experiments to assess the necessity of FeatADP and TopoGDP privacy-preserving mechanisms. We design a total of four experiments in the LP task for comparison: the first is "w/o TopoGDP" (Feature perturbed), which only protects the features of various node types by Eq. (15) in FeatADP, and the output will be used for TopoGDP; second, our expression is "w/o FeatADP" (Topology perturbed) which attains the features by aggregating the information of node neighbors and semantics, and protects the semantic relationships in the topology representation learning process with Eq. (23); the third is the version of the node feature data and the topology data are double-protected in HeteDP; finally, the semantic-aware mechanism is added to the model named HeteSDG. Their privacy budget is 0.01 and the results are shown in Fig. 3. From the elaborated results, we can observe that both perturbations are significant, and compared with the node feature disturbance, topology perturbed has a greater impact on the model. And our training eventually reaches convergence and maintains some utility. Nevertheless, The accuracy of HeteSDG is improved faster than HeteDP. At the convergence, HeteSDG accuracy is higher, which indicates that HeteSDG has better availability and stronger adaptive capability after the usage of the semantic-aware mechanism.
Furthermore, we visualize node types to observe the utility of privacy protection in Fig. 4. It shows the 2-dimension node embedding visualization of all nodes in ACM using t-SNE [48], where the colors indicate node types. We design pure model, feature perturbed, topology perturbed, HeteDP and HeteSDG experiments, where the privacy budget is 3. The visualization from left to right generally shows increasingly tight clustering among similar nodes. For feature perturbed, we observe that node perturbation makes the spacing within the "paper" node type closer, which affects the classification effect within that node. And topology perturbed has clearer boundaries among different node types and causes a large change in the position of individual nodes even at higher . This perturbation phase has a lower impact within the node type, while the different node types become more dispersed. Compared with the pure model, this phenomenon indicates our method can distinguish different types of nodes and perturb similar nodes to achieve privacy and ensure utility. Finally, compared with HeteDP, HeteSDG provides semantic information on FeatADP to make the same type of nodes more difficult to distinguish, and adds semantic regularization terms on TopoGDP to make the different types of node boundaries clearer. Since ablation studies prove that topology perturbations have a greater impact on the model, the semantic regularization term has a greater effect and the overall model accuracy is improved.
Bi-level optimization Privacy budget allocation is an essential task in privacy protection. The aim is to reduce the probability of data being accessed by attackers, weigh the training accuracy, and try to address the problem of model performance degradation due to privacy noise. To further improve the utility of the model in privacy preserving, we design a bilevel optimization trick to allocate the privacy budget of Gaussian noise on FeatADP and TopoGDP. We fix the topology noise and seek the optimal privacy budget allocation on node features by experiments in a specific interval according to Eqs. (5) and (24). Next, we fix the amount of noise on features to find the optimal privacy budget on topology following Eqs. (5) and (25). The results are shown in Fig. 5. The figure compares the effect of an equally divided privacy budget and a bi-level optimized privacy budget and shows that bi-level optimization can bring better performance for the model, which achieves the purpose of the trade-off between protection power and utility.

Sensitivity analysis
We demonstrate the sensitivity analysis of HeteDP and HeteSDG to the privacy budget parameter on different datasets. Specifically, we test the influence of parameter on the LP task. We set 5 values of on ACM, IMDB and DBLP, and show in Fig. 5. We observe that HeteSDG in ACM dataset achieves a score close to 80% at = 1, which is nearly 5% higher than = 0.01. Overall, as the privacy budget increases, the model performance shows an upward trend. And HeteSDG introduced a semantic-aware mechanism that improves the performance of the model. However, on the IMDB dataset, the performance peaked around a privacy budget of 0.5 and decreased when the privacy budget was increased. Because its network structure is fragile, and it is harder to improve the learning ability once it is disturbed. The experiments show that different datasets have myriads of changes in sensitivities to the privacy budget due to inconsistencies in their natural data distributions.
And it is necessary to analyze the privacy parameters to find an appropriate level of privacy protection that balances privacy and data utility.

MIS verification
In the real world, the membership inference attack with semantic enhancement (MIS) models used by attackers is diverse and unknown. MIS is different from the traditional membership inference attacks (MIA), and the difference is whether semantic information is obtained to enhance the inference. MIS is an attack that infers the member- ship of a dataset by analyzing the semantic similarity and correlation of the target model's output. The dataset we used does not contain links between identical nodes, but virtual links can be established between them based on the topological structure and semantic information (meta-paths), and they can be considered as high-order neighbors of each other, thus solving the problem of data heterogeneity. We use the shadow model which is similar to the structure of the original model to get the training set of the attack model and obtain the results of the target model query to get the test set of the attack model [49]. Specifically, we provided semantic information to the shadow model based on the output of the target model. To make the attack experiments more representative, we utilize three widely familiar classical inference models as attack models: naive Bayesian, KNN, and decision trees. We selected the convolution-based aggregation methods in the baseline (HGConv, HGT, RGCN) as the comparison models, i.e., the meta-paths-based design methods are not considered in the design of the attack models, since this is a semantic enhancement-like and an independent process from the model training. Table 4 shows the attack accuracy of each attacker who conducts "paper" node type inference on ACM and "movie" node type inference on IMDB. The experimental results expose that the attack models obtain more information on the baseline model but achieve poorer inference results on our model. In particular, for the models HeteDP and HeteSDG that are both designed with privacy protection, HeteSDG performs the best in terms of model performance when e = 1, but exhibits the poorest performance in the attack. In other words, for situations with a higher privacy budget, HeteSDG can resist MIS while maintaining utility close to the baseline model. When the privacy budget is low, the classifier limits the model's ability, resulting in low data availability, but at the same time provides strong privacy protection capability. This indicates that our model has excellent utility and the ability to resist attacks. Due to the limitation of evaluating the quality of models based on their final accuracy, a global observation of model performance cannot be achieved. We further analyze the overall performance of the models from a new perspective, utilizing the Precision-Recall curve as  Fig. 7. For instance, when = 1, the curve of HeteSDG covers that of HeteDP, indicating that HeteSDG performs better overall with the same privacy requirements, further demonstrating the effectiveness of the semantic-aware mechanism.
In addition, we show "paper" embedding visualization of ACM in Fig. 8 and we also obtain the same conclusion as above. An additional gain is that the perturbation ability of privacy budget strength on representation learning is demonstrated. In general, HeteSGD can guarantee privacy while ensuring accuracy for downstream tasks. For example, with = 1, the classification accuracy for the paper is 88.31%, which is higher than HGT and RGCN. And the classification visualization is well bounded which indicates the availability, while the attack accuracy decreases at least by 40.7% which shows the power of privacy protection.

Conclusion
In this work, we propose HeteSDG, a novel heterogeneous graph neural network with semantic-aware differential privacy guarantees, i.e., we propose a two-stage privacypreserving mechanism based on differential privacy, capable of adapting to the high-order features and topological complexity caused by semantics. For multi-nodes and multirelationships, we learn the representation distribution and aggregation of nodes on each relationship through multi-relational convolutional layers and adapt to various downstream tasks through unsupervised learning. Considering that nodes and topology are vulnerable to MIS in heterogeneous graph scenarios, we perturb the node features and the topology structure, respectively. In particular, we design a unique semantic-aware debias mechanism to guide more accurate noise generation and enhance the utility of the privacy-preserving model. Then, we balance the privacy budget allocation of the node feature and the topology structures, and achieve higher performance by bi-level optimization. Comprehensive experiments on four datasets demonstrate the privacy-preserving capability and adaptability of HeteSDG, and the MIS experiments on three basic attack models show that our model produces resistance against MIS with guaranteed accuracy utility. We hope that our work could bring some inspiration to privacy protection in more complex graph data. Yuecen Wei is currently a M.S. candidate at the School of Computer Science and Engineering, Guangxi Normal University, Guilin, China. Her research interests include graph representation learning, privacy protection and social network analysis. She has published one paper on ICDM.