Beyond model splitting: Preventing label inference attacks in vertical federated learning with dispersed training

Federated learning is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data. As an important variant, vertical federated learning (VFL) deals with cases in which collaborating organizations own data of the same set of users but with disjoint features. It is generally regarded that VFL is more secure than horizontal federated learning. However, recent research (USENIX Security’22) reveals that it is still possible to conduct label inference attacks in VFL, in which attacker can acquire privately owned labels of other participants; even VFL constructed with model splitting (the kind of VFL architecture with higher security guarantee) cannot escape it. To solve this issue, in this paper, we propose the dispersed training framework. It utilizes secret sharing to break the correlations between the bottom model and the training data. Accordingly, even if the attacker receives the gradients in the training phase, he is incapable to deduce the feature representation of labels from the bottom model. Besides, we design a customized model aggregation method such that the shared model can be privately combined, and the linearity of secret sharing schemes ensures the training accuracy to be preserved. Theoretical and experimental analyses indicate the satisfactory performance and effectiveness of our framework.


Introduction
Machine learning has gained great success in numerous fields, such as decision-making, risk identification, and disease diagnosis. The widespread adoption of machine learning and its  In this paper, we propose the dispersed training framework, to combat such attacks and enable the security of VFL. The basic idea of dispersed training is to utilize secret sharing breaking the correlations between the gradients and the training data. As shown in Figure 1, participant B holds his own data with labels hoping to train a model; he hopes to utilize participant A's data to improve the quality of his model. Participant A is the positional attacker and he is intended to infer participant B's label. In the dispersed training framework, a shadow model (i.e., participant C) is created for participant A, and part of the data of participant B is shared to participant C. In the training phase, the clients (i.e., participants A and C) update their bottom model with their shared data and upload their partial outputs to the server. The server aggregates the clients' partial outputs and trains its top model; their outputs can be efficiently aggregated, due to the linearity of the secret sharing schemes. The server's training output is also segmented into two parts and delivers each segment the A and C, respectively, who later iteratively train and update their bottom model. With such a method, even if the attacker receives the gradients in the training phase, he is incapable to deduce the feature representation of labels from the bottom model.
The rest of the paper is organized as follows. Section 2 presents basic backgrounds of federated Learning and describes how inference attack is conducted on vertical federated Learning. We present our dispersed training framework and its construction on Section 3. The performance evaluation is presented on Section 4. Related work on this topic is presented at Section 5, and we conclude this paper on Section 6.

Federated learning
Compared with the common centralized learning, the federated learning provides a collaborative learning method to perform distributed training models on data, which protects the privacy of data [7]. According to the characteristics of data, the federated learning is mainly divided into three categories: horizontal federated learning, vertical federated learning, and transfer federated learning [8].
Horizontal federated learning is suitable for the situation where the data features of the participants overlap a lot, but the IDs of the samples overlap less. The purpose of horizontal federated learning is to train a more accurate prediction model by combining more samples with the same characteristics. For example, two e-commerce platforms in different regions have purchase records of the same consumption level in their respective regions. The two e-commerce platforms can train a model through horizontal federated learning and can push products to users of this consumption level. Vertical federated learning is suitable for samples with more overlapping IDs and less overlapping data features. The purpose of the vertical federated learning is to train a model by combining more features of the same sample. For example, a bank and a lending company in the same area, the bank has data on a group of people's economics status and whether they owe money, while the lending company has another asset status of the same group of people. Integrating the lending company and the bank into cooperate method can carry out vertical federated learning by using their data, and train an accurate model to predict risk and decide whether to service a loan to someone [8]. Transfer federated learning is more suitable when the data characteristics of the participants and the number of samples overlap less, or the distributions of the participants differs greatly. For example, hospitals and lending companies in different regions want to train a prediction model, but the two parties have little overlap in data features and sample numbers. At this time, transfer federated learning can be used to train the model [9].

Label inference attacks in VFL
The label inference attack of vertical federated learning is proposed by Fu et al. [6], which reveals the privacy relationship between vertical federated learning and labels. Fu et al. [6] designed three attack methods to reason about labels with high privacy, which can enable malicious participants to perform inferring attacks on labels of any participant's data, resulting in serious privacy leakage.
Passive label inference attack is vertical federated learning for model splitting. Although no participant can access the top model, the trained bottom model can be used for inference attacks [10]. Its principle is that the adversary can convert its features into indicative information about labels during the training process, and use that information to make predictions. The adversary uses vertical federated learning to train the bottom model with indicative features, and then adds an initialization layer to the upper layer of the trained bottom model to form a complete model, and uses a small amount of labeled data to perform semi-supervised learning on the attack model. With this model, the label can be directly inferred.
Active label inference attacks use the adversary to accelerate the gradient descent of the bottom model, so that the bottom model is better trained in each iteration, which can provide better training features to the server, making the top model of the server more dependent on the bottom model, and the adversary then fine-tunes this bottom model to form a complete attack model that can perform inference attacks on labels. Fu et al. [6] designed a malicious local optimization algorithm and added an algorithm to limit the learning rate to the gradient descent algorithm, which can ensure that the learning rate is appropriate, so as to accelerate the gradient descent and make the malicious bottom model better trained.
The direct label inference attack is aimed at vertical federated learning without model splitting. The adversary can receive the gradient leaked from the server. Fu et al. [6] proved by mathematical analysis that the adversary can directly analyze the leaked gradient sign. When the gradient sign is negative, it proves that the guessed label is exactly the actual label. When the gradient sign is positive, the guessed label is incorrect. This can cause serious label leakage.
In summary, all three attacks result in serious privacy breaches. Passive label inference attack is due to the bottom model having better inference ability, and the bottom model is trained by semi-supervised learning to form an attack model to perform label inference attack. Active label inference attack is on top of the base, using malicious SGD to accelerate gradient descent, resulting in a malicious bottom model that can be trained better and has better inference attack capability. The direct label inference attack, on the other hand, is not very related to the first two. The direct label inference attack is directly caused by the leakage of label information through the gradient leaked from the server. Due to previous work [6], common privacy protection methods cannot be used to defend against passive label inference attacks and active label inference attacks, the proposed model in this paper is mainly for the discussion of the first two attack methods.

The training model
Before describing more details with respect to the training model, we present the model in brief. First, a shadow model c is generated, which has the same structure with a but with a benign type. The difference between the malicious and benign types is that when training the model, the benign model utilizes SGD for gradient descent, while the malicious model utilizes its own local optimizer to accelerate gradient descent. Then the malicious model shares its training dataset with the shadow model through secret sharing, and performs local training on the bottom models. Consequently, the output of the bottom model is uploaded to the top model after being merged and aggregated. Finally, the top model assigns the gradient to the bottom models. Note that the shadow model should share the gradient with the malicious model. The above process is repeated until the model converges.
Algorithm 1 is the pseudo-code of the training model. More details are as follows.
updating top model 27: 28: (6)Sharing the gradient and update model: calculated, and then the gradient is passed down to the bottom model. In the previous work [6], the top model will pass gradients down to models a and b, respectively. In this paper, the process of model b receiving its gradient is similar to the previous work [6]. However, it's different for downloading the gradient to the model a since model c is involved. In particular, the gradient g a of the model a is generated first. Then a new g a and g c are obtained by the secret sharing function SS(), which are passed to the models a and c. This function is implemented by the merging layer. 7. Repeat (1)-(6) for the bottom model and the top model until converge.

The attack model
In the attack model, we still use the model completion proposed by Fu et al. [6]. After completing the federated training process, we will get the trained bottom model. This bottom model also has strong capabilities to run label inferring attack. We retrain the bottom model and use a small amount of labeled data as Fu et al. [6]. More specifically, the malicious attacker adds an extra layer to continue semi-supervised learning, and finally trains the bottom model. The newly trained model is used as an attack model to execute label inference attack.

Experiments settings
All experiments are performed on Intel(R) Core i5-12500H @ 2.50GHz, 16GB RAM, NVIDIA GeForce RTX 2050 card ( Table 1). The three datasets used in the experiment are CIFAR-10, CINIC-10, and BCW, which are also the datasets used by previous work [6].
In this paper, Top-1 Accuracy is selected as the performance indicator for the attack effect of federated original tasks and label inferring. Top-1 accuracy means that the predicted label takes the largest one in the final probability vector as the prediction result. If this prediction result is the actual label, it means the prediction is correct, otherwise, the prediction is wrong.
In the process of training the attack model, this paper selects an additional small amount of labeled data for semi-supervised training of the bottom model. In the attack model experiment, CIFAR-10 and CINIC-10 selected 40 labeled samples, and the BCW dataset selected 20 labeled data for semi-supervised training. As mentioned in the previous work [6], the number of labels will affect the effect of the label inference attack. When the number of samples reaches a certain number, the result of the label inference attack will grow slowly. Therefore, the same as in the previous work [6], In this experiment, 40 data and 20 data are also selected.

CIFAR-10:
CIFAR-10 is a typical classification dataset, which contains 60,000 images, of which 50,000 images are used as a training set and 10,000 images are used as a test set. Each photo is a 32*32 color image, and there are 10 different categories of images in the entire dataset. CINIC-10: CINIC-10 is an extension of CIFAR-10 through subsampled ImageNet images, and like CIFAR-10, it is also divided into 10 categories. To solve the problem of the small number of CIFAR-10, the CINIC-10 dataset appeared. It contains 270,000 images, which is 4.5 times larger than CIFAR-10. The images are evenly divided into three subsets: training set, validation set, and test set. Each subset has 90,000 images, the number of training sets is 1.8 times that of CIFAR-10, and the number of test sets is 9 times that of CIFAR-10. Through this data set, we can test the effect of our designed scheme on large data sets. BCW: The Breast Cancer Wisconsin (BCW) dataset is a breast cancer dataset with a total of 569 pieces of data and 32 feature columns, mainly for nuclear features.
The label of the sample is whether the diagnosis is benign or malignant. The data set used in the experiment is the data set used by Fu et al. [6], which randomly selects 426 samples as the training set and the remaining 143 samples as the test set.
In order to compare with the paper proposed by Fu et al. [6], we use the same top and bottom model structure of Fu et al. [6]. For large datasets like CIFAR-10 and CINIC-10, the bottom model is chosen as the residual network and the top model is full connect neural network, for the BCW dataset both the bottom and top models are used as fully connected neural network, and the specific structure is shown in Table 2.

Comparison with original attack
The performance of the model after dispersed training in the face of a passive label inference attack is shown in Figure 2. It can be seen from the Top-1 accuracy of the attack on the three datasets, compared with the label inference attack proposed by Fu et al. [6] the effect has Table 2 Neural network structure  of the bottom model and top  model   Dataset  Bottom model structure  Top model structure   CIFAR-10  ResNet-18  FCNN-4   CINIC-10  ResNet-18  FCNN-4   BCW  FCNN-3  FCNN-3 dropped significantly, especially on the CIFAR-10 dataset by about 70%. After dispersed training, the Top-1 accuracy rates of attacks on CIFAR-10, CINIC-10, and BCW datasets are 9.99%, 10.02%, and 36.36%, respectively. For the CIFAR-10 and CINIC-10 datasets, the Top-1 accuracy rate of the attack is reduced to about 10%. These two datasets only have ten categories, so the Top-1 accuracy rate of the 10% attack is equivalent to the Top-1 accuracy rate for these ten categories to random guess. The results from the CIFAR-10 dataset to the CINIC-10 dataset are all around 10%, indicating that our scheme also has a good defense effect on large datasets. Overall, the dispersed training proposed in this paper can effectively prevent passive label inference attacks. In addition, this paper also compares active label inference attacks. As can be seen from Figure 3, the model after dispersed training is also effective against active label inference attacks. On the CIFAR-10 dataset, the attack Top-1 accuracy rate dropped from 84.84% to about 10%, and the attack Top-1 accuracy rate for the other two datasets also dropped significantly.
Overall, our scheme can reduce the accuracy of label inference attacks to random guessing, which proves that dispersed training can effectively mitigate label inference attacks.   Table 3 shows the accuracy of the original federated tasks for each dataset after dispersed training. Compared with the original federated learning, the federated accuracy decreased after dispersed training. It drops by 4%, 15%, and 18% on the BCW, CIFAR-10, and CINIC-10 datasets, respectively. Among them, the accuracy of the CIFAR-10 and CINIC-10 datasets dropped significantly, which was caused by the large datasets. Although the label inference attack can be effectively prevented after dispersed training, the federated accuracy has also decreased to a certain extent, especially on large datasets. Therefore, the implement of dispersed training requires a trade-off between defending against label inference attacks and federated training accuracy.

Comparison with gradient compression
Compared to common machine learning privacy-preserving methods, our scheme can mainly make the performance of active label inference attacks and passive label inference  attacks reduced to random guesses and can make the performance of the original federated task not degraded too much. For example, in the method of adding noise to the gradient, the performance of the original federated learning task decreases from 0.8 to about 0.1, which can lead to the failure of the federated task. Let's take gradient compression, a privacy-preserving method, as an example. As shown in Table 4, the active label inference attack decreases from 0.8484 to 0.64 after gradient compression at a compression rate of 0.9, while our scheme can be reduced to about 0.10 for random guesses. The original federated performance is reduced, both down to roughly 0.7 or so.

Output distribution of malicious model
We use t-SNE [11] to map the output of bottom model A into 2D space. As shown in Figure 4, the classification of each color is less obvious. Because of our dispersed training, the bottom model A cannot learn about the relationship between labels and features and has a poor ability to perform label inference attacks. The attack model formed after model completion cannot perform label inference attacks.

Related work
At present, data security is a thorny issue, such as the malicious recovery of image data, the anonymity of network data transmission, privacy protection of distributed systems, data Figure 4 The outputs of attack model a privacy protection in big data environment [12], etc. Various solutions are proposed to solve these kinds of problems [13][14][15][16][17][18][19]. Zhang et al. [20] proposed an interesting approach for optimizing the multicast traffic based on the advantages of the software-defined networking. Among them, machine learning, especially federated learning, is a hot topic. Federated learning (FL) was first proposed by google [21][22][23], aiming at building machine learning models with distributed entities (e.g. devices or datasets), where the private information of the entities should be protected [13]. In general, FL has three categories, Horizontal Federated Learning [23][24][25], Vertical Federated Learning [26][27][28][29][30] and Federated Transfer Learning [31]. Federated Reinforcement Learning [32] has also recently emerged. Security issues [33][34][35][36][37][38][39], especially in vertical federated learning, has got arouse wide concern [40,41]. Wei et al. [42] investigate the issues of security and privacy in VFL. Fu et al. [6] discuss the problem of label leakage instead of membership inference or sample property [43][44][45][46]. Rassouli et al. [47] prove it's possible for the adversary to reconstruct the passive party's feature under the black box. To protect privacy in VFL, Zhu et al. [48] propose a secure framework PIVODL and Han et al. [49] propose FedValue by using Shapley-CMI and guaranteeing the data privacy toward the view of game theory.
In general, differential privacy (DP) [50][51][52][53] and homomorphic encryption (HE) [54][55][56][57] are used to protect the privacy of the data in VFL. Geyer et al. [58] guarantee privacy of the users in the training process. Yuan et al. [59,60] utilize HE to train data in the cloud. While DP and HE do not work here. First, if we only add some ransom salts to the training data, the training and the attack process are almost identical to the previous work [6]. Therefore, we add a convergence level, where the output and shared gradient converge. As for HE, it also fails in our setting. It's due to the fact that the bottom model can still be trained well even if the messages between the bottom and top model are encrypted with HE. Consequently, the attacks can be implemented by adding one level. Recently, secure multi-party computation (SMC) [61][62][63][64] is implemented to solve the privacy issues in VFL. For example, SecureML, an SMC framework is used to scalable preserve the privacy in machine learning. SMC can preserve privacy of sensitive data [65]. Mohassel et al. [66] propose a 3PC model by utilizing secret sharing with non-colluding servers. Personalized federated learning [67][68][69] is used to address data heterogeneity in federated learning. Secret sharing [70] preserves the information with respect to the intersection elements.

Conclusion
The previous work revealed that the privacy security of VFL also has great risks. Malicious participants in VFL can launch inference attacks on the labels of other participants, resulting in serious privacy leakage. To solve this problem, we propose a dispersed training framework, which introduces a new bottom model, which can share part of the gradient during the training of the malicious bottom model through secret sharing, so that the malicious bottom model cannot better obtain the relationship between labels and features, thereby preventing label inference attacks. Experiments show that dispersed training can effectively prevent label inference attacks. However, the accuracy of the original federal task is also affected to a certain extent, and it can only trade-off between raw federated task accuracy and attack accuracy, which provides a good direction for our future research. In the future, we can take into account the original federal performance and reduce the research on the direction of attack accuracy.