A Comprehensive Social Matrix Factorization with Social Regularization for Recommendation Based on Implicit Similarity by Fusing Trust Relationships and Social Tags

Social relationships play an important role in improving the quality of recommender systems (RSs). A large number of experimental results show that social relationship-based recommendation methods alleviate the problems of data sparseness and cold start in RSs to some extent. However, since the social relationships between users are extremely sparse and complex, and it is difficult to obtain accurately user preference model, thus the performance of the recommendation system is affected by the existing social recommendation methods. In order to accurately model social relationships and improve recommendation quality, we use explicit social relationships such as user-item ratings, trust relationships and implicit social relationships such as social tags to mine potential interest preferences of users and propose an improved social recommendation method integrating trust relationship and social tags. The method map user features and item features to the shared feature space by using the above social relationship, respectively, and obtains user similarity and item similarity through potential feature vectors of users and items, and continuously trains them to obtain accurate similarity relationship to improve the recommendation performance. Experimental results demonstrate that our proposed approach achieves superior performance to the other social recommendation approaches.


Introduction
In recent years, the latent factor model based on matrix factorization (MF) is widely used in RSs due to strong scalability and

Preliminaries and related works
In this section, we will review the work related to the social relationship and the low-rank social MF-based latent factor model because of the importance of social relationship in improving the quality of recommendation [10,11,13,26,29,50] .

Social relationships
Trust relationships and propagation are used to evaluate indirect social relationships between users, which are used in the establishment of recommendation models [5] . In recent years, several trust relationship metrics have been proposed, among which TidalTrust and RTCF are the most representative trust relationship calculation methods [10,12] . TidalTrust is a user trust relationship measurement method based on trust propagation theory, which mainly uses the idea that the trust relationship would gradually decrease as the distance between users increases. Its trust relationship is expressed as follows [2,24] : where  u F denotes the set of users that user u trusts directly, and uw S denotes the degree of user u's trust in w. RTCF is a measurement method of user's relationship that takes into consideration the similarity and trust relationships between users.
The calculation method of the social relationship between the two users is as follows [6,23] (2) where uv sim denotes the similarity of preference between users u and v, and uv T denotes the trust degree of user u in user v. which is bound to the range [0,1]. Trust propagation can be used to measure the strength of trust between two users without a direct correlation. For example, Figure 1 (a) describe the process of predicting ratings for items using trust relationships between users. It is known that the user u2's rating for the item i2 is 5 in Figure 1 (a), according to the trust propagation theory we can get the user u1's rating for the item i2, namely 3 because the user u1's trust strength for the user u2 is 0.6. However, there are complex social relationships between users in social networks and various social relationships among users affect each other, thus it is difficult to measure accurately the social relationship between users. Figure 1 (b) shows the complex social relationships between users. In recent years, various social relationships such as social tags, personal interests and other influencing factors are integrated into the recommendation model to improve the recommendation quality by mapping user rating information and social relationships to shared user feature space and item feature space [7,11,12,14,17,19,32,34,38] . Although the problem of inaccurate recommendation caused by sparse rating data is alleviated to some extent, it lacks further training on user social relationship to obtain accurate similarity relationship when measuring user characteristics through neighborhood relationships and trust relationships. So that the user preference model obtained by the nearest neighbor or trusted users may deviate from the real user preference model, resulting in the limited range of improving the accuracy of recommendation. Recently, some prediction methods of user trust relationship have been proposed, but few researches have applied the prediction results of user social relationship to recommendation model [7,13,14,21,34,37] .

Matrix factorization model based on social relationships
The social MF has been become one of the most widely used methods in RSs due to accurate prediction and high efficiency. In addition, neighborhood relationships and trust relationships are two of the most commonly used social relationships, which are integrated into the MF process to obtain more accurate user feature and item feature [21,22,24,32,54] .

Recommendation model based on neighborhood relationships
The MF approach can be depicted as a probabilistic graphical model, as shown in Figure 2. The MF method uses the potential relationship between users and items to decompose the user-item rating matrix into two low-dimensional matrices: users' preference feature matrix and items' attribute feature matrix. The two matrices are mapped into the same latent factor space, and the unknown ratings are predicted according to the degree of matching between users' preferences and items' attributes.
For the user-item rating matrix R, MF techniques can be used to map them to the user and item latent feature spaces, respectively. The rating information R is composed of inner product of the user and the neighbor user and the item feature, and the probability graph model is shown in Figure 2. In order to obtain a more accurate personalized preference, it is assumed that each user obeys different priori variances. The more a user's rating is, the more accurate his estimated preference is, and vice versa, the greater the uncertainty of his preference is. Then the Gaussian prior distribution of the user features is as follows [8,14,39] : where u u n denotes the number of ratings from the user u. According to Bayesian inference, the feature matrices U and V can be obtained by minimizing the following formula [39] .
where u N represents the set of neighbors of the user u, and uk S denotes the similar relationship between users u and k.

Recommendation model based on trust relationships
In [14,28], user interest preference models are built from the perspectives of users' trustees and trustees, respectively. The modeling idea is based on the following considerations: a user u's trustee is likely to have the same or similar preference interests as the user u, so these trustee users have similar feature vectors as the user u. Similarly, when some users are trusted by the same user u, these users have similar characteristics as the user u. Therefore, the user preference model is described as follows [7] :   and 2 E  ,respectively [7,8] . According to the Bayesian formula, we can get the following loss function [7,8,15] : where Tuv represents the trust relationship between users u and v.

Recommendation model based on social tags
If there is no explicit trust relationship between the users, since the user's annotation and comment information on the item reflects the user's preference for the item to some extent, the tag weight information of users and items can be used to express implicit social relationships between the users and between the items. In order to better measure the social tag weights to reflect the user's personalized preference characteristics, each tag is considered to be subject to a different a priori variance.
The Gaussian prior distribution of the tag feature is as follows: where t l n indicates the number of tags that are marked by the user u. up G and ip H indicate the weight relationships between the user u and between the item i for the tag p, respectively, which can be obtained by TF-IDF as follows [10,12] : where up c denotes the times that the user u selects the tag p, () u p c denotes how many users have chosen the tag p, ip c indicates the times the tag p appears in the item i, and () i p c indicates how many items the tag p appears in. The more times a user selects the tag, and the few users use the tag, indicating that the tag should have a greater weight for the user. The more times a tag appears in an item set, the few times appear in other item set, and for the item set the more important the tag is, indicating the tag should have a greater the weight for the item. Inspired by [11,32,34,39,41], the features of the user and the item can be obtained using the latent semantic model, respectively, and the conditional probabilities of the user-tag and item-tag weights based on the user-tag weight and the item-tag weight, respectively. The conditional probability distributions are as follows: 2 2 11 ( | , , ) ( | ( ), ) where G is mapped to the user feature space U and the tag feature space L, respectively, and H is mapped to the item feature space V and the tag feature space L, respectively. According to the Bayesian formula, the posterior probability of G and H is described as follows: After solving the above formula, it is equivalent to minimizing the following loss function.

The proposed recommendation framework based on similarity feedback on user and item characteristics
This paper is based on the assumption that user preferences are influenced by users and their neighbors, trusted users, and social tags, first analyzes the impact of social tags on user preferences, and builds a comprehensive recommendation model using rating information, trust relationships, and social tag weights to obtain user and item features from two perspectives of explicit and implicit social relationships. At the same time, based on the framework of the literature [7,8,15,19,20,23], the feature spaces of users and items are trained to improve the quality of recommendations. The ISocialMF recommendation framework presented in this paper is shown in Figure 3. The recommendation method is divided into the following processes: (1) a social recommendation model based on explicit interaction information such as user's user ratings, direct trust relationship, and social tags is established; (2) user features, trustee users and item features are mapped to a shared space respectively using MF technology; (3) user weights and item weights are established using interaction relationships such as social tags; 4) the implicit feature space and the social tag weight information are combined to obtain the implicit similarity between users and between items, respectively; (5) the above social recommendation model is combined with the implicit similarity by using the SocialIT framework structure to establish an explicit and implicit relationships, and implicit feature spaces of users and items are obtained from the above model learning training, respectively; (7) the ratings are predicted and recommendations are generated.

Regularization of social relationships based on implicit interaction
In order to accurately estimate the similarity of users and the similarity of items, the social MF framework with explicit and implicit interactions is proposed by combining various social factors to map user and item features to a low-rank share space respectively in this section.  We use the similarity of trust relationships between trustees and trusted users to constrain the trustee and trustee characteristics. As two trusted users, the closer the features of the two users are, the more similar their preferences are. Therefore, two truster feature vectors can be constrained by the feature similarity of the user's trustee as follows [7] . (14) where () B uv P denotes that the similarity between the user u and the user v based on the trust relationship, which can be obtained from the common trust relationships between users u and v with their common trust user k. In addition, the higher the degrees of trust of a user by other users are, the more users will adopt his/her suggestion, and the greater the user's indegree will be. According to [2,28,34,41], the improved user similarity is described as follows: (out u d represent in-degree and out-degree in the trust network, respectively. Similarly, as a trusted user, the closer their characteristics are, the more similar the preferences of the two users are. Therefore, the constraint terms of the two trusted user feature vectors are as follows [7] : (16) where () E uv Q denotes that the the similarity of the trust relationship between two trusted persons. According to the literature [7,8,22], the improved user similarity based on the trustee is as follows: (17) In summary, considering the contribution of user ratings and trust relationships, the improved similarity between users is expressed as follows: Inspired by the literature [7], if a user trusts another user or a user is trusted by another, their features will be very close. Therefore, the truster and the trusted user have similar feature vectors respectively. So a regularization term is obtained as follows: where Bu and Eu represent the user's set of trusted and trusted users, respectively.

Implicit similarities between users and between items
In order to accurately describe the degree of similarity between users, based on the literature [14,27,36], we analyze the similarity relationships among item similarity and social tag comprehensively, through the user's explicit interaction with the item ( Information such as ratings and direct trust relationships) and implicit interactions (users' tag information for items) are optimized to obtain user and item low-rank feature vectors to establish user-preference relationships and item similarity relationships.

Improved user implicit similarity
According to [7,8,15,23], user feature vectors, trust relationships and social tags can reflect the similarity between users to some extent. In [13,103], user characteristics are measured directly, while the contribution of similarity from other social relationships is ignored. In [45], user characteristics and trust relationships are employed to measure similarity, but the similarity just add up linearly, without considering the impact of data imbalance and the weight of the different influencing factors. From [7,8,31], the comprehensive trade-off user characteristics, trust relationship characteristics and social label weight relationships can accurately measure user similarity. Among them, the user's rating similarity is calculated by the user's rating vector as follows [7,8,15] .
If the two trustee feature vectors are similar, their preferences are similar. Considering trusters and trusted users in a comprehensive way can avoid user feature deviation caused by data sparseness to more accurately reflect the user's preference characteristics. Therefore, the average similarity of preferences based on trust relationship is as follows: The implicit similarity between users is considered as a normal distribution consisting of user feature similarity, trust relationship implicit similarity and tag similarity. The conditional probability distribution is as follows: where relation(u,v) indicates a direct trust relationship between users u and v. It can avoid the inaccuracy of user feature description caused by data sparseness and imbalance, and also solve the problem of deviating from real user characteristics caused by linear superposition of user similarity, without explicit user rating and trust relationship data. For user implicit preference similarity based on tags, it can be obtained by comparing the tag weights of the above two users to the item [7,20] .  (24) Taking the logarithm of the posterior probability of Eq (24), the objective function is obtained as follows:

Item implicit similarity
If a user likes an item, the user will often like the item with similar characteristics. Therefore, we introduce the idea to improve the recommendation quality. Specially, it is assumed that item similarity consists of a normal distribution of item feature similarity and social tag relationship similarity as follows: According to Bayesian inference, the following loss function can be obtained as follows: Among them, the similarity between items according to the comprehensive consideration of the item characteristics and social tag factors can be obtained as follows: Here, the similarity between items based on social tags is as follows [54].  (29) Similarly, the similarity based on item characteristics is be calculated as follows:

Social recommendation algorithm model integrating trust relationships and social tags
To fully exploit and accurately measure the potential complex social relationships between users and items in online social networks, inspired by [7,8,15,20,23,29,33,36], the effects of user ratings, explicit and implicit interactions such as social tags and user trusts on recommendation quality are analyzed in depth, some classic recommendation frameworks such as SocialIT, RSTE and RoRec are integrated into the recommendation process, and an improved social recommendation algorithm namely ISocialMF fusing social tags and trust relationships is proposed in this paper. The method maps the rating information, social tags and user trust relationships to the low-rank user feature space, the item feature space and the tag feature space, respectively, and uses the optimized feature vector to obtain the user's implicit similarity and item implicit similarity. Then the user and the item feature vectors are trained continually, and the implicit similarity between the users and between the items are optimized by using various influencing factors to improve the accuracy of the recommendation. Social tag information is introduced into the process of establishing the recommendation model, the inaccuracy of recommendation caused by sparse data and imbalance is not only be mitigated, and the diversity of recommendations also can be improved.

ISocialMF algorithm model
Considering the impact of user trust relationship, rating information and social tags on user preference similarity and item similarity, combined with Eq.(4), Eq. (6) ( ( )) ( ( )) 22 where sim(u,t) denotes the similarity of users u u and t u , see Eq (12), and sim(i,j) indicates the similarity between items i i and j i . respectively. In Figure 4, the user feature space is constrained by the similarity relationship Suv composed of the user feature U, the trust relationship features B and E, and the tag weight relationship G, and the similarity relationship Sij composed of the item feature V and the social tag weight relationship H is used to constrain the items.

Model learning
For the above objective function, the gradients of and t L can be calculated by the stochastic gradient descent method to obtain the local minimum as follows:

Figure 4 Probability graph model of ISocialMF
Here, ' u U , ' u B ,and ' v E are the partial derivatives of Uu, Bu, and Bv, respectively.

Algorithm 1 Social recommendation algorithm that integrates trust relationships and social tags (ISocialMF).
Input： user-item rating matrix R, trust relationship matrix T between users, user-tag and item-tag relationship matrix G and H, respectively, latent feature dimension K, regularization parameters U, V, T, P, Q, W, S, L and learning rate η.
Output：user latent factor feature vector U, item latent factor feature vector V, truster feature vector B, trusted user feature vector E, user-to-user similarity relationship matrix 3 Get social tag-based user similarity and item similarity according to Eqs. (26) and (31).
Obtain the user feature, trust relationship and social tag according to Eqs. (22), (23),and (26). 12 Obtain the item similarities based on social tag and item feature according to Eqs (31),and (32). 13 Obtain the user and item implicit similarities according to Eqs (27),and (29). 14 end for 15 Update L according to Eq (30)

Experimental results and analysis
In this section, some experiments are performed on the Epinions and Douban datasets, and the performance of the algorithm is compared with mainstream social recommendation algorithms to evaluate the effectiveness of our model.

Datasets and evaluation indicators
To verify the effectiveness of our algorithm, we select two popular social network datasets, such as Epinions and Douban. For the Epinions dataset, user relationships are directed; for Douban datasets, the relationship between users is undirected. The statistics for our extracted Epinions and Douban datasets are shown in Table 1.
We first use MAE and RMSE evaluation indicators to evaluate the performance of each algorithm. In addition, we also use the evaluation indicators P@N, R@N and NDCG commonly used in the Top-N RS related to the actual scene to evaluate the algorithm more comprehensively.
NDCG is an evaluation index for measuring the quality of recommendation ranking. It considers the relevance and ranking position of all recommended items, and its definition is as follows:  Density of trust network 0.00201 0.00378

Effect of parameters
To verify the impact of social tags on the performance of the algorithm, we divide the training set into five groups according to the number of tags to be tagged: "=0", "1~5", "6~10", "11~20" and ">20", then the experiment is conducted. The performance of the ISocialMF algorithm on MAE and RMSE at different numbers of tags is shown in Figure 5.

Figure 5 The MAE performance of the algorithm under different numbers of tags
As can be seen from Figure 5, since the IScocialMF algorithm takes into account the user's tag information for the item, the MAE performance of the recommendation algorithm is significantly improved on both the Epinions and Douban datasets. As the number of tags increases, the performance of the RS increases significantly, and when the number of tagged labels reaches 30, the performance tends to be stable. This phenomenon indicates that when the number of tags is increased to a certain threshold, the user's preferences and item characteristics can be expressed accurately. As the number of tags continues to increase, tag information becomes redundant and there is no significant improvement in the accuracy of the recommendations.
In the IScocialMF algorithm, the parameters  and  P are important parameters that affect the recommendation performance. Among them,  controls the degree of contribution of the user and neighboring users to the preference prediction, and the range of which is [0, 1]. The user's own behavior is mainly used for prediction when  =0, and the behavior of the neighboring users is mainly used for prediction when  =1. Figure 6 depicts the effect of parameter  on MAE and RMSE for the Epinions and Douban datasets, respectively. The effects of the parameter  on MAE and RMSE are very similar. As  increase, the values of MAE and RMSE decrease first, then gradually increase, and finally become stable.
For Epinions and Douban datasets, the prediction error is minimal. This phenomenon indicates that it is more accurate to 15 consider comprehensively the behavior data of users and their neighbors.  Figure 7 shows the effect of parameter  P on MAE. The role of  P is to control the impact of user implicit similarity on recommendation performance. The larger the value is, the greater the role of neighboring users in predicting the rating process is, and the more the user characteristics depend on neighboring users. On the Epinions and Douban datasets, as  P increases, the MAE value decreases first, then increases slowly until stabilizes. The prediction accuracy of IScocialMF on both datasets is optimal.

Impact of Different Sparse Degrees of Trust Relationship on Recommendation Performance
The sparsity of trust relationships is also a key factor that directly affects the quality of social recommendation algorithms.To verify the robustness of the ISocialMF algorithm when trust relationships are sparse, we first classify all users according to their trust association degree (outdegree + indegree), and then evaluate the MAE performance of the algorithm. The distribution of each group of users and the average number of ratings are shown in Figure 8. The proportion of the number of user relationships with connection relationship range of 0~5 and 5~10 is the highest, the proportion of connection relationships exceeding 200 is the least, but the number of users who rated items is the largest. Figure 8 The social relationship distribution of each user group Figure 9 The MAE performance of users in different groups for each algorithm Figure 9 shows the MAE performance of each algorithm at different social connection densities. On the Epinions and Douban datasets, ISocialMF performs optimally. The performance trends of several other social recommendation algorithms are the same. As the number of social relationships increases, the recommendation quality also increases. When the number of social relationships increases to a certain extent, the recommendation quality begins to decline and finally become stable.
Only the recommendation performance of the PMF algorithm is gradually improved since the algorithm does not consider the influence of social relationships, and the rating data also increases as the number of social relationships increases. Therefore, the performance of the PMF is improving constantly, and finally tends to be stable. Therefore, with the increase of the number of social relationships, although the number of some friends has increased, many of these relationships are only casual. There is no substantial trust relationship and common interest preference between them, these casual social relationship connections will reduce their ability to learn the user's true interest preference characteristics, so the recommendation algorithm does not achieve better recommendation quality.

Performance comparison and analysis
In order to evaluate the performance of the ISocialMF algorithm proposed in this paper, the algorithm is compared with social recommendation methods such as PMF [23] , TrustPMF [14] , SocialMF [18] , PRM [31] , EnSocialMF [25] , USSHMF [28] , and SocialIT [3] . Compared with other social network-based recommendation algorithms, the traditional PMF algorithm has the worst MAE and RMSE performance. For Douban datasets with relatively dense data, the ISocialMF algorithm has a smaller increase than PMF and other social-based collaborative filtering algorithms. For the Epinions dataset, the association between users is directed. The ISocialMF algorithm can accurately describe the relationship between users. For the Douban dataset, the relationship between users is a two-way friend relationship, so the performance of the ISocialMF algorithm is in Epinions.

Performance of the algorithm on sparse datasets
The increase in the data set is large.  Table 3 shows the P@N, R@N, and NDGG performance of each algorithm on the Epinions and Douban data sets. The performance of the ISocialMF algorithm in this paper is significantly improved compared to other social relationship-based recommendation algorithms, which indicates that the algorithm comprehensively considers the similarity of the ratings, the implicit similarity of the trust relationship and the similarity of the label weight to effectively improve the recommendation performance of Top-N. Although the SocialIT algorithm optimizes the user feature space by using feedback from user ratings and trust relationships, it improves the recommendation quality by mining implicit social relationships between users, but this method does not consider the impact of social tag implicit social relationships and item similarity. Compared with other social recommendation algorithms, the USSHMF and EnSocialMF algorithm models improve the recommendation performance to a certain extent, but lack the training of the model because of its direct modeling of social relationships. Their user preferences may deviate from real user preferences, therefore affect the quality of recommendations. The experimental results show that our method can improve the recommendation quality effectively by mining the implicit information of social relationships and social tags to model user preference interests, and training user features and item features for obtaining accurate user similarity and item similarity.

Performance of the algorithm on cold start users
Cold start is a challenge for CF-based recommendation algorithms, and we compare our algorithm with other social relationship algorithms in terms of cold start. Here we define users who rate less than 5 items as cold start users. Table 4 shows the recommendation performance of various recommendation algorithms on cold-start users. It can be seen that the performance of the ISocialMF algorithm on the Epinions dataset is better than that on the Douban dataset, and for cold start users, the performance is more significantly improved on the Epinions dataset.

Conclusion and future work
Data sparsity and cold start are major challenges of the CF-based RS. This paper further eases the impact of the above problems on RS by introducing trust relationship and user tag information into the MF-based CF method. In this paper, we improve the social relationship measurement method using social tags and trust relationships, and propose a comprehensive social recommendation method, namely ISocialMF. The biggest advantage of ISocialMF is to integrate the social tags and item's social relationship into the MF process, and to train the shared user and item features for more accurate recommendation quality through continuous feedback mechanism in the process of matrix factorization. Finally, the proposed model is applied to different datasets to verify its effectiveness. The experimental results show that the ISocialMF algorithm proposed in this paper is superior to other algorithms in recommendation accuracy, which verifies the correctness of establishing the

Declaration
The content of this article has not been published, nor has it been submitted to other journals. There is no conflict of interest in the content of this article. With the consent of all the authors, this article am authorized to publish. The contributions of each author in this article are as follows: Dr. Rui Chen is responsible for the writing of the article, Prof. Jian-wei Zhang is responsible for the revision of the paper, Prof. Zhifeng Zhang has put forward many valuable suggestions for this article, Dr. Jingli Gao verified the method and experiment, Dr. Pu Li completed the experiment of the paper, and Prof. Hui Liang revised the grammar of the paper. The publication of this article is supported by the following funds.