An improved constrained Bayesian probabilistic matrix factorization algorithm

Given the increasing growth of the Web and consequently the growth of e-commerce, the application of recommendation systems becomes more and more extensive. A good recommendation algorithm can provide a better user experience. In the collaborative filtering algorithm recommendation system, many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings, this paper proposes an improved constrained Bayesian probability matrix factorization algorithm. The algorithm introduces a potential similarity constraint matrix for specific sparsely scored users to affect the user’s feature vector, and uses the Logistic function to express the nonlinear relationship of the potential factors, combined with the Markov chain Monte Carlo method for training. Finally, the data set is used for testing and comparative evaluation. This experiment proves that the algorithmic model can be efficiently trained using Markov chain Monte Carlo methods by applying them to the MovieLens and Netflix dataset. The experimental results show that the algorithm has better predictive performance and is suitable for solving the problem of sparse rating matrix of specific users.


Introduction
In recent years, recommendation systems have been widely used in various fields such as e-commerce, video, and social networking (Hu et al. 2020;Han et al. 2020;Xu et al. 2021). In the recommendation system, recommendation algorithms are mainly divided into content-based recommendation and collaborative filtering-based recommendation . The collaborative filtering algorithm completes the recommendation task by obtaining user behavior information, it is a personalized recommendation algorithm that relies on historical data . At present, most recommendation systems mainly use collaborative filtering to complete the recommendation task .
Although the collaborative filtering algorithm can analyze users' preference information and recommend favorite items for them, there is a problem of low prediction accuracy caused by the sparse user rating matrix (Wang 2020). Later, the potential factor model based on matrix factorization was successfully used in collaborative filtering, it effectively solves the problem of sparse scoring matrix by analyzing the potential characteristics between users and items, and its prediction accuracy and stability have been widely recognized (Fang et al. 2020;Han et al. 2020;Zhang ang Yang 2021).
With the continuous development of technology, people's requirements for recommendation accuracy are getting higher and higher. There are many ways to improve the prediction accuracy by improving the matrix factorization technology. In Zhang et al. (2020), a collaborative filtering algorithm based on user social networks has been proposed, which can improve the accuracy of score prediction, but the algorithm still needs improvement to solve & Guodong Wang jxust_edu_wgd@126.com Musheng Chen dreaminit@163.com the problem of sparse score matrix. Ni et al. (2020) proposes a recommendation heterogeneous similarity model based on implicit feedback, through the user's implicit feedback similarity information to complete personalized recommendation, this algorithm can effectively solve the problem of sparse rating matrix, but the algorithm complexity is too high. Salakhutdinov and Mnih introduced a constraint matrix to the probability matrix decomposition, which avoided the feature of users with less ratings from approaching the average value of the prior distribution and improved the prediction accuracy Hu et al. 2020). However, the model requires manual adjustment of parameters, which is prone to over-fitting problems.
Although Salakhutdinov and Mnih also proposed a Bayesian architecture and used Monte Carlo sampling to control model hyperparameters, which can effectively solve the problem of model overfitting and optimization ). However, it does not give a good enough solution to the problem of sparse user ratings. At present, the improved probabilistic matrix factorization algorithm still has some problems (Wang 2020). For users with sparse ratings, the convergence speed of the algorithm is slow, and the prediction accuracy is low, and the regularization parameters need to be manually adjusted. The probabilistic matrix factorization algorithm was proposed very early, but the later optimization of it has not achieved the desired effect.
With the rapid updating and iteration of technology, various video and social applications emerge at the historic moment. Collaborative filtering recommendation algorithms integrated into applications can usually better capture user preferences, thereby improving user experience. However, collaborative filtering algorithm has the problem of cold start, and in the face of a large amount of data and a small amount of valuable data, that is, the user's rating matrix is sparse, collaborative filtering often cannot complete the recommendation task well. Many scholars have made a lot of improvements to collaborative filtering algorithm, but the recommendation effect is still low. Although some improved collaborative filtering algorithms have improved the recommendation performance very well, there are too many parameters to consider, the algorithm process is cumbersome, and the algorithm complexity is high. Therefore, in order to solve the above problems, this paper proposes an improved constrained Bayesian probability matrix decomposition algorithm, which improves the recommendation performance without increasing the complexity of the algorithm. This paper mainly aims at the problem of sparse user rating matrix in collaborative filtering recommendation, and proposes an improved constrained Bayes-based probabilistic matrix factorization algorithm. Mainly by introducing constraint vectors other than user feature vectors, and using the Logistic function to represent the nonlinear relationship of potential feature factors, the LCBPMF model is established without increasing the complexity of the model, and use the Markov chain Monte Carlo method to train the model, which effectively improves the interpretability of the algorithm.
The rest of this paper is organized as follows: Sect. 2 mainly introduces the probabilistic matrix factorization (PMF), Constrained probability matrix factorization (CPMF) and Probability matrix factorization based on Bayesian (BPMF). Section 3 introduces an improved constrained Bayesian probability matrix factorization algorithm, and analyzes the model establishment and inference process, and give the algorithm design and analysis process. Section 4 mainly introduces the data set used in the experiment and the standardized evaluation criteria of the experiment, and analyzes the experimental results. Section 5 summarizes the main work in this paper, and the conclusions and future work are discussed.

Related work
In recent years, with the rapid development of information technology, some scholars and actual researchers have done a lot of new work on recommendation algorithms. In order to better measure and learn implicit feedback and improve system performance, a model named CML is proposed, which is a collaborative metric learning model (Hsieh et al. 2017). Using CF metric learning and metric implicit feedback, it can not only encode user preferences, but also learn the joint metric space. A latent relation metric learning model (LRML) is proposed. It is a method to analyze user interaction projects through the potential relationship of users (Tay, Anh Tuan and Hui, 2018). This method is helpful to prevent the influence of the potential geometric flexibility of translation based knowledge graph embedding on the results. A CF using Apache Spark is applied to movie recommendation. In this study, the focus is on selecting the parameters of the ALS algorithm (Aljunid, M. F. and Huchaiah, M, 2019). Through parameter selection, the parameters obtained from the results will affect the performance of movie recommendation engine construction. The model is tested and evaluated through different measurements. A CF method based on Knowledge Graph Embedding (KGECF) is proposed . This method uses the projection relationship as a complex spatial vector expressed as rotation to ensure that attributes such as stacking and inversion are captured. A multi model deep learning method to optimize collaborative filtering recommendation is proposed (Aljunid, M. F. and Huchaiah, M, 2020). This research has better improved the performance of collaborative filtering, but this method does not consider multiple recommendation methods comprehensively, and only improves CF. A Match4Rec task recommendation method based on task method matching and bidirectional encoder is proposed . This method has certain advantages in improving the recommendation accuracy through task matching. In order to improve the recommendation system and enhance the user experience, an efficient hybrid recommendation system is proposed (Aljunid, M. F. and Huchaiah, M, 2020). This system has done further work based on CF, and has made great contributions to the improvement of collaborative filtering performance. An explicit and implicit feedback integration model based on deep learning collaborative filtering algorithm is proposed (Aljunid, M. F. and Huchaiah, M, 2022). Under a framework called ''IntegrateCF'', the explicit and implicit coupling interactions between users and projects as well as between users and projects are fully considered. Additional information about users and projects is also considered. Experiments on two datasets show that this model performs better than previous methods. The section mainly introduces the traditional probability matrix factorization model. At the same time, it introduces the constrained probability matrix model and the Bayes-based probability matrix model, and it introduces the related theories of the latest two models to improve the probability matrix factorization model.

Probability matrix factorization model (PMF model)
Because the traditional filtering system cannot deal with the user prediction problem that gives very few ratings, Salakhutdinov and Mnih proposed a probabilistic matrix factorization model in 2007 (Yan and Xie 2018). One of the most popular approaches to collaborative filtering is based on low-dimensional factor models. The idea behind such models is that attitudes or preferences of a user are determined by a small number of unobserved factors (Yang et al. 2019). In a linear factor model, a user's preferences are modeled by linearly combining item factor vectors using user-specific coefficients (Wen et al. 2020). Suppose there are N users and M products to form an N Â M dimensional scoring matrix R, the element R in the matrix R ij represents the rating of user i on item j (Bai and Li 2020). Assuming that the number of potential features is D, then the D Â N dimensional matrix U represents the user's potential feature matrix, the D Â N dimensional matrix V represents the product's potential feature matrix, U i represents the potential feature vector of user i, and V j represents product j potential feature vector Zhang et al. 2020). Training such a model amounts to finding the best rank-D approximation to the observed N Â M target matrix R under the given loss function (Cai and Huang 2020). The probability model diagram is shown in the Fig. 1. Assuming that the conditional distribution of the known scoring data obeys the Gaussian distribution (Yang et al. 2019;Fang et al. 2020). It is possible to define the conditional distribution of the observed ratings as shown in formula 1.
Among them, I ij in the above formula 1 is the indicator function, as follows: Then assume that both the user potential feature vector and the product potential feature vector obey the Gaussian prior distribution with a mean value of zero, as shown in formula 3 and 4.
In order to further solve the objective function, it is necessary to find the posterior probabilities of users and items (Kim et al. 2020;Liu et al. 2021). Calculate the posterior probability of the potential feature matrix U of the user and the potential feature matrix V of the item, as follows: For the posterior probability formula of users and items, take the logarithm at the same time to obtain formula 6. ln P U; VjR; r 2 ; r 2 V ; r 2 Find the maximum value of formula 6, which is equivalent to minimizing formula 7 Zhang 2020). The objective function is as follows: Among them, the regularization coefficient is Formula 7 takes the derivation of U i and V j separately, which is: Then use the stochastic gradient descent method to update the user's potential feature vector U i and the item's potential feature vector V j , until the convergence condition is met or the maximum number of iterations is reached . As shown in formula 9. Among them, the g is the step size and it also called learning rate . A proper learning rate is conducive to rapid convergence and avoiding shocks (Fang et al. 2021).
Let e ij be the difference between the real value and the predicted value of user i's rating of item j, it is possible to update the gradient descent formula of U i and V j as follows: The logistic function can be used to replace the original linear Gaussian model to avoid exceeding the effective range of score during prediction, as shown in formula 13 and 14.
Although the traditional probability matrix model improves the accuracy of prediction and improves the quality of recommendation, the traditional probability matrix decomposition model will make the features of users with less ratings approach the average value of the prior distribution, which will lead to the prediction of ratings of users with less ratings close to the average score of the item, which affects the accuracy of the recommendation (Zeng et al. 2018).

Constrained probability matrix factorization model (CPMF model)
In order to solve the problem that the rating prediction of users with a small number of ratings is closer to the average score of the item and affects the accuracy of recommendation, a constrained probability matrix factorization algorithm is proposed (Cai and Huang 2020). Constrain the user characteristics by adding a constraint matrix, give each item a constraint vector other than the original feature vector, and let the average value of all the item constraint vectors rated by each user affect the user's feature vector. The probability model diagram is shown in Fig. 2.
After adding a constraint matrix, the model defines a new user feature vector as follows: The conditional distribution of the observed ratings can be redefined is as follows: Assume that the user feature vector U, the item feature vector V, and the constraint feature vector W all obey a Gaussian prior distribution with a mean value of zero and a variance of r 2 U , r 2 V , r 2 W , which is: After taking the logarithm of the posterior probability of U, V, W, and then maximizing it, it is equivalent to minimizing the objective function . The objective function is as shown in formula 20.
Among them,k U =r 2 =r 2 U ,k V =r 2 =r 2 V , k W =r 2 =r 2 W ,in the same way, the stochastic gradient descent algorithm is used to solve the optimal solution U, V, W of the objective function.
Compared with the traditional probability matrix factorization model, the constrained probability matrix factorization model can solve the problem that the rating prediction of users with a small number of ratings is closer to the average score of the item and affects the accuracy of recommendation, it has a good recommendation effect for users with sparse ratings, but the model does not solve the overfitting problem well, and it also ignores the existence of mutually independent attributes of the item and user, these attributes will also affect the accuracy of the score to a certain extent (Jiang and Dong 2020).

Probability matrix factorization based on Bayesian (BPMF model)
Because the performance of the traditional probability matrix factorization model is not stable enough, it is necessary to manually adjust the regularization parameters, which is prone to over-fitting, and the prediction accuracy of users with sparse ratings is not enough (Wang et al. 2018;Dong et al. 2019). Therefore, in order to solve these problems, Salakhutdinov and Mnih proposed applying the matrix factorization model to the Bayesian frame to generate a scoring probability model with multivariate Gaussian prior distribution, namely BPMF (Jiang and Chen 2019). The probability model diagram is shown in Fig. 3. Assume that the prior feature vectors of users and items obey Gaussian distribution, as shown in formula 21 and 22.
The conditional distribution function that defines the model is as shown in formula 23. In order to improve the prediction accuracy of the model, the prior distributions of hyperparameters H U = {l U ; K U } and H V = {l V ; K V } of users and items are set to Gauss-Wishart distribution (Zhang et al. 2018).
In the above formula 24 and 25, W is the Wishart distribution with the degree of freedom v 0 and the covariance matrix W 0 , as shown in formula 26.
In the above formula 26, C is the regularization parameter, and generally defines Bayesian-based probabilistic matrix decomposition generally adopts Markov chain Monte Carlo method for training, and the error is relatively small (Wang et al. 2018;Dong et al. 2019). However, due to its linear model and data-sensitive characteristics, its performance is limited, and the prediction accuracy still needs to be improved (Zhang and Yang 2021).

Model construction and algorithm design
This section mainly introduces a constrained Bayesian probability matrix decomposition model based on the Logistic function. That is, the LCBPMF model. Given many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings, it will result in low prediction accuracy and poor recommendation quality. Therefore, the LCBPMF model is presented in this section. Furthermore, this section presents the construction and inference process of the LCBPMF model, and designs and analyzes the algorithm.

LCBPMF model predictions
The main improvement of the model is based on Bayesian, adding a potential similarity constraint matrix to the user's feature vector, that is, assigning each item a constraint vector other than the feature vector, let the constraint vectors of all items rated by the user affect their eigenvalues, and prevent the user eigenvalues from being close to the average value of the prior distribution. At the same time, in order to express better the nonlinear relationship of the potential factors, suppose that the user i's rating R ij of the item j obeys a Gaussian distribution with a mean value of B i gðY T V j Þ and a variance of a À1 , the symbols of each parameter of the model, refer to Table 1. The probability model diagram is shown in Fig. 4.
Set the potential similarity constraint matrix W, so the new feature vector of the user is as follows: In formula 27, a new user feature vector is generated by constraining the user U i . The indicator function I ik indicates whether the user i has scored item k. If it has already been scored, I ik =1, otherwise I ik =0. After accumulating user i's ratings W k for all items k, take the average value, then add the result obtained and the offset value of the average value of the prior distribution of user i to obtain a new user feature vector.
Suppose that the user i's rating R ij of the item j obeys a Gaussian distribution with a mean value of B i gðY T V j Þ and a variance of a À1 , so as to construct a new probability objective function of the model LCBPMF, the conditional distribution of the observed ratings can be defined as shown in formula 28. In formula 28, g(x) represents the Logistic function, and B i is a parameter representing the rating scale of user i. Suppose B obeys a Gaussian distribution with mean is l B and variance is K B . Simultaneously set the prior distribution of hyperparameter At this time, the initialization model parameter H 0 =fl 0 ; l 1 ; l 2 ; v 0 ; v 1 ; v 2 ; W 0 ; W 1 ; W 2 ; a; b 0 g, in order to reduce the process of optimizing parameters, generally set: where m is the average user rating and D is the dimension of the potential feature vector Liu et al. 2021).
In this model, the predictive distribution of the rating value R Ã ij for user i and query item j is obtained by marginalizing over model parameters and hyperparameters (Zeng et al. 2019;Liu and Li 2020). The probability objective function of user i's prediction score matrix R Ã ij for item j is as shown in formula 31.
Under normal circumstances, the process of solving the posterior distribution is more complicated. The joint probability in formula 31 is generally not easy to obtain (Jiang and Dong 2020;Zhang et al. 2020). The solution method used in this article is to use the Markov chain Monte Carlo method to extract samples, and then use formula 32 to approximate the complex objective function.
The samples fU t i ; V t j ; W t k ; B t i g are generated by running a Markov chain whose stationary distribution is the posterior distribution over the model parameters and hyperparameters { U; V; W; B; H U ; H V ; H W ; H B }. The advantage of the Monte Carlo-based methods is that asymptotically they Variance matrix of Gaussian distribution produce exact results (Zhang et al. 2018;Dong et al. 2019;Liu and Meng 2020;. In practice, however, MCMC methods are usually perceived to be so computationally demanding that their use is limited to small-scale problems (Wen et al. 2019; Jiang and Chen 2019; Liu and Li 2020).

LCBPMF model inference
One of the simplest MCMC algorithms is the Gibbs sampling algorithm, Gibbs sampling is a special case of the MCMC method, which cycles through the latent variables, sampling each one from its distribution conditional on the current values of all other variables. Gibbs sampling is typically used when these conditional distributions can be sampled from easily (Jiang and Dong 2020;Liu and Meng 2020).
In order to facilitate sampling of model parameters and hyperparameters, Bayesian inference is used, which combines priori ideas and sample data, and then infers based on the posterior distribution ). In the process of training the model, this article uses the Gibbs sampling in the MCMC method for Bayesian inference. Generally, conditional probabilities are used to construct a Markov chain whose stationary distribution is the desired joint probability distribution, and then sampling T times, the sample ðU; V; W; BÞ can be approximated as a sampling from the joint probability, and finally the score prediction is made using formula 32.
In the LCBPMF model, when other parameters are known, the objective function of the posterior probability of the user U i is as shown in formula 33.
To further simplify the objective function of posterior probability of user U i , perform Maclaurin expansion on B i g Y T V j À Á : As follows: Under normal circumstances, the transformation of the posterior distribution to the conditional distribution is more conducive to sampling. According to the conjugate prior distribution theory, the conjugate prior distribution of the mean value of the Gaussian distribution is the Gaussian distribution. It can find the conditional distribution of the user feature vector U i Obey Gaussian distribution.
Among them, K Ã Ui and l Ã Ui in formula 35 are shown, respectively, in formulas 36 and 37.
In the same way, the conditional posterior probability of V j can be solved, and the result is shown in formula 38.
The conditional posterior probability of W k needs to be calculated, and the result is shown in formula 39.
Among them, K Ã W k and l Ã W k in formula 39 are shown, respectively, in formulas 40 and 41.
Solve the conditional posterior probability of B i . The solution process is shown in formula 42.
Among them, K Ã Bi and l Ã Bi in formula 42 are shown, respectively, in formulas 43 and 44.
The posterior probability of the hyperparameter H W can be obtained by using the properties of the Gauss-Wishart distribution. Therefore, the posterior probability of the corresponding hyperparameter can be obtained by the posterior probability of the mean and variance (Wen et al. 2019;Jiang and Chen 2019;Liu and Li 2020). Formula 46 can be obtained from formula 45 through the properties of Gauss-Wishart. Because: So: Among them, W Ã W and W in formula 46 are shown as follows: In the same way, the conditional posterior probability of hyperparameter H B can be obtained. Because: So: Among them, W Ã B and B in formula 49 are shown as follows: In the same way, the posterior probabilities of the hyperparameters U and V can be obtained. The solution process will not be repeated. The results are shown in formulas 51 and 52.
Among them,W Ã U , U,W Ã V , V, S in formula 51 and 52 are as follows: The above is the derivation process of the conditional probability of each parameter. So far, the conditional probabilities of all the parameters of this model have been derived.

Design and analysis of algorithms
In this study, Bayesian inference is adopted, combining priori ideas with sample data and reasoning according to a posteriori distribution. First, construct a Markov chain whose stationary distribution is the desired joint probability by using conditional probability, and then use the Gibbs sampling method to sample the hyperparameters T times, the sample (U,V,W,B) can be approximated as a sampling from the joint probability, and then traverse and update the eigenvalues of the U, V, W, B vectors. Finally, fill the sparse matrix, the model is trained on the training set, error verification is performed on the test set, and then the prediction accuracy of the model is obtained. The algorithm steps are as follows.
The total computational complexity of the model is Although the algorithm complexity of the algorithm model (LCBPMF) is the same as the BPMF model, LCBPMF optimizes the traditional Bayesian model and greatly improves the model's ability to extract potential features. The model can achieve a good prediction effect when the user's potential feature vector dimension is very low.

Experimental design and result analysis
In order to prove the superiority of the LCBPMF algorithm, this article conducted experiments on different data sets, compared the LCBPMF algorithm with PMF, CPMF and BPMF, compared BPMF and LCBPMF in different potential feature dimensions, and tested the ability of the algorithm to extract potential feature vectors from sparse matrices by setting different feature vector dimensions, and tested the prediction accuracy of the model on different data sets, and tested the effectiveness of the algorithm through standardized evaluation indicators.

Description of data set and evaluation indicators
This article used the MovieLens and Netflix data set, we used, respectively, MovieLens100k, MovieLens1M and Netflix for experiments. MovieLens-100 k is the 1 Â 10 5 rating information of 1682 movies by 943 users (Yang et al. 2019;Zhang and Yang 2021). MovieLens-1 M is the 1 million rating information of 8662 movies by 6039 users Wang et al. 2020). Netflix contains more than 100 million ratings on more than 1:7 Â 10 4 movies from 4:8 Â 10 5 anonymous Netflix customers . In this experiment, about 3 Â 10 5 pieces of information about 3000 movies rated by 8662 users were randomly selected from Netflix. The data set is divided at a ratio of 8:2, of which 80% of the data is used as the training set, and the remaining 20% of the data is used as the test set. The statistics of data set are shown in Table 2. There are many evaluation criteria for commonly used recommendation systems (Liu and Li 2020;Jiang and Dong 2020). In order to better evaluate the superiority of the algorithm of this model, uses the root mean square error (RMSE) and the average absolute value error (MAE) as the evaluation basis. The smaller the RMSE and MAE, the smaller the prediction error and the higher the accuracy (Kim et al. 2020;Liu and Meng 2020). Standardized evaluation indicators RMSE and MAE can better evaluate performance of the model . The definitions of RMSE and MAE are shown in formula 54 and 55.
In order to fully verify the advantages of the method proposed in this research, the normalized cumulative loss gain NDCG@k and Hit Rate (HR@k) are added in this study. Used to evaluate the algorithm model provided in this study.
The idea behind the Discount Cumulative Gain (DCG) is to rank the items that users like first, which can improve their experience. The formula of DCG is defined as follows: In the above formula 56, r i indicates whether the item ranked ith is the current user's favorite. If r i = 1, it means that the user likes the item, otherwise, the user does not like the item. L is the length of the recommended list, and b is a free parameter, generally set to 2.
Because it is not reasonable for different users to directly use DCG for comparison, they are generally normalized. In this study, the original DCG is divided by the DCG in the ideal state to obtain the normalized cumulative loss gain NDCG. The NDCG formula is defined as follows: Hit rate (HR@k) refers to the ratio between the number of users and the total number of users of items in the TOP-K recommended list in the test set. The formula is defined as follows: In the above formula 58, | users | is the total number of users, and | hits | is the number of users whose items in the test set appear in the Top-K recommended list.

Parameter setting
To test the model algorithm of sparse matrix extracted characteristic vector potential ability, we put the eigenvector dimension D is set to 10,20,30,40,50,60,70,80,90,100, respectively. The number of iterations is 50. The average value of convergence RMSE and MAE obtained by iteration was used to test the scoring prediction performance of the model. By comparing the performance of the model proposed in this study with that of the baseline model, the potential feature dimension was selected as 30, sample size as 100, the samples fU t i ; V t j ; W t k ; B t i g are generated by running a Markov chain whose stationary distribution is the posterior distribution over the model parameters and hyperparameters {U; V; W; B; H U ; H V ; H W ; H B }. In order to reduce the process of optimizing parameters, we set W 0 , W 1 and W 2 are the unit matrix, set l 0 =l 1 =l 2 =0, v 0 =v 1 =v 2 =D=30, learning rate as 0.05, regularization factor as 0.02, and freedom as 30. We made comparison under different feature vector dimensions. When D = 30 with good stability was selected, the recommended number K = 10 and the number of Epoch was set to 30, and the performance of the LCBPMF model in this study was tested under three different data sets.

Experimental process and comparative analysis
In order to further verify the effectiveness of the algorithm of this model, this paper has done three sets of experiments 1, 2, and 3. Root mean square error and average absolute value error are used as evaluation indexes. The root mean square error and the average absolute value error of the comparison model were analyzed from different angles, further enhance the interpretability of the algorithm of this model (Bail and Li 2020; Guo et al. 2020;Dong et al. 2019). The first set of experiments uses the MovieLens-100 K data set, preprocesses the data set. The results of this experiment are obtained from the test set. Set, respectively, the dimensions D of the potential eigenvectors to 10,20,30,40,50,60,70,80,90,100. In order to test the ability of the model algorithm to extract the potential eigenvectors of the sparse matrix, this experiment has a total of 50 iterations Jiang et al. 2020). Take the average of the converged RMSE and MAE obtained by iteration. The experimental results are shown in Figs. 5 and 6.
It can be seen from the Figs. 5 and 6 that the scoring matrix is sparse, the RMSE and MAE of the LCBPMF model algorithm are reduced. Among them, RMSE of LCPMF is reduced by about 0.4, 0.3 and 0.1 compared with PMF, CPMF and BPMF models. The MAE of LCPMF model was reduced by about 0.12, 0.06 and 0.05 compared with PMF, CPMF and BPMF. In the case of feature vectors of different dimensions, the models proposed in this study all have high potential feature extraction ability. Through the analysis of Figs. 5 and 6, it can be obtained that the LCBPMF model is more suitable than the current model to solve the problem of insufficient prediction accuracy caused by sparse user ratings. When the dimension of feature vector is very low, the error of the algorithm of this model is still very small, the usability of the algorithm of this model is stronger. The LCBPMF algorithm has a better ability to extract potential eigenvectors from sparse matrices under different eigenvector dimensions. The LCBPMF is more superior to other models in solving the sparse rating problem.
The use of standardized evaluation criteria can increase the interpretability of the algorithm (Wen et al. 2019). The second set of experiments is to compare and analyze the RMSE and MAE of the PMF, CPMF, BFMF, and LCBPMF models on the three data set. In this experiment, the latent feature dimension is set to 30, the number of iterations is 50, the number of samples is 100, the learning rate is set to 0.05, the regularization factor is set to 0.02, and the degrees of freedom are set to 30. Set l 0 =l 1 =l 2 =0, v 0 =v 1 =v 2 =D=30.The experimental results are shown in Fig. 7.
According to the result of convergence interval in Fig. 7a, the mean square error (RMSE) of LCBPMF decreases by about 0.10, 0.09 and 0.07 compared with PMF, CPMF and BPMF respectively under MovieLens-100k data set. As can be seen from Fig. 7b, the RMSE of LCBPMF decreases by about 0.13, 0.11 and 0.10 compared with PMF, CPMF and BPMF respectively, when the experiment is completed with MovieLens-1M dataset. As can be seen from Fig. 7c, the RMSE of LCBPMF decreases by about 0.16, 0.13 and 0.12 respectively on the Netflix dataset compared with the other three models. In conclusion, compared with the other three algorithms, the error value of LCBPMF is smaller, the convergence speed of the model is faster, and the prediction accuracy of LCBPMF is greatly improved. Therefore, LCBPMF algorithm has more advantages than other algorithms in solving the sparse scoring problem. LCBPMF algorithm can better meet the actual recommendation application requirements, recommend satisfactory products for customers, and improve the recommendation quality.
From the above experiments, it can be found that the algorithm proposed in this paper is compared with the PMF, CPMF, and BPMF, whether verifying from the predictive performance of the model, or verifying from different potential feature dimensions and different data sets, the LCBPMF model proposed in this paper have better prediction accuracy. Therefore, for users, the LCBPMF algorithm can complete the recommendation task more accurately and greatly improve the user experience. For merchants, it can recommend products to users who need it, which greatly improves the quality of recommendations. Compared with other models, the hyperparameters of the LCBPMF algorithm are the optimal model parameters obtained by continuously updating and iteratively sampling the model using the MCMC method in Bayesian. The model is interpretable and the selection of hyperparameters is more reasonable.
In order to fully verify the advantages of the method proposed in this study, NDCG@k and Hit Rate (HR@k) indicators are used to evaluate the algorithm model provided in this study. This experiment uses NDCG@k and (1) Fill in the user-item matrix for the prediction scores of the above models. The recommendation list is generated according to the scoring prediction results, and the result scoring list and the ideal result scoring list are obtained according to the recommendation list.
(2) First, calculate the cumulative income CG. For the result list A, which contains k items, and the score of the ith item is r i , the total score is P k i¼1 r i .
(3) CG does not consider the order. For the final result, the score of the highest ranking should be higher than that of the lowest ranking. Therefore, a loss factor is added to discount the income of the lower ranked ones, that is, to complete the DCG calculation through the formula 56.
(4) Calculate the IDCG in the ideal state, and use the DCG results calculated in 3) above. Use formula 57 to normalize IDCG to obtain NDCG, which is used to indicate the extent to which the current result is close to the most ideal result.
(5) If the items in the test set appear in the TOP-K recommended list, it is considered as hit by default. According to the number of hits, use formula 58 to calculate the hit rate (HR@k).
We made comparison under different feature vector dimensions. When D=30 with good stability was selected, the recommended number K=10 and the number of Epoch was set to 30, and the performance of the LCBPMF model in this study was tested under three different data sets. The experimental results are shown in Table 3 and Fig. 8. For the HR@10 and NDCG@10 evaluation indicators, under different epochs, the normalized cumulative loss gain NDCG and hit ratio of LCBPMF proposed in this study have higher performance. the algorithm proposed in this study is superior to other current algorithms.
In order to prove the effectiveness of the methods proposed in this study, the latest baseline is added for analysis and comparison to ensure that the contributions made in this study are more convincing. The baseline model used in this study is as follows: • CML (Hsieh et al. 2017). This model is a collaborative metric learning model. They use CF metric learning with metric implicit feedback, which can not only encode user preferences, but also learn the joint metric space.
• LRML (Tay, Anh Tuan and Hui, 2018). This method is a latent relation metric learning method (LRML). It is a method to analyze user interaction items through the potential relationship of users. This method is conducive to preventing the impact of the potential geometric flexibility of translation based knowledge graph embedding on the results.
• KGECF . They used the CF method based on knowledge atlas embedding (KGECF), which uses the projection relationship as a complex spatial vector expressed as rotation to ensure that attributes such as stacking and inversion are captured.
• Match4Rec ). Match4Rec is a task recommendation method using task matching method and Fig. 7 The performance of the proposed model along with baseline methods for three datasets using RMSE metrics bidirectional encoder. This method has certain advantages in improving recommendation accuracy.
• IntegrateCF (Aljunid, M. F. and Huchaiah, M, 2022). The model uses a framework called ''IntegrateCF'', which combines explicit and implicit coupling interactions within users and items as well as between users and items, as well as additional information about users and items.
The results of the experiment are shown in the following Table 4. It can be found that the method proposed in this study can more accurately improve the accuracy of user scoring prediction, and thus improve the recommendation performance. Compared with CML, LRML, KGECF and Match4Rec models, LCBPMF model showed significant improvements in HR@10 and NDCG@10. The performance difference between the IntegrateCF model and LCBPMF model is small. In general, the recommended performance of LCBPMF has certain advantages. The experiment shows that compared with the latest baseline model, the model proposed in this study has higher recommendation performance, which increases the interpretability of this study.
As shown in Table 3 above, the PMF model will make the characteristics of users with low scores approximate to the average prior distribution, resulting in the score prediction of users with low scores approaching the average score of the item, thus affecting the accuracy of recommendation. The CPMF model does not solve the overfitting problem well, but also ignores the existence of independent attributes of items and users, which will also affect the accuracy of score prediction and model performance to some extent. The performance of BPMF is limited due to its linear model and data sensitivity. Generally, it is difficult to make good use of item data to improve system recommendation performance by treating user item data as Fig. 8 The performance of the proposed model along with baseline methods for three datasets using HR@10 and NDCG@10 metrics  (Aljunid, M. F. and Huchaiah, M, 2022). In addition, it is often untenable to assume that all attributes are independent of each other, and the coupling relationship between attributes and instances is complex. In Table 4 above, CML, LRML and KGECF models all have such disadvantages. The changes used for dynamic preferences make it difficult to record user patterns in historical sequences and may result in lower performance for Mat-ch4Rec and IntegrateCF. We construct a novel LCBPMF model and add a fractional constraint matrix to avoid the characteristics of low score users from being similar to the average prior distribution. Bayesian network architecture is introduced to consider the existence of item and user independent attributes. In addition, the potential similarity constraint matrix was introduced for specific sparse score users to influence the feature vector of users, and the Logistic function was used to represent the nonlinear relationship between potential factors, and the Markov chain Monte Carlo method was combined for training. We use conditional probability to construct a Markov chain whose stationary distribution is the expected joint probability, and then use Gibbs sampling method to sample the hyperparameter T times. We add constraint vector W and scale vector B, and approximate the sample (U,V,W,B) as sampling from the joint probability, and then traverse and update the eigenvalues of vectors. Potential characteristics of users can be extracted more accurately. As can be seen from Tables 3 and 4, the performance of this model is superior to the current baseline model.
The theoretical and experimental results show that the proposed method can be applied to practical application scenarios. Because collaborative filtering recommendation needs to be based on the user's historical behavior data, such as the user's browsing duration, whether the user likes it, and whether the user comments. Using these data and the corresponding algorithm, developers can calculate the user's rating value for the corresponding items. However, most platforms have a large number of items and users, and many users do not generate corresponding behavior data for items, which leads to sparse user rating matrix. The method proposed in this study can better predict the user's rating value of items, and then obtain a mass of users' rating data of items by filling in the sparse scoring matrix, which can improve the recommendation performance.
The improved algorithm proposed in this study aims at the low prediction accuracy of collaborative filtering in the case of sparse scores. Based on the probability matrix decomposition model, combined with the work done by some current scholars, this algorithm has made further exploration. From the analysis of evaluation indicators, this algorithm is superior to the current scholars' improvement of the probability matrix decomposition algorithm without increasing the complexity of the algorithm. However, the algorithm still has the following limitations: (1) The algorithm provided in this study only aims at achieving good performance under the existing data set conditions and in the laboratory environment, and its performance under actual user and item data in actual application scenarios has not been verified by practice.
(2) The algorithm provided in this study is mainly to improve the recommendation performance from the perspective of score prediction, while other aspects of improving the recommendation performance are not reflected in this study.

Conclusions and future work
This article aims at the fact that the matrix decomposition in the existing collaborative filtering algorithm cannot solve the problems of low prediction accuracy and poor recommendation quality caused by the sparse ratings of specific users, an improved constrained Bayesian probability matrix factorization model is proposed. The MCMC method is used to sample and train the model. Through the comparative analysis of three sets of experiments, the experimental results show that the algorithm of this model can well solve the problem of insufficient prediction accuracy caused by sparse user ratings. The algorithm of this model can better improve the quality of recommendations, and can recommend items for users more accurately. This article summarizes the innovation as follows: 1. An improved constrained Bayesian probability matrix factorization algorithm is proposed. On the basis of the Bayesian framework, the model is optimized by adding a constraint matrix and using the Logistic function to express the nonlinear relationship of the potential factors. Compared with other early methods, it reduces errors and improves prediction accuracy.
2. We apply the Markov Chain Monte Carlo method to MovieLens and Netflix datasets, the hyperparameters of the LCBPMF algorithm are the optimal model parameters obtained by using the MCMC method in Bayesian to continuously update and iteratively sample the model. The algorithm is more interpretable.
In future work, we can use the LCBPMF algorithm to complete the recommendation task, aiming at specific users with sparse ratings to improve the quality of recommendation and enhance the user experience.
Funding The research content of this article is part of science and technology project of Jiangxi provincial department of China (2021204201400711). This research is mainly led by the big data application technology group, and the strategic partners are the intelligent network technology group.

An improved constrained Bayesian probabilistic matrix factorization algorithm 5765
Data Availability Enquiries about data availability should be directed to the authors.

Declarations
Conflict of interest The author(s) declared no potential conflicts of interest with respect to the research, author-ship, and/or publication of this article.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.