## 3.1 Genetic algorithm

From the modern mathematical theory of genetic algorithm, researchers can optimize the underlying modern mathematical structure of genetic algorithm from the aspects of its convergence, convergence mode, the mechanism of parameters on the operation process, and the modern mathematical function of genetic algorithm. There are generally three kinds of mathematical models for the study of the convergence performance of genetic algorithms, which are axiomatic model, Vose Liepins model and Markov chain model. In the statistics of individual similarity rate, the specific calculation formula of group information entropy can be determined through the analysis of information entropy:

$$H\left(M\right)=-\frac{1}{N}\sum _{i=1}^{N}\sum _{j=1}^{S}{p}_{ij}log{p}_{ij}$$

1

Where, M is the number of individuals, N is the number of individual genes, and s is the number of alleles available for selection. The similarity between individual P and individual Q is as follows:

$${A}_{PQ}=\frac{1}{1+H\left(2\right)}$$

2

At present, in the process of selecting operators, the roulette method is a common method. This method judges whether an individual can be inherited to offspring through the application of individual fitness distribution values in probability. Therefore, its advantage is that individuals with good fitness distribution are more likely to be genetically evolved, while those with poor fitness distribution values are less likely to be genetically evolved. However, its disadvantages are also very obvious. Even if the best individuals are highly likely to be inherited, there will be a possibility that they will not be inherited at all. The specific steps for roulette selection are as follows:

Calculate the sum F of all individual fitness function values. Specific calculation formula:

$$F=\sum _{i=1}^{n}{f}_{i}$$

3

Where, fi is the fitness function value of the ith individual.

Step 2: Calculate individual selection probability Psi. Specific calculation formula:

$${P}_{si}=\frac{{f}_{i}}{F}$$

4

Use Pk to connect wij and θ j. The input sj of the number of cells in the middle layer is calculated, and the output bj of the cells in the middle layer is calculated from sj:

$${s}_{j}=\sum _{i=1}^{n}{w}_{ij}{a}_{i}-{\theta }_{j} j=\text{1,2},\dots ,p$$

5

$${b}_{j}=f\left({s}_{j}\right) j=\text{1,2},\dots ,p$$

6

Through bj, connection weight vjt and threshold γ T Calculate the output Lt of each unit in the output layer, and calculate the response Ct of each unit in the output layer.

$${L}_{t}=\sum _{j=1}^{p}{v}_{jt}{b}_{j}-{\gamma }_{t} t=\text{1,2},\dots ,q$$

7

$${C}_{t}=f\left({L}_{t}\right) t=\text{1,2},\dots ,q$$

8

Through the network target vector Tk, the actual output Ct of the network, calculate the unit error dtk of the output layer.

$${d}_{t}^{k}=\left({y}_{t}^{k}-{C}_{t}\right)\bullet {C}_{t}\left(1-{C}_{t}\right) t=\text{1,2},\dots ,q$$

9

Only through the fitness function value of the offspring, the prediction accuracy of the obtained model is converged to a degree that is much higher than the value interval of the original model, which is the maximum convergence rate we require. The ratio between it and the iteration frequency is shown in Fig. 1. Here, the horizontal axis refers to the iteration frequency, while the vertical axis represents the highest value of the fitness function value in each batch of individuals, and the dotted line represents the fitness function value of the original model.

As can be seen from Fig. 2, the MAE value of the ItemCF algorithm decreases gradually as the neighbor list length K increases from 5 to 100. In the process of K increasing from 5 to 20, MAE decreases rapidly. When K > 20, MAE almost does not change much, and it basically remains between 0.80 and 0.84.

## 3.2 Cloud computing recommendation

The concept of cloud computing has caused a sensation in the industry, and major IT companies are scrambling to join the ranks of cloud computing research. Cloud computing is one of the trends of future IT industry development. What exactly is cloud computing technology? The definition in IBM's white paper is that the term cloud computing technology is usually used to describe some information system network platforms or refer to some types of applications. A cloud computing technology information system network platform can provide mobile layout, settings, repeated installation, or even deletion services as required. If people develop a Web network or engage in an App business, they will inevitably involve such thorny problems as data storage and big data computing. The birth of cloud computing platform undoubtedly provides a set of practical solutions to deal with these problems. What cloud computing provides is a computing service, and it can realize cheap mobile scaling advantages through the Internet. It integrates most of the computer system resources in the data pool, so as to complete the parallel distribution of computing functions, and provides various computing power, storage capacity and information to the application operating systems with different requirements according to the requirements.

The main idea of the content based introduction method is to determine the characteristics and functions of a category that allows consumers to give a better evaluation, and then introduce another category with the same characteristics to consumers. For some categories represented by words, such as news text and web articles, this feature vector usually includes the important coefficient of inverted document rate of word frequency, that is, the weight of some keywords with the highest amount of information. Generally, it needs to be obtained from the player's scored category subset. The formula is:

$${X}_{u}=\sum _{i\in {I}_{u}}{r}_{ui}{X}_{i}$$

10

It can be estimated by averaging the scores of these users on i, as shown in the formula:

$${\widehat{r}}_{ui}=\frac{1}{\left|{N}_{i}*\right(u\left)\right|}\sum _{v\in {N}_{i}\left(u\right)}{r}_{vi}$$

11

If the sum of these weights (i.e. similarity) is not 1, the predicted scoring results are likely to exceed the allowed range of scoring values. Therefore, it is customary to normalize these weights, so the scoring prediction becomes a formula:

$${\widehat{r}}_{ui}=\frac{\sum _{v\in {N}_{i}\left(u\right)}{w}_{uv}{r}_{vi}}{\sum _{v\in {N}_{i}\left(u\right)}\left|{w}_{uv}\right|}$$

12

A new method of scoring prediction is given:

$${\widehat{r}}_{ui}={h}^{-1}\left(\frac{\sum _{v\in {N}_{i}\left(u\right)}{w}_{uv}h\left({r}_{vi}\right)}{\sum _{v\in {N}_{i}\left(u\right)}\left|{w}_{uv}\right|}\right)$$

13

If a scoring prediction model has been developed, it can predict all the unseated items in the user/item scoring matrix. We can design a new prediction model based on this model to minimize the error (or loss function), as shown in the formula:

$$err\left(P\right)=\sum _{u,i}{({r}_{ui}-{\widehat{r}}_{ui}^{\left(k\right)}-{\widehat{r}}_{ui}^{\left(k+1\right)})}^{2}$$

14

If we have enough scoring prediction models, we can use linear fitting to connect them. Linear fitting method of K scoring prediction models, such as the formula:

$$\widehat{r}={\alpha }_{0}+\sum _{k=1}^{K}{\alpha }_{k}{\widehat{r}}^{\left(k\right)}$$

15

If constant terms are not considered:

$$\widehat{r}=\sum _{k=1}^{K}{\alpha }_{k}{\widehat{r}}^{\left(k\right)}$$

16

SVD can be used to reduce data, and then extract the most prominent information in the original data set to remove more than useless information. Suppose there is a scoring matrix of m users for n items. First, simply complete the average value of users/items to obtain the completed matrix R '. Then R 'can be decomposed into the following formula by SVD decomposition:

$${R}^{{\prime }}={U}^{T}SV$$

17

Score matrix after dimension reduction, as shown in the formula.

$${R}_{f}^{{\prime }}={U}_{f}^{T}{S}_{f}{V}_{f}$$

18

When selecting in this paper, first calculate the square value of the F norm of the matrix as f, and then calculate the probability of each line. The probability of selecting this line is shown in the following formula, that is, the ratio of the square value of the F norm of the vector to the square value of the F norm of the matrix. By comparing these ratios, you can select the r vectors with the highest probability.

\({p}_{i}=\sum _{j}{m}_{ij}^{2}\) /f(19)

After generating matrices C and R, the intersection matrix W of matrix c and R can be obtained by querying, the SVD decomposition result of matrix W can be calculated, and then the generalized inverse matrix of diagonal matrix technology can be calculated, then the U matrix in CUR decomposition can be obtained. The calculation method of U matrix is as follows:

$$U=Y{\left(\sum +\right)}^{2}{X}^{T}$$

20

The CUR decomposition technology is used to transform the recommendation algorithm. Because the selection between C matrix and R matrix is mainly based on the selected probability of vectors in the operation steps of CUR, it can also be called a step that saves important vectors. Therefore, after analyzing the dimension reduction algorithm, a group of R matrices will be obtained, and then the recommended coefficient vector for users will be estimated according to the following formula.

$${R}_{m}=(u\times {R}^{T})\times R$$

21

Under local conditions, this paper compares the results of the two recommended methods through programming experiments to determine the effectiveness of the method improvement. The test data set used in this experiment is also the sample in Table 1.

Table 1

| Resource 1 | Resource 2 | Resource 3 | Resource 4 | Resource 5 | Resource 6 | Resource 7 |

Basic methods | 39.21 | 18.13 | 24.01 | 37.24 | 25.48 | 16.17 | 15.19 |

CUR method | 509.016 | 223.260 | 273.273 | 373.279 | 178.842 | 354.514 | 0.000 |