Although these strategies improve the performance of KG-based recommendation models, they do not make the most effective use of the intrinsic structural information contained within the KGs. None of these methods is able to successfully connect knowledge graphs with the characteristics of the user or the object. Therefore, the purpose of this research is to propose a knowledge-graph-based technique for selecting educational resources that can be utilized in music education programs at the graduate level. With the assistance of the wide variety of facts that can be discovered in knowledge graphs, a two-way attention network can be constructed. Improve the neighborhoods on both the user and the item side, as well as take into account user preferences, in order to make the forecasts more understandable.

First, we introduce the relevant definitions and concepts of this paper. The representation of users and projects is as follows

$$\begin{gathered} U=\left\{ {{u_1},{u_2}, \cdots {u_M}} \right\} \hfill \\ I=\left\{ {{i_1},{i_2}, \cdots {i_N}} \right\} \hfill \\ \end{gathered}$$

1

The feedback of interaction matrix is

$$X=\left\{ {{x_{ui}}\left| {u \in U,i \in I} \right.} \right\}$$

2

That means, when there is an interaction between the user and the item, we have

The entity set of user in KG is

$$E\left( u \right)=\left\{ {e\left| {e \in N\left( i \right),i \in \left\{ {i\left| {{x_{ui}}=1} \right.} \right\}} \right.} \right\}$$

4

As we all know, KG is composed of triples, the storage form is \(\left( {h,r,t} \right)\), *h* is the head entity, *r* is the relation, and *t* is the tail entity. A dropout layer is added to the network to prevent overfitting. First, a random deactivation vector \({r_u}\) is generated using Bernoulli distribution.

$${r_u}\sim B\left( p \right)$$

5

The attention network is expressed as follows:

$${H_0}={\text{tanh}}\left( {{W_0}\left( {C\left[ {{h_i},{r_i},{t_i}} \right] \odot {r_u}} \right)+{b_0}} \right)$$

6

$${H_1}={\text{sigmoid}}\left( {{W_1}\tanh \left( {{W_0}{H_0}+{b_0}} \right)+{b_1}} \right)$$

7

Further, we normalize the attention weights

$${H^{\prime}_1}=\frac{{\exp \left( {{H_1}} \right)}}{{\sum {\exp \left( {{H_i}} \right)} }}$$

8

Similarly, the scores between users and relationships also need to be normalized

$${d^{\prime}_{ur}}=\frac{{\exp \left( {{d_{ur}}} \right)}}{{\sum {\exp \left( {{d_{ur}}} \right)} }}$$

9

$${d_{ur}}=d\left( {u,r} \right)$$

10

KG nodes may have zero to an arbitrary number of neighbors. If an entity has more than K neighbors, K neighbors are chosen at random; otherwise, K neighbors are generated by duplicating the entity's nearest neighbors. In this paper, we employ this method. When doing neighborhood aggregation on target entities, multi-hop operations are used to determine high-level entity dependence information. Aggregation is an instrument which is calculated as

$$A=f\left( {W \cdot \left( {I+\sum {{{d^{\prime}}_{ur}}} } \right)+b} \right)$$

11

Structured and unstructured course features allow for distinct approaches to feature extraction. Feature matrix V is obtained by splicing and directly embedding structure attributes like course category and course duration. Text-CNN is used to conduct in-depth content analysis on unstructured text data in order to generate a more nuanced abstract feature representation. Specifically, what needs to happen is as follows:

First, the description text will be segmented, and Word2vec will be used to perform the embedding operation on the words, which will be converted into the word vector matrix *Z* in the implicit semantic space. Let *z* be the word vector of the i-th word in *Z*, then we have:

$$Z=\left[ {{z_1},{z_2}, \cdots {z_n}} \right]$$

12

Next, input the word embedding matrix into the convolutional layer to get the feature map *y*

$${y_i}=g\left( {W \cdot {Z_{i:i+k - 1}}+b} \right)$$

13

$$y=\left[ {{y_1},{y_2}, \cdots {y_n}} \right]$$

14

where *k* is the convolution kernel window size.

The loss function is

$$L=\sum {H\left( {\hat {y},y} \right) - \alpha \sum {d\left( {\hat {y},y} \right)} +\beta \left\| W \right\|} _{2}^{2}$$

15

where \(\left\| W \right\|_{2}^{2}\) is a regularization term.