A Hybrid Semantic Recommender System Based on an Improved Clustering

. A recommender system is a model that automatically recommends some meaningful cases (such as clips/films/goods/items) to the clients/people/consumers/users according to their (previous) interests. These systems are expected to recommend the items according to the users’ interests. There a re two traditional general recommender system models, i.e., Collaborative Filtering Recommender System (ColFRS) and Content-based Filtering Recommender System (ConFRS). Also, there is another model that is a hybrid of those two traditional recommender systems; it is called Hybrid Recommender System (HRS). An HRS usually outperforms simple traditional recommender systems. The problems such as scalability, cold start, and sparsity belong to the main problems that any recommender system should deal with. The memory-based (modeless) recommender systems benefit from good accuracies. But they suffer from a lack of admissible scalability. The model-based recommender systems suffer from a lack of admissible accuracies. But they benefit from good scalability. In this paper, it is tried to propose a hybrid model based on an automatically improved ontology to deal with the scalability, cold start, and sparsity problems. Our proposed HRS also uses an innovative approach of clustering as an augmented section. When there are enough ratings, it uses a collaborative filtering approach to predict the missing ratings. When there are not enough ratings, it uses a content-based filtering approach to predict the missing ratings. In the content-based filtering section of our proposed HRS, ontology concepts are used to improve the accuracy of ratings’ prediction. If our target client is severely sparse, we can not trust even the ratings predicted by the content-based filtering section of our proposed HRS. Therefore, our proposed HRS uses additive clustering to improve the prediction of the missing ratings if the target client is severely sparse. It is experimentally shown that our model outperforms many of the newly developed recommender systems.


Introduction
The amount of data is vast on web servers' databases.These data can be used to build learning models that are capable of predicting the users' behaviors.A learning model can be materialized by a simple statistical model or a sophisticated machine learning model for a given application.As the data is huge in these applications, the learning models which need the original data during prediction are not considered to be scalable.Therefore, data mining and machine learning models are considered to be better options for these applications.One of the applications of these models is retrieval/filtering data systems.A retrieval/filtering data system is a model that extracts meaningful data from the database according to its previously learned patterns from the data.One subfield in the field of retrieval/filtering data systems is the topic of recommender systems (Wang et al., 2022b;Kirubanantham et al., 2022).These systems are defined in the applications where users buy/choose/rate lists of items (for example, a movie website where different users can rate different films) (Manimurugan et al., 2022;Rostami et al., 2022).The aim of these systems is to predict the potential favorite list of items for any given (unknown) user.It is worthy of being mentioned that these systems should not interactively communicate with users.These systems sometimes are named approximation concepts, data-retrieving systems, forecasting theories, management science, and customer choice modeling in marketing and business (Jain et al., 2022).
Recommender systems help users find their customized items.These systems have been widely used in industry; for example, the "Amazon" organization benefits from a very useful recommender system.These systems are really useful to users in their shopping/item-searching.
One of the main challenges to the recommender systems is their inability to deal with an (unseen) recently-added item.This challenge is named as item cold start problem.We present a hybrid recommender system to overcome the item cold start problem.The other aim of our work is to present a scalable recommender system.We also want to improve the traditional recommender systems in terms of the precision of recommendations.To achieve these goals, we introduce a clustering-based recommender system model where it uses a WordNet/ontology.The ConFRS section of our hybrid recommender system model uses WordNet/ontology and semantic similarity; the ColFRS section of our hybrid recommender system model uses clustering and ontological semantic similarity.
Being a pioneer in consolidating ontology/WordNet and hybrid recommender system, we present an ontological hybrid scalable recommender system.To achieve precise results, our model also uses a memory-based approach with some modifications.We use clustering, user profiling, and ontology/WordNet in its ColFRS section to enhance recommendations' precision and scalability, cold start, and sparsity.We also use item ontology, ontological semantic similarity, and the K-nearest neighbors in the ConFRS section to enhance recommendations' precision in item cold start situations.An "IsA" Degree (IaD) is also presented to estimate the semantic similarity between two ontologies.To summarize our innovations, we can present the following: • We propose a new semantic similarity metric while eliminating uniformity edge taxonomy; • We propose a procedure to create ontology-based similarity value; • We propose a procedure where while it uses the benefits from K-nearest neighbors, it lacks its non-scalability; • We improve the recommendations' list precision.Indeed, movie recommender systems have been considered a hot topic in the recommender system field (Jain et al., 2018;Dwicahya et al., 2019;Patra and Ganguly, 2019;Hu et al., 2020;Lin and Chi, 2020;Nguyen et al., 2020;Shristi and Mohanty, 2018;Forouzandeh et al., 2021;Thakker et al., 2021).Therefore, we have presented a movie recommender system method that is faster than knn models at the cost of a slight performance decline.Also, our method sacrifices a little execution time to enhance performance meaningfully against model-based recommender systems.
The related works are subsequent section.The problem definition and our approach to its solution are subsequently presented.Then, the details of our model are presented.The empirical study is following section.The final section is dedicated to the paper's conclusions.

Literature Review and Related Works
A brief introduction to clustering, wordnet/ontology, ColFRS, and ConFRS are presented in this section as they are vital elements of our method.

History of Recommender Systems
Large companies use a wide range of recommender systems.Companies such as Google, Netflix, and Amazon and websites like Information Movie Data Base (IMDB) are good examples of these companies.Since the mid-1990s, a wide range of research has been carried out in this field.At that period of time, the phrase "Recommender System" emerged.These systems can be considered an efficient mechanism for refining data and filtering information (Adomavicius et al., 2020).
Recommender systems are categorized into different types according to the features they use.For example, the usage of the environmental conditions and the location leads to the emergence of a group of recommender systems called domain-based systems, or system knowledge of the user, providing the background for the emergence of knowledge-based systems.In (Jannach et al., 2010), these systems are classified as follows: • Collaborative-based model: the suggestions that have been of interest to other users with similar conditions (i.e., similar desires or preferences) are provided to the user, • Knowledge-based model: this model presents its suggestions along with arguing which items address the user's needs, • Combined model: this model applies a combination of the mentioned models.
Also, Fig. 1 shows some types of recommender systems, their used algorithms, and the relationship between them.
Collaborative filtering is the most convenient and practical filtering algorithm (Su and Khoshgoftaar, 2009).Methods based on this approach compute the utility of items according to their resemblance to other users (Ekstrand et al., 2014;Zhang et al., 2019).

Fig. 2. Different collaborative filtering techniques
These techniques apply to the obtained users' feedback about items.The feedback of all users is stated in a two-dimensional matrix (users' number × items' number).Element [, ] denotes the value of item  in the opinion of user .According to the circumstances of the issue, any element of this matrix can be a Boolean, Integer, or Real number.Also, this number can be harmful when the user does not like the corresponding item.On each item, the comments of users can be expressed explicitly (Guo et al., 2014) or implicitly (Jin and Chen, 2012).
According to Fig. 2, collaborative filtering-based methods can be categorized into two classes: model-based and memory-based.First, to recommend items to the user, the system creates a learning model.In the second class, by defining the user interest matrix, the algorithm is performed in three steps: 1) similarity between users is calculated, 2) users are identified with the most similarity in the neighborhood, and 3) by summarizing the neighbors' suggestions, a new proposal is selected for the user.
These approaches typically use the KNN algorithm, and they often yield acceptable results.Using KNN, similarities between users are calculated.In calculating of the similarity between users, the steps of implementing the algorithm are as follows: 1.In the first step, based on a similarity criterion (cosine, Pearson correlation, mean square difference), k neighbors will be selected for user .These neighbors are the ones who are the most similar to the user .
2. In the second step, for any item in the system, a quantitative predicted value indicating whether item  will be in user 's favorite list or not will be calculated.Calculation of this quantitative value is achieved based on applying various methods (average scores, weighted sum, etc.) on the rates that are given to item  by user 's neighbors.
3. Based on the second step, among all items, the item that has the highest prediction value is suggested to the user .
An example of a KNN algorithm with the user-by-user similarity approach is shown in Fig. 3.In this example, each value belongs to interval [0,5].

ColFRS
Model Based

Memory Based
User-based

Item-based
Recommendation/ Prediction Fig. 3.An exemplary user-by-user similarity approach.This algorithm, despite its simplicity and proper results, suffers from two essential problems: (1) it is weak in dealing with sparsity in data space (Bobadilla and Serradilla, 2009), and ( 2) it has low scalability (Luo et al., 2012).
In the item-to-item version, the main steps are as below (Bobadilla et al., 2013): 1. First, we determine  neighbors based on similarity criteria for each item , 2. If user  has not given a rate to item  so far, we calculate its approximation based on the rates that this user gave to neighbors of item , 3. We suggest the items with the greatest predicted values to the user .
Collaborative filtering-based recommender systems search for users who have similar interests to the new user.The process of this algorithm is shown in Fig. 4. of user-assigned items.The value of an uncertain rate  , for user  and item  is typically determined by the accumulation of rates of the other users similar to user , just like Eq. ( 1).
(1)  , = ∑   ′ ,  ′ ∈  such that   specifies a group of users that are identical to user  and also rates item  ( can range from 1 to the total quantity of users).In various studies, the aggregation can be expressed according to Eq. ( 2), Eq. ( 3), and Eq. ( 4). (2) User feedback is the main foundation of ColF-based systems; consequently, these systems may face the following challenges: 1. First Item and First User (cold start): As we said, in the ColF methods, the recommender system ranks items based on their rates and recommends the highest-rated items to the user.For that reason, if the system is just started or a new item is added to the system, enough information from the items (or that item) will not be available, and therefore it cannot be properly scored and ranked.This is a significant drawback in such systems known as the "cold start" (Bahrani et al., 2020).

Data sparsity:
These systems also suffer from another problem, which is data sparsity.This means that there is information in the system, but it is scattered, and it cannot be right and definitely told which item is more relevant (Moreno et al., 2013).
The rate matrix is a matrix with dimensions of  × , in which  is the number of users and  is the number of items.The (, )th component in this matrix represents the estimated rate that user  gives item .Normally, users deal with less than 1% of items on a website and only rate those items, resulting in having a large matrix with most of its elements missing.This will make it difficult to search in this matrix.As a result, the accuracy and integrity of these systems are reduced.This problem is somewhat overcome in an HRS approach.

Scalability:
User-based ColF algorithms respond well when hundreds or thousands of users exist.But current E-Commerce's scope is growing rapidly, and these applications have more than a million users nowadays.Therefore, the user-based ColF systems are no longer responsive.The calculations are computed during the test phase in these systems, and their amount of data is high.Therefore, their response time will be very long, and this is no longer acceptable.The itembased ColF algorithms can be used to solve this problem.
Because of the cold start and data sparsity problems, ColF-based systems are often used in conjunction with other solutions to reduce their disadvantages and increase their benefits.The collaborative tags can be used to identify users' interests and categorize items according to users' demands.
Another method used to deal with cold start is to use of clustering techniques.Both users and items or both (bi-clustering) can be clustered.These techniques can also improve recommender systems' performance.Dimension reduction techniques, Latent Index, and Singular Value Decomposition (SVD) techniques can also be used to solve the data sparsity problem.Of course, the SVD method, despite its very good results, requires a lot of processing and is usually used in offline applications.
Recommender system history shows that the ConF approach was the first strategy used in these systems.These systems make and keep a profile containing past information about the desires of each user to offer more items to him/her in the future in such a way that these items are most closely related to the user's favorites.
The ConF methods originate from information retrieval and information filtering.Significant advances have been made in the field of information retrieval and document filtering in textbased systems.Hence many recommender systems have also focused on textual informationbased items (like documents, addresses of websites, text news, etc.).A good source for traditional information retrieval methods is the user profile (UP), which identifies the customer's preferences and needs.Profile information is either entered through a questionnaire or can be implicitly deduced from examining user behavior in transactions.Items also have profiles that we indicate with item profile (IP), which is a vector of the attributes of an item, and its data is filled with the measuring of the characteristics of a particular item.The process of a ConF recommender system is shown in Fig. 5.

Fig. 5. The process of a ConFRS
ConFRSs are often used in text-based domains.In these systems, the content keywords are specified in such a way that the importance and meaning of keyword   in document   are defined by a weight   .To measure these weights, use the TF-IDF measurement, which is described below: Suppose that  is the total number of potential texts which may be offered to the user.Also, the keyword   is available in   texts.  specifies the number of repetitions of keyword   in text   , and   is computed with Eq. ( 5): It is assumed that the maximum number of   is calculated for all of the keywords   existing in the   text.When a keyword is repeated in many texts, the keyword cannot be very helpful in representing the difference between the texts, and with the help of that keyword, it cannot be determined whether a text is appropriate or not.For this reason, they often use the product of Inverse Document Frequency (IDF) in the Term Frequency (TF) of simple words.IDF for the keyword   is typically determined based on Eq. ( 6). ( 6) Hence, the TF-IDF weight for the keyword   in the   text is defined according to Eq. ( 7).
Eq. ( 8) is used to describe the content of the text   : (8) 1 ( ) ( , , ) ConFRSs offer items that are similar to those previously selected by the user.Therefore, the candidates are compared with the previous choices, and the best and most similar ones are presented.To formulate that, for each user, we consider a user profile called User Content-Based Profile (UCBP).In UCBP, the interests and preferences of user  are specified.This data is from the analysis of content related to the user's past activities resulting from ranking or analysis of keywords obtained from information retrieval.  indicates the UCBP of the user , includes a vector of weights ( 1 ,  2 , … ,   ) that each   is the weight of keyword   for the user .In addition, for each item, an item profile is considered as its Item Content-Based Profile (ICBP).The data of ICBP is achieved while analyzing the contents of its corresponding item.  denotes ( ̇) where  ̇ is the text related to item .Usually, the utility function is computed according to Eq. ( 9).The proposed system uses s of different items and  of the user  to improve the accuracy.For the determination of the desirability of item  for user , we use Eq. ( 10), which calculates the similarity between item profile   and the user profile   with the cosine similarity measurement.( , ) cos( , ) cos( , ) where  is the number of keywords in the entire system.As an example, if user  views a lot of web-based articles about biology, text-based techniques can suggest other articles on biology and other relevant articles in the field of genetics and etc.Therefore, the text-based profile , defined by the vector   , displays words like   with their weights   .Consequently, in the recommender systems that use cosine or similar similarities, utility function (, ) will give more utility values to the articles  where the term 'biology' has a higher weight (higher   ) and fewer utility values to the articles where the term 'biology' has a lower weight.
The ConF systems may face the following challenges: 1. Content analysis limitation: ConF algorithms are restricted by the features of objects, and therefore, it is necessary for them to have an adequate and proportional series of features.Note that these recommender systems need them for analysis to suggest the items automatically.Information retrieval techniques can fine-tune features as long as the items are text-based, but other types of items have issues with automatic extraction.As an example, automatic extraction techniques for multimedia data such as graphic images, audio data, and video data are confronted with many problems, and in addition, it is often not possible to manually enter the properties.

The challenge of the first user:
A ConF system can only give a reliable suggestion to the user who has rated enough number of items previously.Therefore, in the case of a new user, who has rated less number of items, the recommender system is not able to provide the correct suggestions.

Overspecialization:
During the life of the system, it is attempted to suggest user the items that are identical to the items that have been selected in the past by him/her.This causes the items that may be userfriendly but not similar to the items that the users selected in the past are not ever offered to any user, and therefore, they remain hidden forever.

Unable to get users' feedback:
Usually, in the systems that use this solution, feedback cannot be obtained from users.For example, in such systems, users usually do not rate items explicitly (as opposed to what we had in ColF systems).This does not make it possible to determine whether the offer to the user was correct or not.
Because of the above limitations, we typically use ConF techniques in combination with other strategies.It should be noted that ConF systems are less efficient than other systems due to their need for high processing and analysis.One of the effective ways to improve their performance is to group users and submit the recommendations to the entire group (instead of submitting them to only a user).Although this method does not improve the accuracy and quality of the results, it does have a lot of effect in reducing the system overhead.Various studies have been conducted to address these challenges, like the model-based clustering method (Gong, 2010;He et al., 2011;Sadaei et al., 2016;Shinde and Kulkarni, 2012;Wang, 2012).The clustering technique groups specific items or users into clusters to determine the neighborhood of users.This leads to the preparation of a list for the current user.Of course, clustering also has some drawbacks, such as reduced performance in scalability and precision, as well as overgeneralization and overlapping.Because it is necessary to perform user-to-user comparisons for each cluster in order to determine the current user neighbors, a performance reduction in scalability occurs.Also, because the recommendations are generated by the cluster representative (which can be virtual items or users), there is a reduction in performance in precision.In addition, items and users can be placed in multiple clusters, which leads to overlapping clusters.Likewise, this approach may result in overgeneralization and produce less-personalized suggestions.As a result, the KNN algorithm may be superior to this approach in terms of precision.In the following, both clustering approaches in User-level and Item-level are considered.Fig. 6 shows an example of these processes.In both tables,  and  indicate respectively the number of users and the number of items, and   presents the rating which is provided by user  for item .Part (a) shows the arrangement of user clustering, and part (b) shows item clustering in the ColF recommender system.In Fig. 6b, each row is considered a record, and each column is considered an attribute by the clustering algorithm.Indeed, we want to partition users into a number of clusters in Fig. 6a, and we want to partition items into a number of clusters in Fig. 6b.As mentioned earlier, there are several ways to overcome some KNN-based ColF shortcomings.One is to use model-based algorithms that are scalable and just as accurate as KNN.For this purpose, clustering methods have been used directly or as a preprocessing step in recommender systems (Nilashi et al., 2014a;Gong, 2010;Nilashi et al., 2014b;Troung et al., 2007;Zhang et al., 2011).Clustering may resolve overgeneralization in recommender systems (Kushwaha and Vyas, 2014).
According to the above-mentioned methods and their problems, recommender systems moved towards using hybrid methods to solve those problems or reduce their effects.These recommender systems are considered to be Hybrid-Based Filtering (Burke, 2002).For example, a ColF method does not use the items' properties and only uses interactions between users.Since the new user has little interaction with the system, this method can not be very effective in providing accurate recommendations to her/him.Consequently, with a combination of ConF and ColF, we can more accurately understand the user's needs and provide more effective advice .Fig. 7 shows the different approaches for combining ConF and ColF.The models based on hybrid-based filtering use different mechanisms to combine different approaches to basic recommender systems.Some of these mechanisms are as follows: • Weighted mechanism: Several recommender systems' results (rate or score) are combined to produce a simple recommendation.
• Switching mechanism: This method dynamically chooses one of the recommender systems according to the current situation of the system to make a recommendation.
• Mixed mechanism: The recommendation is made up of several different recommender systems that are displayed at the same time.
• Feature combination mechanism: Features come from different sources of data from different recommender systems integrated together in a simple algorithm.
• Cascading mechanism: The system refines other system recommendations.
• Feature augmentation mechanism: The output of a recommender system is utilized as an input feature for another recommender system.
• Meta-level mechanism: A learned model through a recommender system is utilized by other recommender systems.
The social networks' expansion and the increase in the information contained in them, such as likes, comments, friends, followers, the following, and tags, has created a rich source of information for researchers (Adomavicius et al., 2020).
Knowledge-Based Filtering is a new way of recommender systems.This approach, which is based on existing knowledge about users and items, provides recommendations based on their interpretation of the user's interests and desires.This approach is theoretically more accurate and quality than other methods (Jorro-Aragoneses et al., 2019;Feely et al., 2020).
With the emergence of Web 3.0 or the Internet of Things, a new generation of recommender systems has been created.In this environment, there are various devices and sensors that collect information about the user's context.Such information can be used in recommender systems to form a new generation of systems known as Context-Aware Filtering (Adomavicius et al., 2020).
An effective system utilized in recent years that has been studied well is location-aware filtering recommender systems.These types of systems, which are commonly found in mobile applications, make the users specific recommendations based on their current position.
Demographic information is information such as age, gender, nationality, etc. Demographic information is used in Demographic-based Filtering systems.In these systems, it is assumed that users with similar demographic characteristics (for example, in the same age range) are likely to have similar interests and requirements.
By default, some items from the set of items are selected and recommended to the user in random forecast-based filtering recommender systems.The accuracy of these recommender systems depends entirely on luck, and the probability of error is also high.Therefore, it's never seriously part of the selection options for recommender systems.A similar algorithm to random forecast-based filtering recommender systems is the rate-based algorithm.The rate-based algorithm selects the most popular items instead of randomly selected items.
If a customer repeatedly hits an item, we can use his/her repeated pattern to recommend the rest of the items to him/her.The only problem with this algorithm is that the user must have bought it.It is named a frequent sequence-based filtering recommender system.
A new filtering paradigm has been developed recently based on deep learning, which is very successful if there are large data (Mu and Zeng, 2020;Pan et al., 2020a;Pan et al., 2020b).It is named a deep learning-based filtering recommender system.

Ontology in Recommender Systems
A basic definition of ontology in the field of computer science was presented by Gruber (1992).He defined it as the "explicit specification of a conceptualization."This definition was later supplemented by Staab and Studer (2009).This definition is also expressed by Taniar and Rahayu (2006) as follows: "a knowledge domain conceptualization into a computer process-able format which models entities, attributes, and axioms." Ontology can be automatically extracted from WordNet, which usually consists of relationships and words between concepts.An ontology may refer to a particular domain, which may also be a conceptual model of a particular domain.Taxonomy (topic hierarchy) is used in cases where the ontology possesses only "IsA" links, and also, the use of the word ontology should be limited to systems that contribute to the diversity of relations among the principles that include rational propositions, which officially express the relations.
The ontology may be used in different scopes like machine learning, statistical correlations, domain-specific heuristics, and user profiling.Commercial recommender systems usually use a simple form of ontology.The recommendation accuracy is high if an item-to-item similarity search in the ConF-based recommender system is used.But it is applicable if there are enough training data.Compared to the ConF-based recommender system, the ColF-based recommender system has a better performance in environments with a lot of users.
Semantic data in the concept of recommender systems means features, relations among items, and the relations between items and meta-data.Recently, ontology has been able to increase the success of recommender systems and reduce their problems (Martin-Vicent, 2014;Lopez-Nores et al., 2010).Porcel et al. (2015) improve their recommender system with fuzzy ontology incorporation.(Moreno et al., 2013).Some other researchers have implemented the semantic recommendation by combining the item-based semantic similarity and item-based ColF method (Lu et al., 2010).Daramola et al. (2009) introduced an e-tourism recommender system that uses ontology for tourism purposes.
Recommendations based on semantic relations and knowledge can be created by available data considering application domains (namely in Web 2.0 applications) (Pham et al., 2016).
The domain concept with a readable format for the machine can be supplied by ontologies.This format is usually used in frameworks that contain principle relations, axioms, and features (Giaretta, 1995).Determining the semantic similarity between concepts and ontology is necessary for a comprehensive structure, e.g., Flickr (Buitelaar et al., 2005).
Improvement of ontology formation has an important role in precision enhancement.For example, the assumption that all "IsA" relationships are the same in meaning or worth results in the reduction of the precision of similarity determination among concepts.
Here, we introduce the different methods of ontology usage in recommender systems.Suppose that there is a matrix  of various scores.It can even be a matrix after reducing the dimensions of the matrix using matrix factorization (Koren et al., 2009), the SVD method (Burke, 2007), or other related dimension reduction methods (Pan et al., 2015).The most conventional recommender method in collaborative systems is to use the k-nearest items or neighboring users to predict missing values.For example, in the method of item-based k-nearest neighbors, the missing values are predicted using Eq. ( 11).
where  is the scoring matrix  Î is the estimate of the scoring matrix using the item-based method,   is the score assigned to item  by user ,  Î  is the estimate of the score assigned to item  by user  using the item-based method,    is a k-member set of the index of items that are most similar to item  and scored by user ,   is the scoring base (it is equal to the scoring base of user , denoted by  − , plus the scoring base of item , denoted by  − , plus the scoring base of the system, denoted by  ),  indicates the weighted or non-weighted type of the method, and    is the degree of similarity between items  and .The term (  , ) is calculated according to Eq. ( 12) here and in all subsequent equations.
where  is the scoring base of the system,  − is the scoring base of user , and  − is the scoring base of item .The scoring base of the system (i.e., ) is equal to the average of the scoring matrix ( ̅ ), which is obtained from Eq. ( 14).
The scoring base of item  (denoted by  − ) is obtained from Eq. ( 15).
where  1 is a regularization parameter adjustable by the experiment.The term  − indicates the scoring base of user , which is obtained from Eq. ( 16).
where  2 is a regularization parameter adjustable by the experiment.The most conventional recommender method in collaborative systems is defined as using k-nearest neighbor users to predict missing values in the above equations without losing generality and by transposing matrix .For example, the missing values are predicted using Eq. ( 17) in the item-based knearest neighbor user.
where    is a k-member set of the index of users who are most similar to user  and have scored item , and    is the degree of similarity between users  and .In the text representation of each item, the items are shown as a TFIDF matrix space.The entry   of this matrix indicates the importance of the feature or word  in the text or item  and is obtained from Eq. ( 18).
where   is the number of occurrences of the word  in the text or item , and where  is the number of items and   is the number of items containing the word .This matrix can first be separated into SVD elements, i.e., , , and  matrices, using techniques such as those presented in (Deerwester et al., 1990).The  and  matrices are orthogonal, and the  matrix is diagonal and contains singular values in this separation.A new feature matrix   is obtained, which only has  meaningful features, by keeping the -largest values in the diagonal matrix , replacing the rest of the elements with zero, and then performing the inverse operation.In this way, the number of words in this matrix can be reduced in a controlled way if the total number of words is high and the matrix is bulky.Now, the item-item similarity matrix     is defined as follows.
where  is the set of words used in texts (or items) or their controlled number (i.e., ).This parameter means only if the method proposed in (Deerwester et al., 1990) is used.
In addition to the above method in which the number of words (i.e., ||) can be unlimited, other methods that do this on a certain set , which can be predetermined (Chan et al., 2016) or generated during the system mechanism (Becerra et al., 2013).The similarity between the items is then measured using the equation above.  is meaningless since the word frequency is not used in this case.TFIDF is replaced by LimitedIDF in the above equation to obtain Eq. ( 20).
Binary attribute   , which indicates the presence of word ω in the text or item , can be defined as follows when the number of words is limited.
The item-item similarity criterion is now defined as follows.
Besides, the item-item similarity criterion s I ij DICE is defined as follows.
In an ontology-enhanced similarity-based system, an ontology must first be formally defined (Jimenez et al., 2013).An ontology  ≡ (, , ℑ) contains a set of entities represented by  such that a root node is in the form of θ (that is,  ∈ ).Furthermore, an ontology is a function I that maps one member of E to another member of  such that entity ℑ() is the parent of entity ; that is, entity  is a type of entity ℑ().In other words,   ℑ().An ontology can be represented as a tree with root  and directional edges, each of which indicates an  relationship from a lower to a higher node. is now defined between entities  1 and  2 as follows.
where ℎ( 1 ,  2 ) is the minimum distance between entities  1 and  2 in the ontology tree in steps.The ontology-based similarity between two items or texts is defined as follows, where  is a parameter (Jimenez et al., 2013).
where   is a set of entities in text or item i, ‖  ‖ , is the degree of similarity (of the similarity type   within a set of entities of text or item i (i.e.,   ), and  is a parameter.For better generalization, the value of ‖  ∩   ‖ , is defined as follows.
How calculation of ‖  ‖ , is defined as follows.
Several  functions are then defined to measure the similarity between entities  1 and  2 (Jimenez et al., 2013).  ( 1 ,  2 ) is first calculated from the equation below.
where  is the depth of the ontology tree and  1 is a parameter, which is used to map the range value of the function   ( 1 ,  2 ) below one.The value of parameter  1 is 3 by default.One indicates the similarity between entities  1 and  2 , and zero indicates their difference.  ( 1 ,  2 ) is calculated according to Eq. ( 29).
where   1 , 2 is the first common parent between entities  1 and  2 in the ontology tree,  2 is a parameter, and   is the information value of entity t, which is obtained from the following equation.The value of parameter  2 is 10 by default.
where   is the probability of occurrence of entity t in all items.It is obtained from Eq. ( 31).
where  3 is a parameter.Its value is set to 2 by default.  ( 1 ,  2 ) is calculated according to Eq. ( 33).
where   is the depth of entity t in the ontology tree.The depth of the root of the ontological tree, i.e.,   , is zero.An ontology can be seen in Fig. 8.

Fig. 8.
A hypothetical example of an ontology.
Consider three hypothetical items with the concepts expressed in Fig. 9.
Fig. 9.An example of hypothetical items with their corresponding concepts.
For example,   The proposed approach to measure the degree of semantic similarity using the proposed criterion   is explained.
where    is the th most relevant item or text with entity ;  1 and  2 are parameters, ℎ(, ) is the set of all entity Childs of entity  with lower depth ,  ̈1  2 is the depth difference between entities  1 and  2 in the ontology tree, and ℎ(|   ) is the set of all entity Childs of entity  in    .The function ℎ(, ) is defined as follows.
As previously mentioned, one of the important steps in the development of the ontology model is the conceptual classification of entities based on the degree of semantic similarity between them.This can be done according to the position of each particular entity in the hierarchical tree.In conceptual classification, the concepts that are closer to each other in the hierarchical tree are more similar.However, the weight of edges between two different entities must be calculated to accurately calculate the degree of similarity between concepts because the semantic similarity in conceptual classification is measured based on the weight of edges.The first step to measure the semantic similarity between two terms in a weighted hierarchical tree (edges between entities are weighted) is to calculate the distance between two terms using the weights obtained in previous steps.The distance between two terms can be calculated according to Eq. ( 38): where ( 1 ,  2 ) is the distance between entities or concepts  1 and  2 .The degree of semantic similarity between concepts  1 and  2 can be measured using Eq. ( 39).
where   1 , 2 is the first common parent between entities  1 and  2 in the ontology tree, and   is the depth of entity .  1 , 2 is defined according to Eq. ( 40).
Eq. ( 40) indicates that the semantic variances among the higher levels of the hierarchy are more than those among the lower levels.Meanwhile, the distance between sibling concepts is greater than that between a ℎ and its .

Discussion on Literature
Although ConF-based recommender systems have somehow solved two main recommender system problems, i.e., cold start and sparsity issues, these systems need to be improved and still have some drawbacks in many domains like music and film.The serendipity issue introduced by (Iaquinta et al., 2008) is another problem in ConF systems.This problem means that these systems only offer high-ranking samples when they are valid in UP (Lops et al., 2011).
ColF-based recommender systems are categorized into two memory-based and model-based techniques.In memory-based techniques, predictions are made based on the whole ratings submitted by all users to the items.Therefore, all ratings must be stored in memory.In modelbased techniques, a set of selected ratings is used for ratings prediction and is applied for model learning.They often produce an offline brief rating pattern.For example, in this approach, clustering methods have been applied (Liu et al., 2020;Wang et al., 2022a) to improve the efficiency and scalability (Nilashi et al., 2014a, Nilashi et al., 2014b), but this solution can cause other issues, such as decreasing the precision of performance, overgeneralization, and overlapping.These methods have less precision compared to memory-based or KNN techniques.
Nevertheless, model-based ColF can be a substitute for KNN to solve the scalability of the memory-based approaches.
Using clustering techniques may lead to some problems, like reduced scalability performance and precision performance, overgeneralization, less personalized suggestions, and overlapping.Therefore, these approaches may entail a reduction of precision compared to KNN approaches.Moreover, clustering approaches may cause low precision compared to memory-based methods.
HRSs can be made by combining two or more methods to use their advantages and eliminate their drawbacks.Utilizing hybrid techniques improves the efficiency of the recommender system.Typically, we can create a hybrid method using ColF and ConF to find better recommendations (Meymandpour and Davis, 2014;Ebadi and Krzyzak, 2016).Data and domain properties are important in creating an HRS method.
ConF-based recommender systems may face some limitations in finding the traits that are clearly related to the items.For example, content-based film recommendations may be only based on written content about films like plot summaries, genre, actors, etc.Another problem is that a content-based recommender system can only retrieve items of higher rates for a particular UP.Of course, the main limitation of ConF is cold-start.Since items should be rated accurately before recommendation, the new items with few ratings or those without any rating cannot be chosen as recommendations.In summary, ConF-based recommender systems should deal with three issues: new item cold-start, lack of diversity, and content over-specialization.Another limitation mentioned before is when ConF is applied in a situation where items are defined, namely web pages and text files.A solution can be using ColF to reduce the ConF drawbacks, including limited content analysis issues, deficiency of diversity, and content overspecialization.ColF typically made recommendations according to similarities between users.The similarity between users is defined based on the shared patterns in user behavior and collective behavior.
The ColF method has no limitation in suggestion space and can prevent over-specialization and lack of content diversity.But it has limitations in cold-start, sparsity, and scalability.Therefore, we combine ColF and ConF with the aim of solving the mentioned problems.In this combination, it tried to use the advantages of both methods and remove their limitations.Although the hybrid system solves a great part of the problems, there remains a part where ontology can be used to improve the accuracy of hybrid methods.
In this research, ontology has been used as a conductive channel to alleviate the mentioned problems.To this end, it may be possible to implement ontology, which is one of the prominent tools of the Semantic Web, to address issues such as the new item cold-start.Also, to solve the problems of scarcity and scalability, a clustering method can be used in the ColF phase of this system.In this case, similarity links are required for user clustering in the collaborative filtering phase and items in ConF, the accuracy of which can be improved by using semantic similarity.Semantic similarity relations can be acquired by applying node-based and edge-based algorithms in a hierarchical tree (Jiang, 2013).In an exemplary tree depicted in Fig. 11, each edge or link is an "IsA" relation, and each node indicates a concept in the tree.Indeed, the relations among concepts were shown by "IsA."Here, each parent node has the same similarity support for its child nodes.

Motor-Vehicle
In previous research, the ontological edges that present relationships among concepts have been supposed to be conceptually similar to each other.It means the labels (or weights) of edges are equal (or uniform).It can minimize the precision of similarity among each pair of concepts.Improving the structure of ontology can increase the performance of ontological similarity measures.In this case, it is considered that in a hierarchical tree, two concepts that are on the same level are identically related to other different concepts.Consequently, the accuracy of the recommender system to detect similar users or items can be affected.
For example, in Fig. 11, "Bus" and "Coupe" have the same distance from the "Cargo-Trailer" in WordNet.Therefore, it is supposed that "Bus" and "Coupe" are equally similar to the "Cargo-Trailer."However, it is obvious that the similarity between the "Bus" and "Cargo-Trailer" should be more compared to the one between "Coupe" and "Cargo-Trailer." In this regard, each link should be assigned a weight in taxonomy to improve the accuracy of semantic similarity.In addition to algorithms that can automatically find relationships between concepts, uniform edges may be removed through data mining and ontology.Then, semantic similarity relationships may be obtained by applying the IaD determined between the two concepts.
In previous research, clustering in ColF has only been used based on ratings performed by actual users.In this case, the issue of efficiency may arise, and this dependence of ColF on user ratings may be undesirable because they are usually relatively less accurate and can cause overgeneralization as well as produce less-personalized recommendations.
In this regard, domain ontology can be used to improve personalization.This is because the user interest in ColF is modeled effectively and accurately using ontology and using the domainbased inference technique.
In the presented research, a heuristic method is used to improve the accuracy and customization in the ColF part of the hybrid system through ontology and content of items.Finally, content-based features and ontologies may be realized to improve customized recommendations and accuracy in ColF by combining model-based and memory-based methods.

Proposed Recommender System
The paper aims at performance enhancement and dealing with scalability, sparsity, and cold start challenges using an ontology-enhanced hybrid recommender system.Our hybrid recommender system consists of two separate sections.It benefits from both collaborative filtering and content based-filtering.It also uses a model-based recommender system and a modeless (K-nearest neighbors) recommender system.If the size of the nearest neighbors is low, a missing value imputation, which was proposed by Pan et al. (2015), is used to improve the predicted ratings.The collaborative filtering section generates a profile for each user according to its extracted ontology.It also provides us with a metric to assess ontological similarity, and according to this type of similarity, we cluster the users.This section also uses item ontology and K-nearest neighbors.In our model, the content-based filtering section extracts items' ontology and their ontological semantic similarity.
The main innovations of this article include (1) the usage of a self-organized WordNet, (2) the definition of a similarity measurement criterion considering the created WordNet, (3) the creation of a recommender system that uses both a model-based method to be fast and a memorybased method to be accurate, and (4) using a missing value imputation method to improve our method.Fig. 12 shows the pseudocode (Fig. 12.a) and the general structure (Fig. 12.b) of our hybrid recommender system.As seen in Fig. 12.b, our hybrid recommender system consists of two main sections: the primary content filtering section and the primary collaborative filtering section; therefore, it is a combination of the primary memory method and the primary modeloriented method.
In our hybrid recommender system, the collaborative filtering section, on one hand, uses the video ontology database and users' latent information to construct the ontology of users' profiles, and on the other hand, uses explicitly ranking users' information as helpful knowledge in the cluster stage and completes ontology.
We represent the main steps of our algorithm in Fig. 12.a.We can see an overview of the proposed hybrid system in Fig. 12.a.It is clear that our proposed hybrid system includes two basic approaches, and it has both parts.Collaborative filtering uses the movie ontology repository (MOR), explicit valuation, and implicit data of the user (IDU) to construct an ontology of the user's profile (UP).In this section, we use explicit valuations for the clustering task.Then, we improve ontology through this clustering.In the content-based filtering section, we only use the movie ontology repository.We use the results of this section to validate the IsA relationship.
The primary ontology construction, the collaborative filtering section, the content-based filtering section, and the combiner are four parts of our framework.The primary ontology construction gathers all the data for an ontology, user profile-based ontology, and item-based ontology.The collaborative filtering section and the content-based filtering section are then parallel.We give the outputs of the collaborative filtering section and the content-based filtering section as input to the combiner at the last step.
In the primary content-based filtering section, we use the knowledge contained in the ontology repository of films to determine ontological IsA value, leading to a weighting of the concepts contained in the ontological tree.It is hierarchical, and in this way, our method can identify the best items as a recommendation list for the target user.As we mentioned, measuring the IsAs, before calculating the degree of between-concept equality is one of the main steps in the proposed hybrid method.Clustering, user profile ontology, and the proposed basic memory algorithm (such as the KNN algorithm) are used in the primary content-based filtering section.
In the primary content-based filtering section, the uniformity of all IsA relationships in ontology is measured, and all of the similar edges are removed.Then, the ontological semantic similarity between every pair of concepts (based on the given weights) is evaluated.Accordingly, we identify items similar to the target user's profile.On the other hand, the construction of the improved clustering and ontology, the proposed algorithm for finding similar users (for example, improved KNN), and finding the ontological semantic similarity are the main processes in the collaborative filtering section of the proposed method.In order to cluster users, information about (explicit) user ratings and movie content features is used.In the proposed clustering method, overlap in clustering and other problems in traditional clustering are eliminated.In the next step, the ontology of the items is improved using the clustering step.At this point, a feature called UC (User-Clustering) will be added to the ontology of items.This UC feature of the item includes users who have purchased or are interested in the item.The UC feature is included in the item ontology.For such a feature of an item, users who have bought or are interested in the item are presented.It is noteworthy that the clustering of users is conducted based on the clustering phase, encapsulated in the UC feature.
In the following, the users with the most similarity with the target user (the k closest neighboring users to the target user) are identified based on the mentioned feature.In the next step, the number of top N items is determined according to the needs and interests of the  users who are similar to the target user.Unlike traditional algorithms, to detect k nearest neighbor users to the target user, we do not need to search all clusters to find clusters of similar users.Also, we do not need to check all users in the cluster of similar users to detect k nearest neighbor users to the target user.For this purpose, only those users in similar clusters to the target user are examined.Also, in order to find the most similar users with the target user within the target cluster, only users who are, based on the UC feature, within a common cluster with the target user are considered.
The URL of any movie on the IMDB website is a unique indicator representing a single film.Using a Web Crawler Service (WCS) and the unique URL of each movie, the primary content features of that film can be extracted from the mentioned repository.These properties are stored in a database to generate ontology-based metadata.The WCS analyzes representative web pages based on the critical features of each predefined movie.In this research, ten critical characteristics of items (films) are used, which are: genre, actors, country of production, release time, release time, average website rating, color, director, writer, and film language.After that, user ontology is constructed according to the ontology of items and the implicit ranking of users, as well as the user's feedback (explicit ranking); all of them are collected through a web proxy.
At the stage of measuring the IsA degree, the ontology of the concepts must first be generated.To design an ontology, we use the concept tree as our knowledge representation model.In the concept tree, the relationships between concepts and the IsA relationships are defined.Each node of the concept tree represents a concept, and each edge represents the parent-child relationship between the two nodes.The concept tree is one of the simplest available ontology models; we use it in the present study.The UNSPSC coding method (UNSPSC, visited in 2016) is used to classify concepts in this research.Each concept has its characteristics, such as product code in UNSPC and WordNet code.In addition to the mentioned features, this study uses another unique feature called between-concept equality (CE) grade for concepts.This feature shows how much a child concept (ChC) is supported by its parent concept (PaC).The following algorithm is used to detect the relationship between any pair of concepts automatically; it is recommended that the weight of the edge between the mentioned pair of concepts is considered to be their betweenconcept equality grade.In order to ascertain the concept equality grade among two concepts,  and , Eq. ( 41) is used.
where ℎ(, ) takes a concept  and an integer  as inputs and returns the set of all the th level ChCs to a concept  (to make it formal, ℎ(, 1) is the set of all the child concepts of concept ; and ℎ(, ) = ⋃ ℎ(, 1) ∈ℎ(,−1) ),    takes a concept  and an integer  as inputs and returns the th relevant document (among those documents found in search algorithm) to concept , ℎ(|) takes a concept  and a document  as inputs and returns a set of the child concepts to concept  that are in the  document,  is a parameter equal to three in this paper,  is a parameter, ̈ , takes two concepts  and  as inputs and returns an integer indicating how long these two concepts are away from each other in hierarchical concept tree, and finally, () takes a condition as input and returns one if the condition is true; otherwise, it returns zero and it is presented by Eq. ( 42).
To compute  100,2 (,  − ℎ), let's assume we search  (here  = 1,000) documents containing child concepts of the "Motor-Vehicle" concept.Let's assume the Motor-Vehicle concept emerges as the parent concept to Truck, Motorcar, Cargo-Trailer, Coupe, and Bus concepts correspondingly 311, 871, 0, 4, and 275 times in our 1,000 documents.Therefore, we have  1000,2 (,  − ℎ) is 1.2353 (= 311 311+871+0×0.5+4×0.5+275×0.5+ 1).The proposed algorithm includes the following steps to measure IsA grade.In the first step, all child concepts of a parent concept are identified using the UNSPSC standard in the hierarchy of concepts.Then, we find a set of documents related to the parent concept as the keyword.For this purpose, we can use the methods that are used to automatically detect the relationship between concepts (Buitelaar et al., 2005)."Related documents" means documents related to the parent concept.The following statements should be searched on Google to find documents related to the parent concept (Buitelaar et al., 2005;Hearst, 1992) (as shown in Fig. 13).In Fig. 13, the phrase "PaC" means "parent concept," and the phrase "ChC" means "child concept."Also, the phrase hyponyms("ℎ  , ") indicates "ℎ  IsA ".We should search each phrase without taking a specific child concept into account.As mentioned, phrases should be searched without taking a specific child concept into account since the aim is to acquire a general basis for documents, including every child concept of a parent concept.For example, the following expressions should be searched to find related documents and determine the grade of "IsA" for the children of the phrase "Bus" (Fig. 14).In the next phase, we can look at a certain number (indicated by ) of documents found for each phrase in order to find the concepts.For instance, the first thousand searched web pages for any phrase can be selected.In the final step, the sub-phrases (child concepts) of a particular phrase (parent concept) are explored using the previously mentioned relationships.For each phrase (child concept) discovered in the documents retrieved in step 3 (as well as in the relationships in step 2), a positive point is added to the value of that phrase relative to its parent concept.Now, we introduce our proposed semantic similarity (SS) measure.One of the essential parts of this work is calculating the between-concept ontology-based similarity according to the concepts' position in the concept tree.Accordingly, the closer the two concepts are to each other in the concept tree, the greater their between-concept ontology-based similarity.The most crucial part in calculating the between-concept ontology-based similarity between two concepts is defining a meaningful weight for the edges between the concepts in the concept tree.It is noteworthy that the degree of between-concept equality is the weight of the edge between the two concepts in the concept tree.Based on these edges, the between-concept ontology-based similarity is calculated.Also, calculating between-concept ontology-based similarity is the most important innovation in this paper.
The degree of between-concept ontology-based inequality is presented in the form of Eq. ( 43).Eq. ( 43) takes the concepts  and  and the values of the parameters  and  as input, and according to the values of the parameters  and  calculates and returns the degree of betweenconcept ontology based inequality of the concepts  and .
We present the between-concept ontological semantic similarity between the concepts  and  in the form of Eq. ( 44).Eq. ( 44) takes the concepts  and  and the values of the parameters  and  as input, and according to the values of the parameters  and  calculates and returns the degree of between-concept ontological semantic similarity between the concepts  and .
where  ̇ is the root of the concept tree,  ̇, is the concept that is the closest common ancestor of the concepts  and , and the regulator  , (, ) is defined as Eq. ( 45).
The between-concept ontological semantic similarity between the concepts  and  (Eq.( 44)) confirms that the similarity variance in the concepts at the top of the concept tree is more significant (greater) than the similarity variance in the concepts at the bottom of the concept tree; also, the between-concept ontological semantic similarity between the parent concept and its child is more than the similarity between the two sibling concepts.
Next, a list of the top items (named Top N) chosen based on the target user's past records is prepared using the primary content filtering method.The primary content filtering method used here is based on the between-concept ontological semantic similarity described in the previous step.
In the collaboration filtering segment, in order to cluster items, both content-based features (implicit information) and user ranking (explicit information) are used because only paying attention to one of these types of features leads to problems such as reduced accuracy, less quality generalizability, and the overlapping clusters.In this research, different algorithms can be used for clustering.We use the k-means algorithm, i.e., one of the simplest and oldest algorithms, which is among the unsupervised learning algorithms (MacQueen, 1967;Lloyd, 1982).
The k-means algorithm is a clustering method with the assumption that clusters are global.In the k-means clustering method, the goal is to divide the set of items into different k clusters.The initial positions of the centers of the clusters can be predefined.In order for clustering to work properly, it is best to keep the initial positions of the centers of the clusters far away.It contains the following steps: Stage 1: Initialize a set of  seed points as the center points of the  initial clusters.Each point is a rating list for all items.
Stage 2: Calculate the distance of each point to all of the center points.Allocate each user to its nearest cluster; i.e., allocate each user to the cluster whose center has the lowest distance to the user's rating vector.Based on the content characteristics of the items, we first generate a set of clusters of clients using a clustering approach.In this research, the genre features are first used to make a prototype of the movie recommender system.For this purpose, the importance of each genre to each user must be calculated.To achieve this goal, we must first display each item as a vector.Let's assume that we use the symbol    to denote whether the th item contains the th genre or not (it is a Boolean value).Now, we have an importance matrix just like Eq. ( 46).where    is the importance of the th genre to the th user.Therefore, we have the important genres for each user at this point.Now, we can use a primary clustering algorithm, such as kmeans, to allocate users into a predefined number of clusters.After finding clusters, we assign the th item to the th cluster if for any ̇ clusters where    is the importance of the th genre to the center point of the th cluster.After accomplishing this stage, we have a set of clusters, each of which contains a number of items and a number of users.
One of the most important tasks in this study is to complete the ontology according to the clustering of users in order to achieve efficient and accurate clustering-based collaborative filtering.We define a new assumptive feature named UC (user clustering) in the items' ontologies.The UC of each item is defined as a part of its ontology.The users located in a cluster with an item during the aforementioned clustering are located in the UC part of the ontology of that item.It will be later discussed that UC can speed up the process of determining the k nearest neighbors of the targeted user.Indeed, we just explore those users in the UC of the items highly rated by the targeted user to discover the potentially near users to the targeted user.
In this study, in order to prepare a list of recommended items based on collaborative filtering, we must identify the most similar (or neighboring) users to the target user.Different algorithms and different approaches can be used for this purpose.Applying the primary KNN algorithm to the rating matrix is one of the options to achieve this goal.However, in order to improve and increase the efficiency of the algorithm, in this research, a solution is presented that does not need to search through all users to find neighboring users with the target user.To do this, we first find a set of all of the users in UC of any item which is highly rated by the target user.Then, we use only this set of users to find the most similar (or neighboring) users to the target user by applying the primary KNN algorithm.As the number of users in this set is largely less than the original number of users, we expect that the scalability and accuracy of the algorithm will improve.Therefore, this will improve the performance of the entire recommendation system.
Semantic similarities among ontologies are used for this algorithm to compare target users to other UPs to acquire k nearest neighbor users (Maedache and Staab, 2002).During the semantic similarity computation, conceptual and ontological similarities are taken into account.The conceptual level compares two taxonomies and the relationships among the associated concepts of both taxonomies.After finding the K nearest neighboring users (let's denote the set of the K nearest neighboring users to the th user by   ), any item which is highly rated by at least one user in   is suggested to the th user.
The final recommended items are given to the th user using a hybrid of the recommended items by a content-based phase and the recommended items by a collaborative-based phase according to a weighting mechanism (Burke, 2002).

Empirical Study
The proposed algorithm is evaluated with the help of a real benchmark.Accordingly, the results of the proposed method can be compared with the results of other methods in this field.In this article, the proposed recommendation system and other comparable algorithms are implemented on a Matlab R2019a.

Benchmark Details
For the evaluation of different algorithms, two datasets are used.The first dataset is collected from the Movielens benchmark.This data is then refined to be usable during the evaluation process.Each user has rated at least 20 movies in this dataset.The dataset description is as follows: The number of movies and users of the Movielens database is 3,706 and 6,040, respectively.Users can score a movie an integer from 1 to 5 with an average score of 3.58 and a standard deviation of 1.17.Each user has submitted at least 20 ratings to the Movielens database.Also, each user has provided a maximum of 2,314 ratings.On average, each user has provided 166 ratings with a standard deviation of 193.Each item in the Movielens dataset is rated at least once and at most 3,428 times.On average, each item is ranked 270 times with a standard deviation of 384.This dataset consists of 1,000,209 anonymous ratings based on the number of movies and users.The dispersion of the dataset is 95.53% (i.e., [1 − In evaluating each recommender system, the dataset should be divided into two parts: a training set and a test set.When building a recommender system, only the training set should be used.Once the recommender system is built, the test set can be used to evaluate the performance of the recommender system.A subset of 80% of data is randomly selected for the training set, and the test set consists of the remaining 20%. In this section, we also use the Netflix data for the evaluation of different algorithms as our second benchmark.This benchmark contains 100 million anonymous ratings to date-stamped movies accumulated by 480000 unknown users.Therefore, each user has rated at least 208 items in this dataset.Also, each movie has been rated at least 5600 times.Each rating is an integer value in the range [1-5], where a "1" assigned by a user to a movie indicates that the user dislikes the movie greatly, and a "5" assigned by a user to a movie indicates that the user likes the movie greatly.We report results on a test set that is predefined by Netflix as the Quiz set.It contains over 1400000 ratings.This Quiz set contains many users that never rate in the training data or rate a few movies there.Thus, it is a real challenge, especially in CS situations (Bennett and Lanning, 2007).
We define an implicit feedback matrix as follows.We create a binary matrix, , with the same size as the rating matrix, , where each   is 1 if   has a value; otherwise, it is 0. Therefore, the   is one, if the th user rates the th movie (Marlin et al., 2007).

Setting Number of Clusters
Silhouette measure can be used to evaluate the quality of a clustering.In this regard, after distributing a dataset into a number (denoted by k) of separate clusters, the clustering is evaluated using the Silhouette method (Frahling and Sohler, 2008;Nilashi et al., 2014).Since the value of  is variable, different clusters will be produced based on different values for .Therefore, it is necessary to test the quality of the different clusterings to find the optimum k (Nilashi et al., 2019).The results are presented in Fig. 15.

Recommender System Assessment
In order to evaluate the proposed method, it should be compared with the basic memorybased recommender system (for example, basic KNN) and the basic model-based recommender system (for example, basic clustering-based recommender system) in terms of time complexity and accuracy.
The KNN method is used in memory-based ColF, and the clustering method is used in modelbased ColF.It is noteworthy that memory-based recommendation systems have better accuracy, while model-based ones are significantly better in terms of time complexity.The purpose of the proposed recommendation system is to provide an algorithm benefiting from the accuracy of a memory-based recommendation system and the scalability of a model-based recommendation system.
Many state-of-the-art recommendation systems have been used to compare with the proposed recommendation system.These methods include the Non-Normalized ConF recommendation system, a Singular Value Decomposition-based recommendation system, a Popularity-based recommendation system, and an Ontology-based Top-N recommendation system using Matrix Factorization.These methods include some of the basic algorithms (like LMRS, LULCS, GMRS, LMRS+ (LMRS enriched by demographic information), NoPE (user-based ColF)) and some sophisticated approaches (like nonnormalized ConF RS, Item-based + SVD + EM + Ontology (ISVDEMO) (Nilashi et al., 2019), User-and Item-based + SVD + EM + Ontology (UISVDEMO) (Nilashi et al., 2019), Pop-based RS (Cremonesi et al., 2010), OTopN (ontologybased Top-N) (Bambini et al., 2011).For all of these methods, we use their best-recommended parameters by their corresponding papers.The experiments are independently done in three sections ICS, UCS, and TN.Furthermore, the weights of Q and Z are set to 25 as the best experiments are attained for the mentioned values.The averaged consumed time for prediction of a missed rating per different RSs in terms of neighbor size on the TN dataset of Movielens is presented in Fig. 15.The recommender system (algorithm) proposed in this paper and other compared algorithms are analyzed using Matlab R2019a software under Intel Core i9-9900K Tray processor with 16GB (8GB*2) RAM with 4000Mhz memory, and Windows Server 2019 Version 1809 Build 17763.805.The time complexity analysis of the used recommender systems has been presented here.Let's denote the quantity of users, the quantity of clusters, the quantity of items similar to UP of the targeted user, and the quantity of the users who have rated the items similar to UP of the targeted user, i.e.,  1: 2  , by ,  1 ,  2 , and || respectively.The time complexity of the suggested algorithm is ( 2 × ||).It is enhanced compared to the conventional KNN-based ColF (Adomavicius and Kwon, 2007) approach, which is (), and is close to the time complexity pertaining to the model-based clustering system (Liu et al., 2011), which is reported to be ( 1 ).In order to evaluate the proposed system in terms of time complexity, the consumed time of the proposed algorithm can be compared with those of the primary memory-based recommendation system (i.e., KNN) and the primary model-based recommendation system (i.e., clustering-based technique).Among these, parameters such as the total number of users, the number of clusters, the number of items that are similar to the target user's profile, and the number of users who have given similar ratings to the items that are similar to the target user's profile are important.In order to clearly identify the time complexity for the three algorithms, Fig. 16 shows the consumed time chart in terms of seconds.The three algorithms are conducted under the same conditions and executed on a system with the same configuration.Fig. 16 shows that the consumed time for the proposed method is less than that of the primary memory-based recommendation system.It also shows the consumed time of the proposed method is slightly more than that of the primary model-based recommendation system.The consumed time for the primary memory-based recommendation system is significantly higher than those of the primary model-based recommendation system and the proposed recommendation system.Additionally, the primary model-based recommendation system is the best one in terms of time complexity.It is also clear that the size of the neighboring set in KNN affects a little on the consumed time irrespective of the method.
Statistical metrics such as the Mean Absolute Error (MAE) among the actual and predicted ratings are determined.Contrarily, the decision support metrics present a comparison of the suggested items with relevant items, e.g., counting overlap.Eq. ( 47) presents the MAE.such that  is the quantity of items that test users have rated, and  ̂ is the predicted rating that the th user gives to the th item.The MAE results of different recommender systems have been presented in Fig. 17 in terms of profile expansion size.The statistical t-test has been performed on the results of Fig. 17 for the UCS dataset, and the results are presented in Table 1.According to the data, the proposed method has significantly improved the performance here.

UCS dataset
The accuracy results of different recommender systems have been presented in Fig. 18  In regard to precision measures, decision support metrics have a specifically significant role in the field of multi-indices recommender systems.Numerous metrics are prevalent for data retrieval and are discussed further.Eq. ( 48), the precision equation, determines the share of items relevant to the achieved results.49), the recall equation, determines the share of relevant retrieved items.Both of these metrics must be used at the same time since when the quantity of retrieved items increases, the same applies to the recall, whilst the accuracy is reduced in greater sizes.

𝑅𝑒𝑐 = 𝑇𝑃𝑅 𝑇𝑃𝑅 + 𝐹𝑁𝑅
(49) such that  is the quantity of unrelated false predictions,  is the quantity of truly related predictions, and  is the quantity of wrongly related predictions.
The F-measure metric considers both  and  values (Tsai and Hung, 2012).It is computed as shown in Eq. ( 50) to derive the average precision and recall.The parameter Φ may be utilized to weigh the impact of one or both, such that Φ > 1 increases the precision significance and Φ < 1 increases the impact of the recall.To achieve a balanced F-Measure, Φ is assumed to be one.

𝐹 Φ =
(1 + Φ 2 ) ×  ×  Φ 2 × ( + ) (50) Additionally, to assess the recommended approach via decision support metrics, the recall and precision were derived using various 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,and 20.The recall and precision values for various Top-N values are used to calculate the  1 metric.Fig. 19 shows the  1 metric.The proposed method is also assessed using MAE to determine its performance, and it is compared to the recommender systems of the memory-based type (such as KNN) and those of the model-based type, such as clustering approaches.Fig. 20 shows the MAE for various neighbor sizes k on the benchmark dataset.As it can be inferred from Figure 19a, increasing the number of  in Top-N improves the efficacy of all recommender systems in terms of recall.Also, the proposed HRS is superior to other state-of-the-art recommender systems irrespective of the number of N in Top-N.Although the OTopN, ColF KNN, and Model-Based Kmeans recommender systems are, respectively, the best methods after the proposed HRS in terms of recall, only the performance of the OTopN is comparable.As it can be seen, GMR is the best method in terms of the run time.It is fast because it lacks a time-consumeing task.It manages the missing ratings as a global imputation method.ColF KNN is the best method in terms of the quality of its recommendations.But this method is significantly slow.Meanwhile the recommendations of the GMR method is of low quality.While Model-Based Kmeans is not the best method in terms of the run time, it is not fundamentally slow.
Later in this section, we compare different recommender system methods in terms of Precision, Recall and F-score.Here, we use a dataset named flicker.The recommender systems presented in Table 2.    Using some of the state-of-the-art methods (presented in Table 2) as baseline against our method, we conduct results of Fig. 23-25.As it can be seen in Fig. 23, the more N, the more the number of recommendations.The Recall metric grew as the number of suggestions was raised, as seen in Fig. 24.According to the Fig. 24, our method is the best in terms of F-measure.

Conclusions and Guidelines for Future Research
We propose a new recommender system where it uses ontology and WordNet, and it is based on the profiling of users and items.We propose to improve the profiling of users by clustering.Our recommender system deals with the item cold-start challenge with the help of ontology and WordNet.It overcomes the scalability challenge and the sparsity challenge with the help of a clustering method.It also outperforms the state-of-the-art recommender systems in terms of recommendation quality performance.Our recommender system is more scalable than many modern recommender systems.It is also comparable with many of the scalable recommender systems in terms of scalability.Our recommender system utilizes both concepts of the collaborative filtering recommender system and the content-based filtering recommender system.Experimental results have been done on the Movielens benchmark.Analyzing the experimental results, we find out that our method is superior to the state-of-the-art recommender system in terms of accuracy and execution time.In future work, demographic information can be considered in this method.Also, this method can be generalized to other problems as another future work.

Declarations Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.

Fig. 1 .
Fig. 1.Various types of recommender systems and their relations • Content-based model: the suggestions that are similar to the content of a user (profile) are given to him,

Fig. 4 .
Fig. 4. The process of ColFRS.Collaborative filtering (ColF) algorithms are categorized into two fundamental types: (a) memory-based (or heuristic-based) ones and (b) model-based ones.Memory-based collaborative filtering algorithms are inherently heuristic, and their predictions are on the basis of the total set s) score(UCBP , ICBP ) =
Fig 7.a shows an approach in that each method is separately defined and used.The union of their results can be later used.Fig 7.b combines both methods in an integrated parallel structure.Indeed, the results are combined like a voting mechanism to provide more effective recommendations.Fig 7.c shows that a ColF approach is implemented at first, and then some features of ConF are added.Fig 7.d also shows that a ConF approach is implemented at first, and then some features of ColF are added.

Fig. 11 .
Fig. 11.An example of a graphical fragment of WordNet.
= (,  1 )    =sort(   ,S)Set=The union of the  3 most similar users to target user considering all items rated by target user  as neighbors of the target user to predict missing elements in vector   and name it    Use missing value imputation method(Pan et al., 2015) to fill  matrix and name it     =Combine(   ,   )

Fig. 14 .
Fig. 14.Finding child concepts of the Bus concept.

Stage 3 :
Recalculate the clusters' centersStage 4: Iterate stages 2 and 3 until there is no change in two consecutive data allocations.

Fig. 15 .
Fig. 15.Silhouette of the different partitions obtained for different values of the clusters' number in the clustering algorithm.

Fig. 17 .
Fig. 17.The performance comparison of different RSs in terms of MAE for different profile expansion sizes on a) (top) TN dataset, b) (middle) ICS dataset, and c) (bottom) UCS dataset.

Fig. 18 .
Fig. 18.The performance comparison of different RSs in terms of RC on a) (top) TN dataset, b) (middle) ICS dataset, and c) (bottom) UCS dataset.
Fig. 19.a) The recalls of different methods against N in Top-N.b) The  1 scores of different methods against N in Top-N.

Fig. 20 .
Fig. 20.The MAEs of different methods against the number of the nearest neighbors, .

Fig. 21 .
Fig. 21.The performance comparison of different RSs in terms of MAE on the Netflix dataset

Fig. 23 .
Fig. 23.The precision of different methods against N in Top-N.

Fig. 24 .
Fig. 24.The recall of different methods against N in Top-N.

Table . 1
. Statistical test of the state-of-the-art methods with the proposed method

Table . 2
. Some of the state-of-the-art methods used in comparison with the proposed method