Recop: Fine-grained Opinions and Collaborative Filtering based Recommender System for Industry 5.0

In the futuristic Industry framework, user interactions with the product are seamlessly integrated with the product life cycle. A recommender system can be considered as an information filtering tool that provides suggestions to users about products, music, friend, topic, etc. This suggestion is based on the interest of users. Several research works have been carried out to improve recommendation accuracy by using matrix factorization, trust-based, hybrid-based, machine learning, and deep learning techniques. However, very few existing works have leveraged textual opinions for the recommendation to the best of our knowledge. Existing research works have focused only on numerical ratings, which do not reflect actual user behaviour. In this research work, sentiments of textual opinions are analyzed for an in-depth analysis of users' behaviour. Furthermore, Natural Language Processing techniques such as lemmatization, stemming, stop-word removal, Part-of-Speech (POS) tagging are applied to textual opinions. Recommendation accuracy is improved by using the proposed score Recop calculated from opinion sentiments. Furthermore, the sparsity issue is resolved by using our proposed approach. Amazon and Yelp review datasets are used for Experiment analysis. Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) values are improved significantly using the proposed approach compared to the existing approaches. MAE and RMSE scores and 1.51, respectively. Additionally, MAE and RMSE scores on the Amazon dataset are 0.66 and 0.93, respectively, significantly contributing to our proposed approach.


Introduction
In Industry 5.0, smart manufacturing is becoming the norm. Many technologies are being grouped across different layers to build a production stack. Many products are being customized based on the actual specifications given by the customer, and others are being modified based on the feedback of the user. Business intelligence is significant for the growth of a company (Tavera Romero et al., 2021). Machine learning-based models improve the B2B sales (Wisesa et al. 2020;Romero et al. 2021;Singh, B & Kapoor, M 2021).
Large-scale heterogeneous and unstructured data is generated by Social Networks, Business transactions, and E-commerce sites. It is very difficult for users to extract relevant information from overloaded and distributed data (Kumar et al., 2021;Rangra & Kapoor, 2021). A recommender system is a tool that can be used for information filtering to overcome information overload issues (Adomavicius and Tuzhilin 2005;Ghasemi and Momtazi 2021). Users are provided with suggestions based on their likes and dislike for any product or topic. Content-based, collaborative filtering and hybrid-based are various types of recommender systems. Several researchers have proposed novel approaches for improving product, movie, book, friend, and stock recommender systems. Trust Guo et al. 2015), social connections, social regularization, and matrix factorization (Koren and Bell 2015) are proposed benchmark improvements for better recommender system accuracy.
1 http://www.trustlet.org/epinions.html The limitation of existing techniques is that only numerical ratings are used. Users provide these numerical ratings based on product or topic relevance. Several standard datasets, such as Epinions 1 , FilmTrust 2 which provide ratings on a 1-5 scale. Numerical ratings do not reflect the actual feedback of users. Opinions and textual reviews contain more valuable information than numerical ratings (Ghasemi and Momtazi, 2021;Ganu et al., 2009). Opinions provide the state of users in subjective expressions (Bifet and Frank 2010). It can enhance recommendation capabilities (Poria et al., 2016). An ample amount of reviews helps in recommendations (Ray et al., 2021). Reviews can be used with ratings to provide better predictions (Lei et al., 2016). With the advent of e-commerce sites and review hosting sites, the textual review is gaining a lot of popularity.
Sample Review: Affordable and good. Clean restaurant. Rating: 3 A nice opinion is provided in the above sample review from the Yelp dataset, but the rating provided is less that does not reflect actual preference. There are several other samples in which reviews are good, but the rating is only on the scale of 1-2. Moreover, multi-faceted views can be captured from users' opinions (Chen et al. 2015;Singh et al. 2020). The reason is that opinions provide latent information and emotions. Moreover, in numerical ratings, there is a sparsity issue as many users do not provide ratings. These issues can be resolved by using the opinions of users in natural texts. If these text opinions are converted into a numerical score, better recommendations can be provided to users. Several research works have tried to improve the recommender system using opinions and reviews (Lei et al. 2016;Wang et al. 2012;Diao et al., 2014;Jakob and Ag 2009;Ling et al. 2014). However, there is the requirement for extensive analysis of textual opinions based on polarity.
Sparsity and cold start are the main limitations of recommender systems. Several researchers have proposed novel approaches to overcome these issues. Opinion mining can overcome sparsity and cold start (Chen et al., 2015). Reviews can overcome sparsity and cold start by three techniques: (i) In content-based recommendation, user profiles are created based on terms, i.e. frequent keywords used in reviews.
(ii) Virtual ratings are filled for sparse data.
(iii) Ratings are improved by using reviews.
In this research work, numerical score-Recommendation using opinions (Recop) score is proposed using textual opinions. Furthermore, the Recop score is normalized for review strength. 'Review strength' is the level of opinion provided by the user such as strongly like, average like, like, strongly dislike or dislike. Numerical ratings only provide the scale of likings of the user for any product. Reviews provide details of the reason for like or dislike of product which is the basis of high accuracy of proposed Recop score.
In this research work, the Amazon review dataset is observed thoroughly. The reason for selecting numerical values for the Recop score is that similarity measures are very accurate for numerical compared to categorical attributes. Moreover, MAE and RMSE evaluation metrics work effectively on numerical scores.
The major contributions of this paper are as follows: (i) A fine-grained analysis of textual opinions is employed instead of numerical ratings. (ii) Machine Learning is applied on textual reviews to compute numerical score which reflects actual user behaviour and interest. (iii) Sparsity and cold start issues are resolved by using the proposed approach. (iv) Recommendation accuracy is improved significantly by using the proposed approach.
The rest of the paper is structured as follows. Section 2 covers related work on recommender systems and opinion mining. In Section 3, the proposed approach is elaborated. Section 4 is focused on Experiment analysis. Finally, Section 5 concludes the paper with future directions.

Related Work
With the expedited Industry 5.0 deployment, Collaborative filtering based Task recommendation models are being used for crowdsensing networks, trajectory prediction, dwell time and degree of trust calculations.
In (Cheng et al. 2019), a multi-modal aspect topic model is proposed, applying aspects-based techniques to text reviews and images. The weighted ratings are combined with aspect ratings. The authors have concluded that the sparsity issue is resolved by combining weighted ratings and aspect ratings. Experiment analysis is conducted on the Amazon dataset, proving that recommendation is improved compared to existing approaches. In , the authors explained how aspect-based opinion mining improves recommendation accuracy. It is mentioned that existing studies provide equal weightage to different aspects, but the fact is that users provide different weightage to different aspects. In this work, authors have embedded aspect-based opinion mining using deep learning with collaborative filtering. Rating prediction and ranking performance are evaluated using RMSE and precision, respectively. In the proposed MCNN architecture, word embedding and POS tags layers are included to incorporate aspects.
In (Tarus et al. 2018), knowledge-based recommendations are explained with the use of ontology. Numerous Several research works in an ontology-based recommendation for e-learning are reviewed in this paper. It is stated in this paper that no comprehensive survey has been conducted on ontology-based recommenders for e-learning.
Research surveys have been conducted on contentbased, collaborative filtering, and hybrid recommendation only. Future trends in ontologybased recommendations are also presented in this paper. In (Chen et al. 2015), review enhanced recommender systems are surveyed. It is mentioned in this paper that sparsity is resolved by including reviews in content-based and collaborative filteringbased systems. The valuable information in the review can be useful for improving recommendation accuracy. Review elements such as frequent items, opinions, review topics, review emotions, feature opinions, and contextual opinions are combined with the rating and recommender approach to enhance the recommender system's effectiveness. (Musat et al. 2013) proposed review-based collaborative filtering. It is stated in this work that traditional recommender systems use numerical ratings to find similar users. The limitation is that aspects and topics are not covered in numerical ratings properly. Moreover, in this work, authors have created a topic-based profile for users, which enhances accuracy. Reviews are correlated with ratings to validate the approach. In , authors have considered explicit ratings and implicit opinions that can overcome data sparsity. It is mentioned in this paper that the collaborative filtering technique is based on the score to find similarities between users. The review provides information that can be used for recommendations in an effective manner. In this work, features and polarity are extracted from opinions. Experiment analysis proves better recommendation accuracy. In (Shen et al. 2019), the sentiment score is calculated based on the dictionary. It is argued in this paper that previous works have considered reviews score with ratings, but there is no sentiment score calculated. This sentiment score is significant for rating prediction. The proposed approach is implemented using the Amazon dataset. NRMSE (Normalized Root Mean Square Error) evaluation metric is used in this work. Experiment analysis proves that the proposed approach outperforms state-of-the-art approaches. In (Sundermann et al. 2019), many research works on opinion mining-based recommender systems are reviewed. It is mentioned in this paper that opinions and reviews are relevant for recommender systems. In this review, a welldefined protocol is followed where 17 papers are selected out of 195 papers from four digital libraries.
In this review paper, recommender systems are categorized into collaborative filtering, hybridbased, and content-based recommendation. Numerous research works on opinions, aspects, and entities are reviewed. This paper also observed that opinion mining and context-based recommender systems are recent topics for researchers due to many recent publications by researchers. It also provides research gaps that future researchers can fill. Authors in (Lei et al. 2016) proposed sentimentbased ratings prediction approach to improve recommendation accuracy. Sentiment similarity, inter-sentiment factor, and product reputation are considered in the proposed approach. Product features are extracted using LDA. Furthermore, cosine similarity is used in this work to compute sentiment similarity. In experiment analysis, the Yelp dataset is used, which contains reviews about various categories. Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) metrics evaluate the proposed approach and compared it with the existing approaches. Authors have demonstrated that sentiment factors improve recommendation accuracy, as proved by experiment analysis. In (Asani et al. 2021), a context-aware recommender system is proposed for restaurants and food. Authors have highlighted that existing recommender systems use static information such as food quality, service etc. The cluster of names is extracted from comments. The TripAdvisor dataset is used for experiment analysis. The proposed approach outperforms existing approaches with a precision of 92.8%.
The observation from existing research work is that review strength is not calculated, which is required for better recommendation accuracy. This is the motive for proposing a Recop score that considers the strength of reviews which calculates users' preference in a fine-grained manner and provides better recommendation accuracy.

Proposed Approach
The collaborative filtering approach provides better results when sufficient ratings are available, but it cannot provide better recommendation accuracy on sparse entries (Su and Khoshgoftaar, 2009). Researchers have resolved this issue by using various approaches such as trust (Guo et al., 2015), social influence, social regularization, matrix factorization, and reviews. Explicit ratings and implicit opinions address data sparsity . There is a need for fine-grained analysis using opinions in reviews. In the proposed approach, this limitation is resolved by using comprehensive-textual reviews. The similarity between users is calculated by Equation 1.
where, ( ) and ( ) are ratings provided by users and respectively, ( ) ̅̅̅̅̅̅ and ( ) ̅̅̅̅̅̅ are average ratings of users and . The main limitation of existing approaches is that these are based on numerical ratings only. Users provide ratings to any product on a 1-5 scale. Numerical rating is the basis of several research works where the prediction of ratings is computed using collaborative filtering, matrix factorization, social regularization, etc. The ratings of users are calculated by the average of ratings of nearest users as formulated in Equation 2.
where, , is the rating of an i th user for j th product and is the rating of k nearest users. It has been proved by several researchers that numerical ratings do not provide a specific description of the like or dislike of users (Li et al., 2021). There is a need for better input to recommender system that can reflect user's behaviour more precisely. The main limitation of numerical ratings is that the final ratings will be numeric if any user likes some aspects of the product and dislikes other aspects. Still, it does not reflect the actual sentiment of the user. Actual sentiments are reflected in opinions . Sentiment analysis using opinions is one of the most frequently evolving research fields (Mäntylä et al., 2018). Textual opinions provide better predictions as compared to numerical ratings (Ganu et al., 2009).In our proposed approach, textual opinions are used to compute numerical scores by using Logistic regression. In Figure 1, it is depicted that users provide reviews of products. These reviews are based on user experience with the product. The advantage of using reviews is that the user provides detailed descriptions in textual opinions and specific likings of a particular aspect of the product. Furthermore, recommendation accuracy is improved by employing opinions ).

Fig. 2(a) Workflow of recommendation using numerical ratings
Figures 2(a) and 2(b) highlight the difference between numerical ratings and reviews approaches. Research methodology is the same for these approaches except for numeric scores computed by using logistic regression in the reviews-based approach.

Fig. 2(b) Workflow of recommendation using reviews and ratings
Algorithm1. Calculation of proposed Recop score Input: Textual Reviews from users U1……n, n is the total number of users Output: Predicted Ratings R1…..m for Users U1……m, m is testing data (20%) 1. Preprocess textual reviews using stemming, lemmatization, tokenization, POS tagging 2. Initialize i:=1,j:=0, k:=0, n:= count(reviews(R)) 3. for (i:=1 to n) do (a) Apply logistic regression using sigmoid function on Ri and compute Recop score (b) Normalize ratings on a scale of 1-5 by using 5 * end for 4. Compute similar users using Recop score as virtual ratings using correlation coefficient pcc (Ua, Ub). 5. Calculate predicted rating ri,j (ratings of user i for product j) by an average of ratings of k nearest users , = ∑ =0 6. Calculate MAE and RMSE for the proposed approach 7. Exit In Algorithm 1, the calculation of the Recop score is explained in sequence. In step 1, the review text is preprocessed using stemming, lemmatization, and POS tagging. Logistic regression score is calculated and normalized on a 1-5 scale. In the next step, the correlation between users is calculated, which assists in predicting ratings. MAE and RMSE are calculated by calculating the deviation between actual and predicted ratings. Logistic regression is applied to textual reviews in our proposed approach. The logistic probability score is calculated using the reviews column in the dataset. The reason for using logistic regression is that it can provide an accurate score for the strength of like or dislike in textual review. For example, if any user has provided an 'Excellent' review about any product, logistic regression will give a score of 0.9, and for 'good' review, logistic regression will give a score of 0.6; for 'bad' review, the score will be given 0.3 and for 'worst' review score will be given as 0.1. Furthermore, logistic regression performs better as compared to other techniques (Pranckevičius and Marcinkevičius, 2017).

Fig. 3 Schematic diagram of Proposed Approach
The sigmoid function is used in logistic regression so that the probability score is in the range of 0-1. Furthermore, the numerical score is normalized to a 1-5 scale. When these scores are used for similarity of users, prediction accuracy is improved as validated with improved MAE and RMSE values in section 4. The proposed approach is depicted in Figure 3. The textual reviews dataset is preprocessed using stemming, lemmatization, and stop words removal. The logistic regression is applied to preprocessed text to calculate the Recop score. A recommendation is provided to users based on the Recop score. Logistic regression is calculated by using Equation 2.
where W is weight, X is input, and b is biased. σ is the activation function that is used to map final values in the range [0,1]. In this research, the sigmoid function is used, as shown in Equation 3.
Recop score is again converted to a 1-5 scale by using Equation 4.
= 5 * (4) where Recop score for user-item review is calculated by multiplying logistic regression score by 5. It is analyzed that the total score reflects the correct user emotions about any product or topic compared to numerical ratings alone. In the proposed approach, stemming and lemmatization are used to preprocess reviews, and then stop words are removed. Further steps of NLP such as POS tagging and Entity Recognition are also included to improve review analysis. Unigrams (Barnaghi et al. 2016) and Ngrams are used for feature extraction in the proposed approach.

Experiment Analysis
This work may become part of the Virtual layer in the smart manufacturing setup and give insights relevant to the product and dataset used in a real setup. In this section, experiment setup, dataset statistics, and results are discussed. Experiments are conducted on Google Colab GPU. MAE and RMSE scores are plotted for the proposed approach and existing approaches to depicting results.

Experiment Setup
NLTK library is used for stemming, tokenization, lemmatization, and stop words removal. Pandas 1.2.3 library is used to read CSV files. Sklearn 0.24.2 library is used to implement logistic regression and sigmoid function. MAE is calculated by the difference between predicated ratings and actual ratings, as shown in Equation 5.
where pr is predicated rating by our proposed approach and p is actual rating in the dataset. RMSE is a square root of the sum of the square of the difference between predicted ratings and actual ratings, as shown in Equation 6. Researchers state that RMSE reflects recommendation accuracy in a better way as compared to MAE.

Dataset Statistics
In this research work, Yelp and Amazon product review datasets 3 are used. The reason for using these datasets is that detailed reviews and numerical ratings for products are provided. In this dataset, 21 columns are present which describe ID, name, reviews. date, brand, etc. In Table 1, dataset metrics are described in detail. The reason for selecting this dataset is that ratings on a 1-5 scale and reviews are available. Textual reviews are used to convert into numerical scores so that MAE and RMSE can be calculated effectively. In this research work, the Yelp dataset is also used for experiment analysis. Yelp is a review site where reviews and ratings are available. Table 2 describes the Yelp dataset in detail. In this dataset, 10 columns such as business_id, review date, and reviews are available. User rating is on a 1-5 scale. There are 4174 unique business_id, 6403 unique user_id values.

Results & Discussion
It is analyzed from experiment analysis that recommendation accuracy is better when textual opinions are considered instead of numerical ratings. The reason is that the sentiments and preferences of users are better reflected in textual opinions. The reason for converting textual opinions into the numerical score is that MAE and RMSE are calculated effectively by numerical scores, as shown in Table 3 and Table 4.    In Figure 4, the RMSE score is calculated using numerical ratings only as well as numerical scores calculated from reviews on the Yelp dataset. It is depicted that the proposed approach outperforms traditional calculation by using ratings only.

Fig. 4 RMSE scores on Yelp dataset
In Figure 5, the MAE score is calculated using numerical ratings only as well as numerical scores calculated from reviews on the Yelp dataset. 5 folds are used during the experiment, and the mean is also calculated from these 5 folds. In Figure 6, the RMSE score is calculated using numerical ratings only and the numerical score calculated from reviews on the Amazon dataset. In Figure 7, the MAE score is calculated using numerical ratings only and numerical scores calculated from reviews on the Amazon dataset.

Fig. 7 MAE scores on Amazon dataset
It is validated that RMSE and MAE scores are significantly improved when the numerical score is calculated from textual reviews using the proposed approach. Such improvement is observed because reviews reflect actual user behaviour more effectively than numerical ratings only. In Experiment analysis, the proposed approach is  (Yang et al., 2013). An explicit factor model is used to improve prediction accuracy in (Zhang et al., 2014). Experiment analysis proves that the proposed approach outperforms existing works in terms of MAE values. In Table 5, the MAE score using the proposed approach and existing approaches on the Yelp dataset are mentioned. In Table 6, the MAE score using the proposed approach and existing approaches on the Amazon dataset are mentioned. MAE score using the proposed approach and existing approaches on Yelp and Amazon dataset are depicted in Figure 8. Several researchers have stated that even a slight improvement in MAE reflects a significant improvement in the recommendation. It is clear from Experiment analysis that recommendation accuracy is improved significantly by using the proposed approach.  Table 7, the RMSE score is calculated using the Amazon dataset. It is clearly concluded that our proposed approach provides better performance as compared to baseline approaches. Fig.9 depicts the results.

Conclusion and Future Directions
The proposed system can become a subpart of the Intelligence layer of the cyber-physical systems to be deployed in Industry 5.0. The traditional recommender system is based on numerical ratings only. In this research work, it is mentioned and proved that numerical ratings do not reflect the actual interest of users towards the product. Several aspects of the product are not included using numerical ratings only. In the proposed approach, textual opinions are employed thoroughly, i.e. many aspects of the product are extracted from textual review using machine learning. The sentiments from these textual reviews are converted to a numerical score. The conversion is efficient metrics, i.e. MAE and RMSE are available based on a numerical score. Furthermore, the numeric score is normalized to a 1-5 scale to precisely compare existing approaches. Amazon and Yelp datasets are used for experiment analysis. The reason for selecting these datasets is that numerical scores and comprehensive textual reviews are available. During deep study and analysis of these datasets, it is concluded that reviews provide in-depth likes/dislikes of users as compared to plain numerical rating scores on a 1-5 scale. Experiment analysis proves that the proposed approach outperforms existing approaches in terms of MAE and RMSE. MAE and RMSE scores on the Yelp dataset are 1.51 and 0.85, respectively. Additionally, MAE and RMSE scores on the Amazon dataset are 0.66 and 0.93, respectively. Moreover, the proposed approach is also compared with the traditional numerical ratings-based recommendation system. Recommendation accuracy is improved significantly as compared to state-of-theart approaches. Precision, Recall, and F-measure will be evaluated in the future. Recommendation using opinions deployed on BERT is an open research area where a lot of research is required and can be explored by researchers and practitioners.

Declaration
Funding: NA