BbNCPD- B ayesian belief Network Based Contextual Polarity Disambiguation in Sentiment Analysis

Sentiment analysis has enormous contributions and vivid utility in many areas of natural language processing, e-commerce and social network analysis. Although there are several methods designated for this task but major thrust of this domain to develop machine learning based polarity classification of user generated text, review and post. It is even tougher to classify the text containing multi-polarity words. In the meantime, several studies have discussed and focused on contextual information of word in Natural Language Processing. But, due to lack of methodologies, it becomes challenging for contextual polarity disambiguation because no evidence is available which proves that whether the sentiment expressed by the context is correctly identified. Therefore, in this work an effort to compute the quantified amount of polarity of terms/words in a given text, based on its local as well as global contexts. The proposed work contains two major parts, firstly it extracts the local and global features from semantic dependency parsing and co-occurrences with which association of the designated word to be disambiguated with previous polarities. Then, the second part computes a Bayesian belief network of these features along with their conditional probabilities which are used to disambiguate and compute sentiments. The performance of the proposed scheme is exhaustively tested on standard datasets and results clearly show the efficacy of the proposed scheme.


Introduction
Studying sentiment analysis and understanding polarity disambiguation have been an interesting topic for many researchers. This interest can be due to the possibility to generate great wealth by predicting nature of sentiment and its behavior toward social users. Usage of internet has been effecting the emotion and attitude of society. Experience of social users has changed the way for sentiment assets. Therefore, with the numerous growths of social users, predicting sentiments has become more complicated. People are communicating with each other by using several forms of communications. In the past era, the stream of traffic has turn into manifold on internet [1] [2]. With this growth of internet traffic, innumerable users are posting their sentiments and opinions in number of ways [3]. In this digital domain things are varying in a precise period and become popular and trendy over social media in a particular time t .In the past decade, opinion mining and sentiment analysis have made momentous progress [4] [5] as the micro-blogging have grown to become a cause of mottled kind of information [1] [3]. It is happening as users post real time communications about their views, opinions on a multiple themes and matter, debate on recent concerns, criticize, grumble and prompt positive, negative and neutral sentiments [6] [7]as per own understanding and experiences [5]. Twitter is a prominent micro-blogging to share information with other users through instant messages called tweets which is represented as 1 2 3 { , , ,.........
U is a user and t is representing time. With this process of sentiment analysis, polarity disambiguation is a serious research problem. In determine the Polarity of sentiment words, contextual information plays an important role of sentiment ambiguous words in a specific context [9] [10]. Required features are extracted from the selected sentiment words.
The procedure of sentiment analysis contains text analytics, text mining, linguistics and conventional language processing. With these all, numerous applications are involved in various domains. E-commerce business, marketing [28], healthcare domain [29] and tourism and travelling [30] are very famous application areas of sentiment analysis.
Due to the numerous application domains, sentiment trend is going to be burst because of huge traffic of online social users which overloads the information system. In this way, the situation becomes more complicated because; useful patterns of information in terms of sentiment words can be untouched. For this purpose, a vast information system is to be designed which can handle the huge upcoming traffic of text which can easily recognize the polarity of sentimental keywords.
Alike to the well-known word sense disambiguation (WSD) task, the main motive is to resolve polarity of ambiguous words in definite context. Therefore, in this research article, we are addressing this domain of research and for this purpose; sentiment words W such as Therefore, we proposed a method BbNCPD-Bayesian Network Based Contextual Polarity Disambiguation which exploits contextual information of a word in a tweet for polarity disambiguation in sentiment analysis. Bayesian belief network is adopted which computes polarity probability of a given sentiment word with a given context. This computation is based on the posterior distribution of training corpus. Multiple hypotheses are designed between the features that affect the model. Independent assumption is made amongst the features that influence the model.
To evaluate efficacy of BbNCPD, experiments have been led which explain the efficacy of the features in the sentiment level context and the effectiveness of the Bayesian model. We also compared the proposed approach with several state-of-art methods using the various data sets as STS (Stanford Twitter Sentiment Corpus), HCR (Health Care Reform) and OMD (Obama-McCain Debate). Thus, to understand the contextual polarity disambiguation in data sets towards the sentiment keywords, this paper focus to identify the effect of context and to measure their influence on sentiment.

Related Work
From the previous research point of view, in sentiment ambiguous words, the dynamic polarity is extracted and analyzed. Two types of evaluation are indicated which are either including contextual information or free from contextual information. Polarity associated with sentiment words usually has dynamic polarity which includes different attributes and contextual information to compare and match sentiment words. It is also noticed that in diverse context, sentiment polarity score will be different [31]. In current years, abundant attention is gained by sentiment analysis and classification. Element words such as good, bad, perfect, excellent have direct impact on sentiment prediction. Numerous researchers have been done on sentiment word extraction from data sets and to categorize the contextual word polarity disambiguation. In [14] authors predict the polarity of sentiment and intensity by considering the financial short texts whereas [13] used twitter post to forecast stock market pointers to predict the behavior of stock market in coming day by analyzing the behavior of twitter users. In [32], authors determine the contextual meaning of news headlines with their appropriate senses. They also proposed a method called word sense disambiguation which predicts the EUR/USD exchange rate by sentiment analysis. A methodology for determine opinion is given which is based on feature oriented. PMI-TFIDF is used for similarity measure to determine contextual information of sentiment domain.
Furthermore, in [15]authors performed five task to predict nature of sentiment (positive, negative or neutral). Work is also extended to find the contextual polarity of sentiment w.r.t. given topic. Similarly, in [16] contextual polarity is identified and sentiment score is to be computed.
In [33] consider the user generated reviews for experimentation to mine opinions using sentiment analysis. They combined the sentiment with similarity in content-based recommendation methods to suggest products. They claim that the proposed approach is helpful in few domains of Amazon product recommendation. In short text messages, contextual information is complex to determine specially if the user is using slag or short form of an important sentiment keyword. Therefore, in [17] contextual information in short text is determined. For this purpose, various features with supervised machine learning are considered. In [7] authors proposed a system based on SVM to explore and predict sentiment score. Polarity and contextual information is determined in [18] and for understanding the slangs and short form of sentiment words SentiWord Net is implemented to determine lexicon polarity with implicit aspect pointer. Accuracy of the sentiment analysis and contextual polarity can be improved using word sense. Therefore, in [34] a proposed approach based on WSD is given to increase the performance of sentiment analysis. A lexicon based dictionary is designed to predict and determine word senses for each particular domain. They extract the data related to product reviews and from that text valuable pattern and features are analyzed. The polarity of the particular domain is determined including the word present in expressing opinion. The polarity of the words is obtained and to obtain the exact word WordNet is implemented. The accuracy of the proposed approach is 93.2% and 2500 reviews are used to conduct the experiment.
However regrettably, sentiment word polarity disambiguation has not fascinated abundant research attraction. But many authors developed and implemented manual methodologies to extract knowledgeable patterns from collected text and classify contextual polarity. On the other hand, aspect based sentiment analysis domain has good impact. The main motive of aspect based sentiment classification is to identify the sentiment expressed for every single aspect.
Aspect based polarity disambiguation is explained in [19] [20], where the sentiment words are high-quality and yielding relatively high precision over a set of sentiment features, the recall is slightly low. Similarly by considering the aspect, in [21] a novel aspect based sentiment analysis methodology based on CRF and lexical features are used.
Furthermore, a threefold methodology for sentiment word polarity classification in Twitter text is given in [22]. In the first step pre-processing on the collect tweets is implemented which helps to remove unnecessary information. In the next phase, features are extracted and selected whereas in the last step, classification with the help of SVM is implemented.
Sometimes in the sentiment analysis, we found that when we measure the polarity of sentiment words semantic features also plays an important role. Therefore, by taking care of semantic information, in [23] authors explored the semantic features in various datasets of Twitter. The most important features are Unigram, POS and sentiment topic which are included in this research article. For the purpose to determine polarity of sentiment, different approaches such as replacement, augmentation and interpolation for twitter sentiment classification are considered. On the other hand to determine the polarity of sentiment analysis at phrase level, experiment is implemented which helps to disambiguates the polarity of polar language [24].
Therefore, from the literature it is very clear that most of the research methods are based on disambiguating a single Sentiment word (usually noun). Therefore, in our proposed research work, we are considering Bayesian Belief model for sentiment polarity disambiguation. We also introduce the trust factor on the basis of sentiment score which is not even touch and discussed yet in the previous research work.

Problem Formulation and Proposed Methodology
In this section, we are describing the problem formulation in Sentiment classification and contextual polarity disambiguation. Designing mechanism is also introduced in this section which provides the refine information to extract exact sentiment polarity score.

Problem Definition
In this section, we are defining the problem statement in the domain of sentiment analysis and polarity disambiguation considering various corpus. Major challenge involved in social text is noise. In another terms we can say that the social text includes unwanted text, slangs, emoticons and short forms of words. Therefore, the first main motive of our research is to understand the nature of fields available in corpus. For this purpose, we have to pre-process the text to remove unwanted information. The normalized text will act as an input for the perceived sentiments. Next, we discussed the proposed methodology with essential terminology where the main motive is to intend a plan for the real time system and for this purpose; we designed a methodology to pre-process the text. We carried our experiment on the basis of designed methodology and analyze the text which are generated and posted by numerous users to give their views in terms of sentiment at different social media environment.
Further, in the current scenario the contextual information has great impact in sentiment polarity disambiguation.
Thus, for analyzing the polarity of sentiment in different corpus Bayesian belief network is used. Bayesian network is a mechanism for demonstrating and reasoning with ambiguous beliefs. It contains two parts which are qualitative components (usually directed acyclic graph) and quantitative components (usually conditional probability). Using Bayesian network, features can have represented using little space and their probabilistic inference can be compute among the features in an adequate amount of time. This computation helps to analyze the contextual information of sentiment words with their polarity. The graphical representation is generated for all sentiment words as the Turbo Parser is implemented. Its graphical nature provides better spontaneous hold of the relationships between the features.
In the previous sentiment classification task, contextual information is missing due to which many important keywords associated with sentiments can be ignored. Because, word related to sentiment may occur in another form and till date this problem is not addressed. Therefore, our proposed approach is also including the contextual information for sentiment words. We have also given a trust factor based on sentiment score, which will provide guidance to the social users who are seeking information. Through this, we achieved effective and relatively compressed input of text and an effective filtering technique which is capable to handle the long term dependency of contextual polarity of sentiment.
In this research work, we assume that the designed approach can be appropriate for various real world situations and provide a better way for sentiment analysis and polarity disambiguation. The designed approach is interesting and relevant for short text message on twitter text and for any other short messages blogs.

Designing Methodology
Sentiment analysis in social text corpus is also very popular. Sentiment keywords are extracted from the corpus and it is normalized before further processing. As the text of the corpus will be noisy and consist of various unwanted symbols, abbreviations and short forms. Thus, for cleaning the corpus of text normalization is to be implemented. For this purpose, we used the methodology described in [9]. Further for classification and to identify the polarity of sentiment words, Bayesian belief network is implemented as the text available in the corpus is quite challenging to analyze its features to combine. Thus, to obtain and to handle this challenge, contextual information is attached with polarity to build up information comparative to the text corpus which is already classified and normalized to assign sentiment polarity score accordingly. To make this concept understandable, we are describing few definitions which will provide ground information for the same.  Moreover, polarity disambiguation in sentiment words in an explicit context is a generic issue. Consider a sentiment ambiguous word in corpus C and a certain context  which is commonly a sentence. Therefore, word polarity disambiguation tries to expect a polarity score. For this purpose, if the probabilistic model is considered then it is computed in equation 1 in which  is defining polarity which can be positive or negative.   Step 1 Apply learning algorithm to 1 2 3 ( , , ,......... ) where w C is a sentimental keyword including its features and  is the parameter which will helps to distribute the sentimental keywords features into different boundaries according to the polarity score.
Step 2 Step 3 Calculate the threshold for every feature such as: is a set of all features of sentiment.
Step 4 Similarly compute threshold for all other sentiment features with different corpus () w C of words is represented as: , where  is giving value of threshold for various features of an sentiment keyword and  is representing a combination of all features together.

Text pre-processing
From the text corpus, to eliminate undesirable text pre-processing is essential. It helps to reduce noise effusive words and slangs which are required to convert into genuine words. Generally in normalization stop list [8], tagging ( # tags & @ tags), URL, are removed to discover meaningful pattern from the collected text stream. As we are using the methodology proposed in [11] for text pre-processing which is effective to tame text in terms of noise which includes emoticons, misspelled or incorrect words, folksonomies and slangs. This normalization approach is an endeavor to observe the outcome of pre-processing on twitter text stream for the fortification of event detection and classification mainly in terms of actual event word [12] [11].   , Collection of refined pre-processed Tweets, which belongs to text corpus C compared with Word2Net dictionary..

3:
While (not end of text corpus)Compute indicator dataset of text corpus corresponding to i 1 ii →+Obtain the next keyword from corpus w c Apply step 2 for the collection of refined pre-processed Tweets, which belongs to Twitter text stream S for the new coming tweets

4:
for each cluster The lexical variations in social media, user's language effect the whole representation of corpus. Text representation plays a major role in the fundamental task of sentiment analysis to analyze the text fragment and the social users' attitude towards the sentiment topic. Large numbers of words which can be irregular are usually concerned with multiple computational techniques which create a problem of data sparsity. Therefore, to give a close attention to this problem, we implement the process using three text corpus (STS, HCR, OMD) in terms of vocabulary growth. Figure   1 shows the vocabulary growth rate of three text corpus. From the graph it is very clear that the vocabulary of STS data set is smooth and growing well whereas, the vocabulary of OMD text corpus moves upward and then become sharp and smooth. Usually the vocabulary is measured by the total number of words and in our experiment we are calculating on the basis of total sentimental words extracted after pre-processing.

Proposed Methodology
The major objective behind proposed methodology is to use Bayesian network learning as these networks are graphical models which not only encodes relationships among the features but also proved to be an efficient way to get the quantitative measure of these relationships. It deals with uncertainty precisely as compare to simple Bayesian learning (Eq. 3). Another main reason is its joint probability distribution, which is capable to store contextual information.
Simply, Bayes rule is represented as eq. 3 whereas, the Bayesian network representation is given in eq. 4. 12 ( , ,......  w is any word in sub-tree from word w and S is any sub-tree in dependency graph.

Output:
SentiValue for all the sentimental keyword from corpus C .

Begin
w " in dependency graph from lowest level to highest level.

END
In the above mentioned steps, sub-tree is generated which can be further used to compute SentiValue which is defined as given below: Conversion of   to ' is necessary to classify the full context in the dependency tree. Therefore to compute the sentiment value in the context of  Thus, in sentiment analysis and polarity disambiguation, tree banks play a vital role in the evaluation of dependency of sentiment words using dependency parser. Dependency structures can be created directly for a corpus or can be created using automatic parsing. Usually, tree banks are created for morphologically rich languages whereas, the English dependency tree-banks have mainly extracted from present resources such as the Wall Street Journal sections of the Penn Treebank defined in [27]. Most of the dependency tree-bank consist of two phases i.e. classifying all the dependent relations in the structure and to categorizing accurate dependency relations for relations. Information extracted from the dependency based analysis helps to provide information which can be directly used for sentiment classification based on their semantic relationships. Tree-bank for sentiment classification are based on graph theory and they use maximum spanning tree methods for generating dependency structure. Besides, tree-bank be responsible for the data required to train sentiment classification system and mostly dependency tree-banks are generated openly by human annotators using programmed transformation from phrase-structure tree-banks.
Therefore, in our proposed approach our first step is to learn the network first which helps to understand the conditional probabilities within the text. It will also provide the embedding contexts whose aim is to classify words among m Dependency parsing method is used to identify the dependency among the contextual sentiment words [25]. Directed graphs are generated to analyze the dependency structure as ) , which comprises of set of vertices "V " and set of pairs of vertices " A " refers as arcs. Thus in the dependency formalism set of vertices "V " resembles precisely to the word present in the sentence whereas the arc "A" detentions the head-dependent and linguistic function liaison among the elements in "V ". Thus there will be solitary elected source node which does not contain any incoming arcs, every vertex has precisely unique inward arc and a unique route from source node to every vertex in " V " [48], [26]. Thus from all, it is clear that in sentiment word tree every single word has a solitary head in which the dependency organization is associated and a unique source node is existed from which an exclusive directed route to every single word in the sentence with which one can follow. After this unique structure of dependency graph, a word combinations Treebank is to be generated which play an acute part in the growth and assessment of dependency parsers.
Deterministic process is used to interpret currently existing integral based Treebank into dependency trees. Head rules are used for the whole process of conversion which depends on analyzing the head dependency association in the structure and to classifying the precise reliance relations [26].
From tree bank, we can identify the dependency of sentiment w.r.t. contextual polarity and SentiValue score can be computed as and further in the next step polarity is to be disambiguates.

Sentiment polarity disambiguation
In text analytics, a word having several alternatives linguistic is known as ambiguous word. Thus, in sentiment analysis, sentiment words are ambiguous if there are numerous another linguistic structures. Let us consider an example "I park her car in block", which can be represented in different forms and meanings which could have ambiguity at some level. Sentence with different meaning is written as: i.
"I blocked the traffic on road for her". ii.
"I stopped playing and turned her into residence block".
iii. "I parked her car in lawn".
In the given example the different senses are affected by number of ambiguities. In part-of-speech (POS), the words "park" and "her" are morphologically or syntactically ambiguous. "Park" can be a verb or a noun whereas; "her" can be a dative pronoun or a possessive pronoun. Secondly, the word "block" is syntactically ambiguous in various modes. "Block" can be transitive, i.e., taking a single direct object (i) or it can be di-transitive, that is, taking two objects (ii), meaning that the first object (her) got made into the second object (block). Therefore, park can take a direct object (iii) meaning that, the object (her) got caused to perform the verbal action (park). Thus, we can say that there is an even deeper kind of ambiguity, where one sentence could have been eye on another sentence. Thus, in sentiment analysis, resolving ambiguity is most important. Word sense disambiguation and part of speech (POS) plays important role and can be solved by various approaches. An extensive variety of tasks can be outlined as lexical disambiguation problems.
Words used to evaluate the sentiment polarity are extracted from text corpus. First, we have to categorize the sentiment words through entire of existences of the sentiment words present in the corpus. Then remove those words, which are arbitrated as holding a distinctive polarity in all analyses. After removing the words it is easy to find the polarity of words present in the text corpus [9]. Number of words can occur many times in the various domain of data set which are mutual keywords and are clear enough. For polarity disambiguation we are using the following equations: In above scenario 'h' is Opinion Holder, 't' is opinion target, 'w' is sentiment word (to be disambiguated in terms of polarity), 'm' is modifying word, 'n' is negation word, 'I' is indicative word.

Experimental Study and Training Corpus
Proposed methodology is evaluated and the experiments are implemented to reveal the effectiveness and efficiency.
For this purpose, we are comparing the proposed approach with three states of art approaches using three different data sets. In this section we are discussing the existing feature based approaches such as unigram feature, POS feature and Sentiment topic feature. We also explained the data sets used in this research which are STS (Stanford Twitter Sentiments), HCR (Health Care Reforms) and OMD (Obama-McCain Debate.) • Unigram feature: Due to the simplicity and ease of use unigram feature is widely used for sentiment classification.In [5], model is trained using word unigram and implement on the various classifiers and shown that it perform well by a decent sideline of 20%. In result section, we have described various data sets which we used in our research work for the classification training purpose.
• Bigram: This model The bigram approximates the probability of a given word of all the previous words which is • Part-of-Speech Features: It is very common feature which is abruptly used in the various literature for the purpose of sentiment classification in twitter. POS features are extracted, which is proficient/trained precisely from tweets.
Using a combined version of unigram and POS features, we use them as a baseline models. Treebank is used for extracting the word combination and to identify the dependency among sentiment words. It helps to identify many types of words which are out of vocabulary (OOV), emoticons and abbreviation.
• Sentiment-Topic Features: These features are extracted from tweets to find the actual topic related with tweets.
Each word in tweet is labeled with sentiment label and sentiment topic with which it is related. As per the sentiment topic, dataset can be divided into positive and negative features. The main objective is to collect the similar sentiment topic in same cluster to minimize the sparsity of data which helps to improve the sentiment classification, sentiment polarity disambiguation as well as improves the accuracy.

Data Acquisition:
In corpus of text extraction of NE (name entities) from the collected tweets is an important step. We choose the various text corpuses for sentiment analysis and contextual polarity disambiguation. In our research work, we classify the tweets as described above in the pre-processing task which allows for the extraction of various fields and entities from text corpus along with their semantic information. Therefore, including their semantic information we designed data sets for further processing and used in this research task.
Furthermore, in next phase we construct and develop a collection of sentiment explanation which provides the information of all attributes of text corpus. Sometime, the sentiment keywords give clear information from which we can classify them into positive, negative or neutral class. But, we have seen that it becomes difficult when complex nature or unidentified keyword found during sentiment classification. Therefore, to deal with such type of situation we categorize them into mixed and other category. The description of various data sets used in our research is explained in further section which provide a basic information including training and test sets.
• Stanford Twitter Sentiment Test Set (STS-Test): It is explained and given by Go et.al [35]. It is divided into two sets which are classified as test and training sets of text corpus. In STS-Training set, total 1.6 million tweets automatically labeled as positive or negative based on emotions. In this text corpus, emoticons are also addressed and categorized into positive and negative class of sentiments. In our experimentation, we emphasis on those text corpora that have been physically interpreted. We observed and noticed that the test corpus(STS-Test) is physically interpreted and includes 182 positive, 177 negative and 139 neutral tweets. Even though the size of text corpus (STS-Test) is quite small but it has been extensively used by the various authors for the multiple evaluation task of research. Due to this reason we also opt the same for our experimentation to evaluate their polarity classification.
•  [39] in which they used this dataset for polarity disambiguation and due to which we are also using this for the sentiment polarity disambiguation.

Evaluation Metrics
For evaluating the proposed method our main aim is to compare the proposed methodology performance by considering the various existing features. In our comparison we are including Unigram, POS and Sentiment topic for the evaluation purpose and computing the relevance of proposed approach.

Results and discussions
In this section Table 3 is explaining the comparison of proposed approach with the existing features and we found that our proposed methodology is giving better results.

Accuracy of Proposed approach BbNCPD
The comparative analysis based on results obtained using proposed approach using various data sets as explained above with various approaches comparison is shown in figure 3. Mainly in previous research works, authors have used machine learning algorithm to analyze the polarity of sentimental keywords. From literature we found that authors also use SVM based for contextual polarity disambiguation. In another methods, using n-gram approach of unigram, bigram and combination of unigram and bigram [40] contextual polarity is computed [21] [24] and prove that such type of combination provide better efficiency. Bayesian model is also opted in [46] for sentiment polarity disambiguation and the accuracy in term of polarity disambiguation obtained is better in various aspects. In our research task, we are using various feature based model and its accuracy is shown in figure 3 which is clearly indicating that sharp decrease with the increase of sentimental keywords is observed. As a result, we can say that the proposed Bayesian belief based contextual polarity disambiguation approach is quite better as compare to existing approaches.
From this accuracy we observed and prove that when accuracy is higher and the experiment is able to handle contextual polarity then that particular approach is trustable on f Tr and in social media sentiment analysis classification, online social media will trust on those sentiments which is having high accuracy and trust factor. Therefore, we can conclude

Conclusion & Future Scope
We proposed a Bayesian Network Based Contextual Polarity Disambiguation (BbNCPD) to resolve the polarity ambiguity. Specifically, we investigate intra-opinion features and the their relationships among each other, such feature include opinion target, modifying word and indicative words, which relies on features such as correlative words in sentence, discourse and application. We adopt a Bayesian Network model to deal with the word polarity disambiguation task in a probabilistic manner. The parameters and structure of this network is learned from the contextual dependency of features on each other. This task is done using semantic parsing using our proposed learning algorithm. Utility of this scheme is twofold, as many useful priors can be computed beforehand (from the corpus) that speeds up the overall process and dynamic context encoding is done based on local features associated with the ambiguous word. Experiments using the various text corpuses show that application of BNCPD can make a significant contribution in word polarity disambiguation across the different domains on the basis of trust factor. It is yet to be seen the performance of this scheme under multi-lingual aspects of training data.