Islamophobia has become a hot topic in various circles, posing a challenge to individuals all around the world. This has an impact on many Muslims, particularly those who are minorities in specific locations. Western intellectuals invented "Islamophobia" to denote anti-Islamic sentiment and prejudice [12]. Islamophobia is a shorthand term for fear or mockery of the Islamic religion, as well as the hostility of most or all Muslims [13].
When connected to the internet, social media is a tool that can be used to interact or communicate online [14]–[16]. Social media is frequently used to form social relationships with others and share personal activities or real-life experiences. One of the elements fueling the development of social media is its high mobility and ease of use. In real-time, the community of social media users provides data in a wide range of unstructured formats and languages, as well as thoughts and attitudes [17]. One of the topics in data mining known as text mining is the availability of vast and unstructured data. This strategy is well-organized.
Social media platforms are very diverse in type and type, so that it allows people to choose the community they want. One of the platforms is Twitter, which provides several facilities for its users to interpret, convey, and share posts of up to 280 characters, better known as tweets. The platform is accessible via mobile devices, instant messaging, and website interface generating 326 million monthly active users [18]. By linking hashtags, users can share any information very quickly when they want to search for information. The Twitter social network is included in the speedy category in terms of information exchange due to its easy use and high mobility [19], [20].
The pre-processing stage involves determining the data's quality before it is processed using specialized algorithms to be categorized, classed, or visualized as required. This stage is also critical because it determines the data quality to be used. Some of the processes are governed by rules that the researchers specify. Imperfect data, data interference, and inconsistent data can all be avoided by pre-processing. Pre-processing is critical in sentiment analysis, particularly in social media, where informal and unstructured words or sentences abound, as well as a lot of noise [21]–[23].
The following are the preprocessing stages often carried out, namely case folding, punctuation removal, tokenizing, and stop words removal. Case folding is a way to convert data in the same font size, which converts all into lowercase letters. punctuation removal, which is the stage to remove punctuation marks, numbers, links, and others. There are punctuation marks in some conversation data such as periods (.), commas (,), and a link. It is not necessary, so it needs to be removed. The process of dividing sentences into words and forming word vectors is a tokenizing process. Elimination of irrelevant words reduces the repetition of words that occur to give rise to unconnected opinions [24].