Detection of Banglish Slang in Social Media Comments Using a Hybrid Bidirectional Long Short-Term Memory (Bi-LSTM) Model

doi:10.21203/rs.3.rs-4109962/v1

Download PDF

Research Article

Detection of Banglish Slang in Social Media Comments Using a Hybrid Bidirectional Long Short-Term Memory (Bi-LSTM) Model

https://doi.org/10.21203/rs.3.rs-4109962/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Banglish slang is a distinctive mashup of Bengali and English terms that is widely used in messaging applications, social media, and other casual contexts. It has grown to be an essential component of Bangladesh's linguistic landscape. The use of Banglish slang in social media comments has become more common among young people from Bangladesh in recent years. Yet, because traditional language processing methods are not appropriate for managing such linguistic complexity, automated text analysis systems find it difficult to handle this informal communication style. Our proposed model captures the unique components of Banglish slang, such as slang phrases, misspellings, and abbreviations, by combining character-level and word-level properties since it is the most common form of communication and improves user interaction and content moderation. We remove special characters from the dataset and transform all test results to lowercase. After then, the stop words are removed from the dataset. Furthermore, we do tokenization using a pad sequence prior to classification. Stack Bidirectional Long Short-Term Memory (Bi-LSTM) model, a neural network architecture used in Natural Language Processing (NLP), is a state-of-the-art model utilized in this research to classify slang in Banglish words. On the training set with an outstanding 99% accuracy and a strong 79.8% validation accuracy, the suggested model beats traditional approaches. The Mendeley dataset was used to analyze social media comments to detect instances of Banglish slang. To carry out the work, approximately five thousand comments are considered overall. The experimental findings show how well our recommended approach works to identify Banglish slang. This method can be useful in monitoring content and language learning, among other real-world applications.

Banglish

Dataset

Neural Network

Social media

Natural language processing (NLP)

Mendeley

Deep learning

Social media platforms have seen a sharp rise in usage in recent years, giving people a means of text-based communication as an expressive medium. Even while most social media users speak to each other in formal language, using slang and informal language is becoming more and more common, especially among younger generations. There were 5.3 billion internet users globally as of October 2023, or 65.7% of the world's population. Of them, 4.95 billion people, or 61.4% of the global population, used social media [1]. The growing number of offensive information, cyberbullying, and online harassment are increasing as more individuals use social media every day which is alarming [2]. It is becoming more difficult to locate people since they are employing phony identities and anonymous accounts. But it’s best to stop them before they start. and in order to achieve it, we must first identify the hateful remarks and speech. With the primary goal of “Bangla Slang Detection with AI,” there have been some earlier investigations [3, 4, 5, 6] towards identifying various forms of hate speech in Banglish (Bengali written in English). Some of them focused on general derogatory texts, combative remarks, and abusive Bengali texts on social media. Detecting white supremacist contents is the subject of some of the works. Some even had jobs detecting slang in the regional tongue. However, there hasn’t been much research done on identifying Bengali slang in English writings. Bengali slang detection uses artificial intelligence to find slang words in Bengali text that has been transliterated into English characters [7].

Other algorithms like Naive Bayes (NB), K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Decision Trees (DT), Support Vector Classification (SVC), Random Forest (RF), Logistic Regression (LR), Bagging Classifier (BGC), XGBoost (XGB), Adaboost, and Gradient Boosting Decision Trees (GBDT) will be used in conjunction with our suggested Stack-BiLSTM model. To perform better, the classifiers undergo several upgrades that require fine-tuning the model parameters to ensure optimal setup while boosting output efficacy and accuracy. In order to prepare the input data for analysis, which is crucial for producing the best results, preparation processes are also used. There will be cross validation to assess the model’s dependability and generalizability, ensuring that the performance gains seen during training persist when applied to new data. This methodical validation process enhances the overall resilience of the system. In order to tackle the difficult task of identifying Banglish lingo in social media comments, this study suggests a novel hybrid Bi-LSTM model. The model is anticipated to capture both forward and backward contextual data by utilizing the bidirectional nature of the LSTM architecture, allowing for a more precise detection of Banglish slang within the text. By incorporating language elements unique to Banglish slang, the hybrid design improves the model's sensitivity to the complexity of this linguistic phenomenon. The creation of a hybrid bidirectional long short-term memory (Bi-LSTM) model for identifying Banglish slang in social media comments is the paper's main contribution. To increase slang recognition accuracy, a combined word- and character-level representation is used in the suggested model. The Bangla Slang Dataset is a newly created dataset that has been carefully selected by the authors to aid in the identification of Banglish slang. The experimental results show that the suggested model is more successful than current approaches for Banglish slang detection in reaching state-of-the-art performance on this new dataset. Overall, by presenting a novel method for identifying Banglish slang, which can have useful applications in content moderation and online community management, this paper significantly advances the fields of natural language processing and social media analysis.

The major contribution of this paper are as follows:

To collect the Banglish slang dataset.
To develop a hybrid Bi-LSTM model for detecting slang in social media comments.
To measure different performance metrics of the proposed model.
To advance the area of Natural Language processing and social media data analysis.

The rest of the paper is organized as follows: section II represents the literature review, section III represents the datasets, section IV describes the methodology, section V describes the model development, the results are presented in section VI and section VII concludes the study.

Faruqe et al. (2023) have proposed an automatic Bangla hate speech detection system using natural language processing (NLP) and deep learning approaches [8]. BERT, Bi-LSTM, GRU, and attention techniques have been applied to detect Bangla hate speech. The GRU and attention techniques performed best with 98.87% and 98% accuracy, respectively.

Jahan et al. (2022) have proposed BanglaHateBERT model [9] for abusive language detection in Bengali. They have also applied mBERT, IndicBERT, BanglaBERT, and CNN models. Their developed Bangla HateBERT yields 94.3% accuracy which is the highest accuracy. The number of hate samples in each category is imbalanced.

Akter et al. (2022) [10] have classified aggressive comments in Hindi, Bangla, and English datasets using LSTM, BiLSTM, LSTMAutoencoder, word2vec, BERT, and GPT-2 models and have also shown a novel way of generating machine-translated data to resolve data unavailability issues. The BERT model performed best on noisy datasets, with 78% accuracy, while the GPT2 model performed best on raw datasets that did not contain any noise, with 80% accuracy. There wasn’t enough data for each class, resulting in lower accuracy.

Remon et al. (2022) presented a fresh dataset of 10,133 user comments that were gathered from different publicly accessible Facebook pages' comment sections. Convolutional Neural Networks (CNNs) are one of the machine learning and deep learning models that the authors examined in terms of their effectiveness in detecting hate speech.The models were trained using fastText's pre-trained word embeddings in Bengali. The goal of the project was to generate a new, sizable dataset that will aid in future investigations into Bengali hate speech detection. The results of the experiments demonstrated the good performance of all machine learning and deep learning models, with Support Vector Machine (SVM) demonstrating the greatest performance [11]. The study does not investigate alternative methods or strategies for Bengali hate speech detection, instead concentrating on the effectiveness of machine learning and deep learning models.

Hamid et al. (2023) have applied Naive Bayes, logistic regression, KNN, random forest, decision tree, SVM, and XG-Boost classification-based research to detect slang phrases automatically from given Bengali text. Logistic regression recognized slang terms with 70% accuracy [12]. The accuracy result is comparatively low.

Sultana et al. (2023) employed Logistic Regression (LR), Multinomial Naive Bayes (MNB), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) for detecting abusive Bengali comments on social media. SVM achieved the best results, with an accuracy of 85.7% [13]. The lack of specific information regarding the dataset used to train and evaluate the machine learning algorithms in the paper could have an impact on the results' external validity and reproducibility.

Akter et al. (2023) have proposed a trustable LSTM-Autoencoder Network for cyberbullying detection on social media using synthetic data in three languages: Bangla, English, Bengali, and Hindi. They achieved the highest ac- curacy of 95% on all datasets [14]. They used synthetic data for models, which may not be fully capable of capturing the complexities and nuances of real-world social media scenarios.

Akhtar et al. (2023) presented a robust hybrid ML model for cyberbullying detection in the Bengali language on social media. The model achieved an accuracy rate of 98.57% and 98.82% [15] in binary and multilabel classification, respectively. The study focuses on a specific ML approach, limiting the generalizability of the findings to other languages and cultural contexts.

Rahman et al. (2022) have presented a benchmark study on different models to classify both emotion and sentiment from Bangla textual data. The bidirectional GRU model of the NN-based approach achieves an F1-score of 85.37 for sentiment identification and 69.22 for emotion recognition, which performs better than traditional ML-based approaches [16]. Their dataset was imbalanced, even after data augmentation.

Keya et al. (2023) have used G-Bert, Lstm Bangla-bert, and mBERT Methods for Identifying Hate Speech in Bengali Texts on Social Media. GBERT has a maximum accuracy of 90% [17]. The paper does not mention the size or diversity of the dataset used for training and evaluation. Lack of explanation of feature selection: The paper does not provide a detailed explanation of the features used in the G-BERT model.

Mahmud et al. (2022) proposed a reason-based machine learning approach to detect Bangla abusive social media comments. They have used Logistic Regression, multinomial NB, Decision Tree, Random Forest, Support Vector Machine, AdaBoost, GradientBoost, SGDClassifier, ExtraTreesClassi- fier, KNeighbors, and MLPClassifier to detect Bangla abusive social media comments. Logistic regression gives the highest accuracy of 97% [18]. Their dataset is comparatively small.

Sarker et al. (2022) proposed a multi-nominal Naive Bayes (MNB) to classify anti-social Bengali comments on social media. In this paper, they made a dataset, and both ANN like GRU and supervised machine learning classifiers like Logistic Regression, Random Forest, Multi-nominal Naive Bayes (MNB), and SVM are used to classify anti-social Bengla comments on social media. The highest accuracy is 80.51% from the MNB model [19]. Their accuracy is lower than that of the relative study.

Khanet et al. (2021) used supervised learning, specifically the Support Vector Machine (SVM) classifier, for sentiment analysis on Bengali Facebook comments. Other ma- chine learning techniques were also employed for comparison, including Random Forest, K-Nearest Neighbors, Naive Bayes, and neural networks. The paper achieved an accuracy of 62% for support vector machines (SVM) [20]. Their dataset was not fully labeled. Their dataset is also small and imbalanced, which impacts their accuracy.

We collected our dataset from Mendeley data. The dataset contains Banglish comments for detecting hate speech. There are 5000 comments in total. Among them 2836 of these are hate speech, and 2164 are non-hate speech. Among the different types of hate speech are: sexual (928), Slang (575), Racial (135), Religious (198), Appearance (410) and Others (590). Figure 1 mention the hate label where yes is for slang and no is for normal sentences.

Table 1

Sample dataset of Banglish slang in social media comments
Sl. No	Comments	Text Type
1.	Balda dekhaiche	Slang
2.	Choira halar vai	Slang
3.	leura kothakar	Slang
4.	hala ekta peshagar	Slang
5.	Bura matharir pank dekhle mon da cay bis khai	Slang
6.	kotha kom bol magi	Slang
7.	madarchood ekat tui	Slang
8.	shala valo kore word uccharon kora shikh	Slang
9.	khanki magi amdr islam nia kono kotha bolbi na	Slang
10.	Ai bal ta khea kono kaj pai nh saradin faltu bol bok kore	Slang
11.	Juta maira goru daan koros magi?	Slang
12.	Hedar pani	Slang
13.	Kutti maiya,thapraite mon chay	Slang
14.	baal gula chirbe Bangladesh team	Slang
15.	Baldar ki boyos bare nah	Slang
16.	sob gaja khor obaber dine ogore gaja dilo	Slang

The banglish dataset is collected from online data source and integrated into the following proposed architecture.

4.1 Data Analysis and Preprocessing

Our original dataset has three columns. Comment contains all the social media texts in Banglish format. Hate identifies weather the text is slang or not and type refers to the type of hate speech.

4.1.1 Removing Special Characters

First we removed all the special characters form the original dataset. The comment module contains all the text so we removed all the special characters from that column

4.2.2 Converting to Lowercase

After removing the special characters, we converted all the tests to lowercase.

4.3.3 Removing Stopwords

we manually found out some stop words that dont bring any meaning for the sentences. The stop words are ’ki’, ’e’, ’ei’, ’r’, ’ar’, ’ni’, ’er’, ’a’, ’but’, ’k’, ’to’,

’o’, ’ai’, ’by’, ’the’, ’way’, ’kn’, ’ta’, ’ke’, ’re’, ’ra’, ’or’. We removed them from our sentences.

4.3.4 Encoding

In order to perform classification, we had to perform encoding on our labels. we have Hate and Type as our label. We performed encoder on this column.

4.4.5 Tokenization

Finally, we performed tokenization with pad sequence. After performing all the preprocessing steps finally our dataset is ready for classification. Some examples are follows

Original Comment: sob gaja khor obaber dine ogore gaja dilo

Tokenized Sequence: [8, 421, 1007, 9188, 702, 9189, 421, 321]

Original Comment: hoy news title dey. shala matha mota article writer.

Tokenized Sequence: [14, 118, 9198, 264, 243, 420, 214, 9199, 9200]

Original Comment: chagol mohila

Tokenized Sequence: [605, 181] etc.

After data preprocessing, we apply deep learning and different transfer learning model to our dataset. First we applied multiple convention algorithms such as NB, SVC, KN, ETC, RF, LR, Bgc, xgb, AdaBoost, GBDT. But their performance was not up to the mark as a result we created a hybrid Bi directional LSTM model which is our proposed model. The step by step brief breakdown of our model is given below:

5.1 Embedding Layer (embedding 5)

This layer, an Embed- ding layer, takes input sequences of length 85 and maps each word to a 100-dimensional vector. With 1,063,400 trainable parameters, it fine-tunes the mapping weights during training for optimal performance in capturing semantic relationships.

5.2 Bidirectional LSTM Layer (bidirectional 10)

This layer is a Bidirectional LSTM that handles sequences of length 85, employing 1024 units for each time step. With 2,510,848 trainable parameters, it fine-tunes the bidirectional LSTM operations during training, showcasing its ability to capture intricate patterns in sequential data.

5.3 Bidirectional LSTM Layer (bidirectional 11)

This layer is a Bidirectional LSTM layer designed to process sequences. With an output shape of (None, 512), it utilizes 512 units for each time step, leveraging bi-directionality to double the number of units. Housing 2,623,488 trainable parameters, this layer plays a crucial role in fine-tuning the bidirectional LSTM operations during training, emphasizing its ability to capture temporal patterns and enhance the network’s understanding of sequential data.

5.4 Dense Layer (dense 10)

This layer, a Dense layer with 64 units, transforms and consolidates features with an output shape of (None, 64). It contains 32,832 trainable parameters, adjusted during training for optimal predictive performance.

5.5 Dense Layer (dense 11)

This layer, a Dense layer with an output shape of (None, 1), serves a role suggestive of binary classification. Housing 65 trainable parameters, this layer is fine-tuned during training to produce a single output value, contributing to the network’s decision-making in binary classification tasks.

5.6 Total Parameters

Total trainable parameters in the entire network: 6,230,633 (23.77 MB). All parameters are trainable, indicating that the entire network will be updated during the training process.

5.7 Non-trainable Parameters

None - All parameters in this network is trainable, there are no fixed or non-trainable parameters.

The neural network model presented demonstrates a structured architecture tailored for sequential data processing, possibly in the context of natural language or time- series analysis. With key components including Embedding and Bidirectional LSTM layers, the model efficiently captures intricate patterns within input sequences. The Dense layers contribute to feature consolidation and predictive capabilities. Overall, the model’s parameters, intelligently distributed across its layers, highlight a balanced complexity, allowing it to learn and generalize effectively during training. Figure 2 above illustrates the architecture, which is especially well-suited for tasks requiring an understanding of sequential dependencies, like binary classification, shown by the final Dense layer's features and output appearance.

In contrast to the traditional one-directional LSTM, the BiLSTM is composed of two separate LSTM structures that carry out feature learning for the input sequence in both forward and reverse order. Through doing this, the model may be trained both from input to output and from output to input, which effectively increases the model's dependency and predicting accuracy [21]. Figure 3 presents a more precise depiction of the LSTM model. An input gate, a forget gate, and an output gate are the three gates that together make up an LSTM cell. At time t, the input gate, forget gate and output gate denoted as i_t, f_t, o_trespectively.

When two independent LSTM networks are connected to the same output layer, bidirectional LSTM networks work by displaying each training sequence both forward and backward [22]. This indicates that for every point in a specific sequence, the Bi-LSTM holds complete, sequential information about all points before and after it. The figure 4 illustrates the architecture of Bi-LSTM model.

To predict the amount of energy used by the residential and commercial sectors, a hybrid model called CNN-BiLSTM that is a combination of CNN, BiLSTM, and connection layer is presented. The CNN layer receives the input from the CNN-BiLSTM initially, then convolution computation and max pooling are then performed to create a new feature matrix. The BiLSTM is then used to extract the hidden output, using the feature matrix that was acquired from the CNN as its input. The linear layer that makes up the connecting layer receives the concealed output. The connection layer is then used to obtain the final predictable results.

This part contains information on our study, including the experimental strategy and assessment of the outcomes. First we performed multiple machine learning models. Among them K-Nearest Neighbour(KNN) has the highest accuracy of 77%. And Gradient Boosted Decision Trees (GBDT) has lowest of 64%. On the other hand, our model has achieved an 80% of validation and 99% training accuracy which is the best rate among them.

Table 2: Results of different model

Table 3: Performance metrics of the proposed Model.

Table 4: Model Summary

Model: "sequential"

Layer (type)	Output Shape	Param
Embedding	(None, 85, 100)	1063400
bidirectional (Bidirectional)	(None, 85, 1024)	2510848
bidirectional_1(Bidirectional)	(None, 512)	2623488
dense (Dense)	(None, 64)	32832
dense_1 (Dense)	(None, 1)	65

Total params: 6230633 (23.77 MB)

Trainable params: 6230633 (23.77 MB)

Non-trainable params: 0 (0.00 Byte)

The goal of this research aimed resulting in a hybrid bidirectional long short-term memory (Bi-LSTM) model that could identify Banglish slang in remarks made on social media. The suggested model increases the precision and effectiveness of slang detection. The suggested model proved to be more effective than other cutting-edge techniques in terms of precision, recall, and F1-score, as demonstrated by the outcomes of tests carried out on an extensive dataset of Bangla social media comments. However, the presented neural network model has demonstrated superior performance compared to traditional models in tasks involving sequential data analysis. The model outperforms its traditional counterparts in handling sequential dependencies, which is a key sign of its success. This is a major breakthrough in predictive capabilities and demonstrates the effectiveness of using deep learning architectures for tasks demanding a sophisticated grasp of information in a sequential manner.

Funding: The authors thank Natural Sciences and Engineering Research Council of Canada (NSERC) and New Brunswick Innovation Foundation (NBIF) for the financial support of the global project. These granting agencies did not contribute in the design of the study and collection, analysis, and interpretation of data.

Conflict of interest: There is no conflict of interest.

Author Contribution

Authors’ NameA. Dr. Osama Ibrahim KhalafB. Suraiya Yasmin C. Rashedul Islam Jisan D. Muhammad Fiazul Haque E. Dr. Sameer Algburi F. Dr. Habib Hamam Author's ContributionA. Review, supervision, plagiarism checking, journal searching and advice, fund achievement.B. Original manuscript writing, writing abstract– review & editing, visualization, supervision, fund searching, article submission etc.C. Manuscript writing and editing (such as literature review, proposed methodology etc.), research work, flowchart drawing etc.D. Equal contribution with Rashedul Islam Jisan, original research work, data collection.E. Review, supervision and guidance for article writingF. Guidance and strategic direction for research

Pakhritdinovna DX (2024) Using Digital Marketing Tools to Increase the Efficiency of the Enterprise. World Econ Finance Bull 30:67–72
Ali WNA, Wan TQ, Ni, and Syed Zulkarnain Syed Idrus (2020). Social Media Cyberbullying: AwarenessPrevention through Anti Cyberbully Interactive Video (ACIV). In Journal of Physics: Conference Series, vol. 1529, no. 3, p. 032071. IOP Publishing, https://doi.org/10.1108/S2050-206020140000008023
Sarker M, Hossain MF, Liza FR, Sakib SN, Al Farooq A (2022) A Machine Learning Approach to Classify Anti-social Bengali Comments on Social Media, International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE), Gazipur, Bangladesh, 2022, pp. 1–6, 10.1109/ICAEEE54957.2022.9836407
Banik N, Rahman MHH (2019) Toxicity Detection on Bengali Social Media Comments using Supervised Models, 2nd International Conference on Innovation in Engineering and Technology (ICIET), Dhaka, Bangladesh, 2019, pp. 1–5, 10.1109/ICIET48527.2019.9290710
Sarker KC, Rahman MM, Siam A, Anglo-Bangla Language-Based AI Chatbot for Bangladeshi University Admission System, (2023) International Conference on Communications, Computing and Artificial Intelligence (CCCAI), Shanghai, China, 2023, pp. 42–46, 10.1109/CCCAI59026.2023.00016
Irtiza Tripto N, Eunus Ali M, Detecting Multilabel Sentiment and Emotions from Bangla YouTube Comments, (2018) International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, Bangladesh, 2018, pp. 1–6, 10.1109/ICBSLP.2018.8554875
Al Taawab L, Tasnia M, Dhar, Mehedi MHK (2022) Transliterated Bengali Comment Classification from Social Media, IEEE 10th Region 10 Humanitarian Technology Conference (R10-HTC), Hyderabad, India, 2022, pp. 365–371, 10.1109/R10-HTC54060.2022.9929514
Faruqe O, Jahan M, Faisal M, Islam MS, Khan R (2023) Bangla Hate Speech Detection System Using Transformer-Based NLP and Deep Learning Techniques, 2023 3rd Asian Conference on Innovation in Technology (ASIANCON), Ravet IN, India, pp. 1–6, 10.1109/ASIANCON58793.2023.10269919
Jahan M, Saroar M, Haque N, Arhab, Oussalah M (2022) BanglaHateBERT: BERT for Abusive Language Detection in Bengali. In Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis, pp. 8–15
Akter MS, Shahriar H, Cuzzocrea A, Ahmed N, Leung C (2022) Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words, IEEE International Conference on Big Data (Big Data), Osaka, Japan, 2022, pp. 5414–5423, 10.1109/BigData55660.2022.10021025
Remon NI, Tuli NH, Akash RD (2022) Bengali Hate Speech Detection in Public Facebook Pages, International Conference on Innovations in Science, Engineering and Technology (ICISET), Chittagong, Bangladesh, 2022, pp. 169–173, 10.1109/ICISET54810.2022.9775900
Hamid MA, Tumpa ES, Polin JA, Nahian A, Rahman J, A., Mim NA (2023) Bengali Slang detection using state-of-the-art supervised models from a given text. Bull Electr Eng Inf 12(4):2381–2387
Sultana S, Redoy MOF, Nahian A, Masum J, A. K. M., Abujar S (2023) Detection of Abusive Bengali Comments for Mixed Social Media Data Using Machine Learning
Shapna Akter M, Shahriar H, Cuzzocrea A (2023) A Trustable LSTM-Autoencoder Network for Cyberbullying Detection on Social Media Using Synthetic Data. arXiv e-prints, arXiv-2308
Akhter A, Acharjee UK, Talukder MA, Islam MM, Uddin MA (2023) A robust hybrid machine learning model for Bengali cyber bullying detection in social media. Nat Lang Process J 4:100027
Rahman R, Hasan SA, Rubel FA (2022), December Identifying Sentiment and Recognizing Emotion from Social Media Data in Bangla Language. In 2022 12th International Conference on Electrical and Computer Engineering (ICECE) (pp. 36–39). IEEE
Keya AJ, Kabir MM, Shammey NJ, Mridha MF, Islam MR, Watanobe Y (2023) G-bert: an efficient method for identifying hate speech in Bengali texts on social media. IEEE Access
Mahmud T, Das S, Ptaszynski M, Hossain MS, Andersson K, Barua K (2022), October Reason based machine learning approach to detect bangla abusive social media comments. In International Conference on Intelligent Computing & Optimization (pp. 489–498). Cham: Springer International Publishing
Sarker M, Hossain MF, Liza FR, Sakib SN, Farooq A (2022), February A. A machine learning approach to classify anti-social bengali comments on social media. In 2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE) (pp. 1–6). IEEE
Khan MSS, Rafa SR, Das AK (2021) Sentiment analysis on bengali facebook comments to predict fan's emotions towards a celebrity. J Eng Advancements 2(03):118–124
Chen Y, Fu Z (2023) Multi-Step Ahead Forecasting of the Energy Consumed by the Residential and Commercial Sectors in the United States Based on a Hybrid CNN-BiLSTM Model. Sustainability 15(3):1895
What is Bi-LSTM Available [Online], https://www.codingninjas.com/studio/library/bidirectional-lstm, Accessed date:[04.02.24]

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Detection of Banglish Slang in Social Media Comments Using a Hybrid Bidirectional Long Short-Term Memory (Bi-LSTM) Model

Status:

Version 1

Abstract

Figures

I. INTRODUCTION

II. Related works

III. Dataset collection

IV. Proposed Methodology

V. Model Development

VI. Result Analysis

VII. Conclusion

Declarations

Author Contribution

References

Additional Declarations

Status:

Version 1