Two-model active learning approach for inappropriate information classification in social networks

The work process of specialists in protection from information consists of many time-consuming tasks, including data collection, datasets formation, and data manual labelling. In this paper, we attempted to help such specialists with a two-model approach based on the iterative online training of binary classifiers. This approach is used for inappropriate information detection and applied on text posts from the VKontakte social network. The first model is used to detect text posts that are corresponding to the selected topic and is trained on the data that is labelled positively and negatively by experts as well as random text data. The second model is used to improve the accuracy of the first model and is trained only on the data that is labelled by the experts. The novelty of the approach lies in the constantly growing dataset, while the classifiers training process takes place during the operator’s work. The approach works with texts of any size and content and applicable for Russian social networks. The research contribution lies in the original approach for inappropriate information detection. The practical significance of the approach lies in the automation of routine tasks to reduce the burden on specialists in the area of protection from information. Experimental evaluation of the approach is focused on its iterative retraining part. For the experiment, text posts of different topics from the VKontakte social network were collected and labelled. Those topics include: Aggression, Dangerous conspiracy theories, Radicalism, Gambling, Prostitution, and Sects. After that, we evaluated precision, recall, F-measure and ROC-AUC metrics for classifiers trained on random subsamples of different sizes and different topics. Those metrics were evaluated for both one-model and two-model implementations of the approach, while the following classifiers were used: linear support vector machine, passive–aggressive classifier, multilayer perceptron. Moreover, the advantages and disadvantages of the approach, as well as future work directions, were indicated.


Introduction
There are many approaches for the detection of inappropriate information on the Internet, including those that use artificial intelligence (AI) methods [1,2].As a rule, these approaches are applied for social networks content and are associated with protection from information [3].At the same time, modern solutions are aimed at identifying information of a certain type (images, video, text, etc.), topic (racism, justification of violence, etc.) and sentiment (aggression, insults, etc.) [4].It means that for each task to be solved, it is required to collect data, form train and test datasets, while data tend to be labelled manually by experts.It makes protection from information a time-consuming task.Note that the task to be solved is in demand, then it is likely to find a labelled dataset that was made by the scientific community [5].While, if the task is specific or delicate, in most cases there are no free access datasets with the very inappropriate data.
The workflow of a typical specialist in the area of protection from information consists of the following stages: 1. Data collection; 2. Data labelling; 3. Dataset building; 4. Experimental evaluation of the AI methods; 5. Selection of the AI model.
In this paper, we attempted to help such specialists with a two-model approach based on the iterative online training of binary classifiers.Within the proposed approach, the operator's work is monitored to stop classifiers' iterative training when detection of text posts of a given topic with the required accuracy can be performed.
As a source of the data for experiments, we used VKontakte 1 social network.It is the prevailing social network in Russia and Russian-speaking countries with over 700 million registered users.Posts on VKontakte could include text, photos, videos, audio files, and other multimedia content, and could be written in several languages including Russian, English, Belarusian, and other languages.However, Russian is the most common language used on VKontakte.Users can add emojis to their posts, comments, and messages, and the site supports a wide range of emojis including VKontaktespecific ones.Post characteristics can vary depending on the individual user and the context of the post.
This approach was initially presented during the 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2022) [6].This paper is an extension and improvement of the given work.We extended the approach with additional steps, improved the related work analysis, and detailed the experimental evaluation.
Usually, an active or online approach to classifiers training is used when the dataset does not fit entirely into the main memory.In our case, an active approach to classifiers training is used to make the training possible in principle.Our task is to determine the minimum size of the training dataset to solve the binary classification problem with an accuracy not lower than a certain specified threshold, since the data labelling in the case of protection from information task is an expensive and time-consuming procedure.
The novelty of the approach lies in the constantly growing dataset, while the classifiers training process takes place during the operator's work.The approach works with texts of any size and content and applicable for Russian social networks.The research contribution lies in the original approach of classification with active learning for inappropriate information detection, while practical significance lies in the automation of routine tasks to reduce the burden on specialists in the area of protection from information.
It should be noted that the proposed two-model approach will be just a part of the general pipeline of the system for the information attacks detection and evaluation.The recognition of the messages that have inappropriate content (and the sense of the inappropriateness can be changed for each new attack to be detected) will be the starting element of the pipeline which will contain bot's detection, group activity evaluation, countermeasures generation, etc.
The paper is organised as follows: In Sect.2, the state of the art in the area of inappropriate information detection is considered.Section 3 describes the new iterative classification approach for inappropriate information detection.In Sect.4, an experimental evaluation of the developed approach is presented.In Sect.5, the advantages and disadvantages of the approach are considered.Section 6 contains general conclusions and future work directions.

Related work
The process of detection of inappropriate information on the Internet combines such research fields as information security and social networks [7,8].During the investigation of social networks, the following objects are analysed: profiles, posts, likes, reposts, friends, groups, photos, videos, audio, etc. [9,10].Those objects, depending on the chosen social network, can vary, but in the end their analysis can be limited to the investigation of text, image, video and audio data [11,12].And to analyse those types of data, statistical and artificial intelligence methods are the most common ones to be used [13,14].Let us consider several examples.
The paper [15] addresses the problem of automatic fake news detection on Facebook.For that purpose, the authors analyse multiple features associated with news characteristics and Facebook user activities.The authors have collected a dataset with more than 15 000 news from 5 000 different profiles including fake and real news and experimentally evaluated several classical classifiers (like KNN, SVM, or logistic regression) along with LSTM neural networks.The results showed that the machine learning models provide effective detection of fake news when using news characteristics together with specific features of Facebook user accounts proposed by authors.
In [16], the issue of phishing attacks through social networks is discussed.According to the authors, those attacks have evolved and phishers are able to use similar methods to entice social network users to click on malicious links.The goal of the research was to examine the effect of the relationship between the Big Five personality model and the heuristic-systematic model of information processing.The results showed that conscientious users were found to have a negative influence on heuristic processing and are thus less susceptible to phishing on social networks.The study also confirmed that heuristic processing increases susceptibility to phishing, thus supporting prior studies in this area.
The issue of public opinion manipulation is considered in [17].The authors analysed social network actors, environment, scene, manipulation and ethical framework.As a result, the impact on public opinion was divided into six main categories: Bot, Botnet, Troll, Manipulate real people and events, Cyborg, and Hacked or stolen.Moreover, this impact is mainly focused on incident and public opinion reconnaissance, sentiment analysis, and active intervention, while the main area of application is political propaganda.
The paper [18] attempts to improve the detection of spam profiles in a social network, utilising machine learning and publicly available language-independent features of Twitter social network profiles.Within the classical flow of natural language processing, the authors evaluate a default set of machine learning algorithms (i.e.decision trees, K-nearest neighbours, multilayer perceptron neural networks, and random forest) on five datasets (in Arabic, English, Korean, in Spanish and Multilingual), showing that language-independent features proposed (e.g.suspicious words, number of followers and number of following) significantly improve quality of spam detection regardless of ML algorithm used.Moreover, authors compared the selected features' importance among the feature selection methods and observed the relations and the importance of the selected features across all datasets.
In [19], the negative effects of using online social networks were analysed.Authors identified 46 negative effects and then proposed their taxonomy into six themes: cost of social exchange, cyberbullying, low performance, annoying content, privacy concerns, and security threats.After that study, the interviews with experts were conducted to justify the proposed taxonomy.
The challenge of the detection of fake news in social networks is considered in [20].Authors described what can be considered as fake news, their importance, overall impact on different areas, different ways to detect them on social media together with existing detection algorithms.Moreover, an approach for the fake news detection that is based on the combination of data driven and engineered knowledge was suggested.In addition, the impact of fake news on society was investigated.
What unites all research in this area is quality evaluation metrics [21,22].The most common ones to be used are accuracy, precision, recall, F-measure and ROC-AUC.And if the approaches share the same goal, for example, assuming the topic of the text data, then their metrics can be compared on different datasets with a conclusion which of them performs better in different scenarios, is robust and not overtuned [23,24].The main issues of all these approaches are in the limited amount of datasets that are labelled by the scientific community, as well as in disadvantages of the datasets that were labelled manually by the number of experts.It is common when experts can't agree with each other on some sensitive topic.
One of the promising solutions to those issues lies in the use of the active learning approach [25,26].As a rule, such approaches are used for large datasets that do not fit entirely into the main memory.Let us consider several examples.
The paper [27] describes an intelligent tool utilising an active learning approach to aid the network security expert in the task of labelling network data.Modern network intrusion detection systems now widely use machine learning and other artificial intelligence methods, thus requiring labelled datasets of network traffic traces with a variety of intrusions or attacks included for training and evaluation.Meanwhile, the lack of appropriate public datasets of such nature is still an essential problem because labelling requires a major effort of highly paid trained experts in cybersecurity.And, as it was mentioned below, the paper addresses this problem by offering an intelligence visual analytics application RiskID that trains a classifier (Random Forest) on the subset of already labelled instances of a dataset in active learning mode and then uses classifier output to help the user in the label decision process.
To evaluate the performance of the developed system, the authors used the publicly available dataset from Malware Capture Facility Project (MCFP) [28] and compared truepositive rate (TPR), false-positive rate, F1, area under the receiver operating characteristic curve, and Equalized Lost of Accuracy metrics with the same for the current state-of-theart system, ILAB [29].The authors state that the predictive module of their RiskID system beats the ILAB system in several aspects; thus, they do recommend an active learning approach for solving similar labelling tasks.
The [30] proposes to use multi-label active learning to solve the problem of mobile app user reviews classification, MAREVA.It is claimed that the results of such a classification can help developers to improve and maintain their apps according to the users' needs.The main problem solved here, once again, is the difficult, time-consuming, and expensive process of obtaining labelled datasets to train a classifier.The authors claim that, with AL, it is possible to achieve similar (or greater) performance using a smaller labelled dataset, thus saving money and time for developers.And they prove this statement with experimental research performed with a logistic regression classifier and support vector machine on the publicly available dataset collected by them -MAREVA outperforms well-known classifiers (i.e.decision trees, Knearest neighbours, multilayer perceptron neural networks, random forest) in terms of performance without the need to label the whole dataset.Another justification of the active learning approach's efficiency.
The paper [31] compares the active learning approach with random learning (another popular alternative to classical model training on a whole dataset) in the task of vision-based monitoring on construction worksites using deep neural networks.As reported in the experiments conducted by authors, the proposed deep active learning approach achieves performance similar to traditional deep learning methods with a 90% reduction of time needed to label training data and, overall, manages to beat the traditional deep learning approach in performance.
In our case, we are trying to solve another issue.We decided to investigate the applicability of this approach for determining the minimum size of the dataset suited to train classifiers for inappropriate information categorisation on social networks.And to the best of our knowledge, such approaches for the information security purposes are currently not presented, while actively used in other research fields.Our view on such an approach is presented in the following section.

Classification approach
The two-model approach for inappropriate information detection is based on the work of the operator and consists of 9 steps, see Fig. 1: 1. Selection of the topic of interest.2. Manual labelling of the text posts.Let us consider each step in more detail.
Step 1.The selection of the topic of interest.It is assumed that the operator is searching for the text posts of one particular topic in the social network.And because the approach is not able to predict the topic at the start, the operator must provide it.After that, any text post that the operator will manually check and label is stored in the database as corresponding or not corresponding to the selected topic.
Step 2. The manual labelling of the text posts.At this step, the operator manually browses the text posts of the social network and labels them as corresponding or not to the topic that was selected in the previous step.All decisions of the operator are stored in the database.
Step 3. The training of classifiers using the labelled and random data.After the operator labels a considerable amount of data, the first binary classifier (Model 1) can be trained to detect the text posts of the selected topic.Note that during the training, labelled data are mixed with random text posts from the social network.After that, the efficiency of the Model 1 is evaluated in accordance with already labelled data.
Step 4. The training of classifiers using only labelled data.Additionally, after the operator labels a considerable amount of data, the second binary classifier (Model 2) can be trained to detect text posts of the selected topic.The difference is that the second binary classifier is trained only on the labelled data (posts that correspond and do not correspond to the selected topic, without addition of the random text posts).Once again, the efficiency of the Model 2 is evaluated in accordance with already labelled data.
Step 5.The generation of search queries based on the labelled data.Moreover, after the operator labels a considerable amount of data, it is possible to extract keywords that are specific to the text posts of the selected topic.Such keywords are used for the generation of search queries that allow extraction of text posts of the selected topic with high probability.
Step 6.The extraction of the text posts from the social network.Search queries from the previous step are used to extract text posts from the social network automatically.As a rule, it is done with the help of the social network API (application programming interface) provided for scientific purposes.With correctly formed search queries, extracted text posts are most likely representing the selected topic of interest.
Stage 7. The fully automated labelling of the text posts.In this step, the trained classifier from step 3 is used to label the text posts that were extracted in step 6.Each post that is labelled as corresponding to the topic is provided to the following step.
Step 8.The partially automated labelling of the text posts.In this step, the trained classifier from step 4 is used to label the text posts that were labelled as corresponding to the topic in step 7.Each post that is labelled as corresponding to the selected topic is shown to the operator.It means that the second classifier is used to improve the accuracy of the first classifier, while the operator helps to improve the accuracy of the second classifier.After that, the operator decides which text posts were labelled correctly by the model.It makes the amount of labelled data larger, while the information about false and true positives is used during the retraining of the model.
Step 9.The checking if the operator is satisfied with the efficiency.In this step, the operator decides if the classifier is good enough to extract the text posts of the selected topic.And if not, steps from 3 till 9 are repeated.
We assume that such an approach will help an operator during the inappropriate information detection because of automatisation of the following routing tasks: extraction of keywords from the text posts, formation of the search queries that are specific for the selected topic and extraction of the corresponding text posts from the social network.The exper- imental evaluation of the developed approach is presented in the following section.identified.Further, such potentially malicious posts on each topic were offered for evaluation to the experts who checked the text information for compliance with the specified topic and marked them as "YEAH" if an expert thinks that a post corresponds to the specified topic, and "NOPE" if an expert thinks that a post doesn't correspond to the specified topic.

Dataset
The final experimental set includes texts from the VKontakte social network posts with a length of 50 to 3000 symbols, marked by experts as corresponding and not corresponding to the following potentially malicious topics: Aggression, Dangerous conspiracy theories, Radicalism, Gambling, Prostitution, and Sects.In addition, 5365 random posts from the VKontakte social network were collected for the experimental study, ranging in length from 50 to 3000 symbols also.Table 1 shows the distribution of the number of collected posts by selected topics and posts collected randomly (Other).
To pre-process texts and extract features, a standard "bagof-words" scheme was used, namely lemmatisation and deletion of standard and dataset-specific stop words were performed.All hashtags, links, and mentions of VKontakte users were removed from texts.To vectorize texts, HashingVectorizer (n_features=2 10 and stopwords for English from nltk library) from the Sklearn Python library was used.
The dataset for each topic as well as for "YEAH" and "NOPE" subsets was stratified into two parts in a ratio of 80% to 20%.On the first part of the sample, active iterative training of the classifiers under study was performed, and on the second part of the sample, the classifiers trained in the partial mode on subsamples of different sizes were evaluated using precision, recall, F-measure, and ROC-AUC which are used in the majority of works devoted to solving similar problems.

The course of the experiment
To emulate the approach described in Sect.3, for each potentially malicious topic listed in Table 1, two separate binary classification tasks were solved.At the first step, the posts related to a topic and labelled as "YEAH" by an expert were considered as objects of a positive class, and randomly selected posts from the Other group together with the posts presumably related to a topic but labelled as "NOPE" by an expert were considered as objects of a negative class.At the second step, the posts presumably related to a topic but labelled as "NOPE" by an expert were considered as objects of a positive class, and the posts related to a topic and labelled as "YEAH" by an expert were considered as objects of a negative class.A total of 6 x 2 binary classification tasks were solved.Samples for partial training of each of 12 binary classifiers for 6 topics were formed as follows: • The training subsample of texts on each potentially malicious topic and labelled as "YEAH" by experts was divided into N parts with the size of 10 instances (a positive class for each task), X + 1 , X + 2 , . . ., X + N ; at the first step, each partition was treated as a positive class, at the second step, each partition was treated as a negative class; • The training subsample of texts on each presumably related to malicious topic but labelled as "NOPE" by experts was divided into N parts with the size of 10 instances (a part of negative class for each task), X − * 1 , X − * 2 , . . ., X − N ; at the first step, each partition was treated as a part of a negative class, at the second step, each partition was treated as a positive class; • The training subsample of other texts was also divided into n fragments of size M/N , where M is the number of texts in the Other group X − 1 , X − 2 , . . ., X − N ; each partition was treated as a part of a negative class at the first step.
Within the experiment, for a positive class instance, there were different amounts of negative class instances in each binary classification task for different potentially malicious topics.The pairs of the positive and negative class training subsamples were successively fed to the classifiers trained in active (partial) mode at each step separately.During evaluation on the test subsample (Fig. 2), at first, each instance was fed to the classifier trained on the first step of the experiment.
After that, if an instance was classified as negative (not related to a specified malicious topic), then it is a final prediction for that instance.If an instance was classified as positive (related to a specified malicious topic), then next it was fed to the classifier trained on the second step of the experiment and the prediction from this classifier was treated as a final prediction.
As it was mentioned above, the idea behind that approach is that the classifier trained on the second step of the experiment is able to refine the final predictions.After each iteration, the predictions from both classifiers were used to evaluate the quality with precision, recall, F-measure, and ROC-AUC metrics.During the experiment, the following publicly available classifiers that support iterative learning and are implemented in the Sklearn library were investigated: Each classifier was trained 100 times with a random subsample formed as it was described above.As classifiers were Fig. 2 The process of getting the final set of prediction using two-step approach Begin Input data: testset for each topic Test subsample for each topic includes: -20% of the posts related to a topic and labelled as "YEAH" by an expert; -20% of the posts presumably related to a topic but labelled as "NOPE" by an expert; -20% of the posts from the Other group.Proccessing of an instance while there are unproccessed instances in the testset For the mentioned metrics, we have set two thresholds.The minimum threshold for F-measure and ROC-AUC is 0.65, and the desirable threshold for F-measure and ROC-AUC is 0.7.Those thresholds are purely experimental as for every practical task and possible every topic under the interest, an expert should set the thresholds.The linear SVM classifier (Fig. 3a) was able to reach the desirable experimental threshold of 0.7 for F-measure (which is more than 95% of the maximum F-measure that can be reached for this data by linear SVM) for Conspiracy and Prostitution after around 200 positive examples in the training set.For Radicalism, the desirable experimental threshold of 0.7 for F-measure is reached after around 310 positive examples in the training set, with the maximum F-measure around 0.75.For Sects, the value of the F-measure varies around 0.65 and reaches 0.7 starting from around 350 positive examples in the training set.As for Gambling and Aggression, the linear SVM barely reaches the minimum experimental threshold of 0.65 for F-measure even with all available train data.So probably the linear SVM in one step-mode is not effective enough for classifying these topics texts, or the quality of the dataset should be improved.
The passive-aggressive classifier (Fig. 6a) tends to be more accurate in terms of the F-measure with fewer examples in the training set.The desirable experimental threshold of 0.7 for the F-measure for Conspiracy and Prostitution can be reached after around 190 positive examples in the training set.And the classifier reaches the minimum threshold of 0.65 for Gambling after around 260 examples and almost reaches the desirable experimental threshold of 0.7 after around 390 examples, which was impossible with the linear SVM classifier.For Radicalism and Sects, the accuracy of the passive-aggressive classifier in terms of F-measure also tends to be more stable and exceeds the desirable experimental threshold of 0.7 after around 300 and 400 examples, respectively.However, for aggression, the passive-aggressive classifier never reaches the minimum experimental threshold of 0.65 for F-measure even with all available train data too.This fact gives us another possible evidence of not enough quality of the training data for these topics.
It seems that perceptron classifier (Fig. 9) is the worst classifier in terms of F-measure and ROC-AUC as it exceeds the desirable experimental threshold of 0.7 for the F-measure only for Prostitution after more than 450 instances and for Aggression and Gambling it never reaches the minimum The values of ROC-AUC tend to be higher both for the linear SVM classifier (Fig. 3b) and for the passive-aggressive classifier (Fig. 6b).The linear SVM classifier exceeds the desirable threshold of 0. As for the precision, the linear SVM classifier and the passive-aggressive classifier exceed 0.8 threshold (Figs.4a, 7a), except for Sects in case of linear SVM, when applying only the first step classifier.But unfortunately, during the second step, the predictions have not been improved.Moreover, we see the significant drop in accuracy in terms of precision for aggression, conspiracy, and radicalism (Figs.4b, 7b).The quality of perceptron predictions is too pure in terms of precision as well (Fig. 10).
The same is true for the recall (Figs. 5, 8)-the second step classifier is not able to improve the results.

Discussion
It is important to note that the developed two-model approach not aim to replace specialists in protection from the information.Understandably, specialists will be able to detect inappropriate information at a very high level in most situations.Our goal is to help such specialists by automat-ing such routing tasks corresponding to the selected topic of interest.
The experimental evaluation showed that the developed approach has its advantages and disadvantages.The advantages are as follows: • The labelled dataset grows constantly; • The training process takes place during the work of the operator; • The formation of search queries and extraction of text posts is automated.
While the disadvantages are as follows: • If the operator cannot determine whether the post is related to the selected topic, then the model will not be able to understand this either; • For the approach efficient work, the topic chosen must be specific; otherwise, the approach will be able to speed up the operator's work, but the quality of the extracted posts will not be high; • Not every social network provides an interface for data extraction, while the available interfaces have limitations on the number of requests and the amount of data to be extracted.
It means that the topic of interest should be chosen carefully to be distinguished among general text posts of the social network.Moreover, the operator should know the topic well enough, so the approach would learn based on the operator's decisions.In the end, the model learns to imitate the actions of the operator being monitored.
To compare computational cost of each classification model, an additional experiment was conducted to measure the time required for their training and prediction stages, see Tables 2 and 3.Each cell of those tables contains correlation As a rule, the quality of classification is directly related to the time spent on during the training process.Thus, to improve the classification accuracy, more time has to be spent on the training.But our experiments showed that the correlation coefficient calculated for efficiency metrics and time required for their improvement on the training stage is equal to small negative value, meaning that additional training time is able to decrease the accuracy of the classification.This effect requires additional investigation and may be attributed to overfitting.Therefore, small fraction of posts gives maximal classification accuracy and an increase in the amount of training data only deteriorates the classification.For the prediction stage, the extra time yields positive effect, but this effect also decreases over time, meaning that there is a threshold of data saturation, after which the classification accuracy does not significantly increase.
Nevertheless, we believe that the proposed two-step classifier is the perspective approach.The current experiments results can be reasoned by the nature of the datasets were used.These datasets were formed by the group of the students that had slightly different understanding of the topics.And it works for the first step classifier which works with the general topic, but fails for the second step classifier which tries to enhance the closeness of the classification results to the understanding of the modelled experts who do the continuous categorisation of the growing dataset.The second reason of the results is the small amount of answers "NOPE" of the experts in the dataset.So the major part of the dataset were either belongs to the selected topic or far from selected topic.And the second step classifier were aimed to enhance the quality of classification when there is a topic, but the expert needs only some part of this topic.During the future experiments, we plan to check the quality of the two-step approach by the separation of the experts into several datasets and by the extension of the dataset itself.
As for the technical part, it is required to take the limitations of each social network separately.The answers to the following questions should be found in the documentation: • How many requests can be made per second/minute/hour?• How many times can this data be extracted per day?• How many accounts can be used to work with social networks?• Is it legal to use extracted data for scientific research?
The answers to these questions are the basis for the software implementation of the developed approach.

Conclusion
The new two-model active learning approach to iterative classification of inappropriate information in social networks is presented in the paper.Its novelty lies in the constant growth of the labelled dataset, while the training process takes place during the operator's work.Its practical significance lies in the automation of routine tasks to reduce the burden on specialists in protection from the information.
We propose to use so-called active learning approach for training classifiers not for solving the challenge of large datasets that do not fit into the main memory, but for determining the minimum size of the training dataset allowing to solve the binary classification task with an accuracy not lower than a certain specified threshold applied for protection from the information.
For the experimental evaluation, the approach was applied to text posts from the VKontakte social network.The data were collected by topics, within which text posts often contain inappropriate information.The following topics were selected: Aggression, Dangerous conspiracy theories, Radicalism, Gambling, Prostitution, and Sects.The experts evaluated the correspondence of each text post to one of the selected topics.After that, we evaluated precision, recall, Fmeasure and ROC-AUC for each topic for both one-model and two-model implementations of the approach on random training samples of different sizes and analysed the results.
It also should be noted that the classification of these categories can be difficult due to such factors as ambiguity, subjectivity, context-dependency, complexity, and a lack of clear criteria.Ambiguity arises when the boundaries between categories are not well-defined, while subjectivity may be the result of different cultural contexts and personal opinions.Context-dependency arises from the interpretation of categories depending on cultural and historical contexts.Complexity may involve multiple dimensions of information, and a lack of clear criteria may be the result of legal and moral differences across societies.These factors may impact on the quality of both experts-based and AI-based classification.
As for experiment results, the passive-aggressive classifier tends to reach higher accuracy with fewer positive examples in the training set and show more stable results in terms of precision, recall, F-measure and ROC-AUC than the linear SVM classifier, while the perceptron classifier tends to have the worst results according to the analysed metrics.
It is also worth mentioning that none of the classifiers (passive-aggressive, linear SVM, perceptron) was able to classify Sects and Aggression texts with desirable accuracy.So probably the quality of the data on these topics should be improved.
During the future work, it is planned to further improve the approach, formalize the verification of its steps, expand the list of topics for analysis, and collect a dataset of greater volume and quality.Moreover, future plans also include expanding the experimental evaluation on the approach as a whole rather than testing its parts separately.

3 .
Training of classifiers using the labelled and random data.4. Training of classifiers using only the labelled data. 5. Generation of search queries based on the labelled data.6. Extraction of the text posts from the social network.7. Fully automated labelling of the text posts.8. Partially automated labelling of the text posts.9. Checking if the operator is satisfied with the accuracy.

Fig. 1
Fig.1The overview of the two-model active learning approach for inappropriate information classification in social networks
Predict class for an instance with classifier trained on the first step of the experiment Positive class predicted?Add prediction for an instance to the final prediction set Predict class for an instance with classifier trained on the second step of the experiment Positive class: posts related to a topic and labeled as "YEAH" by an expert Negative class: posts from the Other group and posts presumably related to a topic but labeled as "NOPE" by an expert Positive class: posts presumably related to a topic but labeled as "NOPE" by an expert Negative class: posts related to a topic and labeled as "YEAH" by an expert The final prediction set Proccessing of an instance while there are unproccessed instances in the testset End Yes No trained in iterative learning mode and the final goal was to find minimal sample size providing acceptable quality, hyperparameters were not tuned, default hyperparameters for all classifiers were used.The averaged results with standard deviation for precision, recall, F-measure and ROC-AUC metrics were used to build the plots below.The experiment results for passive-aggressive classifier, linear support vector machine classifier and multilayer perceptron are shown in Figs. 3, 4, 5, 6, 7, 8, 9, 10 and 11.Lines of different colours correspond to binary classification tasks for different potentially malicious topics.On the X-axis, the sizes of the positive class at each iteration are shown.On the Y-axis, the value of the corresponding accuracy measure (Fmeasure ROC-AUC, precision or recall) calculated on a test sample for a classifier trained at each iteration with a training sample of a different size is shown.

Fig. 3 FFig. 4
Fig. 3 F-measure (a) and ROC-AUC (b) calculated on the test set of each malicious topic for the first step of active classification experiment with the linear support vector machine classifier

Fig. 5 Fig. 6 FFig. 7 Fig. 8 Fig. 9 FFig. 10 Fig. 11
Fig.5 Recall calculated for the first step of active classification experiment (a) and difference between Recall after the first and after the second steps (b) calculated on the test set of each malicious topic with the linear support vector machine classifier 7 for Prostitution, Conspiracy, and Gambling, after around 100 examples in the training set and for Radicalism and Sects after around 260 examples.The passive-aggressive classifier exceeds the desirable threshold of 0.7 for Prostitution, Conspiracy, and Gambling after around 110 examples, and for Radicalism and Sects around 240 examples.So for the passive-aggressive classifier, it is possible to reach higher accuracy if tune the probability threshold for the classifier.

Table 1
Distribution of the number of collected posts by topics

Table 2
Correlation between efficiency metrics time required for their improvement: training stage of each classification model

Table 3
Correlation between efficiency metrics and time required for their improvement: prediction stage of each classification model