Beyond Binary: A Multi-Class Approach to Large Language Model-Generated Text Percentage Estimation

doi:10.21203/rs.3.rs-4318347/v1

Download PDF

Research Article

Beyond Binary: A Multi-Class Approach to Large Language Model-Generated Text Percentage Estimation

https://doi.org/10.21203/rs.3.rs-4318347/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

The usage of AI in the workplace has increased to 22% of those polled, underscoring the increasing demand for the detection of AI-generated material. With an AUROC of 0.95, DetectGPT beat previous models, demonstrating its efficacy in identifying text written by LLM. Still, there are difficulties in recognizing subtleties that are typical of text generated by language models. This study presents perc-DETECTLLM, a unique method to precisely estimate the percentage of LLM- generated text by combining a zero-shot model with a Bayesian model. perc-DETECTLLM as compared to other models, shows promising results with an accuracy of 72.1% and a recall rate of 81.0%, solving the difficulties in machine-generated text detection. More than 500K samples from different domains makeup constitute our dataset, which makes thorough model training and assessment possible. Perc-DETECTLLM achieves promising text classification results by estimating the overall percentage of LLM-generated text using Bayesian inference. Ensemble techniques perform better and are used as a standard for LLM detection models. Compared to rule-based approaches, perc- DETECTLLM's data-driven methodology improves accuracy and dependability. Overall, all the experiment results are promising and could be extended to other domains, there are still difficulties with dynamic thresholding processes and perturbation function optimization, which points to areas for further study. With perc-DETECTLLM, textual content authentication in the digital age is feasible and is a substantial improvement in LLM-generated text detection.

Bay’s probability

Ensemble model

LLM

word embedding

zero shot model

These recent years have belonged to AI with the advancement of generative AI models in computer vision to large language models (LLMs) in natural language processing (NLP) (Junchao and Yang, et al., 2023) which has sparked the need to detect these AI-generated work and human-generated. According to McKinsey, 22% of people surveyed now use AI in some form for their work. Artificial intelligence (AI) tools can be abused by users for immoral activities like spamming, creating fraudulent product reviews, plagiarism, fake news, and social engineering, all of which can have detrimental effects on society. Numerous basic flaws have been included into news stories that AI has rewritten. Therefore, it's important to make sure that these generative AI technologies are used responsibly. Much current research focuses on identifying AI-generated texts in order to facilitate this (Chen, et al., 2023) .

Text generated by large language models (LLMs) poses many issues and with the rapid development of these LLMs it is becoming harder to distinguish between human and machine-generated work (Chen, et al., 2023). It poses a threat to many industries like creative arts, education, and journalism. A good LLM text detection algorithm can be a great source of help for publication as well as to maintain authentication in research work.

Detecting generated text produced by Large Language Models (LLMs) serves various purposes, driven by concerns related to misinformation, ethics, and security. The ability to detect the percentage of LLM content accurately in such conditions is paramount for several reasons:

1. Misinformation and Fake News: Because LLMs can produce extremely convincing and contextually relevant content, it can be difficult to discern between factual and false information.

2. Ethical Use of Technology: It is critical to ensure that LLMs are used responsibly. When automated text generation is being utilized unethically, for example, in the production of frauds, deepfake text, or fraudulent material, detection algorithms can be employed to help identify these cases.

3. Protecting Against Cyber Threats: Phishing emails, social engineering communications, and other cyber threats can be created by malicious actors using LLMs.

4. Preserving Authenticity in Creative Works: Preserving the authenticity of content created by humans is crucial in fields such as academics, journalism, and literature. Maintaining the integrity of artistic and intellectual works can be facilitated by detection tools.

The existing research work addressed the following knowledge gaps:

1. There is no existing model that gives a percentage of LLM-generated text based on the probabilistic approach used by the zero-shot model.

2. The binary classification does not account for the padding of human-generated data with LLM-generated text.

3. There is no existing any robust dataset which encounters a percentage of LLM-generated text as a label.

The proposed solutions bridge these existing shortcomings by:

1. Created a model that gives a percentage of LLM-generated text based on the probabilistic approach used by the shot model.

2. By knowing the amount of AI-generated text in the content, since binary classification does not account for padding of human-generated data with LLM-generated text.

3. Created a robust dataset with the percentage of LLM-generated text as a label.

Earlier LLM detection was solely focused on paraphrasing which is not a very reliable method taking the rapid growth of the LLM model's progress into account, outfox (Koike, et al., 2024) was one first model that went beyond this and used two-way input stream to train a model which significantly improved the performance. Furthermore, this plagiarism is generated from various LLMs such as ChatGPT and Gemini, with each of them having its unique properties which are fundamentally similar, however, their training model is very different which makes it more challenging to detect LLM-generated text.

It is important to note here that “rephrasing” is not the only way used to get away with LLM-generated text (Venkatraman, et al., 2023), people also add some of their text in the work to evade AI detection methods like waterfall, zero-shot (Mitchell, et al., 2023) etc. To counter such practice there is a need for detection of the percentage of LLM content in the given text or string, existing algorithms in this field can binary classify the given text but perform poorly when it comes to finding the percentage of LLM content in the text. Further, there does not exist any good dataset with a percentage label of LLM.

To address the above research objectives and questions, firstly, a literature review is undertaken to furnish an overview of LLM-generated text detection, their limitations, and opportunities, as outlined in Section 2. Relevant research methodologies are discussed in Section 3, followed by an investigation of the performance of current state-of-the-art deep learning models for LLM-generated text detection in Section 4. The model pertinent discussion is done in Section 5. Finally, novelty of this research includes proposing using the Bayesian model along with the zero-shot model to enhance the zero-shot model’s probabilistic approach to achieve a better level of performance and evidence-based recommendations are presented in Sections 6 and 7. The findings are promising, and they will lead to more studies in the future, relating to various detection algorithms for text generated by LLMs.

2.1 Overview of LLM models

Generative Pre-trained Transformers, which are known as GPTs, are a series of multimodal which have advanced forms of Artificial Intelligence that are designed to understand and generate responses as Humans do. The model has been used in various Domains like translation, content writing, chatbots, image generation, and video generation.

2.1.1 GPT based models

The detection of fake content or misinformation is created by large language models like GPT. The paper (Chen, et al., 2023) discusses whether AI-generated misinformation or fake news was more harmful than human-generated misinformation written by humans. It has been seen that LLM-generated misinformation is more the difficult to detect and investigate. The study also found that AI-generated misinformation was more deceptive and difficult to detect, posing a significant challenge to tackle online safety and public trust websites. “GPT-who” (Venkatraman, et al., 2023) , an LLM-generated text detector based on Uniform information Density (UID), suggested that human generally distribute their information or knowledge evenly during their communication. This concept is used in “GPT-who” to detect the difference between human-generated and AI-generated. The paper claims that it performs better than another state-of-the-art detecting model by around 20% on many performance metrics and provides more interpretable representations of the text.

2.1.2 Gemini model

Gemini is an advanced chatbot model built by Google DeepMind. The author (Saeidnia & Hamid, 2023) examined the motivation behind the development of Google’s chatbot Gemini and its potential impact on IT industry. Gemini is designed to provide a better user experience by personalized and relevant information and improving customer services. Model built to transform how users access the internet and information and interact with it. We used Gemini’s API to generate text and also used it for the paraphrasing of our dataset.

2.1.3 Necessity of LLM generated text detection

The impact of the LLM model on academic and student learning has been seen in higher education through a content analysis of 100 articles from Australia, New Zealand, the United States, and the United Kingdom (Sullivan, et al., 2023). It explores various benefits and risks associated with using generative AI tools and the need to change teaching style. The assignment may need to adapt to this new reality as the performance of ChatGPT compared to humans in various expert domains is getting better day by day shown in (Guo, et al., 2023) . The study (Khalil, et al., 2023) evaluates LLMs’ ability to understand the theory of mind task. It suggests that as LLMs improve, their language skills may lead to the emergence of human-like cognitive abilities as discussed by (Junchao and Yang, et al., 2023) , emphasize the importance of building detectors for LLM-generated text from potential misuse and protecting different domains like artistic expression, social network. It discusses recent advancements in detection techniques and the challenges faced, such as out-of-distribution problems and data ambiguity. It is not only necessary to detect generated text but also to detect hallucinations or unreliable answers in LLMs. (Chen, et al., 2023) and (Guo, et al., 2023) talk about how LLM could be deceived to generate false information while forcing on that LLM can also make statistical mistakes and how can they prove to be costly in many fields. The need for a robust model to discern reliable information provided by the LLM model is critical for their safe and effective use.

2.1.4 Challenges in detection of LLM generated text

Identifying text written by Large Language Models provided several challenges for researchers and practitioners. First, the out-of-distribution problem results in false negatives because detection models cannot identify a novel text, meaning a text that was not part of their training data. In addition, the text could be deceptive in style (Chen, et al., 2023)similar to human prose, which makes the machine-based system and humans difficult to detect that the text was written by the machine. Ultimately, this becomes more challenging and deceiving if production of misinformation is produced by LLM.

Another challenge, and restriction was the ineffectiveness of traditional plagiarism systems (Khalil, et al., 2023) they were unable to detect LLM-generated materials because these were new and original. In this case, adversaries could be detecting a mistake in a text and trick the detection system into thinking that it was not abductive hence, the system misclassifies them because they could classify most of them as shown in (Pu, et al., 2023) . The limitation in (Guo, et al., 2023) is the high difficulty of measuring the LLM generated text using logistic regression. This was difficult because the content generated by LLMs was complex and confusing therefore, understanding requires profound insights into critical factors that influence performance and the ability to adapt to these factors in various contexts. These challenges necessitate ultra-efficient detection methods that can adapt to the growing capabilities of LLMs.

2.2 Text detection algorithms

2.2.1 Double stream models

OUTFOX (Koike, et al., 2024) frame has been used to improve the detection by having detector and attacker model and iteratively learn from each other’s outputs through in-context learning. The detector uses human-generated text (essays), attacker-generates texts, and regular LLM’s generated texts as in-context examples to learn to detect the attacker’s generated texts. The attacker used the detector’s outputs to learn adversarial to generated text which was not easily detected by the detector. They constructed a dataset of 15,400 triplets of essay prompts, human-written essays, and LLM-generated essays to train and evaluate the model.

In recent work, (Abburi, et al., 2023) investigated the task of detecting AI-generate text and attributed it to the large language model. They proposed an ensemble that involved passing input text through pre-trained large language models like BERT, RoBERTs, and DeBERTa, etc., and using the output probabilities from the model as features input for traditional machine learning models for binary classifiers such as SVM and Logistic regression. For the detection task, the ensemble approach achieved a micro F1score of 0.733 on English data and 0.649 on Spanish data. The approach performed well on the multiclass task of attributing LLM-generated text to one of six language models, ranking 1st in both micro F1 0.625 in English and micro F1 0.653 in Spanish.

2.2.2 Multi feature detection

The concept of multi-feature detection in (Wu, et al., 2023) , was a complex approach to improve various fields in the accuracy and reliability of the zero-shot detection model. Multi-feature detection model integrates various data attributes and multiple sources using advanced algorithms to analyse and find complex patterns. The zero-shot model used different mathematics approaches like log-likelihood, log-rank, and entropy for better interpretation. This approach was particularly favourable in scenarios where a single feature model may fail or not be enough, such as in a medical field where patients have multiple points and must be considered or in Cybersecurity, where a wide range of indicators need to be analysed to find the thread.

2.2.3 Adversarial learning

RADAR (Hu, et al., 2023)aims to identify machine-generated text using a method called learning with adversarial networks (GANs). As a classifier RADAR utilizes a diversity-enhancing GAN that employs a unidirectional LSTM as the discriminator. This unique approach incorporates rewards at both the word and sentence levels during training treating text generation as a decision-making process. It considers the generated text as the state and the next word to be generated as an action utilizing Monte Carlo searches to gather feedback from the discriminator. Conversely, OUTFOX (Koike, et al., 2024)enhances detection by employing both a detector and an attacker model that learn from each other through in-context learning. The detector learns to distinguish between human-generated texts (essays) created by attackers and regular language model-generated texts using examples. In contrast, the attacker leverages feedback, from the detector to create text that evades detection.

The authors (Mitchell, et al., 2023) introduced a novel method to detect LLM’s generated text. The authors found a distinctive property of LLM’s probability function to detect. LLM-generated text from the sample generally made negative curvature with the model’s log probability function and they used this insight, to develop a curvature base criterion called “DetectGPT”. It did not require any separate classifier or training dataset to build the model. It only used the log probability function to compute from another pre-trained language model and the result showed that DetectGPT performed better than the existing Zero-shot model. DetectGPT achieved 0.95 AUROC compared to Zero-shot 0.81 AUROC.

2.2.4 Comparative analysis of different multilingual models

A recent study by (Orenstrakh, et al., 2023)explored a method to detect the Large language model in the context of academic integrity. For this, the author compiled a dataset of student text along with ChatGPT-generated text and used datasets with widespread powerful large language models like CopyLeaks, GPTKit, GLTR, and GPTZero to do comparative analysis for detection. The most accurate detector was CopyLeaks shown in Table 1 which had the highest precision in identifying the LLM-generated text. GPTKit was for effectively reduced the false positive and GLTR was the most resilient one. However, the study also highlighted concerns regarding the high false positive rates of GPTZero as well as the reduced accuracy of all detectors when faced with code, non-English content, and paraphrased submissions.

Table (I): Overall accuracy of LLM – generated text detectors

Detectors	Human Data	ChatGPT Data
CopyLeaks	99.12%	95.00%
GPT2Detector	98.25%	95.00%
CheckForAI	98.25%	95.00%
GLTR	82.46%	95.00%
GPTKit	100.00%	75.00%
OriginalityAI	93.86%	70.00%
AI Text Classifier	94.74%	60.00%
GPTZero	54.39%	45.00%

3.1 Data Sourcing and Data Statistics

In our quest for data robustness and diversity, we strived to encompass a broad spectrum of domains and generation styles by sourcing data from an array of benchmark datasets.

Opinion Statements: Data was collected from Yelp reviews dataset.
Question Answering: Hello-SimpleAI/HC3 question answering data was also used during model training and evaluation. The dataset contains more than 24K data points, containing questions and answers from the domain, ‘reddit-sli5’, ‘finance’, ’medicine’, ‘open-qa’ ‘wiki-csai’.
Opinion Statements: The dataset incorporated 804 opinion statements from the Reddit subcommunity /r/ChangeMyView (CMV) as studied by (Tan, et al., 2016), and 1,000 reviews from the Yelp reviews dataset as analysed by (Zhang, et al., 2015).
News Article: For news article writing, it comprised 1,000 news articles from XSum as per (Narayan, et al., October 31 - November 4, 2018), and 777 news articles from TLDR_news.
Story Generation: For story generation, the dataset contained 1,000 human-written stories based on prompts from the Reddit WritingPrompts (WP) dataset as per (Fan, et al., July 15–20, 2018), and 1,000 stories from ROCStories Corpora (ROC) as per (Mostafazadeh, et al., June 12–17, 2016) .
Reasoning: In the commonsense reasoning category, it encompassed 1,000 sentence sets for reasoning from HellaSwag as per (Anon., July 28- August 2, 2019).
Knowledge Illustration: For knowledge illustration, it included 1,000 Wikipedia paragraphs from SQuAD contexts as per (Rajpurkar, et al., November 1–4,2016).
Scientific Writing: Lastly, for scientific writing, it included 1,000 abstracts of scientific articles from SciGen as per (Moosavi, et al., 2016).

Model is based on more than 500K + data collected from above mentioned sources. The data distribution can be shown using the following pie chart,

Table (II): Distribution of data on the bases of length of text

Data Source	Human Written	LLM Generated	All
Average Document Length	225.06	224.31	224.58
Average Sentence Length	15.7	17.5	16.84
Average #sentence per document	14.84	14.59	14.68

3.2 Model Intuition and New Approach, ‘perc-DETECTLLM’

3.2.1 Model Intuition -

perc-DETECTLLM’s approach to distinguishing between machine-generated and human-written text is particularly relevant in today’s digital age, where AI language models are increasingly used to generate text. This could range from creating content for websites, drafting emails, to even generating news articles. As such, the ability to discern the origin of the text becomes crucial in various contexts, including but not limited to, academic integrity, content credibility, and information security.

The perturbation function, a key component of perc-DETECTLLM, is designed to introduce subtle changes to the text while preserving its overall meaning. This is a non-trivial task as it requires a careful balance between making meaningful modifications and maintaining the original intent of the text. The choice of perturbation function can significantly impact the performance of DetectGPT, making it an important area for further research and optimization.

The Perturbation Discrepancy Gap Hypothesis, which forms the theoretical foundation of DetectGPT, provides an interesting insight into the characteristics of machine-generated text. It suggests that machine-generated text is more sensitive to perturbations, resulting in larger discrepancies in log probabilities. This characteristic can be attributed to the fact that language models, when generating text, optimize for high-probability outputs, which often lie in areas of negative curvature in the log probability function.

perc-DETECTLLM’s thresholding mechanism is another critical aspect that requires careful calibration. The threshold needs to be set at a level that maximizes the accuracy of distinguishing between machine-generated and human-written text. It is worth noting that this threshold may vary across different language models and datasets, necessitating a dynamic thresholding strategy.

perc-DETECTLLM represents a significant advancement in the field of AI language models. By providing a robust and scalable method to distinguish between machine-generated and human-written text, it opens up new possibilities for understanding and regulating the use of AI in text generation. However, like all models, DetectGPT is not without its limitations and areas for improvement, which present exciting opportunities for future research.

3.2.2 Model Methodology –

Data Manipulation

The first step in our process involved manipulating the data to create a mixed dataset of human-written and machine-generated text. We achieved this by randomly selecting segments within human-written sentences and replacing them with corresponding segments generated by Language Models (LLMs). This process simulates a real-world scenario where human-written content might be interspersed with machine-generated text.

Calculation of LLM Percentage Data (llm_perc_data): Once we have our mixed dataset, we calculated a metric called LLM percentage data for each sentence. This metric quantifies the proportion of each sentence that has been generated by an LLM. It is calculated as the ratio of the word count of the LLM-generated segment to the total word count of the sentence:

$$llm-percentage-data=\frac{len\left(wordcountofLLM-generated\right)}{len\left(wordcountofthesentence\right)}$$

This gives us a continuous value between 0 and 1, where 0 indicates no LLM-generated content and 1 indicates that the entire sentence is LLM-generated.

Utilization of DetectGPT Model

Next, we used the DetectGPT model to independently assess the likelihood of each sentence being machine-generated. DetectGPT provided an estimate, also a continuous value between 0 and 1, of how likely it is that a given sentence has been generated by an LLM.

Bayesian Model Evaluation

Finally, we combined the LLM percentage data and the DetectGPT assessment using Bayesian inference to estimate the overall percentage of LLM-generated text in our dataset. Bayesian inference allowed us to update our prior beliefs about the distribution of LLM-generated text based on the observed data.

The Bayesian update was performed using Bayes’ theorem:

$$P\left(\right\{LLM-generated\left\} \right| \left\{data\right\}) = \frac{\left\{P\right(\left\{data\right\} \left| \right\{LLM-generated\left\}\right) \times P\left(\right\{LLM-generated\left\}\right)\}}{P\left(\right\{data\left\}\right)}$$

Notation:

$P\left(\right\{LLM-generated\left\} \right| \left\{data\right\})$ is the posterior probability, or our updated belief about the proportion of LLM-generated text after observing the data.
$P\left(\right\{data\left\} \right| \{LLM-generated\})$ is the likelihood, or the probability of observing our data given that a certain proportion of it is LLM-generated.
$P\left(\right\{LLM-generated\left\}\right)$ is the prior probability, or our initial belief about the proportion of LLM-generated text before observing the data.
$P\left(\right\{data\left\}\right)$ is the evidence, or the total probability of observing our data under all possible proportions of LLM-generated text.

When compared to LLM-generated text, the perc-DETECTLLM model's evaluation produced encouraging results in terms of text classification. With an accuracy of 72.1%, the model demonstrated that it could accurately classify occurrences of both categories of text. Precision was found to be 76.0%, which is the percentage of correctly detected LLM-generated text among all text classified as LLM-generated. This measure shows how well the model can reduce false positives, meaning that the content that is classified as being generated by language models is, in fact, generated by language models. In addition, the recall metric—which measures the percentage of accurately recognized LLM-generated text across all instances of LLM-generated text—was 81.0%. This high recall rate suggests that a considerable amount of the LLM-generated text in the dataset was successfully captured by the model.

Tabel 3 shows that zero shot – bayes model has second best accuracy with best recall among other existing models, which provides best overall performance for LLM detection. This is a promising result and could act as a reference in model decision-making for researchers and manufacturers.

Table (III): Performance of Large Language Model generated text detection models.

Model	Accuracy	Precision	Recall
deberta-large	0.620	0.783	0.610
xlm-r-100langs-bert-base-nli-stsb-mean-tokens	0.647	0.782	0.639
roberta-base-openai-detector	0.679	0.805	0.671
xlm-roberta-large-xnli-anli	0.618	0.782	0.608
roberta-large	0.623	0.784	0.613
Ensemble with Voting classifier (𝑃c as input feature)	0.751	0.826	0.745
perc-DETECTLLM	0.721	0.760	0.810

With respect to current techniques for identifying text created by LLM, the perc-DETECTLLM model offers a number of noteworthy benefits. Contrary to rule-based methods(Wu, et al., 2023) that depend on pre-established patterns(Hu, et al., 2023) or language characteristics (Junchao and Yang, et al., 2023), perc-DETECTLLM uses a data-driven strategy to improve accuracy and dependability by fusing machine learning with Bayesian inference. This addresses the inherent complexity and diversity in LLM-generated text as discussed by (Khalil, et al., 2023), by allowing the model to dynamically adapt to various language models and datasets. Additionally, perc-DETECTLLM takes into account semantic coherence and contextual relevance, identifying subtle subtleties indicative of machine-generated text, whereas other previous techniques(Koike, et al., 2024) (Abburi, et al., 2023), just concentrate on surface-level traits(Chen, et al., 2023) or statistical anomalies (Orenstrakh, et al., 2023). Furthermore, by adding a perturbation function, perc-DETECTLLM may replicate actual fluctuations in LLM-generated material, simulating the difficulties encountered in real-world situations systems(Khalil, et al., 2023) (Guo, et al., 2023). Thorough approach and strong performance of perc-DETECTLLM position it as a promising advancement in the field of LLM-generated text detection.

It can be seen in Fig. 3 that models currently in use(Wu, et al., 2023)(Orenstrakh, et al., 2023) perform differently on different parameters, some reach comparable accuracy(Chen, et al., 2023) and recall(Abburi, et al., 2023) levels, while others don't measure up. As an illustration, the accuracy rates of DeBERTa-Large (Abburi, et al., 2023), XLM-R-100langs-BERT-base-nli-stsb-mean-tokens (Abburi, et al., 2023), and Roberta-Base-OpenAI-Detector(Abburi, et al., 2023) range from 61.8–67.9%, whereas the recall rates fall between 60.8% and 67.1%. While these models perform rather well, perc-DETECTLLM outperforms them in terms of accuracy and recall. Conversely, ensemble techniques (Wu, et al., 2023)- especially those that use voting classifiers—achieve higher accuracy rates; the highest, 75.1%, is documented. These models' marginally lower precision scores as compared to perc-DETECTLLM, however, suggest that they might forgo precision in favour of enhanced recall (Pu, et al., 2023).

In this work, we constructed zero shot and bayes model for Large Language Model generated text detection, by collecting texts from various writing tasks and Gemini API for LLM text. When it comes to detecting machine-generated texts, human annotators are only marginally more accurate than random guessing. The challenge of detecting deepfake text was illustrated by empirical findings on widely used detection techniques. We found that ensemble methods were the best-performing detection methods across all existing models.

This work extended the prior existing work on binary classification of LLM generated text to percentage of LLM generated text by employing the Bayesian concept. The perc-DETECTLLM model presents a viable approach to address the increasing demand for textual content authentication in the current digital era. perc-DETECTLLM makes a significant contribution to the development of AI-driven text analysis and guarantees the authenticity and integrity of textual information in a variety of settings. The present research formed foundation for other researchers in generative AI as our constructed database is robust and can be employed in different fields.

Limitations:

The model does have certain drawbacks and difficulties, though. Variations in language models and datasets require additional optimization of perturbation functions and dynamic thresholding procedures. Furthermore, language models that closely resemble human writing or excessively edited text may have an impact on the model's performance. The perc-DETECTLLM model presents a viable approach to address the increasing demand for textual content authentication in the current digital era. To realize the full potential of the model, more study is needed to improve its performance and investigate its use in practical situations. perc-DETECTLLM makes a significant contribution to the development of AI-driven text analysis and guarantees the authenticity and integrity of textual information in a variety of settings.

The percentage of LLM-generated text that is included as a label in the dataset that was produced by manipulating pre-existing datasets has enormous potential for use in future studies and applications. Researchers can train and assess models for LLM-generated text recognition as well as gain a greater knowledge of the subtleties and features of machine-generated content across a range of fields by annotating the dataset with this vital information. This annotated dataset is an invaluable tool for comparing and benchmarking various detection techniques, facilitating methodical assessments and progress in the field. Moreover, fine-grained analysis is made possible by the labelling of the LLM-generated text percentage, which provides insight into the presence and distribution of machine-generated content across various textual corpora and communication ecosystems.

Future research endeavours may investigate novel detection strategies that utilize deep learning and sophisticated machine learning techniques, in addition to utilizing the annotated dataset. Investigating adversarial training techniques, in which models are trained on data that has been intentionally disturbed, is one possible direction. The goal is to improve robustness against highly skilled adversaries who are trying to avoid detection. A different approach involves combining linguistic and contextual clues, like syntactic and semantic characteristics, to identify nuances typical of text produced by language learning models.

Furthermore, developing best practices, laws, and ethical standards guiding the proper application and deployment of LLM-generated text detection systems requires cooperation between researchers, industry stakeholders, legislators, and civil society organisations. Researchers should minimise the negative effects and optimise the positive effects of LLM-generated text detection tools on digital communication and information ecosystems by proactively addressing these ethical and societal issues.

No funding was received for conducting this study. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript

Author Contribution

All authors contributed equally to every aspect of the research project, from conceptualization to manuscript preparation, ensuring collaborative and balanced effort throughout the entire process.

Abburi, et al., 2023. Generative ai text classification using ensemble llm approaches. arXiv preprint arXiv:2309.07755.
Anon., July 28- August 2, 2019. Hellaswag: Can a machine really finish your sentence?. s.l., Association for Computational Linguistics, p. 4791–4800.
Chaka & Chaka, 2023. Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: The case of five AI content detection tools. Journal of Applied Learning and Teaching, Volume 6.
Chen, Canyu and Shu & Kai, 2023. Can llm-generated misinformation be detected?. arXiv preprint arXiv:2309.13788.
Chen, et al., 2023. Hallucination detection: Robustly discerning reliable answers in large language models. s.l., s.n., pp. 245--255.
Fan, A., Lewis, M. & Dauphin, Y. N., July 15-20, 2018. Hierarchical neural story generation. s.l., Association for Computational Linguistics., p. 889– 898.
Guo, et al., 2023. How close is chatgpt to human experts? comparison corpus, evaluation, and detection. arXiv preprint arXiv:2301.07597.
Huo, Siqing and Arabzadeh, Negar and Clarke & Charles, 2023. Retrieving supporting evidence for generative question answering. s.l., s.n., pp. 11--20.
Hu, Xiaomeng and Chen, Pin-Yu and Ho & Tsung-Yi, 2023. RADAR: Robust AI-Text Detection via Adversarial Learning. In: A. Oh and T. Neumann, et al. eds. Advances in Neural Information Processing Systems. s.l.:Curran Associates, Inc., pp. 15077--15095.
Junchao and Yang, et al., 2023. A survey on llm-gernerated text detection: Necessity, methods, and future directions. arXiv preprint arXiv:2310.14724.
Khalil, Mohammad and Er & Erkan, 2023. Will ChatGPT G et You Caught? Rethinking of Plagiarism Detection. s.l., International Conference on Human-Computer Interaction. Cham: Springer Nature Switzerland., pp. 475--487.
Koike, Ryuto and Kaneko, Masahiro and Okazaki & Naoaki, 2024. Outfox: Llm-generated essay detection through in-context learning with adversarially generated examples. s.l., s.n., pp. 21258--21266.
Mitchell, et al., 2023. Detectgpt: Zero-shot machine-generated text detection using probability curvature. s.l.:PMLR.
Moosavi, et al., 2016. Learning to reason for text generation from scientific tables. arXiv preprint arXiv:2104.08296.
Mostafazadeh, N. et al., June 12-17, 2016. A corpus and cloze evaluation for deeper understanding of commonsense stories. s.l., The Association for Computational Linguistics, p. 839–849.
Narayan, Shashi and Cohen, Shay B and Lapata & Mirella, October 31 - November 4, 2018. Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. s.l., Association for Computational Linguistics, p. 1797–1807.
Orenstrakh, et al., 2023. Detecting llm-generated text in computing education: A comparative study for chatgpt cases.
Pu, et al., 2023. Deepfake text detection: Limitations and opportunities. s.l., IEEE, pp. 1613--1630.
Rajpurkar, P., Zhang, J., Lopyrev, K. & Liang, P., November 1-4,2016. Squad: 100, 000+ questions for machine comprehension of text. s.l., The Association for Computational Linguistics, p. 2383–2392.
Saeidnia & Hamid, R., 2023. Welcome to the Gemini era: Google DeepMind and the information industry. 26 December.
Sullivan, Miriam and Kelly, Andrew and McLaughlan & Paul, 2023. ChatGPT in higher education: Considerations for academic integrity and student learning. Journal of Applied Learning & Teaching.
Tan, et al., 2016. Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions. s.l., s.n., pp. 613--624.
Tang, Ruixiang and Chuang, Yu-Neng and Hu & Xia, 2023. The science of detecting llm-generated texts. arXiv preprint arXiv:2303.07205.
Venkatraman, Saranya and Uchendu, Adaku and Lee & Dongwon, 2023. Gpt-who: An information density-based machine-generated text detector. arXiv preprint arXiv:2310.06202.
Wang, et al., 2024. LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning. arXiv preprint arXiv:2402.01158, p. 17.
Wu, et al., 2023. A survey on llm-gernerated text detection: Necessity, methods, and future directions. arXiv preprint arXiv:2310.14724.
Wu, Zhendong and Xiang & Hui, 2023. MFD: Multi-Feature Detection of LLM-Generated Text.
Zhang, Xiang and Zhao, Junbo and LeCun & Yann, 2015. Character-level convolutional networks for text classification. Advances in neural information processing systems, Volume 28.

No competing interests reported.

Supplementaryfile.pdf

Download PDF

Editor assigned by journal
02 May, 2024
Submission checks completed at journal
25 Apr, 2024
First submitted to journal
24 Apr, 2024

You are reading this latest preprint version

Beyond Binary: A Multi-Class Approach to Large Language Model-Generated Text Percentage Estimation

Status:

Version 1

Abstract

Figures

1 Introduction

2 Literature review

3 Methodology and Implementation

3.1 Data Sourcing and Data Statistics

3.2 Model Intuition and New Approach, ‘perc-DETECTLLM’

3.2.1 Model Intuition -

3.2.2 Model Methodology –

4 Results

5 Findings and discussion

6 Conclusions

7 Future Work

Declarations

Author Contribution

References

Additional Declarations

Supplementary Files

Status:

Version 1