In the present research, we investigated the efficacy of a Large Language Model, specifically FLAN-T5 in its small and base version, in learning and generalizing the intrinsic linguistic representation of deception across different contexts. To accomplish this, we employed three datasets encompassing genuine or fabricated statements regarding personal opinions, autobiographical experiences, and future intentions.
Descriptive linguistic analysis was performed to compare the three datasets on linguistic features. Vocabulary uniqueness, measured with the Jaccard’s index, ranged from 0.34 to 0.46, suggesting that people tend to use a different vocabulary for truthful and deceptive statements, especially when expressing opinions and intentions. We also explored differences in the DeCLaRatiVE style, i.e., analyzing 26 linguistic features extracted from the psychological frameworks of Distancing, Cognitive Load, Reality monitoring, and VErifiability approach. The linguistic features that showed statistically significant differences between truthful and deceptive statements varied across datasets in terms of total number and type of features, the magnitude of the effect size, and the direction of the effect. Notably, the effect size scores of verbal cues of truthfulness were consistently larger than those of deception, confirming that the linguistic features associated with truthfulness are more reliable and robust than those associated with deception [5].
Overall, the descriptive Linguistic Analysis of the three datasets agreed well with existing studies in cognitive and memory-oriented approaches to verbal lie detection.
In line with the cognitive load framework, we observed that truthful opinions and narratives of autobiographical memories were characterized by greater complexity and verbosity, with opinions being stylistically more authentic and memories more analytical [35, 13].
In accordance with the Reality Monitoring (RM) framework [36], that states that truthful memory accounts tend to reflect the perceptual processes involved while experiencing the event, whereas fabricated accounts are constructed through cognitive operations, genuine memories exhibited higher scores in memory-related words - reflecting individual efforts to recollect the event- and in the number of words associated with spatial and temporal information (‘Contextual Embedding’), as well as an overall higher RM score. Conversely, we found deceptive memories showed higher scores in words related to cognitive processes (e.g., reasoning, insight, causation). Furthermore, along with the Verifiability Approach, truthful memories contained more verifiable details, as indicated by the greater number of named-entities about times and locations [22, 41]. The fewer named-entities in deceptive memories may suggest that deceivers may strategically omit potentially incriminating information even in low-stake scenarios, such as in this study. However, to gain credibility, they may compensate by fostering a sense of social connection by including self-references and mentions of other individuals, which may explain why we found a greater amount of named-entities of ‘People’ and self-references. Finally, truthful memories were overall characterized by words with higher scores of concreteness, supporting Kleinberg’s truthful concreteness hypothesis [42] based on RM framework and Verifiability Approach.
However, the opposite trends were found for truthful and deceptive opinions. Truthful opinions about abstract concepts were characterized by a lesser number of concrete words and a greater amount of cognitive words, as also previously shown [33], reflecting the reasoning processes that truth-tellers engage in evaluating the pros and cons of abstract and controversial concepts (e.g., abortion). Conversely, deceivers tended to produce opinions more grounded in reality, as shown by higher scores in the concreteness of words, contextual details, and reality monitoring.
Finally, in line with previous literature on distancing framework [34, 43] and deceptive opinions [33, 19], deceivers utilized more other-related word classes (‘Other-reference’) and fewer self-related words (‘Self-reference’), confirming that individuals tend to avoid personal involvement when expressing deceptive statements.
Our findings on linguistic indicators of truthful and deceptive intentions are consistent with previous research claiming that genuine intentions contain more ‘how-utterances’, i.e., indicators of careful planning and concrete descriptions of activities. In contrast, false intentions are characterized by ‘why-utterances’, i.e., explanations and reasons for why someone planned an activity or for why doing something in a certain way [41]. Indeed, we observed that true intentions were more likely to provide concrete and distinct information about the intended action, grounding their statements in real-world experiences and providing temporal and spatial references. Additionally, true intentions were characterized by a more analytical style and a greater presence of numerical entities, suggesting that individuals were more involved in a specific and detailed execution of the plan. In contrast, false intentions exhibited a higher amount of cognitive words and expressions and were temporally oriented toward the present and past, suggesting that liars were more focused on providing explanations for their planning, likely in order to be believed. Furthermore, we found evidence in line with the claim that liars may over-prepare their statements [41], as indicated by higher verbosity and a greater number of self-references and mentions of people. Taken together, deceptive individuals may attempt to appear more credible, providing excessive information and creating a sense of social connection by incorporating self-references and mentions of people.
In Scenario 1, we fine-tuned FLAN-T5 in its small and base versions to perform lie detection as a classification task. This fine-tuning process yielded promising results when applied to a single dataset (i.e., opinion vs. memory vs. intention), with the base version providing a higher accuracy. The model size influenced the performance, likely because a bigger model is able to learn a better representation of linguistic patterns of genuine and deceptive narratives.
However, there are no universal rules the model can learn to distinguish truthful from deceptive statements, enabling a generalization of the task across different contexts. This consideration was highlighted in Scenario 2, in which the Flan-T5 model performed at chance level when trained on two datasets and tested on the third one (e.g., train: opinion + memory; test: intention). Given that the three datasets differ significantly in terms of the content and the linguistic style by which truthful and deceptive narratives are delivered, the model appears to engage in domain-specific learning, tailoring its classification capabilities to the specific domain of deception.
In Scenario 3, we fine-tuned Flan-T5 with the three aggregated datasets (i.e., opinion + memory + intention). Our findings demonstrate that the base version of the model is capable of effectively classifying all the datasets without compromising performance on any individual dataset when compared to Scenario 1. However, the ability to accurately detect deception in a multi-context scenario depended on the model size. Specifically, the smaller model employed in Scenario 3 exhibited a slightly reduced performance compared to Scenario 1 when evaluating the model’s accuracy on each individual dataset. This disparity in performance may be attributed to the size of the FLAN-T5 small, which hinders the capacity to learn all the distinctive features across the three datasets simultaneously. Consequently, to classify deception across different contexts, the small model must relinquish certain specialized abilities that are beneficial for specific datasets. Conversely, FLAN-T5 base, with its larger size, possesses the capability to comprehend and integrate the features of the three distinct datasets, thereby maintaining consistent performance across all individual datasets.
Findings from Scenarios 2 and 3 suggest that LLMs, despite having acquired a comprehensive understanding of language patterns, still require exposure to prior examples to accurately classify deceptive texts within different domains. Overall, the results obtained from FLAN-T5 in its small and base versions surpassed the performance of Transformer models previously employed in the literature on the Opinion [28] and Intention datasets [44].
To improve the explainability of the performance collected, we investigated whether the linguistic style that characterizes truthful and deceptive narratives could have a role in the model’s final predictions. Overall, truthful and deceptive statements in the misclassified sample did not differ significantly for any linguistic feature extracted with DeCLaRatiVE stylometry technique. The only exception was in fold 1 and fold 6, which showed significant differences in text’s readability and reality monitoring score, respectively. No significant differences were detected in each fold in linguistic features between deceptive statements that were correctly classified as deceptive (True Negatives) and truthful statements that were misclassified as deceptive (False Negatives), with the exception of Contextual Embedding score in fold 7. Finally, truthful statements that were correctly classified as truthful (True Positives) and deceptive statements that were misclassified as truthful (False Postives) exhibited significant differences in the Reality Monitoring score in four out of the ten folds, suggesting that deceptive statements with higher RM scores may be misleading and drive the model to classify those statements as truthful.
Altogether, most of the analyzed folds showed a complete overlap in linguistic style, suggesting that the linguistic style characterizing truthful or deceptive narratives is a feature that the model may consider for its final prediction. Therefore, the model may exhibit poor classification performance when statements possess ambiguous features, such as deceptive statements being delivered in a similar style as truthful statements.
In contrast, correctly classified statements displayed a cluster of linguistic features associated with the Cognitive Load framework [35] in most of the 10-folds, specifically low-level features related to length, complexity, and analytical style of the texts, that may have enabled the distinction between truthful and deceptive statements. According to this framework, it is plausible explanation behind these findings is that liars would experience increased cognitive load while fabricating their fake responses by checking their congruency with other fabricated information to maintain credibility and consistency [45], therefore producing shorter and less complex sentences.
At the time of writing and to the best of our knowledge, this is the first study involving the use of an LLM for a lie-detection task. The main advantage of our approach consists of its applicability to raw text without the need for extensive training or handcrafted features. We highlighted the importance of a diversified dataset to achieve a generalized good performance. We also considered crucial the balance between the diversity of the dataset and the size of the LLM, suggesting that the more diverse the dataset is, the bigger the model required to achieve higher-level accuracy. Therefore, future works could explore the inclusion of new datasets, trying different LLMs (e.g., the most recent GPT-4), different sizes (e.g., FLAN-T5 XXL version), and different fine-tuning strategies to investigate the variance in performance within a lie-detection task. Furthermore, our fine-tuning approach completely erased the previous capabilities possessed by the model; therefore, future works should focus also on new fine-tuning strategies that do not compromise the model’s original capabilities.
Despite the demonstrated success of our model, three significant limitations impact the ecological validity of our findings and their practical application in real-life scenarios.
The first notable limitation pertains to the narrow focus of our study, which concentrated solely on lie detection within three specific contexts: personal opinions, autobiographical memories, and future intentions. This restricted scope limits the possibility to accurately classify deceptive texts within different domains. A second limitation is that we exclusively considered datasets developed in experimental set-ups designed to collect genuine and completely fabricated narratives. However, individuals frequently employ embedded lies in real-life scenarios, in which substantial portions of their narratives are true, rather than fabricating an entirely fictitious story. Finally, the datasets employed in this study were collected in experimental low-stake scenarios, in which the participants had low incentives to lie and appear credible. Because of all the above issues, the application of our model in real-life contexts may be limited, and caution is advised when interpreting the results in such situations. These limitations underscore the need for future research to address these concerns and expand the applicability and generalizability of lie-detection models in real-life settings.