Novel AI to avert the mental health crisis in COVID-19: Novel application of GPT2 in Cognitive Behaviour Therapy

The effect of the COVID-19 pandemic on mental health is substantial. The World Health Organization has called for action to avert an impending mental health crisis. To respond to this call, this paper contributes a novel application of Deep Learning in Natural Language Generation (NLG) to seed healthy thoughts for mental health therapy. For the 1 st time in literature, a transfer learning capable large neural network with more than 100 million parameters for a NLG based mental health therapy application is proposed & demonstrated. This AI is designed to address scalable impact for millions of families with a timely health intervention in a privacy-safe approach. To the best of our knowledge, this is the rst research paper to apply GPT2 (Generative Pretrained Transformer) for Cognitive Behavior therapy (CBT). Further, the paper demonstrates the proposed neural network architecture with a lab prototype implementation with reproducible results. This paper demonstrates this AI’s ability to generate conditional synthetic human-like text intended to seed a healthy mental outlook. This is accomplished by ne tuning a pre-trained GPT2 language model. The source code and video demonstration is contributed at https://sites.google.com/view/ai-in-mental-health. Also, for the 1 st time in literature, a novel idea of NLU (Natural Language Understanding) activated NLG therapy is demonstrated with reproducible results using a BERT based classier to activate the GPT2 based therapy. Performance of GPT2 models of three different sizes (124, 355, 774 million parameters) was the same for a very small dataset, thus a small GPT2 model is suggested for on-device AI inference. This AI is a step forward in responding to WHO’s call for action to avert the crisis. Towards addressing all the three dimensions of the monumental challenge, the paper designed a novel AI architecture by taking advantage of both BERT & GPT2. It also demonstrated the feasibility of Transformers-based AI for developing a mental health therapy solution. Further, this paper contributed an open-source AI prototype to support research communities to transform global mental wellness.

1. Introduction 1.1 Need: To avert the looming mental health crisis for millions of families COVID-19 has a signi cant consequence on mental health across a vast population as per experts [1].
Mental health & mental wellness concern is emerging as a signi cant and urgent need for a vast majority of the world population as per a recently published report by the World Health Organization (WHO) [31].
Experts call for an urgently deployable intervention, as per publications in JAMA [1] and Lancet [2 , 32].
As a response to this call for action, this paper contributes an AI to enable research communities to design a solution to avert the impending mental health crisis. Though this paper demonstrates an AI solution and develops a prototype implementation in the English language, future researchers may extend it to other languages, given the spread of COVID-19 across countries.

3 Existing literature and novelty in this paper
a. Novel application of GPT2 for CBT: There are not many research publications on the applications of the latest advancements in AI for mental health therapy. Speci cally, the gap is in the application of state of the art NLG such as GPT2.
While there is a lot of work in applications of state of the art NLU for mental health diagnosis [5], there is a gap in applying state of the art NLG for mental health therapy.
While research papers had attempted therapy using NLG, old Deep Learning techniques were applied. However, the opportunity to apply the latest Deep Learning techniques remain untapped. The power of the latest advances such as GPT2 opens up the opportunity to create human-like English narratives. More signi cantly, the possibility of conditioning the language generated using transfer learning allows AI based therapy to help individuals. This breakthrough potential was untapped in the literature. This untapped opportunity offers a clear path to avert the looming mental health crisis from the COVID-19 pandemic. This paper presents this untapped opportunity and reports the conceptual advance by proposing an AI based architecture, its feasibility and a prototype with reproducible results, and an open-source contribution to help fellow researchers.
Very few research publications have explored the application of recent advances in NLG (Natural Language Generation) for mental health therapy. While the advancements in Deep Learning for NLG are staggering in recent years, the potential uses of applying these AI advancements to create solutions for mental health remains untapped. An advancement in the year 2019 in the eld of Deep Learning based NLG is GPT-2 [25]. GPT-2 (Generative PreTrained Transformer) [25] is a powerful language model built using Attention mechanism [18] based Transformers [28]. GPT-2 is a language model that could perform natural language processing applications such as answering questions, completing text, reading comprehension, text summarization. GPT2 is capable of generating human-like language. It was demonstrated to create fake news [21] or to create poetry [15] or generate image captions [20]. This was possible due to transfer learning capability [13], recently made possible in Transformers architecture [25] based NLG, opening up the "imagenet moment in NLG". However, there is a lack of research papers on the application of GPT-2 for mental health therapy, especially at a time when its application can transform the way the world responds to the titanic challenge of looming mental health crisis. This paper is an attempt to bridge this gap, which can lead to a breakthrough approach to avert the forthcoming crisis.
A search on Google Scholar for the keywords combination of "Cognitive Therapy" and "GPT2" shows feeble search results. A Google scholar keyword search on "Cognitive Behaviour therapy Deep Learning GPT2" or "mental health therapy Deep Learning GPT2" or "Cognitive therapy Deep Learning GPT2" or "CBT Transformers GPT2" shows a few research publications, and hence the gap identi ed is presented in Table 1.
To the best of our knowledge, this is the rst research paper to apply latest AI approaches such as GPT2 in Cognitive Behavior Therapy [4]. This paper's work on this unexplored area opened the doors to an AI architecture that is capable of averting the future mental health crisis from COVID-19 pandemic. One of the mental health therapies is CBT or "Cognitive Behaviour Therapy" [6]. CBT is a therapy technique that can help people nd new ways to behave by changing their thought patterns [7,8,9]. A novel NLG based Cognitive Behavior therapy model is proposed in this paper.
There is substantial research publications in NLU (Natural Language Understanding) such as BERT [10] for diagnosis of sentiment [34] or sensing of emotions [43]. However, there are only very few publications on NLG for mental health therapy. A survey paper [33] took stock of a decade of studies with 139 papers shows a lot of effort has happened on diagnosis, but there is signi cant scope for future research on novel applications of NLG for mental health. A paper in Nature scienti c reports [34] also shows efforts on the topic of diagnosis. It shows progress in application of Deep Learning for classi cation problem statements to classify emotions/mental health conditions. However, the opportunity to apply Deep Learning in the synthesis using NLG is a relatively unexplored research theme in the context of mental health, and even more speci cally in the context of CBT.
There are attempts to use NLG for therapy using old Deep Learning techniques such as LSTM. However, not many papers have explored the use of latest Deep Learning techniques for mental health therapy. Transformers [28] based language models represent the latest advancements in Deep Learning based NLG. While there are attempts to use NLG for therapy earlier, such as in year 2017 [35], not many research papers explored the latest advancements in NLG. Since GPT2 emerged only in 2019 [25], the opportunity to apply state of the art NLG models such as GPT2 has opened up. While 2019 saw the evolution of GPT-2, 2020 saw the introduction of a more powerful language model called GPT-3. Though GPT-3 [43] is a successor to GPT-2, it is not suited for edge AI due to its massive size. Since GPT-3 is only available as a cloud API requiring transmission of user's thoughts via the internet, it is not suitable to meet the privacy requirements in a mental health solution. Due to the power of transfer learning [13,27] since the 'Imagenet moment in NLP' arrived, it is possible to produce a language model to suit a particular requirement such as poetry [15] . Transformers based NLP architectures are capable of generating AI performance as close to humans. For instance, it may be hard for humans to nd if an AI generated a piece of fake news [21]. This power of AI to generate short text narratives close to human performance can be tapped into for designing novel health therapy solutions. The application of GPT2 like architectures for generating human-like narratives in mental health therapy solutions is a game-changing idea.
b. More scalable than existing approaches The need for a pandemic scale solution has been stated earlier.
In contrast to cloud-based crawling of social media posts, the proposed on-device AI architecture opens the possibility of providing mental health solutions to a larger population.
Scalability is a signi cant need in the context of pandemic scale mental health crisis from COVID-19 for millions of families across the globe [2]. This paper makes a unique contribution by proposing an approach that is much more scalable than existing approaches. In addition, there is another unique dimension in this paper that enables a signi cantly larger percentage of the population to be offered mental health compared to approaches in the existing literature. The architecture proposed in this paper offers a leap in the addressable proportion of the world population. To the best of our knowledge, this is the rst paper to propose a novel edge AI architecture to do mental health on the user's smartphone. This edge AI based architecture is presented in Figure 6. While most existing literature [5] focus on predicting mental health from publicly accessible social media posts, this limits the percentage of population to those netizens who are actively posting public tweets on personal sensitive thoughts. In contrast, this paper proposes an architecture where AI inference happens on the user's smartphone, thus allowing the opportunity to screen a larger population. Gboard by Google [14] is a downloadable predictive keyboard for smartphone users, as shown in the screenshot of Figure 4. Similar to AI embedded Google predictive keyboard on Android smartphones [14], which auto-corrects spelling errors as users type in words, the proposed approach to embed AI on smartphone will enable signi cant scalability.

c. A unique idea of intelligent activation of therapy
The conceptual advance in "NLU activated GPT2 based therapy" is another unique aspect of the contributed AI architecture.
Once a BERT classi er detects a sequence of depressing thoughts on user's personal smartphone, the GPT2 predictive keyboard can be activated to provide self-help to weed off thoughts that can lead to habituation of depression. This is also rst time in the literature where an AI based therapy is activated intelligently by another AI that keeps a tap on the candidate's mental health. Once a BERT based classi er detects a sequence of depressing thoughts, this GPT2 predictive keyboard can be activated on the user's personal smartphone to provide self-help to weed off thoughts that can lead to habituation depression. This novel AI architecture is presented in Figure 6. d. More privacy safe than existing approaches While cloud chatbots based counseling has been attempted, the concern is in sending sensitive data on one's personal thoughts into a cloud based server. In contrast, this paper develops a privacy-safe approach from the grounds-up from designing an appropriate architecture such that one's sensitive data doesn't leave the boundaries of her personal smartphone.
Privacy safety design from the ground-up is a crucial requirement for digital mental health counseling solutions. The privacy safety of the proposed approach makes this approach to AI-based therapy further unique in the literature. While current attempts to counseling employ an AI chatbot [11,19,24,41], the user's private data is sent across the public internet to the cloud based chatbot engines. In contrast, in this paper's proposed architecture, the AI inference happens on-device in the personal smartphone. Privacy safety aspect in mental health approach is illustrated in Figure 6. As per this gure, the user's sensitive personal information doesn't leave her personal smartphone device's boundaries. The proposed solution is privacy safe compared to other approaches, and this is illustrated in the video in URL. e. A unique design to enable Early Intervention & Protection of sensitive data Early intervention in mental health for millions is critical during a pandemic.
Handling of sensitive personal data should be done with utmost care.
There are limitations of employing cloud-based chatbot based approach. Cloud chatbots are not conducive for early intervention.
In contrast to the cloud chatbots based approach, on-device AI inference is proposed. Adding a mental health AI to a smartphone's keyboard allows for widespread deployment and early intervention at scale to aver the pandemic scale crisis.
The signi cance of early intervention in mental health is stated in JAMA [44]. The signi cance of proactive and early intervention can't be understated [44]. Hence this limits the adoption of chatbots. Chatbots [11] require initiation by user base, limiting early intervention & scalable user adoption. Given the scalability needed to activate mental health wellness for millions of families, a chatbot may not yield user adoption. In the early stages of depression, the affected individuals may not be aware that they need help. This lack of awareness of one's own mental health status in the early days of depression [44] can stop an individual from asking for help from a web-based chatbot.
In contrast to chatbots, approaches such as Gboard [14,16] can enable a systemic proactive early intervention at scale for the global population. The expert review in [4] calls out the cons of collecting personal mental health data. From a privacy point of view, patients' personal data can't be sent over the internet to cloud-based chatbots. In contrast, the proposed approach is AI that run on-device. This AI is an NLG based CBT inspired self-help therapy, packaged in the form of a predictive keyboard such as Gboard [14]. The proposed idea is illustrated in Figure 2 and

Contributions:
A call for action to avert the forthcoming mental health was issued recently by the United Nations (UN) [3]. Towards addressing this mental health crisis fuelled by COVID-19 pandemic [2], contributions in this paper are as follows.
1. A scalable AI solution for mental health intervention: Given the pandemic's impact, the proposed solution needs to be scalable to millions of families. It also needs be a timely health intervention for millions of people. It also should be privacy safe. This work proposes and demonstrates an AIbased strategy with a reference solution in response to this call for multi-disciplinary research on mental health [2]. The focus of this paper is applying the latest advancements in AI for therapy, though the mental health solution covers both therapy and screening. The solution to avert the crisis is based on both therapies using NLG (Result #1) and screening using NLU (Result #2). Both therapy and screening can be seen in Figure 1. The proposed mental health solution and the accompanying AI architecture is demonstrated with a working lab prototype with reproducible results. A lab prototype is implemented and open-sourced to enable many interested research communities to avert the crisis.
2. Novel application of GPT2 in therapy: This paper proposes and demonstrates a novel AI based approach to provide mental health therapy. The novelty is regarding the applications of recent advances in Deep Learning based Natural Language Generation (NLG) to one of the commonly used mental health therapy called Cognitive Behaviour Therapy (CBT). As described earlier in the literature review section, this paper explored an untapped potential of applying the state of the art advances in language models to create a mental health therapy solution. The possibility of conditioning the language generated using transfer learning allows AI based therapy to help individuals develop a positive way of thinking and seeing situations. Like auto-correction of spellings in a smartphone's predictive keyboard, this AI "auto-corrects" the narratives being typed by a mentally depressed person. Natural language generation generates text as user starts typing his thoughts, so that the user can view the situation from a "positive lens". A NLG enabled keyboard can be downloaded on to the user's smartphone that can offer care instantly, enabling early correction of unhealthy thoughts. The proposed AI embedded mental health predictive keyboard is illustrated in Figure 4. Since GPT2 inference computation happens in less than few seconds, this coaches the user's to have healthy thoughts while maintaining privacy. Since few research publications explored the applications of state of the art AI for language generation in self-help based therapy techniques such as CBT, this paper contributed to this gap. It proposed an AI architecture based on Transformers based NLG, proposed novel AI approaches such as NLU activated NLG. It also demonstrated the feasibility of the proposed AI with a prototype implementation. The proposed AI opens the door to develop a AI-based solution to avert the looming pandemic scale crisis for millions of families with a timely health intervention. It is also designed to ensure privacy through an on-device AI inference, even though CBT often involves handling of very sensitive personal information. A pre-trained GPT2 model was ne tunned to generate human like English based on training on a synthetic dataset. Three different GPT2 models were compared, concluding a small-sized language model may be good choice for on-device AI inference. Novel ideas such as NLU activation of GPT2 based therapy ensures timely intervention in a wide deployment for a large population. A prototype of BERT activated GPT2 based therapy is demonstrated. The demonstration shows the GPT2 based self-help therapy is activated only when appropriate. Experts have alerted us of a crisis sooner or later [40]. In this backdrop of the forthcoming crisis, this AI demonstrates a signi cant leap forward in developing an approach to prevent the impending mental health crisis. This work shows it is possible to use AI to prevent the looming mental health crisis.
Experts have called for multi-disciplinary teams [2] to get involved in averting the crisis. Accordingly, to encourage research communities to avert the forthcoming mental health crisis, the paper contributes an open-source AI prototype. While this paper focuses on the Deep Learning aspects of the solution, future multi-disciplinary research teams can extend this AI from the mental health perspective. Though this prototype is on English, future AI researchers can extend this AI to many world languages as many countries need this solution. The idea of applying NLG for therapy is presented in Figure 2. In this idea, an AI resides in the smartphone of the user and helps to improve mental wellness by suggesting exible language narratives. Similar to the auto-correction of spellings in a smartphone's predictive keyboard, the proposed AI solution "autocorrects" the narratives being typed on the user's personal smartphone, as illustrated in Figure 4. This paper infuses AI into one of the popular mental health therapy approach, called Cognitive Behaviour Therapy (CBT) [4]. CBT is a therapy technique to help people nd new ways to behave by changing their thought patterns.
With the power of transfer learning and human like text generation capabilities based on GPT2, the potential to apply NLG in therapy is signi cant. A lab prototype demonstrates this result, and a short video recording of a demo is presented online at URL, https://sites.google.com/view/ai-in-mental-health .
1.6 Potential impact in future: Enabling research communities GPT -2's ability to generate conditional synthetic text samples of unprecedented quality combined with transfer learning opens up an opportunity for mental health professionals to scale their impact to support millions of people on a continuous online basis. This is the potential that can be unlocked to avert the challenge of mental health impacted by the COVID-19 pandemic. This paper presents this potential by developing and demonstrating state of the art AI.
Researchers can take advantage of this opportunity to apply state of the art AI techniques to improve mental health. Further to presenting novel ideas, the paper also contributes an entire AI prototype solution in open source. The scope of the prototype implemented is speci ed in Table 2. This is intended to encourage interdisciplinary research communities in their future research.

Organization of the paper
As presented in Table 2, Contribution #1 is around enabling a solution to avert the crisis, Contribution #2 is around novelty. The paper's results section is organized as follows: The primary result of this paper is Result #1. While Result #1 presents the novel application of AI for mental health, Result #3 expands the implementation of result #1 from a Deep Learning viewpoint. NLG in CBT is presented as both Result#1 and Result #3, while NLU in screening is presented as Result #2. Future research by the community can expand this AI into a real-world deployment to avert the crisis.
The next section on Results presents the three results. For each of the three results, the respective methods, discussions on literature is organized in each sub-section.

Results, Methods, And Discussions
Given the mounting worldwide mental health impact reported recently in Oct 2020 by Lancet's editorial [31], the focus is on designing an AI approach that can address all the 3 dimensions of the challenge. A scalable approach to impact millions in a timely intervention in a privacy-safe way is the focus. BERT activated GPT2 for the generation of short narratives for seeding healthy thoughts is demonstrated. The sample narratives generated by AI are shown to look similar or as good as narratives generated by human counselors. To offer CBT like self-help to correct one's outlook towards a situation, a conditional language model is created using transfer learning. A human-like text generation and the synthesis of conditioned language is demonstrated. The sample outputs shown in Table 4 show the sentances generated by AI is as good as one spoken by humans, thus showing the feasibility of employing AI to scale mental health counselors' impact. Mental health experts can scale their productivity by training an AI, which can assist their patients in their absence. This is important given the need to support millions of families, given the reported shortage of capacity in mental health services [31,32]. The use of such AI also ensures both early and timely activation of therapy. Further, a NLU activated NLG based therapy, ensures activation at the right time. A Human-Computer Interaction approach of a proposed mental health keyboard with an on-device AI inference ensures 24x7 mental health assistance for millions of individuals. The on-device AI inference supports the privacy and protection of one's personal information.
Comparing the performance of different GPT2 model shows a small-sized GPT2 with 224 Million parameters is a choice for widespread deployment on-device on smartphones. Future research can extend the performance on smartphones using DistillGPT.
Among the 3 results, Result #1 represents the primary result of this paper. While novel ideas are presented in Result #1 and Result #3, the paper doesn't make any novelty claims for Result #2. The two contributions and the three results are speci ed in Table 2. 2.1 Result #1: Novel idea of NLG in therapy to avert the crisis: Design of a AI for early intervention in mental healthcare for millions while ensuring safety of sensitive info Challenges addressed by this result: 1. WHO (World Health Organization) reported the need for action on mental health [3]. The challenge is to design mental health therapy to help the individual look at a given situation from a different perspective, in order to lead to mental wellness [8]. As an illustrative example, in a situation of a person losing a job during the COVID-19 pandemic, the AI should help the individual develop a positive inner belief [6]. The challenge is to design an AI based therapy by applying the state of art advancements in Deep Learning in NLG.
2. The challenge is designing an approach that can meet the demands of a pandemic scale mental health solution [2]. The need is a scalable solution to serve millions of families with 24 hours of continuous support for each of the families.
3. Early intervention is an implementation challenge for 21 st century mental health care as per JAMA [44]. Early intervention can be de ned as diagnosis and treatment at the earliest possible point, even presymptomatically [44]. Today's challenge in mental healthcare is that treatments are typically deployed late and without the strategic goal of reducing the progression of the illness, as per JAMA [44].
4. Privacy concerns as mental health counseling often involve sharing of sensitive personal information with the therapist [24].
5. There is a reported shortage in the capacity of mental health services as per WHO's assessment in year 2020 [31]. Given the pandemic, the need for tools to signi cantly multiply mental health professionals' productivity is essential [24]. Going forward, a mental health therapist should be able to care for a signi cantly larger number of patients.

Methods & Discussions:
a) A response to the call to action to avert the forthcoming crisis: The proposed AI is designed as a response to WHO's call to action on global mental health [3,31]. As per WHO, almost 1 billion people suffer from a mental disorder [38]. Around 1 in 5 of the world's adolescents suffer from a mental disorder, as per WHO facts [38]. The economy loses US$ 1 trillion every year in productivity because of depression or anxiety [38]. The Lancet editorial in Nov 2020 reports the monumental effects of COVID-19 pandemic on mental health [40]. The editorial [32] raises the following concern:-It is unclear how the world will deal with this forthcoming crisis, as the capacity of mental health services to respond in such a large scale doesn't exist today. Hence the need for a scalable approach like the one proposed in this paper becomes signi cant.
b) The vast untapped potential of Transformers based AI architecture for mental health therapy.: Though Deep Learning research is progressing at an amazing pace, the opportunity to apply Transformers architecture [28] powered text generator [25] to improve mental health is not yet explored in the existing literature. Given the forthcoming crisis, it is imperative to explore the feasibility of employing the latest advances in AI for mental health. Inorder to avert the titanic crisis, this untapped potential is explored in this paper. This paper presents a novel idea at the intersection of CBT [17] & AI.
As per the May 2020 expert review [4,5], the opportunity to apply a powerful deep neural network of the order of 100 Million neural network parameters for mental health therapy solutions to avert the looming mental health crisis is less explored in the literature. The bene t of such a neural network is human-like performance in language modelling [21] . The ability to ne-tune the models using transfer learning technique [13] allows the computer-based generation of language that is as close as possible to human counselors [6,24]. One reason is progress in large NLG models such as GPT2 (Generative Pretrained GPT-3 [43] introduced in year 2020 is too large for on-device AI inference on smartphones. Unlike GPT-2, GPT-3 is only accessible as a cloud API. So GPT-3 can't meet the privacy requirement of on-device AI inference. Hence GPT-3 is not suitable for consideration for this mental health challenge. Using GPT2 to help patients practice Cognitive Behavior therapy(CBT) to change their negative view -all in real-time with privacy safety is a new idea. Further, given the scale of the problem, the therapy has to proactive rather than user-initiated. The idea of a timely health intervention in a privacy safe approach for millions of families by novel application of GPT2 is made possible now due to the advancements in AI, and this paper proposes this idea & demonstrates with a working prototype.
c) Digital interventions in mental health approaches such as Cognitive Behavior Therapy (CBT):.
Experts have called for the involvement of multi-disciplinary research [2] to avert the forthcoming crisis.
While this paper's primary highlight is on Deep Learning applications, this paragraph introduces mental health therapy concepts such as Cognitive Behaviour Therapy (CBT).
CBT is a popular form of mental health therapy. Cognitive Behavior Therapy (CBT) [17] is psycho-social intervention that aims to improve mental health. . The JAMA article [9] concluded the effect of early intervention using Cognitive therapy. But the number of people who need help is multiple order of magnitude higher due to the pandemic, hence JAMA article [1] calls for creative thinking in treatment. The Lancet Psychiatry position paper calls for Digital interventions [2]. This delineates the potential of applying Deep Neural network based "Digital Cognitive therapy". From the perspective of an interdisciplinary researcher, the opportunity for AI-based CBT is proposed in this paper.
The concept idea of AI based CBT is offered in Figure 2. As shown in the gure, the circle of thoughts and inner beliefs of an individual can be 'in uenced' by helping the person to change the way he 'perceives' the situation. The gure also illustrates the example where AI helps the individual to perceive his loss of a job from the viewpoint of his strengths. The video in the URL further clearly articulates how AI is able to sow healthy thoughts. d) AI for transforming Early Intervention in mental health care:.
Early intervention is an implementation challenge for 21 st century Mental Healthcare as per JAMA Psychiatry [44]. Early intervention is signi cant in many healthcare settings. The same applies for mental healthcare too. Early intervention is giving care at the earliest possible point or pre-symptomatically [44].
In contrast to visiting counseling centers physically, AI allows for CBT inspired self-help almost instantly. As of 8 th Jan 2021, there is a lack of research publications on applying state of the art AI models such as GPT2 for mental health therapy. To the best of our knowledge, this paper is the rst to apply GPT2 for CBT.
Novel application of GPT2 for mental health therapy solution is proposed.
The proposed concept of AI aided therapy is articulated in Figure 2.
The power of Transformers based NLG architecture allowed for the generation of human-like narrative, where ne tuning of the language model was performed by transfer learning.
The proposed GPT2 based CBT was demonstrated with a lab prototype implementation.
The feasibility of generating short sentences by AI that resemble human-generated was experimentally demonstrated. The video capture the live demonstration. For reproducible results, the code is shared online.
Fine-tunning a pre-trained GPT2 on a synthetic dataset composed of around 5000 shortsentences generated language narratives that help the person look at a situation from a more positive mental outlook.
The source code is contributed in open source code. This can enable future work by research communities to avert the forthcoming crisis.
The idea of applying GPT-2 for mental health is rather unique in the research literature. GPT2 based CBT is attempted in this work in the backdrop of lack of any research publications on GPT2 based CBT. Given backdrop of a call for multi-disciplinary priority [2] to avert the looming global mental health crisis, contributing to this gap urgently is even more signi cant. This paper not only address this gap, but also encourages research communities to avert the crisis by 2 contributions as speci ed in Table 2.
The proposed idea of GPT2 based CBT is introduced to address the challenge articulated earlier. The novel concept of GPT2 in CBT is articulated in Figure 2. In the proposed model, a language model listens to users situation and help her frame the narrative. A language model is simply an AI that predicts the next word in the sentence, given the previous set of words in the sentence. The example illustrated in Figure 2 showed how GPT2 helped a person who lost a job to develop a better 'outlook' to pervice the situation. The video demonstration in URL (https://sites.google.com/view/ai-in-mental-health) shows clearly various scenarios of how GPT2 based NLG can help 'tune' an individual's view or outlook. A ne tuned GPT2 model is demonstrated to generate human like text in this video as well in the screenshots in Figure 2 and Figure 4. This demonstrated the novel concept of employing a ne tuned GPT2 model towards a solution for mental health therapy. This demonstration of a novel application of the power of GPT2 to provide human like text to enable therapy will be of tremendous interest to experts and leaders who are interested to prevent the forthcoming mental health crisis caused by the COVID-19 pandemic. Different aspects of the idea are articulated in 4 different pictures - Figure 2, Figure 3, Figure 4 and Figure   6.
The conceptual idea of AI in Cognitive Behavior Therapy is proposed in Figure 2. As illustrated in the gure, the cognitive circle of thoughts and beliefs of an individual is intercepted by self-help based AI therapy. The diagram depicts how AI in uence feelings & thoughts. The triangle at the center of the circle represents the person's beliefs, which can be in uenced by the AI. Inspired by CBT technique [17] to get help by correcting one's beliefs about a situation, the AI offers self-help to correct one's belief for every situation. An example of a situation is a person losing a job due to the pandemic, and then getting depressed. Figure 2 illustrates an instance of how AI can help a person who may feel depressed after losing his job. In the illustrated scenario, he types/speaks his situation in his smartphone as "I lost my job, I am depressed". The GPT2 language model takes this initial phrase as an input and predicts the next words in the sentence. Based on this input, the GPT2 generates a narrative as "I lost my job, I am depressed. Let me keep remembering that I am smart". The screenshot in gure 2 demonstrates this scenario. More examples that demonstrate the AI based self-help is shown in Table 4. A video demonstration of this AI based self-help therapy can be seen online at the URL. The conditional language generation was ne-tuned in such a way so that the positive beliefs are gradually sowed.
Mental health experts compose narratives such as one generated in Table 4 containing a situation and a belief into a training dataset. The training dataset used in this prototype had around 5000 such shortsentences containing various situations. Each sentence in the dataset had a situation and the corresponding belief that can help the person to come out of depression. The dataset used in this lab prototype can be accessed at this URL. This dataset is a small dataset synthesized programmatically.
Mental Health Professionals compile such a dataset and use it to train the AI to create a ne-tuned GPT2 model. This helps scale the number of patients who can be cared for by every mental health expert.
Thus mental health experts can leverage AI to multiply their productivity to achieve the broader objective of preventing the forthcoming mental health crisis across countries. This addresses the shortage of capacity in mental health professionals [31] to scale to millions of families. Mental health professionals ne tune a pre-trained GPT2 model to create a new model using transfer learning. Transfer learning [13,22] was performed on an OpenAI's pre-trained GPT2 model [25] as shown in Figure 3. This allows for human like text narratives [21] to be generated by the new model, which when read, may in uence the thoughts, hence enabling the depressed individual to cultivate a positive mental outlook [8]. This Human-Computer Interaction model [33] is proposed to be similar to a user downloadable smartphone keyboard such as the popularly used predictive keyboard on android smartphones such as Google Gboard [14]. A visual of the keyboard is shown in Figure 4. The AI model is embedded in the smartphone keyboard, where the AI inference happens locally on the local smartphone device running Tensor ow Lite. So similar to how predictive keyboard such as Gboard helps auto-correct the spelling of what is being typed, the proposed keyboard helps correct the mental outlook to improve the mental health of the smartphone user. Once a smartphone is enabled with this AI, it equips the individual to think more positively, as shown in Figure 2. User's thoughts, often in the form of speech or text is fed into their personal smartphone, then analyzed in privacy safe technique. Privacy is enabled as the user's thoughts/text doesn't leave the smartphone, but inferred locally on the device as presented in Figure 6. Thus a novel conceptual advance in the application of the latest techniques in AI for averting the global mental health crisis had been contributed. Additionally, the proposed AI design is implemented and demonstrated to be technically feasible using a working implementation of the proposed AI. For reproducibility of results, the AI is contributed in open source at the website accessible at this URL. In addition, the additional novel ideas are contributed and demonstrated to evolve an AI solution to address the global mental health crisis, If the AI identi es a trajectory towards depression, conditional language modelling (NLG) is activated, as shown in Figure 6. A NLU model activates the NLG, based on detection of mental resilience of the individual, as illustrated in Figure 6. At an appropriate instance, a GPT2 neural network-based NLG (Natural Language Generation) transforms any depression thoughts into something with a better outlook.
A method to detect mental resilience is discussed later as part of Result 2. Given the set of words as input, GPT2 outputs the next set of words to auto-complete the sentence. So this AI can be used to lead a stream of thoughts away from depression. This novel idea of AI in Cognitive Behavior Therapy is proposed in Figure 2 . AI was demonstrated to generate a narrative that helps in uence inner belief for a situation. Also, the proposed AI architecture is presented in Figure 6.
The online demo of AI-based prototype solution is in this URL. In this video exhibit in this URL, it can be seen how a ne tuned GPT2 model can be employed by communities to help the individual to gain a better outlook by xing the internal beliefs. The ne tuned GPT2 model is able to generate a language of gratitude and hope, even when the input is thoughts of loss and depression, as seen in the screenshot of Mental health assistance in real-time, where self-correction is facilitated by AI using NLG, while another NLU model keeps a tab on the person's mental health, was demonstrated in Figure 4 and Figure 6. The comprehensive NLU activated NLG implementation method is discussed in Result 3. The NLU based mental health detection is discussed later as part of Result 2. In short, this result #1 demonstrated the potential and feasibility of applying GPT2 to offer mental health care almost instantly to a potential candidate. This kind of early intervention in mental healthcare is much needed [44].
To summarize, mental health experts train a GPT2 model to multiply their impact to avert the crisis. The novel concept of AI in CBT shown in Figure 2 was demonstrated with a prototype implementation. The potential for proposed AI architecture to solve the the 3 dimensions of the challenge is a notable discussion point. The proposed AI architecture approach presented in Figure 6 is designed to achieve the scalability to millions, achieve early intervention in mental healthcare, and ensure the sensitive personal information of the individual doesn't leave her personal device for privacy safety. Thus this result is a leap forward in the roadmap to apply recent advances in AI to avert the looming pandemic scale mental health crisis. The open-sourcing of the AI further encourages many research communities.

Result #2: Implementation of a lab prototype of a NLU based detection of the state of mental health
Challenges addressed by this result: 1. Many reports have already established the monumental scale of the mental health challenge in COVID [2]. Given the vast majority of the population across multiple countries, a systematic and scalable strategy for proactive mental health screening and non-intrusive rapid diagnosis is necessary to avert the looming mental health crisis. Figure 1 shows the framework of a solution.
2. An idea of NLU based activation of therapy is later discussed as part of the next result, result #3.
This implementation of a NLU to detect the progression of mental health is discussed in result #2.
This NLU module is later re-purposed as a sub-module as part of result #3.
Methods & Discussions on NLU for detecting mental health: Very large deep neural networks such as transformer architecture based language models such as Google BERT offer a signi cant ability to understand English language, making it an excellent choice to understand what a candidate says using NLU (Natural Language Understanding) [12]. Transfer Learning on BERT is proven to be a viable technique for understanding sentences in any domain [13]. The abundance of literature in NLU for social media listening [5,34]  Result #2 is about the application of NLU to detect the state of mental health of an individual.
Result #2 assumes signi cance in the context of the idea of NLU activated NLG, the details of which are described later as part of Result #3.
Result #2 is about implementing an NLU module to detect the progression of an individual's mental health.
The implementation is by application of BERT.
From a literature point of view, the paper doesn't claim any novelty in result #2 on its own. Result #2 is presented here for two reasons. Firstly, it is re-purposed as a sub-module in Result #3. Secondly, it is part of a contribution to avert the crisis as speci ed in Table 2.
To address the scale of a pandemic, a very different approach is necessary. Any capability to perform large scale screening or rapid pre-diagnosis enable experts is valuable to avert the looming crisis. A way to quickly analyze what a candidate is gone through during past weeks is the result obtained. An ability to screen large number of public, and analyze temporal patterns of every candidate quickly in the form of a visual report -makes it possible for rapid diagnosis by mental health professionals. An AI solution for large-scale screening is shown in Figure 7 to identify candidates who need diagnosis. For the shortlisted candidates, a visual report, as shown in Figure 5 is generated for each candidate. The visual reports show mental resilience and time-based swings of cognitive behavior and the recovery rate. This report enables rapid diagnosis by mental health therapists. Using transfer learning, a BERT [10] based binary classi er identi es if the person is showing the language of a depressed person or exhibiting the signs of a person on a recovery path. The time to recover after a loss of job or family member is also explored. The mental resilience of a candidate can be understood by seeing the trend over a period of time. The proposed architecture for the detection of progression of mental health is shown in Figure 5. In this gure, a topic analyzer is cascaded with a mental health classi er, and then temporal modeling is performed. In short, a quick way to pre-diagnose by mental health professionals is demonstrated by the application of state of the art NLU. Thus the feasibility of screening at scale (Figure 7) along with rapid pre-diagnosis by mental health professionals using a cascade of BERTs based architecture ( Figure 5) is demonstrated. An online demo is at this URL, https://sites.google.com/view/ai-in-mental-health/ai-in-diagnosis. 3. Early intervention and 24x7 availability of care 4. Activate assistance only when appropriate: Intelligently choosing when to assist the patient in contrast to enabling her to become self-reliant.

Methods & Discussions:
The opportunity to scale impact via AI based CBT is discussed earlier in Result #1. Here familiar technique of transfer learning is used on the dataset on OpenAI's pre-trained GPT2 language model. This transfer learning idea is shown in Figure 8. Transfer learning in GPT2 has successfully produced poetry [15] and fake news [21]. The superiority of GPT2 (Generative Pretrained Transformer 2) [25] over earlier NLG techniques to produce reasonable text generation has been well established due to attention [18] based neural network architecture on model capacity above 100 million parameters.
The therapy needs to be started only when appropriate. If the person shows signs of natural recovery from a depressed mental state due to natural resilience, therapy is NOT required. So the AI therapy needs to be activated only when appropriate. It is important to keep a tab on the person's progression of mental health. Based on the progression, the decision to activate is taken intelligently by the AI. This idea of intelligent activation of therapy based on how the person is doing over few days is unique in the literature. This concept of detection by NLU, and then appropriate activation of therapy is proposed and demonstrated in this paper. The proposed intelligent activation approach is shown in Figure 6. A BERT based model keeps track of a person's mental health. In case this model detects that a person is trending towards depression, then it triggers activation of GPT2 based therapy model. This idea of intelligent activation of therapy using NLU activated NLG architecture is unique in the literature of mental health. The result is demonstrated in Figure 11.
Result & Method: Highlights of Result #3: The idea of NLU activated NLG based therapy was proposed and implemented. A BERT-GPT2 cascade was implemented. In this approach, BERT detects if a person is depressed, and when appropriate, activates the AI therapy. The results of the proposed neural network architecture design presented in Figure 8 is demonstrated in Figure 11. Figure 11 shows two scenarios from a prototype of NLU activated NLG therapy. One scenario showed NLU triggered GPT2 therapy, another where therapy was NOT required. This demonstrated the proposed AI architecture of NLU triggered NLG.
This intelligent activation architecture was demonstrated. This result is useful for early intervention [44], where the AI is pre-deployed in smartphones and gets automatically activated at the right time for timely activation of therapy to enable early intervention.
For reproducible results, the source code is accessible online at URL.
A pre-trained GPT2 model is ne tuned using transfer learning on a small synthetic dataset. Three different pre-trained GPT2 models of small (124M), medium (355M) and large (774M) pre-trained models were ne-tuned and all there achieved the same accuracy levels. (Refer Figure 9) Since the small (124M) and large GPT2 (774M) achieved the same accuracy levels for a small dataset of 5000 short sentences, a small GPT2 model with 124 million neural network parameters will be appropriate for on-device GPT2 inference.
Though GPT-3 is a successor to GPT-2, GPT-3 is not suitable for on-device inferenc This paper experimentally identi ed a small GPT2 model is su cient to provide the performance required. The open source contribution encourages many communities for further research, given experts are calling to action for multi-disciplinary research.
To summarize, here are unique knowledge elements and results contributed 1. Beam search [26] with GPT-2 decoded the next words in the working prototype. The results of beam search based GPT2 prediction is shown in the screenshot in Figure 4. The prototypes are implemented with Transformers [28, 29] based NLG and NLU using a Tensor ow/Keras library [29] on Google colab.

A novel NLU-triggered-GPT2 is
proposed & demonstrated with a prototype implementation. The proposed architecture of NLU triggered NLG to selectively activate the therapy is presented in Figure   6 and more detailed in Figure 8. The simple implementation of the proposed architecture as a lab prototype is demonstrated online at the URL, and the screenshot of this is presented in Figure 11. The screenshots in Figure 11 shows the two different scenarios, one where the AI therapy is activated, another scenario where the therapy was NOT required. The activation of GPT2 is performed only when appropriate as NLU module keeps a tab on the person's mental health over time. By turning on/off the AI on the smartphone keyboard, families can opt in for self-help based mental wellness. So when both mental health tracking AI, and CBT AI are combined together, this allows a pervasive, non-intrusive, privacy safe approach to provide mental health care for millions. The BERT model detects if a person is depressed, then selectively activates the GPT2 based therapy. NLU module's ability to detect the progression of mental health over a series of sentences over days was shown earlier in result #2. Figure 5 earlier showed that the NLU was able to detect if a person was recovering from depression. Hence the idea of NLU triggered NLG allows for activation of therapy only during appropriate circumstances such as the person doesn't recover after a loss. While most people recover from a loss naturally after a duration, some may get into increasing levels of depression. This proposed idea of NLU activated therapy is able to handle such situations who need help. The 3. Three pre-trained GPT2 model of different sizes were ne tuned with same synthetic dataset. These were small , medium and large GPT2 neural network models of 174, 355, 744 million parameters The performance results of 3 different GPT2 models are tabulated in Table 3 and plotted in Figure 9 as training loss over training steps. This showed that for a small synthetic dataset, all 3 models achieved the same level of performance. Given the on-device requirement for AI inference, the GPT2 small (174M) model is suggested for smartphone deployment. In future, researchers can explore more memory and compute e cient models for smartphone such as DistillGPT2 [36] to enable deployment into real world.
4. An idea of conditioning language generation is demonstrated, demonstrating that AI is close to generating narratives like a mental health professional. The resulting conditioned text is shown in Table 4 for a couple of scenarios. It shows how AI can assist during many situations. It shows AI generates a conditional narrative that spins the words so that the user can look at the situation from a positive mental outlook. Transfer learning on Transformers based language model opens up the "imagenet moment for NLG", so the potential of GPT2 for mental health therapy was successfully demonstrated. This result is a signi cance step towards AI based mental healthcare. The potential of Transfer learning in GPT2 to transforming the boundaries of what is feasible.
5. The dataset used for transfer learning for training the GPT2 model is a programmatically generated synthetic dataset. This synthetic dataset used in this paper contains 4098 records and can be accessed at the URL, https://sites.google.com/view/ai-in-mental-health/ai-to-seed-good-thoughts. Given the dataset is programmatically created as per the code shared at this URL, this gives exibility for the Mental health therapists to quickly con gure the input words to generate a synthetic dataset for a many situations. The dataset is contains short sentences such as the one shown in Table 4.
Each sentence have a combination of a situation, and a belief. The situation and belief to in uence is created by a mental health experts. The training time for all 3 models shown in Table 3 on this dataset was completed in less than 5 minutes on a single GPU environment for all the different sized GPT2 models, enabling swift solution deployment by mental health experts at the time of real word deployment.
. Since mental health counselling for a set of target population related to each other by similar situations (e.g. a group of nurses overwhelmed by handling COVID-19 patients in a hospital), federated learning approach can be bene cial. To learn from community, aggregate model of Federated Learning [23] can be explored. The federation learning concept is presented in the architecture in Figure 8. Federated learning to identify positive self-help narratives that yield faster healing based on joint learning from a group of similar patients. This could be a future direction of research.
7. The choice of AI inference on the smartphone vs cloud was discussed in the context of privacy and willingness of patients to send sensitive information to cloud. A smartphone based model with AI inference on the local device enables privacy safety and user willing-ness to express themselves. The architecture proposed in Figure 6 ensures the user's personal thoughts doesn't leave her personal smartphone device, and hence enables privacy.
Based on the results demonstrated here, future researchers can develop a practical real world mental health solution to avert the forthcoming mental health crisis.
With the intelligent activation of AI therapy using a BERT-triggered-GPT2 architecture presented in Figure  6, Figure 11 shows the potential of AI based mental health solution to address the monumental challenge the paper aimed to achieve.
The power of transfer learning capable Natural Language Generation models such GPT2 when combined with on-device deployment approaches as shown in Figure 8 represent a leap forward in solution to respond to the global call for action by WHO and experts.
The design of neural network such as one in Figure 8 with an Open source lab prototype implementation demonstrated is a step forward in the response to the call for action by WHO to prevent the forthcoming monumental crisis.
The source code is available at URL https://sites.google.com/view/ai-in-mental-health/ai-to-seed-goodthoughts. The AI prototype is open-sourced. This encourages research communities for future research, given experts have called for multi-disciplinary priorities to avert the crisis.

Conclusions & Future Directions
This paper contributed a novel AI based on the applications of the recent advancement in Deep Learning to mental health. By applying state of the art in Deep Learning based language modelling [22], this paper makes a unique contribution to designing an AI that can help the community prevent the onset of a global mental health crisis. The work contributed novel ideas to the literature and developed a prototype to demonstrate the AI. The results are reproducible and are available online. Further to encourage the community to help avert the crisis, a AI prototype is contributed in open source. This supports the call to action by WHO [3,31].

Overview of contributions made by this paper
The two aspects this paper developed are as follows:- Learning. It allowed for the development of pre-trained models, typically on massive datasets, and by exploiting the scalability of Transformers blocks to train gigantic neural networks. BERT is a Transformers based language model, and it is extensively used in understanding sentences. Typically, BERT [10,34] is applied to build a classi cation model to detect depression based on words uttered in social media. Though there is substantial literature on such Natural Language Understanding (NLU) applications to diagnose mental health, there is almost not much literature on applications of Transformers based Natural Language Generation (NLG) for therapy.
1.4 Since the paper argued about the limitations of employing cloud-based chatbots for counseling, an alternative edge-AI based approach for therapy is discussed. The proposed approach is better than cloud chatbots as the proposed AI protects one's personal thoughts being streamed into cloud servers. The proposed AI is also better than cloud chatbots in addressing the practical challenge of early intervention in mental healthcare.
1.5 Transfer Learning [22,28,29] of language models is a mechanism to ne-tune the generation of text as per the required pattern by an NLG model. This work experimentally demonstrated that a GPT-2 model, once ne-tuned, can generate short sentences as required with Beam Search on GPT2.
1.6 The paper demonstrated a GPT-2 model [25] of any size (124 M or 355 M or 774 M) could be netuned to generate human-like text narratives. Given the need for on-device AI inference on smartphones, the paper suggests using the small-sized GPT-2 (124M) model or further smaller sized models such as DistillGPT2. Though there are larger language models such GPT-3, it is not suitable for on-device inference.
1.7 A novel conceptual advance in GPT-2 based CBT was proposed and demonstrated with a working prototype. In this approach, as the user enters her thoughts into her personal smartphone keyboard, the AI inference can auto-correct her beliefs or outlook so as improve her mental wellness. accepted by the AI and then transformed into a short-sentence that can help the reader view the situation from a different perspective. By ne-tuning the language model, the generated short sentence is designed to generate a positive belief for a given situation. In one example of a situation of job loss, the model generated a short sentence to help the reader remember that he is smart. A GPT-2 small model can be ne-tuned on the cloud, and then downloaded to the smartphone for on-device inference. For a typical small dataset with a size of around 5000 short sentences, a ne tunned GPT-2 small model yielded human-like short sentences.
The 2 nd dimension is the challenge in the implementation of early intervention in 21 st -century mental health care. This challenge of early intervention in mental health care is reported in JAMA [44]. Since this AI can be deployed on millions of smartphones, the therapy gets activated only when necessary due to the autonomous intelligent activation. If the smartphone user is depressed, the NLU activated GPT2 system tracks the mental health, and when it detects depression, it automatically activates therapy. So once deployed to millions of smartphones, the early intervention of mental healthcare can automatically get activated for the needy individuals among the millions.
The 3 rd dimension of the challenge is about adhering to rules to protect the personal information of patients requiring mental health counselling [6,17]. Since a lot of information about the patients' various personal situations and emotions is spoken with the counselor, such information is very sensitive and personal to each individual. It is important to protect this information. Such information on personal situations should not get leaked to anyone. In the proposed approach in this paper, the AI inference happens locally on the smartphone device. So the user's sensitive information never gets sent out the smartphone into the internet. Since the inference happens on the smartphone, the personal information is deleted as soon as it is processed. So any text typed into the mental health keyboard or spoken by the user is processed by AI and deleted immediately. So such sensitive data typed/spoken is processed locally on the smartphone within a second. Thus no personal data is archived. Thus this proposed AI provides utmost protection of sensitive personal data to millions of families. To the best of our knowledge, there is a lack of well-published research papers on the applications of the latest advancements in Natural Language Generation for mental health therapy. There is untapped potential to explore the use of the latest advances in AI for improving mental health. Since the last few months, there is a call to action by WHO and experts to avert the forthcoming mental health crisis. So the need for research on this untapped potential was crucial.

Results
Given the mounting evidence of mental health impact due to the COVID-19 pandemic, there is a need to urgently explore the untapped potential of applying the latest progress in AI. This paper made a signi cant way forward in exploring this untapped potential.
The result was a conceptual advance in the way the latest progress in AI is utilized to improve global mental health. In addition to proposing a novel application of GPT-2, the paper designed an AI that meets all the dimensions of the challenge.
The proposed novel AI allowed for scalable deployment of AI based self-help to improve global mental wellness for millions of families.
This on-device AI inference of GPT2 for CBT inspired self-help allowed for scalable deployment.
A GPT-2 small model with 174 Million neural network parameters offered compelling performance, and hence a compact GPT-2 language model is ideal for edge AI inference on mainstream smartphones.
Amidst the reported shortage of mental health services and the looming pandemic scale impact on mental health, AI based approach offered a way boost the productivity of mental health experts.
The mental health professional composed dataset and trained an AI using transfer learning.
Since the text narratives were demonstrated to close to human like narratives due to conditioned language modeling, this opened the doors for early intervention for the masses.
The 21 st century challenge of early intervention in mental healthcare was addressed by a novel approach of NLU activated NLG.
Automatic intelligently activation of therapy allowed for early intervention of mental healthcare.
NLU activated NLG ensured timely activation of self-help therapy.
Automatic activation of therapy on time on deployed smartphones opened

Protection of sensitive information
Due to on-device inference in the proposed AI, the information typed/spoken by the user is NOT transmitted to the cloud.
All sensitive information typed by the user is processed by AI and immediately deleted within seconds.

Detailed Results: Demonstration of AI architecture for mental health therapy
The paper demonstrated the application of a Transfer Learning capable neural network for Natural Language Generation for generating human like text narratives towards a mental health therapy approach, speci cally GPT2 based CBT. Inspired by a mental health therapy approach called Cognitive Behaviour Therapy (CBT) [17], the proposed AI explored the possibility of using human like text generated by the AI for self-help to change unhealthy ways of thinking. By conditional language modelling approach, a Transformers architecture-based GPT2 pre-trained on 8 million web pages, was ne-tuned by transfer learning on a programmatically synthesized small dataset of 5000 short sentences to create a new ned tuned GPT2 model.
The ne-tuned GPT2 model will be activated at appropriate timing only when a Natural Language Understanding based classi er detected the need for therapy. The proposed NLU triggered NLG based AI is implemented with a working lab prototype. By using the power of recent advances in NLU and NLG such as BERT and GPT2, the proposed design of AI architecture was successfully demonstrated with a lab prototype. The concept of the intelligent activation of AI therapy at an appropriate time, based on each individual's circumstance, allows for proactive deployment and early intervention in mental healthcare for millions. BERT classi er based module was able to detect if the person increasingly gets depressed over a period of time. Based on detecting the progression of mental health, intelligent activation of the therapy module was demonstrated. As a simple implementation of the proposed NLU activated GPT2 therapy, a simple prototype was implemented and demonstrated. This is much needed given the challenge of early intervention in mental healthcare [44].
Once activated, the ne tuned GPT2 model generated text narratives that offered exible ways to think about a situation. This work experimentally demonstrated the ability of a ne tuned GPT2 as a potential candidate for AI based therapy. Given the human like text generation capability, when combined with transfer learning, allowed the mental health expert to train and deploy a conditional language model to help millions of families impacted by the anxiety/depression.
By proposing an on-device AI inference inspired by the Gboard [14], a predictive keyboard on the smartphone to improve mental health in a privacy-safe way was proposed. Given a small GPT2 model with 124 Million parameters was shown to achieve same performance in ne tuning as a large GPT2 model with 774 Million parameters during ne tuning on the synthetic dataset of 5000 short sentences, it was noted that a small GPT2 model with 124M was appropriate for a smartphone based on-device inference. Though GPT-3 [43] is a successor to GPT-2 [25], it is not suitable for on-device AI inference.
The proposed on-device architecture help to meet the challenge of requirement in privacy given a lot of personal information of the patient is involved. This smartphone based AI architecture also meant continual assistance enabling timely correction of unhealthy thoughts for an individual. Thus the architecture presented in this paper is a step forward in the direction of evolving an AI based novel mental health solution to avert the mental health crisis looming from the COVID-19 pandemic.
This paper addressed the monumental challenge of averting the forthcoming mental health crisis on all 3 dimensions of the challenge. By proposing state of the art Deep Learning architecture based mental health on-device AI solution, the paper demonstrated the potential of Transformers based neural network architecture to address an early and timely health intervention for millions of families across the globe.
Additionally, the proposed AI solution architecture was able to offer privacy safety given the AI inference happened locally on the smartphone device. This allowed for the AI solution to handle sensitive personal information such as thoughts, personal situation within the boundaries of the personal smartphone. A BERT triggered GPT2 approach further enables such CBT-inspired self-help approaches to correct one's unhealthy thoughts. The ne-tuning of GPT2 created a conditioned language model, powerful enough to create a human like text narrative that was able to generate and sow healthy perspectives for a given situation. The beauty of the proposed architecture is its ability to handle an individual's very personal sensitive thought stream in absolutely privacy-safe ways as the AI inference deletes the spoken/typed messages as soon it is processed locally on-device. Further, since the AI-based therapy is only activated at an appropriate time by an on-device BERT classi er, the AI ensures timely therapy with privacy safety.

Future directions & enabling research communities
The doors for AI based Mental Health Therapy for pandemic level scalability has been unlocked by the ideas and the demonstrated results. This was the 1 st time in the reviewed literature, an advanced transfer learning capable language model was demonstrated to generate human like narratives for mental health therapy, with the 3 challenges of solution being able to scalable to millions in a privacy safe with a timely health intervention. The UN's call to action [3] and the titanic challenge of preventing the forthcoming mental health crisis [32,39,40] is addressed by designing a AI based solution using the state of art Deep Learning.
Given the magnitude of the looming crisis, accelerated research by multiple research communities is called upon by experts [2]. Towards supporting this goal of enabling future research by research communities, this paper contributes the AI prototypes in open source. While this paper explored a best of its kind language model to support English language, further research is required on more languages to support the population in multiple countries. Further generation distillation approaches [36] such as DistillGTP2 and model compaction to enable deployment of the AI on commodity smartphone is required.
More importantly, a rst baby step in the roadmap to practical real-world deployment of state of the art AI based therapy has been attempted by this paper, but a lot of interdisciplinary research is required in the future to bring this AI based mental health solution to improve mental health for millions of people. While this papers' lab prototype demonstrated the feasibility of applying state of the art AI innovatively for AI based therapy, the architecture proposed in this paper is provides an approach that can meet the massive challenges for a large scale real-world mental health solution. The architecture proposed, along with the working prototype, are capable of meeting the three dimensions of the challenge for a large scale realworld deployment to avert the looming mental health crisis, namely the ability to impact millions of families in a timely health intervention in a privacy-safe approach. To build a real-world impact, multidisciplinary researchers now have the opportunity to avert the forthcoming crisis.

Availability Of Source Code
Source code: Open sourced at URL, https://sites.google.com/view/ai-in-mental-health AI framework to avert the pandemic scale mental health crisis during COVID19. The opportunity to employ AI in both diagnosis and counselling is immense. Though there are many research publications in using NLU for diagnosis, there is a signi cant opportunity to contribute to applying NLG for therapy. This paper develops a working lab prototype for both mental screening using BERT, and mental health counselling using a combination of BERT & GPT2.

Figure 2
Novel application of AI in mental health therapy. The novelty is in applying Natural Language Generation (NLG) to stimulate a healthy mental outlook as the user types a message about a depressing situation such as losing a job during the pandemic. Like auto-correction of spellings in a smartphone's predictive keyboard, this AI "auto-suggests" the narratives being typed by a mentally depressed person to sow healthy beliefs.  language model is ne tuned on a supplied dataset to generate narratives. The screenshot shows the AI takes in an input sentence and generates an output sentence based on the ne tuned conditioned language model. It can be seen that AI generates sentences that have a positive outlook, even though the user started typing in depressing narratives. The visualization screenshot shows how the language model has learned to pay attention to words like 'helped' Figure 4 Result #1: Conditioned GPT2 for a Mental Health Therapy. This gure relates to Result #1. The screenshot above shows that the input and output of the ne tuned GPT2 model. While the user's situation is depressing, the AI is able to predict a better narrative for a CBT inspired self-help therapy. A smartphone predictive keyboard for improving the mental wellness of users can embed this AI.

Figure 5
Result #2: Rapid AI pre-diagnosis can be utilized by mental health professionals to amplify their productivity. This gure relates to Result #2.

Figure 6
The AI supports the patient's natural mental resilience development, protects her privacy & private data Privacy safe AI is a signi cant progress.

Figure 7
Scalable screening to identify who needs help. This is supplementary result, which is discussed as result #2 Figure 8 Neural network based architecture for Cognitive Behaviour Therapy. The architecture proposed in Figure 5 is further detailed out in this visual.

Figure 9
Performance of three different GPT2 models Therapy gets activated only when appropriate. This picture shows the screenshot of the prototype implementation of architecture proposed earlier in Figure 8. A ne tuned BERT model triggers activation of GPT2 based therapy . For reproducing the results, the source code is available online at the given URL

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.