The goal of this study consisted of two parts: first to develop a method to estimate diurnal patterns and sleep exclusively from typing activity and engineer features capturing this information, and second, to evaluate the feasibility of unobtrusively monitoring processing speed and executive functioning using only information gathered through smartphone typing behaviors and assess the utility of the engineered features to improve model predictions. A few studies have also explored phone activity to estimate sleep duration with all showing approximate agreement between various self-reported sleep measures and phone activity (40–42). To the best of our knowledge however, this is the first to, invoking the ubiquitous toroidal dynamics, incorporate the impact of surrounding days on daily estimates and additionally examine typing regularity using matrix factorization in a sample outside of a university population. Our objective to approximate diurnal patterns and sleep with typing activity was to balance the advantages and disadvantages between current subjective and objective measures. In addition to not directly measuring physical movement, typing activity was sporadic and so would be less accurate than any form of continuous measurement, but would not require active engagement or for participants to remember to continuously wear an additional device. We also sought to just obtain estimations related to diurnal rhythms, and so our analyses did not require the extensive detail that would be obtained with more direct and accurate measures. Typing regularity and the extensive absence of typing activity in this study served to capture behaviors influenced by diurnal patterns and sleep instead of the patterns and quantities themselves. Nevertheless, using the method outlined in Figure II and conceptually visualized in Figure III, the majority of estimated sleep quantities ranged between 5 and 7 hours. These estimates align with typical hours of sleep duration in adults when considering that many reported at least some disturbances in sleep (69,70).
The utility of our diurnal pattern and sleep approximations were evaluated using models predicting processing speed and executive function in a sample of individuals with and without depressive symptoms (Table III). Assessment of these aspects of cognitive function were obtained using a digital TMT-B administered through participants’ own smartphones (Figure I) such that the comparison of the two measures in the model were obtained from the same device. Models were constructed to predict adjusted dTMT-B time from typing behaviors hypothesized to be related to aspects of cognition. Practice has been well-known to be a source of confound to the interpretation of dTMT-B performance due to the probable improvement following repeated task completions (46). As this study included repeated administrations of the task, practice effects were expected. To control for this source of improvement, dTMT-B times were standardized using the z-scores of dTMT-B groupings by task number, which removed within-subject differences due to practice across sequentially completed tasks. This normalization technique has been used in previous studies to control for other confounds, such as age and education level (39,71), and was chosen over the alternative of including a practice variable in the model to refrain from constructing a model containing task-derived measures to predict task performance. The resulting adjusted dTMT-B times could then be compared across the numbered task groupings to evaluate for differences due to factors pertaining to the cognitive abilities of the individuals.
To predict the performance of executive functioning and processing speed as measured by the dTMT-B, feature selection was first performed using a filter-based method to consolidate the predictors in the models from the combined list in Tables I and II to those thought to be most relevant. For the first model excluding typing regularity and estimated sleep features, the selected variables included typing speed, both average and within subject fluctuations, which previously had been found to be significantly associated with dTMT-B performance (26), as well as an even amount of features relating to behaviors surrounding the number and frequency of phone orientations chosen while typing and those pertaining to typing-specific dynamics (Table IV). Interestingly, grand mean centered variables, which allow for comparison between participants’ average parameters to one another, were chosen more often than subject centered features, which examine within participant fluctuations on performance. The reliance on between subject differences for model predictions may suggest that one of the biggest predictors was the individual themself. This notion is supported by studies emphasizing personalized models for enhanced predictions of mood and cognition (15,28,72,73).
The features derived from typing regularity and estimated sleep computed with the method described in Figure II extracted relationships between typing patterns across neighboring days as well as typical gaps in typing activity over time (Table II). Upon inclusion of these features in the list of predictors, many were chosen by the filter-based feature selection method over previously selected typing and orientational features, shown in Table IV. The shift to these typing regularity features supports the notion that rhythm, or its disruption, is related to mood disorders and associated cognitive impairments and can be passively captured from smartphone typing activity. Furthermore, our use of smartphone typing, which to varying extents and contexts can incorporate interactions with others (e.g. through text messages, emails, and social media posts), to evaluate regularity may in fact additionally detect facets of social rhythms, which have also been implicated to be disrupted in those with mood disorders (74).
Prior to the addition of predictors relating to diurnal patterns and sleep, the RF model had the lowest RMSE and MAE of 0.938 and 0.724, respectively, of the four modeling methods tested, shown in Table V. Since the predicted values had been z-scored, these errors corresponded to standard deviation units as opposed to raw values. This suggested that through analyzing solely information collected from an individual’s naturalistic typing behaviors, aspects about their cognitive performance, otherwise measured by the dTMT-B, could be estimated within less than one standard deviation of the mean on the dTMT-B. Furthermore, these predictions were impartial to practice effects, unlike the conventional TMT-B (46). Although this concept of analyzing smartphone typing as an assessment of cognitive function is fairly new in the literature, a few other studies conducted in patient populations such as Alzheimer’s disease/mild cognitive impairment and multiple sclerosis have also concluded that harnessing smartphone typing dynamics may have the potential to elucidate the impairments in underlying cognitive processes (25,51,75,76).
As hypothesized, the addition of typing regularity features improved our RF model predictions of adjusted dTMT-B times in both the RMSE and MAE (Table V), suggesting that our method of extracting diurnal patterns and estimated sleep from typing activity effectively contributed to model predictions of processing speed and executive function.
Evaluation of the calculated SHAP values for each prediction in the model allowed for determination of feature importance and directionality. Before the addition of typing regularity and estimated sleep features to the model, three features, typing speed (grand mean centered), accelerometer’s median X-axis reading (grand mean centered), and normalized number of cluster transitions (grand mean centered), contributed the most to model predictions (Figure IVa). After the addition of the engineered features, those relating to typing speed and phone orientation and transitions between them remained the highest ranked according to the mean absolute SHAP values (Figure Va). Typing regularity features were observed to be of moderate importance overall for model predictions.
Trends in feature directionality revealed similarities between the two models, shown in Figures IVb and Vb. First, examination of trends in typing speed, thought to relate most to the cognitive aspects of dTMT-B performance, suggested that participants who typed more quickly on average compared to the other participants were predicted to have better dTMT-B performance, which aligned with previous findings (26,77). The observed trend supports the notion that typing speed may be a naturalistic measure of processing speed (25). However, without an isolated measure of processing speed, such as through the TMT part A (a digitally adapted version is not included on the BiAffect app) which consists solely of connecting numbers in ascending order, we are unable to confirm this hypothesis in our sample. Further work would be needed to unpack how our features relate to the specific aspects of cognition involved during the dTMT-B task.
A less defined trend for within subject fluctuations in typing speed in the first RF model revealed an opposite effect in which those who typed more slowly compared to themselves performed better than their average dTMT-B performance. One possible explanation for this juxtaposition could be related to a speed accuracy trade-off as has been seen in smartphone typing between typing speed and correction (autocorrect and backspace) rates (25). If participants were typing more quickly than usual, they might also be tapping more quickly through the dTMT-B but making more errors along the way. The increase in errors would increase the task completion time, thus ultimately causing them to be slower than their average on the task. Without a decipherable measure of typing error in the model, however, we are unable to further investigate this possibility. This effect overall contributed less to the prediction of dTMT-B times, but more work is needed for proper interpretation of this finding and determination of its generalizability.
Furthermore, both models suggested that more transitions between varying phone orientations during consecutive typing sessions related to better dTMT-B performance, but varying trends in the top predictors relating to amount of non-upright typing sessions was observed between the two models, as seen in the comparison between the accelerometer’s median x-axis reading in Figure IVb and the fraction nighttime non-upright sessions in Figure Vb. However, nuances in the calculations, such as timing-related differences, between the specific features in the two analyses could attribute to the discrepancies in the interpretation of these results. Indeed, Ning et al. found that performance on the dTMT-B varied by time of day (77), so calculations of typing features based on varying timeframes may lead to confounding results. This is further supported by the fact that the predictor fraction nighttime non-upright sessions showed consistent directionality between the two models. Overall, these findings suggested that more flexibility in the chosen phone orientations while typing throughout the day was related to better processing speed and executive function, which might have related to levels of activity and the corresponding influence of depressive symptoms. Previous studies have found that increased sedentary behavior and limited physical activity is associated with greater depressive symptoms (78), which is known to impact cognitive performance (2).
The examination of the directionality of the typing regularity features in the second model suggested that similarity in typing patterns between consecutive days was associated with better cognitive performance, but similarity in typing patterns compared 4 days apart related to poorer performance. One rationale behind this finding might relate to the weekly schedules than many people have in which work schedules dominate time on weekdays and weekends allow for more variety in activities. Within this structure, typing patterns may consistently be more similar between consecutive days, while observations 4 days apart allows for more comparisons between weekdays and weekends. In line with this theory, Huber and Ghosh noted 7-day periodicity in smartphone usage in addition to the expected 24-hour cycle in their analysis of behavioral patterns across varying timeframes (79). Disruption to this cycle could indicate the presence of a mood disorder and corresponding cognitive impairment. Indeed, those with major depressive disorder and bipolar disorder have been found to have increased absences from work and an overall loss of work performance (80), which suggests that their daily and weekly rhythms may fluctuate more than those without a mood disorder.
Finally, we have previously observed that depressive symptoms are significantly associated with dTMT-B performance (26). Therefore, we verified that the selected features in our model also accounted for the effects of depressive symptoms on dTMT-B performance by comparison of the RF model performance following the addition of participants’ average PHQ score to the model. The addition of a measure of depression did not improve model predictions. Further examination of the feature importance on model predictions revealed that the average PHQ feature had limited contribution, shown in Figure VI. These factors suggest that variance in the data related to the effect of depressive symptoms on dTMT-B performance was explained by smartphone typing behaviors and supports the use of a predictive model of processing speed and executive functioning entirely comprised of passively-derived features.
This analysis comes with limitations. As described before, the proxies for diurnal patterns and sleep obtained solely from typing activity requires caution in its interpretation due to the lack of knowledge about the activities the participants were engaged in when no typing activity is recorded. Typing regularity was used as a proxy for diurnal patterns; however, although the respective features related to executive functioning as hypothesized, the link between typing regularity and diurnal rhythm itself still needs to be established. Furthermore, estimated sleep labels were based on many assumptions. First, we assumed that participants did not take naps, and they were predicted to have typed on their smartphones right before they went to sleep as well as immediately after they woke up the next morning. Moreover, approximated sleep was based on hourly intervals and was not able to accommodate more precise sleep onset and wake times. Nonetheless, our approximations for the diurnal rhythms of the participants in this study produced coherent relationships with dTMT-B performance in line with the literature. However, future work should investigate the extent to which typing activity can approximate sleep and diurnal patterns in a diverse sample.
Additionally, contrary to traditional in-person assessments, the environmental variables in which the dTMT-B was collected could not be known. The remote administration of the dTMT-Bs, although convenient for the participant, meant that the environment in which they completed the tasks most likely varied between tasks and participants. This may result in better alignment between naturalistic smartphone typing and cognitive assessment in-the-wild and provide more ecological validity to the assessment, but further work would be needed to delineate these effects.
Moreover, the dTMT-B times in this study were adjusted to control for practice effects. The standardization was performed on groups with quantities ranging between 43 and 77 tasks, which might not have encompassed the true distribution of dTMT-B times for each sequential task. Furthermore, the standardization method assumed the same learning rate for all participants, which may not have been an accurate adjustment for each individual.
Lastly, TMTs generally comprise of two parts: part A and B. Our study consisted solely of part B, which meant that we were unable to separate processing speed from set-shifting (the cognitive process required by the task when switching between numbers and letters) in our analyses. However, one may expect that the ability of set-shifting is relevant in naturalistic typing (e.g., switching between QWERTY and special character layouts). Nevertheless, further work should be done to determine the clinical applicability of these findings to the clinical population.