Playing HAVOK on the Chaos Caused by Internet Trolls

Trump supporting Twitter posting activity from right-wing Russian trolls active during the 2016 United States presidential election was analyzed at multiple timescales using a recently developed procedure for separating linear and nonlinear components of time series. Trump supporting topics were extracted with DynEGA (Dynamic Exploratory Graph Analysis) and analyzed with Hankel Alternative View of Koopman (HAVOK) procedure. HAVOK is an exploratory and predictive technique that extracts a linear model for the time series and a corresponding nonlinear time series that is used as a forcing term for the linear model. Together, this forced linear model can produce surprisingly accurate reconstructions of nonlinear and chaotic dynamics. Using the R package havok, Russian troll data yielded well-fitting models at several timescales, not producing well-fitting models at others, suggesting that only a few timescales were important for representing the dynamics of the troll factory. We identified system features that were timescale-universal versus timescale-specific. Timescale-universal features included cycles inherent to troll factory governance, which identified their work-day and work-week organization, later confirmed from published insider interviews. Cycles were captured by eigen-vector basis components resembling Fourier modes, rather than Legendre polynomials typical for HAVOK. This may be interpreted as the troll factory having intrinsic dynamics that are highly coupled to nearly stationary cycles. Forcing terms were timescale-specific. They represented external events that precipitated major changes in the time series and aligned with major events during the political campaign. HAVOK models specified interactions between the discovered components allowing to reverse-engineer the operation of Russian troll factory. Steps and decision points in the HAVOK analysis are presented and the results are described in detail.


21
Even when researchers agree on deĄning trolls by the malevolent nature of their 22 interactions, belonging to the troll class is often determined by the thin lines between the 23 subtypes of expression of malevolence with hate speech (Davidson,Warmsley,Macy,& 24 Weber, 2017; Kocoń et al., 2021;, 2021b and cyber-agression (Rosa 25 et al., 2019) classiĄed as phenomena different from trolling (Uyheng et al., 2022). 26 Moreover, there is no clear agreement on the role of automation in the deĄnition of trolls, 27 with some researchers classifying trolls as exclusively human actors (Bastos & Mercea,1 2018; Broniatowski et al., 2018), whereas others allowing bots into the deĄnition (Paavola 2 et al., 2016). While such inconsistencies exist, research Ąndings are likely to differ due to 3 the study of potentially different phenomena, especially when they are immersed into 4 different contexts (Gorwa & Guilbeault, 2020;Orabi et al., 2020). For instance, bots that 5 might or might not be classiĄed as trolls have been found to be associated with use of 6 abusive language in some contexts (Stella et al., 2018;Uyheng & Carley, 2021a;Uyheng et 7 al., 2022), and having no correlation with abusive language  in other 8 contexts. Even contexts that are similar at the present time will inevitably change with 9 time, in turn also altering the interests of the state and private entity sponsors, who will 10 modify their approaches accordingly (Zannettou, CaulĄeld, Setzer, et al., 2019;Alizadeh et 11 al., 2020;Alsmadi & OŠBrien, 2020;Cresci, 2020;Llewellyn, Cram, Hill, & Favero, 2019).

12
The deĄnitions and thresholds, even if clearly established, will need to be reconsidered and 13 adjusted accordingly, making it necessary to continue the research of trolling with respect 14 to time.

15
Although some troll and social bot studies consider select temporal aspects in their 16 analyses (e.g., Llewellyn et al., 2019;Park, Strover, Choi, & Schnell, 2021;Rajtmajer, 17 Simhachalam, Zhao, Bickel, & Griffin, 2020;Zannettou et al., 2020;Zannettou, CaulĄeld, 18 De Cristofaro, et al., 2019) and some temporal aspects of trolling are utilized in machine 19 learning troll detection algorithms (e.g., Chu et al., 2012Chu et al., , 2018Engelin & De Silva, 2016;20 Fornacciari, Mordonini, Poggi, Sani, & Tomaiuolo, 2018;Galán-García, Puerta, Gómez, 21 Santos, & Bringas, 2016), studies that examine trolling activities with respect to both short 22 and long timescales are rare. We have not found any studies that research troll activity 23 across multiple timescales. The current article examines temporal aspects of the Russian 24 trolling activities across multiple timescales simultaneously, with timescales ranging from 25 intraday to monthly events. Considering a continuum of timescales and algorithmically 26 determining the timescales that are the most relevant to the internal governance of the 27 troll institutions and events that incite their activity may yield better results than selecting 1 the timescale of focus by guesswork, habit or convenience. Modern quantitative techniques 2 designed for continuous time series analysis, such as HAVOK (Hankel Alternative View of 3 Koopman; Brunton, Brunton, Proctor, Kaiser, & Kutz, 2017) can yield insights into 4 temporally relevant communicative strategies across different timescales when qualitative 5 text data is converted into time series of topical word frequencies.
6 HAVOK analysis was derived from control theory by an interdisciplinary team of 7 researchers at UW Seattle. HAVOK is a powerful tool for modeling nonlinear and chaotic 8 time series by decomposing them into intermittently forced linear systems. A forcing 9 parameter allows HAVOK to demarcate regions where a time series is approximately linear 10 from those that are nonlinear, with the forcing extrema often preceding shifts in the time 11 series and aligning with the contextual events.

12
Through a few mathematical steps HAVOK approximates a highly nonlinear system 13 in a state-space representation that consists of a set of components that represent 14 dominant linear dynamics and a forcing term that represents intermittent contribution of 15 nonlinearities. The forcing term (plotted as V r in Figure 1  It is noteworthy that forcing term peaks often tend to slightly precede the occurrence of 20 major events in the measured time series.

21
It is also possible to generalize HAVOK results beyond training data, predict 22 nonlinear events, and predict a systemŠs response to nonlinear events. In the context of the The data that we used was extracted by Linvill and Warren (2020a)   The accounts are classiĄed by Linvill and Warren (2020a) into 5 taxonomies: right troll, 6 left troll, news feed, hashtag gamer, and fearmonger. Only right troll and left troll accounts 7 were used in our data preprocessing step, with right trolls representing trolls who tend to 8 promote right-wing values, and left trolls representing trolls who tend to promote left-wing 9 values. Only posts (not including retweets) dating from January 2016 to January 2017 were used. Only accounts that contained at least 50 posts were used, resulting in a total 1 number of posts being 276,752 (78.32% from left trolls). The resulting 236 accounts 2 included in the analysis had a substantial number of followers (median = 877, mean = 3 4838, 12.37% have more than 10,000 followers), and therefore could be considered 4 inĆuential (Golino et al., 2022;Zannettou, CaulĄeld, De Cristofaro, et al., 2019). The data were preprocessed as described in Golino et al. (2022). Data from right and 7 left-leaning trolls were pre-processed using the text mining package tm() from R (Theußl, 8 Feinerer, & Hornik, 2012). The sparsity of the corpus of words was decreased using a 9 sparsity threshold of .9935, resulting in 168 unique words. TMFG network and DynEGA 10 were applied using 10 embedded dimensions. Arguments to ŚdynEGAŚ were set as follows: 11 time-lag τ = 1 and time between successive observations δ = 1. The correlation Ąrst 12 derivatives were used to estimate the network. The level of analysis was set to population, 13 resulting in estimated topics that represent the mean structure of the population.
14 By estimating the topics DynEGA identiĄed clusters of right-wing and left-wing troll 15 tweet words whose frequencies were changing together in time, identifying dynamic latent 16 topics. HAVOK was applied to the tweet contents that fell into particular latent topics.

17
Topic words whose network scores were used in the analysis belonged to one of the three  (https://github.com/RobertGM111/havok).

1
How exactly does HAVOK operate? 1 In short, dynamics of a system exist in a high 2 dimensional space. We measure the system along one of the dominant axes in time and 3 obtain a measured time series of the system. In psychology, this is what we have and where 4 we startŮwe measure the time series of a phenomenon of interest. In the current case, we 5 use the average network scores of the 3 Trump supporting topics in the right-wing troll 6 tweets.

7
First we arrange our measured time series into a Hankel matrix, equivalent to the 8 transpose of a time-delay-embedded matrix with lag 1 (Sauer, Yorke, & Casdagli, 1991), 9 where each row contains a segment of the measured time series that is shifted forward by 10 one element from the row above it. Doing this is equivalent to folding up our measured 11 time series and representing it in a higher dimensional space. It is relevant that the time 12 series segments that we are shifting with each row can be of any arbitrary length, also 13 1 For a more extensive tutorial than we can supply here, please refer to Moulder, Martynova, and Boker (2021). meaning that our Hankel matrix can have an arbitrary number of rows. The number of 1 rows in the Hankel matrix is one of the hyperparameters in HAVOK, and it is directly 2 related to the time-scale on which the model will focus. The higher the number of rows, 3 the lower the frequency of events on which the model will focus, and the smoother the 4 predicted time series will be.

5
The next step consists of applying singular value decomposition (SVD) to the Hankel where U is a matrix containing the orthogonal left singular vectors, Σ is a diagonal matrix By plotting the Ąrst three of these latent time series on mutually orthogonal axes (v 1 , Figure 3) we can reconstruct a three dimensional projection of the attractor 3 that will be diffeomorphic to the Ąrst three dimensions of the true system attractor. In this 4 context, diffeomorphic means that the true attractor is analogous in its behavior patterns 5 at each corresponding time point to the reconstructed attractor.

6
The SVD component matrices are then truncated up to the r number of components 7 that are needed to accurately approximate the modeled system. Model degree r is a 8 hyperparameter in the HAVOK algorithm that helps separate informative model 9 components from noise. So, truncation here also serves as denoising. In theory, the higher 10 the complexity of the system, the higher the model degree will be required. However, real 11 life systems are almost always very complex, and the model degree usually reĆects how 12 much of that complexity we can explain with our model, given the quality of the data.   (c) Clusters that possessed multiple highly similar or almost identical U-modes.

10
(d) Clusters that were similar in the structure of their coefficient matrices.

11
3. Out of the models that satisĄed the above criteria, models that had the best numeric 12 and visual Ąt between v 1 andv 1 time series, as well as the least noisy forcing terms 13 and the most distinctly/unambiguously identiĄed forcing events were selected.
14 Four models were then selected from the models that met the above criteria. Each of 15 the models was selected to be representative of a distinct timescale (have considerably 16 different stackmax from the other selected models), to possess a forcing term that was 17 distinctly different from the forcing terms of the other 3 models, and to belong to the 18 dominant forcing term cluster in the corresponding timescale range.  prominent and lower frequency events the HAVOK model will focus on, with longer kernels 1 resulting in more smoothing and more focus on slower time scales.

2
As can be seen in Figure 4, these four HAVOK models reproduce the major latent in the tweet time series will not be predicted well by the models that focused on slower 17 timescales. Instead, however, each model will be appropriate for predicting events at its 18 corresponding timescale, which is reĆected by the >.9 coefficients between v 1 andv 1 .

19
Many well-Ątting models were available at different time-scales. At some timescales 20 no well-Ątting models were available, potentially showing that those particular timescales 21 were not aligned with the timescale of events that forced the system. In the context of the 22 troll data, those would be the time-scales of events that did not incite troll tweets. On the 23 other hand some timescale ranges had multiple well-Ątting models available, implying that 24 those time-scale ranges align with the frequencies of the most prominent events that 25 affected the system.

26
Out of these 4 models, the one with the kernel of length 16 days was surrounded by 27 the densest cluster of well-Ątting models in the hyperparameter space implying that this 1 model is potentially appropriately capturing the time-scale of the most inĆuential events 2 that incited active troll tweets.
3 Figure 5 is a close up view of the model results. Squared correlation was .98 and the 4 forcing term had very sharp and prominent peaks and throughs. But do these high 5 amplitude peaks align with the political events that had the power to force the right-wing 6 trolls to actively promote Trump on Twitter? Figure 5-b identiĄes and labels 12 major 7 events during calendar year 2016 that had substantial political relevance.
8 In this particular system the peaks correspond to the events that forced active Trump 9 promotion on Twitter (e.g., DukeŠs and ChristieŠs endorsement), while troughs correspond 10 to sharply decreased Trump promotion activity (e.g., after the announcement of the 11 preliminary election results). This interpretation is relevant only to this particular system, 12 and will not necessarily be the case in other systems or even models of the same system.

13
The meaning of the forcing term directions depends on the shape of the U-mode of the 14 forcing term, its relationship to the other U-modes, as well the structure of the systemŠs 15 attractor.

16
It is noteworthy that the forcing peaks and troughs exhibit a different alignment 17 pattern during the presidential debates. Forcing is active immediately before and right 18 after each debate, but not during the debates, as might have been expected from the 19 alignment of other events.

20
Model results can be viewed in higher dimensions in a form of the system attractors 21 pictured in Figure 3.  sinusoids lagged by π/2. The cycles that they represent are evident in the attractor plots 1 with the U-modes 2 and 3 generating the quadrangular cycles in Figure 3, and U-modes 4 2 and 5 generating the hexadecagonal cycles in Figure 6. The quadrangular cycles, and, 3 correspondingly, U-modes 2 and 3, represent daily cycles, as the four vertices in each cycle 4 represent the 4 measurements that were collected per day. Analogously, the hexadecagonal 5 cycles, and, correspondingly, U-modes 4 and 5, represent 4-day cycles, as sixteen vertices,

12
The 6 th U-mode, which represents the kernel of the forcing term, is unique to the selected model (and the neighboring models clustered in the hyperparameter space). Its 1 shape resembles a logistic curve with sinusoidal daily oscillations that oscillate in the 2 opposite direction from the daily sinusoids in U-modes 2 and 3 in the Ąrst 1/3 rd of the 3 kernel, in the same direction as the daily sinusoids in U-modes 2 and 3 in the last 1/3 rd of 4 the kernel, and phase transition between the two in the middle 1/3 rd of the kernel (see 5 Figure 7). This kernel shape likely indicates that events forcing the system at this  The relationships between the latent time series components (v 1 , ..., v 6 ) that 14 correspond to the six U-modes can be viewed in the state-space equation where v 1 through v 6 represent rows of the V-transpose matrix, and d dt signiĄes Ąrst Many well-Ątting HAVOK models were found, representing different timescales. forcing terms, which implies events forced the system at different timescales. Although 24 some of the differences between the forcing terms could be generated by noise in the forcing 25 term, this seems unlikely since all four models had model degree r << stackmax and their 26 coefficient matrices were sparsiĄed. The reason for this conclusion is that the lower the 1 model degree r relative to stackmax, the more elements of the SVD matrices are 2 truncated, thus eliminating more potentially noisy components from the Ąnal HAVOK 3 model. SparsiĄcation of the model coefficient matrix serves as yet another noise Ąltering 4 step in cases where the truncation was insufficient for eliminating all the noise (Martynova 5 & Boker, In Process). Given that the four models in Figure 4 had 71-93% of their SVD 6 matrices truncated, and 42-87% of their model coefficient matrices sparsiĄed out, the 7 remaining forcing terms are unlikely to be contaminated by a substantial amount of noise.

8
Hence, we believe that the forcing terms presented in the plots predominantly contain 9 information about real world events that forced the analyzed troll operation system.

10
The alignment of the extracted forcing events in Model c with the major politically 11 relevant events during the presidential campaign is evident in Figure 5. This alignment of volume declined due to a second forcing event. In Figure 5, the initiation and termination 10 of the troll activity bursts around the 2 nd and 3 rd presidential debates were likely caused by 11 the evident forcing events. It may be that premeditated strategies by the troll governance 12 determined optimal times for topical shifts between the debates. also distinguish those that are timescale independent from those that are timescale speciĄc.

U-modes and the Russian work week 24
The U-modes of Model c suggest that day-night and four-day cycles were the most 25 important dynamics, suggesting coupling with daily and weekly cycles rather than intrinsic 26 dynamics of the troll postings. That is to say, there was less evidence that one posting led 1 to the next in a deterministic way (intrinsic dynamics) than there was evidence that the 2 postings were coupled to day-night cycles and four-day cycles. Day-night cycles and 3 four-day cycles were present among the U-modes of all four models at four different 4 timescales pictured in Figure 4. In Model c, U-modes 2 & 3 are aligned with day-night 5 cycles and modes 4 & 5 are aligned with four-day cycles. As can be seen in Figure 7 the 6 two cycles are captured by pairs of sinusoidal shapes lagged by π/2, providing evidence 7 that Fourier modes can be detected by HAVOK analysis in addition to sequential 8 derivatives / Legendre polynomials (Hirsh, Ichinaga, Brunton, Nathan Kutz, & Brunton, 9 2021). Due to the pervasive presence of these cycles across dominant models at all 10 timescales we hypothesize that these cycles have a substantive meaning.

11
We explored this hypothesis by locating information from troll factory insider 12 interviews. There is an 8 to 12 hour difference in time zones between the Moscow area in 13 Russia and the USA. To be able to realistically align their tweet patterns with waking time 14 in the USA, Russian trolls would need to work at night, which was conĄrmed by insiders 15 (Volchek, 2021b(Volchek, , 2021a. These reports claim that there were 12 hour work shifts: a night 16 shift and a day shift. These insiders also reported that they had a 2-by-2 work week, which 17 is the most common work structure in Russia when night shifts and no-weekend jobs are 18 involved. In a 2-by-2 work week, people work for 2 days then are off for 2 days. The forcing term is out of sync (Ćuctuating in the opposite direction) with the 3rd 27 U-mode daily cycles for the Ąrst one third (visible in the U-mode plot and r = -0.42), in 1 sync (containing daily cycles in the same direction) for the last one third (visible in the 2 U-mode plot and r = 0.43), and phase transitioning in the middle one third (r = 0.01).

3
The model coefficient of the forcing term that loads on the 3rd U-mode daily cycles is 4 relatively large and negative. This likely implies that occurrence/onset of prominent 5 political events that force the system is accompanied by a disruptive effect on the regular 6 structure of the daily (within 12-hour work shift) tweet posting cycles. One explanation for 7 this disruptive effect is if the beginning of a normal 12-hour work shift cycle involves little 8 posting. According to the insider interviews, trolls receive folders with information on 9 current events and topics of concern at the beginning of a shift and they take time to read 10 and prepare before starting to actively post. At shift end, these reports say that the 11 workers are tired and post less. However, the occurrence of a forcing event might induce 12 the workers to hurry and post about that event while it is still relevant even if this is not 13 their typical peak productivity hours, causing irregularities in the cycles representing 14 12-hour shifts. The directional Ćip in the forcing kernel shape in the second half might 15 indicate the normalization of the regular daily work schedule after a panic posting episode.

16
The forcing term has no direct loadings onto the four-day cycles in the model which 17 implies that political events do not directly affect the 2-by-2 work week structure. This article serves as evidence that HAVOK analysis, when combined with topic 12 modeling using DynEGA, can reverse engineer Russian manipulation strategies from text.

13
Topics used to inĆuence the online debates, forcing events that aligned with politically 14 relevant events and incited changes in troll activity, and troll work cycles (day-night, and 15 2-by-2 work week) were all detected and interactions between them were explored.

16
When applied across a range of hyperparameters, HAVOK has been shown to reveal 17 the multi-timescale nature of the troll tweeting system with evidence that events forced the 18 system at different timescales. HAVOK was able to determine many well-Ątting models Russian troll factory operation cycles was conĄrmed from insider interviews.

27
Relationships between the cycles, the rates of change, and other basis elements were 1 contained within the HAVOK models as coefficients connecting the latent time series v 1 , ...

2
, v r that respectively correspond to the U-mode bases. Model coefficients also indicated 3 how the external forcing term inĆuenced each linear component of the system, and how its 4 inĆuence propagated throughout the linear components over time. HAVOK allowed 5 visualization of the dynamical process of the troll operation system captured in the models 6 in high dimensional attractors in state space. Elements of the system, such as cycles, attractors could be determined, identifying those that were stable across timescales from 12 those that were timescale-speciĄc.

13
This insight was achieved by reapplication of the HAVOK analysis across the 14 hyperparameter space. It was surprising to us how reapplication of such a mathematically 15 simple model provided a way to reverse engineer Russian manipulation strategies with 16 impressive detail and alignment with the real world events. These results allow improved 17 theoretical speculation about the future Russian troll activity and a method to 18 quantitatively predict how future forcing events will manifest in the troll behavior.

19
HAVOK models allow one to simulate a systemŠs response to hypothetical forcing events.

20
Given that forcing events tend to precede events in the response time series, simulated 21 responses generated by data from events in real time could allow for prediction and/or 22 intervention. Simultaneous short-term forecasting of HAVOK components is also possible 23 by the use of HAVOK forecasting extensions (Martynova, Moulder, & Boker, 2022). All of 24 the above can be executed and analyzed at multiple timescales that possess descriptive 25 models representative of a target system, thereby creating a more complete view of the 26 system and increasing chances of correct predictions.

27
Equally impressive results can be achieved by applying HAVOK across 1 hyperparameter space to any intensive time series that meet the requirements listed in