## 2.5.1 Riemannian geometry classifier with TRCA spatial filter

1) TRCA spatial filter

The TRCA algorithm could exhibit maximal similarity among task blocks and extract task-related components by properly weighing the observed signals[20]. Assuming task-related signal \(s(t) \in R\) and task-unrelated signal \(n(t) \in R\), the observed multichannel signals \(x(t) \in {R^{{N_c}}}\) are composed of the following (5):

$${x_j}(t)={a_{1,j}}s(t)+{a_{2,j}}n(t),j=1,2, \ldots ,N_{c}^{{}}$$

5

where \(j\) denotes the channel index, \({N}_{c}\) is the number of channels, and \({a}_{1,j}\) and \({a}_{2,j}\) are the mixing coefficients of the j-th channel. Then, the task-related components, \(s(t)\), can be recovered from a linear sum of the observed signal, \(x(t)\). To ensure that the task-related components exhibit maximal temporal similarity among task trials, inter-trial covariance maximization should be converted. If \({x^{(h)}}(t)\) is the h-th trial of the observed signal, and \({y^{(h)}}(t)\) (\(t=1,2, \ldots {N_t}\)) represents the estimated task-related component, then the covariance of all possible combinations of trials is summed as Formula (6):

$$\begin{gathered} \sum\limits_{{{h_1},{h_2}=1,{h_1} \ne {h_2}}}^{{{N_t}}} {{C_{{h_1}{h_2}}}} =\sum\limits_{{{h_1},{h_2}=1,{h_1} \ne {h_2}}}^{{{N_t}}} {\sum\limits_{{{j_1},{j_2}=1}}^{{{N_c}}} {{w_{{j_1}}}{w_{{j_2}}}Cov(x_{{{j_1}}}^{{({h_1})}}} } (t),x_{{{j_2}}}^{{({h_2})}}(t)) \hfill \\ ={w^T}Sw \hfill \\ \end{gathered}$$

6

where the symmetric matrix \(S\) is defined as follows:

$${S_{{j_1}{j_2}}}=\sum\limits_{{{h_1},{h_2}=1,{h_1} \ne {h_2}}}^{{{N_t}}} {Cov({x_{{j_1}}}^{{({h_1})}}(t),} {x_{{j_2}}}^{{({h_2})}}(t))$$

7

To bound the coefficients, the variance of \(y(t)\) is restricted as follows:

\(\begin{gathered} \operatorname{var} (y(t))=\sum\limits_{{{j_1},{j_2}=1}}^{{{N_c}}} {{w_{{j_1}}}{w_{{j_2}}}} Cov({x_{{j_1}}}(t),{x_{{j_2}}}(t)) \hfill \\ ={w^T}Qw=1 \hfill \\ \end{gathered}\) (8)

The restricted optimization can be solved using Formula (9):

$$\hat {w}=\mathop {\arg \hbox{max} }\limits_{w} \mathop {\frac{{{w^T}Sw}}{{{w^T}Qw}}}\limits_{{}}$$

9

The optimal coefficient vector is obtained as the eigenvector of the matrix \({Q^{ - 1}}S\). The eigenvalues of matrix \({Q^{ - 1}}S\), which can be arranged in descending order, imply task consistency among the experimental trials[21]. In our study, we used the TRCA method to design spatial filters during task interval (0–3 s, where 0 s represented the start of the MI task) to obtain the SSSEP task-related components. In addition, this method can reduce the feature dimensions for Riemannian geometry measurement classification.

2) Data partitioning, multi-trial reference, and feature extraction

Considering there can be no overlap between the training set and the test set for constructing the spatial filter and task classification, the MI-SSSEP task interval signals were first divided into a training set and a test set according to the 10×10-fold cross validation method. Subsequently, the training set was filtered by 27–29 and 32–34 Hz sub-bands, and then the filtered multiple training trial data \({X}_{train}^{28},{X}_{train}^{33}\) were used to build TRCA spatial filters. The first two eigenvectors were employed to obtain the averaged task-related component, \({y}_{ref}\), which was used as a reference after averaging for multiple training trials. Afterwards, the training and test trials were filtered by TRCA filters separately, and the task-related components \({y}_{train}\) and \({y}_{test}\) were covaried with the reference components, whose results were used as feature vectors for the Riemannian geometry measurement.

3) Riemannian geometry measurement

The Riemannian distance can be used to classify left-right foot motor intentions by calculating the Riemannian distance for each class. Here, given two samples, \({x}_{i}\) and \({x}_{j}\), the Riemannian distance between them can be computed, as shown in Formula (10)[22]:

$$\begin{gathered} {\delta _R}(\sum {_{i}} ,\sum {_{j}} )={\left\| {\log ({{\sum {_{i}} }^{ - 1/2}}\sum {_{j}{{\sum {_{i}} }^{ - 1/2}})} } \right\|_F} \hfill \\ ={\left[ {\sum {_{{c=1}}^{C}{{\log }^2}{\lambda _c}} } \right]^{1/2}} \hfill \\ \end{gathered}$$

10

where \(\sum {_{i}}\) represents the covariance matrix of \({x}_{i}\), \(\sum {_{j}}\)denotes the covariance matrix of \({x}_{j}\), \({\lambda _c}(c=1, \ldots ,C)\) are the real eigenvalues of \({\sum {_{i}} ^{ - 1/2}}\sum {_{j}{{\sum {_{i}} }^{ - 1/2}}}\), and is the number of channels. The Riemannian geometric mean of M covariance matrices is defined by the matrix that minimizes the sum of the squared Riemannian distances, as shown in Formula (11):

𝔊\((\sum {_{{1, \ldots ,}}} \sum {_{M}} )=\mathop {\arg \hbox{min} }\limits_{{\sum { \in P(C)} }} \sum\limits_{{i=1}}^{M} {{\delta _R}^{2}(\sum {,\sum {_{i})} } }\) (11)

4) Classification

Given a training set and the corresponding labels, \({r_i} \in \left\{ {1,2} \right\}\) (1: left-foot motor intention; 2: right-foot motor intention), the training of the classifier estimates the Riemannian geometric mean for each class as follows (12):

\(\sum\limits_{{}}^{ - } {_{K}=}\) 𝔊\(\left. {(\sum {_{i}} } \right|{r_i}=K)\) (12)

where \(K \in \left\{ {1,2} \right\}\) denotes the class label. The classification of a test trial can be obtained by computing the minimum Riemannian distance to each class, as shown in Formula (13):

\(\hat {r}=\mathop {\arg \hbox{min} }\limits_{K} {\delta _R}(\sum\limits_{{}}^{ - } {_{K},\sum ) }\) (13)

where \(\hat {r}\) denotes the predictive class. To ensure the reliability of the classification results, a 10×10-fold cross validation was performed to identify the left or right foot motor intention. Figure 4 shows diagrams of the proposed method.

## 2.5.2 Baseline classification algorithm

An FBCSP based on multiple frequency spatial filtering is selected as the baseline classification algorithm under different experimental paradigms, as it is a popular method in BCI applications[14, 23]. The composition of the FBCSP algorithm includes the following steps: first, the original EEG data were filtered through the N bandpass filters to form multiple sub-band EEG data. For each sub-band dataset, a binary CSP algorithm was used to design a spatial filter based on the training set, and the calculation formula of the CSP features in a single trial was obtained as Formula (14):

$${v_{i,b}}=w_{i}^{T}{x_{i,b}}$$

14

where \({x}_{i,b}\) is the EEG data after bandpass filtering of the b-th trial and the i-th frequency band, \({w}_{i}\) is the spatial filter matrix of the corresponding frequency band, T represents the matrix transpose operation, and \({v}_{i,b}\) represents the feature matrix of the b-th trial and the i-th frequency band for each class. The first \(m\)maximum and minimum eigenvalues were selected to form a spatial filter; therefore, 2×\(m\) eigenvectors were obtained in each sub-band of every trial under two unilateral tasks. Subsequently, all the feature vectors in the n-th sub-bands were combined, and then the motor intention was decoded by the SVM classifier, which is also a classical method in BCI pattern recognition[24, 25].

According to the FBCSP algorithm process, the training-set data were filtered by different sub-bands, which mainly included alpha and beta bands, as well as the first- and second-harmonic bands of the SSSEP stimulation frequency. The specific frequency bands were occupied at 8–13, 13–26, 27–29, 32–34, 55–57, and 65–67 Hz. These six sub-bands were then selected to construct a spatial filter. After bandpass filtering of each sub-band in the training process, the FBCSP algorithm was used to build a spatial filter model, and the first two maximum and minimum eigenvalues were selected to form a spatial filter in our study. Afterwards, the spatial filter model was used in the bandpass filtered data of the test set to obtain the feature vectors of each sub-band. After combining the features of all sub-bands, an SVM classifier was used to obtain the classification accuracies. To ensure the reliability of the classification results, a 10×10-fold cross validation was performed, and only training data were involved in forming a spatial filter or SVM classifier model during the process. The results of decoding accuracy recognition among the two experimental paradigms were analyzed by the Wilcoxon signed-rank test to explore whether the accuracy of the MI-SSSEP paradigm was significantly higher than that of the MI paradigms. Moreover, the classification performances were evaluated using different features (i.e., ERD/SSSEP/hybrid features) for the MI-SSSEP, and the one-way repeated measurement ANOVA with Bonferroni post-hoc test was conducted with the classification results to explore whether the modulated SSSEP feature played a crucial role under the MI-SSSEP paradigm in improving the classification results. In addition, to explore the optimal classification methods for the modulated SSSEP features, comparisons of the proposed TRCA-based Riemannian classification method with another three methods, including the baseline method, were further investigated.