Brain Computer Interface-EEG based Imagined Word Prediction Using Convolutional Neural Network Visual Stimuli for Speech Disability

doi:10.21203/rs.3.rs-1143834/v2

Download PDF

Research Article

Brain Computer Interface-EEG based Imagined Word Prediction Using Convolutional Neural Network Visual Stimuli for Speech Disability

https://doi.org/10.21203/rs.3.rs-1143834/v2

This work is licensed under a CC BY 4.0 License

Version 2

posted

You are reading this latest preprint version

Brain Computer Interface (BCI) is one of the fast-growing technological trends, which finds its applications in the field of the healthcare sector. In this work, 16 electrodes of Electroencephalography (EEG) placed according to the 10-20 electrode system are used to acquire the EEG signals. A BCI with EEG based imagined word prediction using Convolutional Neural Network (CNN) is modeled and trained to recognize the words imagined through the EEG brain signal, where the CNN model Alexnet and Googlenet are able to recognize the words due to visual stimuli namely, up, down, right, left and up to ten words. The performance metrics are improved with the Morlet Continuous wavelet transform applied at the pre-processing stage, with seven extracted features such as mean, standard deviation, skewness, kurtosis, bandpower, root mean square, and Shannon entropy. Based on the testing, Alexnet transfer learning model performed better as compared to Googlenet transfer learning model, as it achieved an accuracy of 90.3%, recall, precision, and F1 score of 91.4%, 90%, and 90.7% respectively for seven extracted features. However, the performance metrics decreased when the number of extracted features was reduced from seven to four, to 83.8%, 84.4%, 82.9%, and 83.6% respectively. This high accuracy further paves the way to future work on cross participant analysis, plan to involve a larger number of participants for testing and to enhance the deep learning neural networks to create the system developed to be suitable for EEG based mobile applications, which helps to identify what the words are imagined to be uttered by the speech-disabled persons.

EEG

Speech

Word recognition

CNN.

Brain-Computer Interface (BCI) can be used as a device to understand the processes ongoing in the brain of the patient or disabled persons [1]. Disability for a person either physically or mentally does affect their living style, most of them are unable to work and find living for their own. For physical disabilities, they are constrained or limited in their movement and required assistance to perform their daily life needs.

In recent years, many organizations or societies such as World Health Organization have raised concerns about the problem faced by a disabled person to the public. They expect the public to raise awareness to those disabled people and demand more facilities from the government in order to help them in their daily life, [2] for example reserved disabled parking, ramp for the disabled person, road guidance magnet for blind people and etc.

An Electroencephalography (EEG) signal can be obtained from the human brain by contacting electrodes to the scalp of the head which non-invasively captures the activity of electricity of the brain, unlike implanted electrodes into the brain and other methods [3]. Since human responses can be linked to cortical activities, Electroencephalogram (EEG) can act as a source to classify the word imagined. The word imagined by a person can be recognized by analyzing multiple electrodes at the same time, where when multiple electrodes are receiving the electricity signal spikes, then the bundling behavior of human emotion can be modeled accordingly [4].

Researchers [3] present recent developments in channel selection and evaluation algorithms for the purpose of the processing of EEG signals in applications like early seizure detection, motor imagery, sleep state analysis, emotion and mental activity classification. Covering the usage of five different techniques for channel selection. The techniques are filtering method, wrapper method, embedded technique, hybrid method, and also human-selection method [5]. The advantages and disadvantages of this approach were discussed. Presented the usage of the techniques in the applications mentioned [6]. The study discusses the use of four techniques for all the applications. Focusing on the application of channel selection algorithm for motor imagery, filtering technique is commonly used in many researches. This is because it is able to improve the accuracy of the BCI [7–9]. Other techniques like wrapper technique and embedded technique have also yielded positive results. The study provides background knowledge on algorithms that can be deployed to select EEG channels, process and classify data received. Further work can be done to determine a channel selection technique that can produce the highest accuracy which can then be used in applications involving visual and auditory evoked memories. The channel selection methods are based on feature extraction of EEG data and therefore, the techniques have been used extensively in motor imagery application [10–12].

There are many classification algorithms being applied in BCI technology. Researchers [13–16] reviewed the modern classification algorithms for data produced by an EEG device. The algorithms are classified into four main methods, namely adaptive classifiers, matrix and tensor classifiers, adaptive learning classifiers, and also deep learning classifiers [17–19]. This research discusses major issues faced in classifying EEG signals. The working principle, advantages, and disadvantages of each classifier are explained thoroughly. In addition to that, the research also analyses the properties of each classifier, i.e. stable or unstable, dynamic or static, regularized [20–23]. Similar to the researcher’s previous work [24–26], this research is also covering the suitability of the classifier application-based. The research is concluded with future possibilities of the classifiers discussed.

One of the gaps that still need to be bridged is in increasing the accuracy of EEG feature classifiers [27]. There is a need for investigation of the best performing classifier for each task and how they can be used collectively. Training trials required to achieve accurate results can be reduced [28–29].

The perks of using EEG are that it is inexpensive compared to other medical devices, non-invasive, and can be portable. Therefore, EEG is widely used to study neuroplasticity changes in many areas. EEG can also be used for early prediction and more research is required to be carried out considering the types of lesions, time is taken for rehabilitation, and larger sample size of patients [30].

In this section, the comparison is done for 2 different trained models and the trained model with better performance is selected, where the performance is in terms of accuracy achieved and training duration. In this experiment, 2 trained model is compared which is the Alexnet and Googlenet. The reason for choosing these 2 trained models for transfer learning is because Alexnet requires lesser training duration but is only able to achieve average accuracy. On the other side, Googlenet requires a longer training duration but it is more likely to achieve higher accuracy. The tabulated data are as shown in Table 1, where the training for Googlenet is set with the epoch of 160 as Googlenet requires a longer training duration while the Alexnet only set with an epoch of 80.

Table 1

Model with Epoch for testing
Test	Trained Model	Epoch
Testing Model 1	Alexnet	80
Testing Model 2	Googlenet	160

Training Model Evaluation Matrics

The designed model’s performance needs to be evaluated based on the results obtained. The models were evaluated not only based on the accuracy but also based on True Positive Rate or Sensitivity or Recall, precision, and F1 Score.

Recall is defined as the ratio of true positive to the sum of true positive and false negative and is given as,

$$Recall=\frac{True Positive}{True Positive+False Negative}$$

Precision is defined as the ratio of true positive to the sum of true positive and false positive and is given as,

$$Precision=\frac{True Positive}{True Positive+False Positive}$$

F1 score is defined as the harmonic mean of the other two metrics, namely precision, and recall, and is given as,

$$F1 score=2X\frac{Precision X Recall}{Precision+Recall}$$

Initial Learning Rate Analysis

In this testing analysis, the Initial Learning Rate (ILR) is analyzed to select the best ILR in order to achieve better accuracy. Hence, the ILR will be set at three different values in order to train the CNN models and a comparison is done to select the best, based on the maximum achievable accuracy and maximum validation and training loss difference.

Table 2

ILR Analysis
ILR	Maximum achievable Accuracy	Maximum validation and training loss difference
0.0001	>69.68%	0.5
0.0003	~66.43%	1.5
0.0005	~66.25%	1.7

It can be inferred from Table 2, that when the ILR is 0.0005 and 0.0003, the maximum achievable accuracy is only upto 66.43%, while when the ILR is 0.0001, the maximum achievable accuracy is greater than 69.68%. This allows the accuracy of more than 69.68% to be achieved while testing. At the same time, the maximum validation and training loss difference is at the minimum for ILR of 0.0001. Hence, for the CNN models, the ILR of 0.0001 is selected based on the ILR analysis.

Training Model Duration Test

This test is done to evaluate the time taken for the models to be trained and the data collected is tabulated as shown in Table 3.

Table 3

Training Model duration test
Model	Training Duration (Minutes)
1	214
2	542

From the generated results in Figure 1 and Figure 2, it is clear that even with a longer training duration and higher epoch, the Alexnet is still able to achieve higher accuracy than the Googlenet. This is due to the Alexnet having a larger input signal size than the Googlenet and the structure of the Alexnet is also be better for extracting the interrelated features of the EEG signals and scalogram.

Experimental Test Results of Models using 7 extracted features

In this experiment, 80% of the data collected was used for training and 20% was used for testing and validation of the model. Each participant is tested 10 times and the average of the test result for model 1 is presented in Table 4 along with their average, while for model 2 is presented in Table 5 along with their average.

Thus, the results of each of the trained models used in the training for the classification of EEG signal using 7 extracted features are generated and recorded into Table 4 and 5, which includes the Recall, Precision, Accuracy, and F1 score of each trained model with different epoch used.

Table 4

Testing results with 7 features for model 1
Participants	Recall (%)	Precision (%)	Accuracy	F1 score
1	92.9	90.3	91.4	91.6
2	88.3	86.6	87.5	87.4
3	89.5	88.4	88.4	88.9
4	88.6	87.3	87.6	87.9
5	88.3	87.5	87.8	87.9
6	92.7	93.2	92.4	92.9
7	94.1	92.6	91.3	93.3
8	96.6	94.2	95.7	95.4
9	88.2	87.1	87.1	87.6
10	94.6	92.8	93.6	93.7
AVERAGE	91.4	90.0	90.3	90.7

Table 5

Testing results with 7 features for model 2
Participants	Recall (%)	Precision (%)	Accuracy	F1 score
1	83.4	82.6	82.6	83.0
2	78.6	75.8	77.3	77.2
3	78.2	76.4	78.4	77.3
4	78.7	77.4	77.4	78.0
5	76.5	73.8	76.1	75.1
6	88.2	87.4	87.7	87.8
7	89.2	88.3	89.2	88.7
8	94.6	92.4	93.9	93.5
9	86.2	85.8	85.8	86.0
10	90.7	88.8	89.4	89.7
AVERAGE	84.4	82.9	83.8	83.6

Experimental Test Results of Models using 4 extracted features

In this experiment, 80% of the data collected was used for training and 20% was used for testing and validation of the model. Each participant is tested 10 times and the average of the test result for model 1 is presented in Table 6 along with their average, while for model 2 is presented in Table 7 along with their average.

Thus, the results of each of the trained models used in the training for the classification of EEG signal using 7 extracted features are generated and recorded into Table 6 and 7, which includes the Recall, Precision, Accuracy, and F1 score of each trained model with different epoch used.

Table 6

Testing results with 4 features for model 1
Participants	Recall (%)	Precision (%)	Accuracy	F1 score
1	90.1	88.6	89.4	89.3
2	85.4	82.2	84.4	83.8
3	86.7	84.3	85.3	85.5
4	85.2	83.7	84.7	84.4
5	84.8	82.2	82.2	83.5
6	85.6	82.1	84.5	83.8
7	89.8	87.5	88.8	88.6
8	90.2	92.2	93.5	91.2
9	86.6	85.1	85.1	85.8
10	92.7	90.4	91.8	91.5
AVERAGE	87.7	85.8	87.0	86.8

Table 7

Testing results with 4 features for model 2
Participants	Recall (%)	Precision (%)	Accuracy	F1 score
1	80.8	88.8	88.8	84.6
2	76.8	75.1	75.5	75.9
3	78.1	78.4	78.3	78.2
4	76.5	74.5	74.5	75.5
5	74.4	73.7	73.2	74.0
6	85.2	85.5	85.1	85.3
7	87.9	87.1	87.1	87.5
8	92.3	91.5	91.2	91.9
9	85.1	83.8	84.1	84.4
10	89.6	87.7	87.7	88.6
AVERAGE	82.7	82.6	82.6	82.6

In this research work, BCI-EEG based word prediction using CNN is demonstrated. In this experimental work, first, the ILR was analyzed with three different ILR of 0.0001, 0.0003, and 0.0005. Based on the maximum achievable accuracy and maximum validation and training loss difference, ILR of 0.0001 was selected as it allows room to achieve higher accuracy of greater than 69.68%.

The next two training models were trained and tested. Based on the training duration required, Alexnet was selected with 80 epochs and with less training duration but resulted in higher accuracy.

Further, the experimental test results for 7 and 4 extracted features were analyzed between the two CNN models, and the results are tabulated as shown in Table 8.

Table 8

Performance evaluation of the two models
Number of extracted features used	Models	Recall (%)	Precision (%)	Accuracy	F1 score
7	1	91.4	90	90.3	90.7
7	2	87.7	85.8	87	79.3
4	1	84.4	82.9	83.8	83.6
4	2	82.7	82.6	82.6	82.6

From Table 8, it can be inferred that model 1, Alexnet outperforms in terms of all the performance evaluation metrics for 7 and 4 extracted features used. The higher accuracy achieved was 90.3% by Alexnet when 7 extracted features were used. Also, to be inferred is that when the number of extracted features used is decreased from 7 to 4, all the performance metrics decreased in both the trained models, however, the performances decreased by 3.3% on average.

Table 9

Comparison of the developed models with literature
Evaluation parameters	Developed Model 1		Developed Model 2		Kocturova & Juhar, 2021 [31]	Netzer, Frid & Feldman, 2020 [32]
Number of extracted features	4	7	4	7	9	-
Recall (%)	84.4	91.4	82.7	87.7	88.85	-
Precision (%)	82.9	90.0	82.6	85.8	84.96	-
Accuracy	83.8	90.3	82.6	87	84.62	70.6
F1 score	83.6	90.7	82.6	79.3	86.80	-

Table 9 shows the comparison of the developed models with the existing models presented in the literature [31], [32]. It could be observed that the accuracy of the developed model 1 has been improved and the percentage of improvement is 3.79%, 6.71%, and 27.90%, as compared with model 2 (Googlenet), [31] and [32]. It can be noted that the number of features extracted has been reduced and the system or models developed is analyzed and the performance metrics are evaluated.

Thus, the Alexnet transfer learning model is selected to be the best model as compared Googlenet, as it achieved an accuracy of 90.3% and the final training option of 80 epoch, 64 batch size, scalogram pre-processing method, the ratio of 80:20 training and validation set and initial learning rate of 0.0001. This high accuracy further paves the way to future work on cross participant analysis, to increase the number of participants and plan to involve a larger number of participants for testing and to enhance the deep learning neural networks to create the system developed to be suitable for EEG based mobile applications.

EEG Recording Device

Although there is a lot of research work done in the area of human speech detection and recognition modeling, it is always challenging to use this data that works on the wireless EEG device to acquire the EEG signal data from the human brain and to transmit it wirelessly to the computer interface that creates a speech recognition model. There is always difficulty in the acquisition of the data from wireless devices due to various inferior signal conditions. But wireless devices benefit such as easy connection, the easy transmission of data, cheap prices, easy to mount on the head, and so on.

In this research work, Epoc Signal Server is used heavily for the EEG raw data recorder to Simulink for mathematical signal processing for feature extraction and classification ranking purpose. Besides, Emotiv Control Panel is used to check the electrodes connectivity strength before the recording and training started. Emotiv Testbench and Emotiv Brain Activity Map are used heavily for visual analysis hand in hand with the Simulink recorded data to provide a better strategy to analyze the data efficiently.

EEG is responsible to collect data emitted by the cortical cortex of the brain. The EEG has 16 electrodes which are placed according to a 10-20 system. The EEG device communicates wirelessly to the laptop and therefore other additional components to support the functions of the EEG device are required. The EEG that was used is ‘Emotiv EPOC+[12], as shown in Figure 3.

Data Acquisition

The whole system starts with the EEG device. There are 16 sensors on the EEG device, and these sensor locations are fixed using the 10-20 system. Two of the 16 sensors are used as reference points. These two sensors will be placed behind the ears on mastoid bones. The 16 sensors are located at Fp1, Fp2, F3, F4, F7, F8, C3, C4, T3, T4, P3, P4, T5, T6, O1, O2. Data collected by each of these sensors is considered a separate channel. The sensors will measure the potential difference of the electrical signals fired by the neurons in the brain. The unit used by the sensors is micro-volts.

Data collected by mobile EEG device is at a sampling rate of 128Hz. This new sampling rate will produce lower samples for each recording, thus computational power and time required to train and test classifiers are reduced drastically. The recording of each activity will contain 256 samples. Some variables and arrays were initialized like the 14 selected EEG channels. The 14 channels were selected based on the construction of the Emotiv EPOC EEG device which is used in this research work to read data from these 14 channels. Then the number of participants was checked and looped to open all the raw EEG files to analyze the EEG signals channel by channel.

The computing device receives data from the EEG device via Bluetooth connection. The data received has 25 channels which is additional nine channels. These nine channels contain other data such as timestamp, counter, marker signal, synchronization signal, gyro values and etc. The data from the EEG device is collected every 0.0078125 seconds based on the sampling rate of 128Hz. At the end of the two-second recording, 256 samples are collected and tabulated in a matrix form. The dimensions of the matrix would be 256×25. A high pass filter is then used to reduce the effects of DC offset and also filter out low-frequency noises that may exist in the signal. The high pass filter has a roll-off frequency of 5Hz. Data is saved into the computing device in matrix format allowing the files to be accessed on MATLAB during classifier training.

Data Acquisition Protocol

First, the participants were briefed on the experiment to be carried out and mentioned that the data acquired will be purely used for the research work only following the code of ethics of Anna University guidelines. The data acquisition protocol was developed and briefed to the participants that all agreed to the instructions during the recording. A total of 10 participants were involved, with each being tested separately using two different models using CNN. The participants were asked to imagine 10 words in sequential order, and the signal for each word imagined was recorded with 5 secs gap in between. For this, a stopwatch clock was used that was placed in front of them. The signal imagined was recorded for a duration of 3 secs, followed by a gap of silence for 5 secs. Thus 10 words imagined were recorded for a period of 75 secs for each participant.

Experiment

Convolutional Neural Network (CNN) is used to predict the word imagined as the Continuous Wavelet Transform (CWT) will convert the EEG signals into 3D scalogram image and CNN is able to capture the time-related and spatial dependencies of an image when the relevant filter is applied. The method is shown in Figure 4 and explained in sequence as follows:

Pre-processing using Morlet Continuous Wavelet Transform

In order to perform the pre-processing of the dataset and normalization of label value, the dataset size and format were first checked by loading the dataset. CWT has the window function which can tackle the main wavelet function where the window is shifted and scaled in the process of conversion. This allows the windowing at a longer time interval at low frequency and for high frequency, a short time interval windowing will be used. Moreover, with the capability of the splitting window of variance sizes, it provides the highly effective analysis of the low and high-frequency information of the EEG signal with the non-stationary property. The spectral analysis was done using Morlet wavelet CWT (MCWT) as it is more suitable for non-stationary EEG signals.

The MCWT can be represented mathematically by the equation 1 defined as,

$${W}_{\text{x}}\left(\text{s}, {\tau }\right)=\frac{1}{\sqrt{\text{s}}}\underset{+{\infty }}{\overset{-{\infty }}{\int }}\text{x}\left(\text{t}\right){\psi }\left(\frac{\text{t}-{\tau }}{\text{s}}\right)\text{d}\text{t}$$ …………………………………….. 1

Where

${\text{W}}_{\text{x}}\left(\text{s}, {\tau }\right)$ = coefficient of wavelet

$\text{x}\left(\text{t}\right)$ = signal of time

${\psi }\left(\text{t}\right)$ = wavelet function conjugate

s = scale

${\tau }$ = constraint position

The MCWT was done and produced 1 scalogram per channel, as each trial has 14 channels and thus, the 14 scalograms were combined into 1 image and this combined image will be used as the input for the CNN.

For example, the location [1, 1, 1:8064] represents the first data of the first electrode of the first trial out of 40 trials and the 8064 layers are the data recorded per sample, where the sampling frequency is 128Hz and the data length is 63 seconds. After obtaining the data from one electrode, MCWT is applied with a sampling frequency of 128Hz to convert the data into the scalogram.

The scalogram produced is shown in Figure 5, but the generated image has a label and white bar covering the image which increases the training duration and accuracy of the CNN.

Finally, the 14 scalograms that represent the 14 electrodes recording the EEG signal of the same instance are combined into 1 image as shown in Figure 6 to ease the process of labeling matching and allow the CNN to understand the direct relationship and difference of each scalogram of the same instance when changes have appeared. Then the combined image is saved into the respective designated folders and the combined image is as shown in Figure 6.

Normalisation

The value of the dataset label was normalized to 1 (High) and 0 (Low), where 1 and 0 indicate that the value is in the range of 1-5 and 6-9 respectively. The normalization of the value is required by this method so that the accuracy of the system can be increased by reducing the wide range of parameters.

Feature Extraction

Based on the literature, it is decided to extract 7 features for the first round of training two different models and then reduce to 4 features for the second round of training two different models. The seven features extracted from each signal are,

1. Mean is defined as the average value of the frame, given as

$$\mu =\frac{1}{N}\sum _{n=1}^{N}x\left(n\right)$$ ………………………………… 2

2. Standard deviation is defined as,

$$\sigma =\sqrt{\frac{1}{N-1}\sum _{n=1}^{N}{\left(x\left(n\right)-\mu \right)}^{2}}$$ …………………………3

3. Skewness is defined as the asymmetric measured distribution around its mean values and is given as,

$$s=\frac{1}{N{\sigma }^{3}}\sum _{n=1}^{N}{\left(x\left(n\right)-\mu \right)}^{3}$$……………………………4

4. Kurtosis is defined as the 4th order central moment of the distribution in the given frame and is given as,

$$k=\frac{1}{N{\sigma }^{4}}\sum _{n=1}^{N}{\left(x\left(n\right)-\mu \right)}^{4}$$……………………………5

5. Band power is defined as the average power of the signal in the frequency domain and is given as,

$$P=\frac{1}{N{\sigma }^{4}}\sum _{n=1}^{N}{\left(x\left(n\right)-\mu \right)}^{4}$$……………………………6

6. Root Mean Square is defined as,

$$RMS=\sqrt{\frac{1}{N}\sum _{n=1}^{N}{x\left(n\right)}^{2}}$$………………………………7

7. Shannon Entropy is defined as the spectral distribution measurement of the signal or the amount of information measured and is given as,

$$SE=\sum _{n=1}^{N}x\left(n\right){log}_{2}x\left(n\right)$$………………………………8

Training of CNN

The training process of the CNN using the pre-processed dataset and the normalized dataset labels. The CNN can be divided into 2 parts where the first part is the feature learning layer which generally extract the feature from the input signal, second part is the classification layer where the extracted signal features were flattened into a series of column vector for the feed-forward neural network to perform training.

ReLU activation function

The Rectified Linear Unit (ReLU) was used due to its high efficiency for computational conversion as it does not restrict the range of the activation which means from 0 to infinity. Overfitting issues and long training duration can also be avoided. The ReLU function can be represented mathematically shown in equation 9.

Alexnet

The Alexnet is 8 layers deep convolutional neural network where it is capable to predict signals over 1000 categories of the object as more than 1 million signals have been trained by the network. The fully connected layer is used to classify the signals and the number of classification output of both the fully connected layer and output layer are 1000. However, the classification output required in this work is 10 categories.

The backpropagation method was used for the adjustment of weight and bias for every iteration until certain series of epochs where the fully connected layer will be able to perform the classification.

Finally, the output of the fully connected layer will be sent to the soft-max classification layer for final classification as a label and the trained CNN model will be used for testing. The output of the softmax layer is equal to the desired number of outputs which in this case is 10 (left, right, up, down, front, back, stop, pick, red, blue).

BCI: Brain to Computer Interface, CNN: Convolutional Neural Network, EEG: Electroencephalography, ILR: Initial Learning Rate

Acknowledgements

We appreciate the assistance from Anna University

Authors contributions

Babu Chinta designed research and collected data samples. Moorthi M. wrote the paper. Both read and approved the final manuscript.

Funding

No funding details available

Availability of data and materials

Please contact corresponding author for data requests

Ethics approval and consent to participate

All the experiment was in accordance with the guidelines of national/International in the manuscript.

Consent to publication

All authors consent and Authors confirming that informed consent was obtained from all subjects and/or their legal guardian(s).

Competing interests

The authors declare no competing financial interests

ZHANG, X., YAO, L., ZHANG, S., KANHERE, SALIL, SHENG, MICHAEL AND LIU, Y. (2015) Internet of Things Meets Brain-Computer Interface: A Unified Deep Learning Framework for Enabling Human-Thing Cognitive Interactivity. Journal of Latex Class Files, 14(8). p. 1-8.
CHAO, H., ZHI, H., DONG, L. AND LIU, Y. (2018) Recognition of Emotions Using Multichannel EEG Data and DBN-GC-Based Ensemble Deep Learning Framework. Computational Intelligence and Neuroscience. 2018(1). p. 1-11.
ALOTAIBY, F. E. A EL-SAMIE, S. A. ALSHEBEILI AND A. ISHTIAQ (2015) A review of channel selection algorithms for EEG signal processing. EURASIP Journal on Advances in Signal Processing, 66.
LOTTE, L. BOUGRAIN, A. CICHOCKI, M. CLERC, M. CONGEDO, A. RAKOTOMAMONJY AND F. YGER. (2018) A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update, Journal of Neural Engineering, 15(3), pp.1-29.
RAVI, S. SATHISH KUMAR AND CHOW HIM MUN. (2014) Improved Speech Recognition using Neural Network, in the International Journal of Applied Engineering Research, Vol.9(18), pp. 4297-4325.
WOOKEY, J. S. JESSICA, O. BUSRA, S. S. BONG, M. AZIZBEK AND L. SUAN. (2021) Biosignal Sensors and Deep Learning-based speech recognition: A Review. Sensors, pp. 1-22.
GAUTAM, T, CO, C. MASON, H. YAN AND T. H. AHMED. (2019) Improving EEG based Continuous Speech Recognition. arXiv:1911.11610.
BEOMJUN, K. JONGIN, P. HYEONG-JUN AND L. BOREOM. (2016) Vowel Imagery Decoding toward Silent Speech BCI using Extreme Learning Machin with Electroencephalogram. BioMed Research International, p. 1-11.
D. Khalid, H. LAILA, Z. S. SAADEH, H. MOHAMMED, S. RAMZI AND H. A. SHARHABEEL. (2018) An efficient speech recognition system for arm-disabled students based on isolated words. Computer Applications in Engineering Education, 26(2), pp. 285-301.
PORBADNIGK, M. WESTER, J. CALLIESS, T. SCHULTZ. EEG-Based Speech Recognition- Impact of Temporal Effects. In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing, pp. 376-381.
LI, W. CHAO, Y. LI, B. FU, Y. JI, H. WU, G. SHI. (2021) Deconding imagined speech from EEG signals using hybrid-scale spatial-temporal dilated convolution network. Journal of Neural Engineering, 18(4), pp.1-12.
LOUIZA SELLAMI , THERESA NEUBI. (2019) Analysis of speech related EEG signals using emotiv epoc+ headset, fast fourier transform, principal component analysis, and K-nearest neighbor methods. International Journal of Biosensors & Bioelectronics, 5(3), pp. 94-98.
SANI SAMINU1, GUIZHI XU , ZHANG SHUAI, ABD EL KADER ISSELMOU , Adamu Halilu Jabire , Ibrahim Abdullahi Karaye Isah Salim Ahmad , Abubakar Abdulkarim. (2020) Electroencephalogram (EEG) Based Imagined Speech Decoding and Recognition. Applied materials and Technology, 2(2), pp. 74-84.
RAJDEEP GHOS, NIDUL SINHA, SAROJ KUMAR BISWAS, SOUVIK PHADIKAR. (2020) A modified grey wolf optimization based feature selection method from EEG for silent speech classification. Journal of Information and Optimization Sciences, 40(8), pp.1-6
RINI A. SHARON, SHRIKANTH S. NARAYANAN, MRIGANKA SUR, A. HEMA MURTHY. Neural Speech Decoding During Audition, Imagination and Production. IEEE access, 8, pp. 149714-149729.
NASSIB ABDALLAH, PIERRE CHAUVET, ABD EL SALAM HAJJAR, BASSAM DAYA. (2018) Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area. World Academy of Science, Engineering and Technology International Journal of Biomedical and Biological Engineering, 12 (10), pp. 1-16.
AARTHI, D. AMMU PRIYA, R. DHANALAKSHMI, A. R. MIRADEVI, B. SHUNMUGAPRIYA. (2020) Emerging work on Smart Speaking System for Dumb People using BCI. International Journal of Recent Technology and Engineering (IJRTE), 8(6), pp. 1-17.
MERIEM ROMAISSA BOUBAKEUR, UOYIN WANG. (2021) Self-Relative Evaluation Framework for EEG-Based Biometric Systems. Sensors, 21, pp. 1-19.
IVAN TOT, MLADEN TRIKOŠ, JOVAN BAJČETIĆ, KOMLEN LALOVIĆ, DUŠAN BOGIĆEVIĆ. Software Platform for Learning about Brain Wave Acquisition and Analysis. Acta Polytechnica Hungarica, 18(3), pp. 147-162.
AHMAD AL-QEREM, FATEN KHARBAT, SHADI NASHWAN, STAISH ASHRAF AND KHAIRI BLAOU. (2020) General model for best feature extraction of EEG using discrete wavelet transform wavelet family and differential evolution. International Journal of Distributed Sensor Networks,16(3), pp. 1-8.
H HINDARTO , A MUNTASA AND S SUMARNO. (2018) Feature Extraction ElectroEncephaloGram (EEG) using wavelet transform for cursor movement. 3rd Annual Applied Science and Engineering Conference (AASEC 2018), pp.1-9.
LUNG CHUIN CHEONG, RUBITA SUDIRMAN AND SITI SURAYA HUSSIN. (2015) Feature Extraction of EEG Signal Using Wavelet Transform For Autism Classification. ARPN Journal of Engineering and Applied Sciences,10(9), pp. 8533-8540.
JAHANKHANI, P., KODOGIANNIS, V. AND REVETT, K. (2006) EEG signal classification using wavelet feature extraction and neural networks. in: JVA '06. IEEE John Vincent Atanasoff 2006 International Symposium on Modern Computing, 2006. Los Alamitos, USA IEEE . pp. 120-124.
JEN LOOI TEE, SWEE KING PHANG, WEI JEN CHEW, SIEW WEI PHANG AND HOU KIT MUN. (2019) Classification of Meditation States Through EEG: A Method using Discrete Wavelet Transform. 13th International Engineering Research Conference (13th EURECA 2019).pp. 1-10.
SHARMILA BANU, S. SUGANYA. (2019) Feature Extraction and Enhanced Convolutional Neural Network (ECNN) for Detection and Diagnosis of Seizure using EEG Signals. International Journal of Engineering and Advanced Technology (IJEAT), 9 (1)), pp. 1369-1376.
RAFSAN JANI AND MD. IMDADUL ISLAM. (2018) De-noising and Feature Extraction of ECG and EEG Signal Using Adaptive Algorithm and Wavelet Transform. Jahangirnagar University Journal of Science, 41(1), pp.43-56.
PRIYANKA G. BHOSALE, S.T.PATIL. (2013) Classification of EEG Signals Using Wavelet Transform and Hybrid Classifier for Parkinson’s Disease Detection, 2(1), pp. 1-6.
MANGALA GOWRI S G, CYRIL PRASANNA RAJ P. (2017) A Survey on EEG Feature Extraction and Feature Classification methods in Brain Computer Interface. International Journal of Advanced Research in Computer and Communication Engineering, 6(4), pp. 1-10 .
DENG WANG, DUOQIAN MIAO, CHEN XIE. (2011) Best basis-based wavelet packet entropy feature extraction and hierarchical EEG classification for epileptic detection. Expert Systems with Application, 38, pp. 14314-14320.
H HINDARTO1, ARIF MUNTASA. (2018) Coefficient of Subband Discrete Wavelet Transforms for Feature Extraction of ElectroEncephaloGraph (EEG) Signals. International Journal of Engineering & Technology, 7(2.14), pp. 208-211.
KOCTUROVA, J. JUHAR. (2021) A Novel approach to EEG speech activity detection with visual stimuli and mobile BCI. Applied Sciences, 11, pp. 1-12.
NETZER, A. FRID, D. FELDMAN. (2020) Real-time EEG classification via coresets for BCI applications. Engineering Applications of Artificial Intelligence, 89, pp. 1-8.

No competing interests reported.

Download PDF

Version 2

posted

You are reading this latest preprint version

Brain Computer Interface-EEG based Imagined Word Prediction Using Convolutional Neural Network Visual Stimuli for Speech Disability

Status:

Version 2

Abstract

Figures

Background

Results

Training Model Evaluation Matrics

Initial Learning Rate Analysis

Training Model Duration Test

Experimental Test Results of Models using 7 extracted features

Experimental Test Results of Models using 4 extracted features

Discussion

Conclusion

Methods

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 2