According to the World Health Organization (WHO), COVID-19 led to approximately 15 million deaths in 2020 and 2021 [1]. Moreover, COVID-19 has devastated healthcare systems around the world [2]. Due to the highly contagious characteristics of COVID-19, a discrete, non-contact, and easy-to-use remote COVID-19 diagnosis tool is required.
Researchers have successfully detected respiratory diseases using audio signals [3]. If the audio sensor is placed near the subject's nose, i.e., at a stable position and a certain distance from the nose of the subject, a smartphone microphone can pick up tiny or loud audio sounds [4]. Audio signals have been used in monitoring sleep and diagnosing diverse diseases, including pulmonary tract infection, pneumonia, and chronic obstructive pulmonary disease (COPD) [5]. The audio signal gives the coughing sound, which is useful information for diagnosing respiratory disease, pharyngitis, laryngitis, sinusitis, otitis media, influenza-positive forms, and the common cold [6]. Especially, COVID-19 affects the sound of someone's coughing, breathing, and voice tone since it infects the respiratory system [7].
There have been studies that detect respiratory diseases using hardware such as external microphones [8], electronic patches [9], and mobile phones [10]. However, this additional hardware requires setup or installation, and users must pay to buy it. On the other hand, software-based respiratory disease detection methods detect diseases using time-/frequency-domain features [4, 5]. For example, Mel-Scaled Spectrogram, Mel frequency cepstral coefficients (MFCC), tonal centroid, chromagram, and spectral contrast were extracted [11], and these variables are put into an ensemble of machine learning to
detect COVID-19. As the other example of software-based COVID-19 detection methods, in [12], MFCC, first order MFCC, second order MFCC, and zero crossing rate (ZCR) features are extracted from breathing, coughing, and speech signals and then put these features as input into a neural network for COVID-19 detection purposes. In [13], raw audio signals are converted into spectrogram images, which are used to train their proposed machine learning algorithm to detect COVID-19. Machine learning algorithms have been applied to detect COVID-19.
Machine learning algorithms have been proposed to detect COVID-19. In [11], an ensemble-based multi- criteria decision-making (MCDM) approach was proposed to classify COVID-19 and healthy subjects. Here, the MCDM is adopted to select the feature based on the order preference or similarity. Logistic regression (LR), K-nearest neighbor (KNN), support vector machines (SVM), and decision tree algorithms were used as classifiers to detect COVID-19 in [12]. In [13], an ensemble-based deep neural network is adopted to identify COVID-19 and is tested on the breathing and coughing dataset. An ensemble-based model is also adopted in [14] to train with spectrogram images from raw coughing data.
Among the machine learning algorithms, a recurrent neural network (RNN) is strong when applied to sequential or time series data [15]. Traditional neural networks consider each data section independent [16], while RNN can predict the future data sequence based on the last data samples. Specifically, the RNN combines the previous sequence’s output with the next input [16]. In [12], the RNN was implemented to differentiate between COVID-positive and healthy people. However, the RNN has long-term dependencies and is challenging to train for longer sequences of data due to the vanishing and exploding gradients [17]. LSTM is a particular type of RNN with memory cells, which addresses these challenges. Due to this advantage of LSTM over RNN, LSTM has been widely used in regressing or classifying time-series signals [12].
COVID-19 coughing open datasets are available online in CoughVid [18], Coswara [19], Virufy [20], NoCoCoDa [21]. CoughVid has 150 patients’ signals divided into four categories - COVID-19, asthma, bronchitis, and healthy - further simplified and separated into COVID-19 and healthy subjects. The comprehensive data collected from these patients have 30,000 audio segments and 328 cough sounds. In [14], the coughing signal from the COUGHVID dataset and spectrogram images of the coughing signal are fed into VGG-13, convolutional recurrent neural network (CRNN), and gated convolutional neural network (GCNN).
In this paper, we propose an LSTM-based COVID-19 detection method that uses the raw coughing signal itself and its features. We adopt the raw coughing signals from the Virufy dataset [20]. Here, features are extracted from the raw signal in the time-/frequency-domain. We divide the coughing signal dataset into training and testing sets. Then, the coughing signal and its features from the training set were used to train the LSTM network. The rest of this paper is organized as follows: Section 2 describes the dataset and pre-processing used in this paper. The proposed algorithm is presented in Section 3. Section 4 describes our experimental results. Finally, Section 5 concludes this paper.