3.1 Architecture
The architecture of the IoT-based [20]ECG monitoring system is illustrated in Fig. 1, which mainly consists of three parts, i.e., the ECG sensing network, IoT cloud, and GUI.
The ECG sensing network is the foundation of the entire system, which is responsible for collecting physiological data from the body surface and transmitting these data to the IoT cloud through a wireless channel. Wearable ECG sensors are usually adopted in this system, which have little impact on the user’s daily life. Through this means, ECG data can be recorded over long hours or even days. Then, the ECG value will be tested in DL model to detect arrythmia and its type.
We will be alerting Patient on the basis of ECG rate, whether they have arrhythmia or not and further its Type.
♣ Data storage
ECG data plays a vital role in the diagnosis of heart diseases. Thus, historical data is needed to be stored in the database for further analysis. The ECG data often includes the time and digitized signal amplitude. In addition, at least one copy of the data needs to be stored for disaster recovery. The ECG data is high dimensional and requires high memory for storage. Having a streaming data due to continuous monitoring poses a big challenge for a nearly unbound data. Using the new technologies of Data Engineering and Management we will prioritize the storage of such highly bulky but sensitive data either in on premise Hadoop systems or a cloud IOT database service.
♣ Data analysis
Data is one of the most important parts of the IoT system. Therefore, the IoT cloud often provides a facility to export the data to be analyzed which then will be used to extract useful information from the ECG signal. Here we will be using Convolutional Neural Networks to detect arrhythmia and its type.
♣ Disease Warning
Detect arrhythmia on the basis of patient ECG rate and further detect the arrhythmia type that is present in that particular patient in consideration.
We will be alerting Patient on the basis of ECG rate, whether they have arrhythmia or not and further its Type.
The GUI is responsible for data visualization, management and alerts. It alerts the Health professionals and other key stakeholders, if arrhythmia is detected. It collects data which can be analyzed, visualized and tracked daily to keep patient and his family aware as well as to not overlook any key fluctuations in the ECG signals.
3.2 Hardware Description
-
Arduino Uno: The Arduino Uno is an open-source microcontroller board based on the Microchip ATmega328P microcontroller and developed by Arduino.cc. The board is equipped with sets of digital and analog input/output (I/O) pins that may be interfaced to various expansion boards (shields) and other circuits.
-
LCD: A liquid-crystal display (LCD) is a flat-panel display or other electronically modulated optical device that uses the light-modulating properties of liquid crystals combined with polarizers. Liquid crystals do not emit light directly, instead using a backlight or reflector to produce images in color or monochrome.
-
ECG Sensor: The electrocardiography or ECG is a technique for gathering electrical signals which are generated from the human heart So an AD8232 sensor is used to calculate the electrical activity of the heart. This is a small chip and the electrical action of this can be charted like an Electrocardiogram (ECG).
-
Lm35 Temperature Sensor: The LM35 series are precision integrated-circuit temperature devices with an output voltage linearly- proportional to the Centigrade temperature The LM35 device is rated to operate over a − 55°C to 150°C temperature range, while the LM35C device is rated for a − 40°C to 110°C range (− 10° with improved accuracy).
-
[17]ESP8266 Wi-Fi Module: The ESP8266 is a low-cost Wi- Fi microchip, with a full TCP/IP stack and microcontroller capability, produced by Espressif Systems in Shanghai, China. The chip first came to the attention of Western makers in August 2014 with the ESP-01 module, made by a third-party manufacturer Ai-Thinker.
-
Arduino Cable, Jumper wires, Potentiometer, Resistor, Light Emitting Diodes.
3.3 IoT Cloud Platform Syncing
Arduino will collect the ECG Rate and it will transmit the data to [11][12]Thingspeak.com an IOT cloud Database service over the internet via a Wi-Fi Module.
ThingSpeak is an open-source Internet of Things (IoT) application and API to store and retrieve data from things using the HTTP and MQTT protocol over the Internet or via a Local Area Network.
ThingSpeak enables the creation of sensor logging applications, location tracking applications, and a social network of things with status updates".
3.4 Neural Networks
A neural network is a series of algorithms that endeavours to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. In this sense, neural networks refer to systems of neurons, either organic or artificial in nature.
3.5 Convolutional Neural Network
A [18]Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm which can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these filters/characteristics.
Convolutional Neural Networks, or CNNs, were designed to map image data to an output variable. They have proven to be so effective that they are the go-to method for any type of prediction problem involving image data as an input. The benefit of using CNNs is their ability to develop an internal representation of a two-dimensional image. This allows the model to learn position and scale in variant structures in the data, which is important when working with images. The CNN input is traditionally two-dimensional, a field or matrix, but can also be changed to be one- dimensional, allowing it to develop an internal representation of a one-dimensional sequence. This allows the CNN to be used more generally on other types of data that has a spatial relationship. For example, there is an order relationship between words in a document of text. There is an ordered relationship in the time steps of a time series. Although not specifically developed for non-image data, CNNs achieve state-of-the-art results on problems such as document classification used in sentiment analysis and related problems.
3.6 Model Preparation Dataset Description
The MIT-BIH Arrhythmia Database contains 48 half- hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.
The recordings were digitized at 360 samples per second per channel with 11-bit resolution over a 10 mV range. Two or more cardiologists independently annotated each record; disagreements were resolved to obtain the computer-readable reference annotations for each beat (approximately 110,000 annotations in all) included with the database.
3.7 Data Visualizations
Re-sampling is done to avoid domination of any class over any other which may affect model’s accuracy and effectiveness in classifying the classes being dominated upon.
3.8 Convolutional Neural Network Architecture
We will split the dataset into train and test sets accordingly. Encoding Target variables which will help in going forward with our classification.
Now, to build the best model we applied Convolutional Neural Networks with 6 Hidden Layers with 5 Convolutional Layers applying different concepts like DropOut, Batch- Normalization, Pooling - (Max, Min, Average) which gives 1 Output Layer.
The output layer comprising of an image of some resolution has been flattened out as the purpose of the output is to perform classification and not the general CNN tasks like object detection, face detection, etc. Applying SoftMax activation function as the activation function of the output layer is for the reason as the requirement for our model is multiclass classification. Using an Adam Optimizer for gradient descent implementation because it is known to converge faster and less probability to converge at a local minima. As the data size grows the ability to increase the model building efficiency goes a long way in creating and tuning the best possible models.
For 5 convolution layers we used layers and activation function are as follow:
1st Layer - Convolution1D with ‘elu’ activation function with Batch Normalization along and MaxPool1D.
2nd Layer - Convolution1D with ‘elu’ activation function with Batch Normalization.
3rd Layer - Convolution1D with ‘elu’ activation function with Batch Normalization.
4th Layer - Convolution1D with ‘elu’ activation function with Batch Normalization.
5th Layer - Convolution1D with ‘elu’ activation function with Batch Normalization and MaxPool1D.
Above “elu” stands for Exponential Linear Unit.
Output Layer - Now for Output Layer after trying various the best result came out with dense layer which is then flattened and then in the last layer with SoftMax activation function gives out the final probabilities, all the above trained using then Adam optimizer. To evaluate the training errors, we used various epoch iterations. We found out with more iterations data was overfitting the training set. So, the optimum number of epochs turned out to be around 10.
3.9 Evaluation Metrics
[13]Precision quantifies the number of positive class predictions that actually belong to the positive class. Precision should ideally be 1 (high) for a good classifier. Precision becomes 1 only when the numerator and denominator are equal i.e TP = TP + FP, this also means FP is zero. As FP increases the value of denominator becomes greater than the numerator and precision value decreases (which we don’t want).
[14]Recall quantifies the number of positive class predictions made out of all positive examples in the dataset. Recall should ideally be 1 (high) for a good classifier. Recall becomes 1 only when the numerator and denominator are equal i.e TP = TP + FN, this also means FN is zero. As FN increases the value of denominator becomes greater than the numerator and recall value decreases (which we don’t want).
So ideally in a good classifier, we want both precision and recall to be one which also means FP and FN are zero. Therefore we need a metric that takes into account both precision and recall. F1-score is a metric which takes into account both precision and recall.
[15]F-Measure provides a single score that balances both the concerns of precision and recall in one number. F1 Score becomes 1 only when precision and recall are both 1. F1 score becomes high only when both precision and recall are high. F1 score is the harmonic mean of precision and recall and is a better measure than accuracy. F1 score means that you have low false positives and low false negatives, so you’re correctly identifying real threats and you are not disturbed by false alarms. An F1 score is considered perfect when it’s 1, while the model is a total failure when it’s 0.
In the field of machine learning and specifically the problem of statistical classification, a [16]confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in a predicted class, while each column represents the instances in an actual class (or vice versa).The name stems from the fact that it makes it easy to see whether the system is confusing two classes (i.e. commonly mislabeling one as another).