3.1 General insight of COVID-19 confirmed cases and deaths in Malaysia
This study visualised the trend of COVID-19 cumulative confirmed cases and deaths in Malaysia from July 2020 to June 2021 via Fig. 1. A remarkable increasing trend was observed on cumulative confirmed cases in Malaysia starting from Day 70 (8th September 2020) with a triple-digit number of 100 cases (Fig. 1a). The similar increasing pattern was observed on accumulated confirmed deaths on day 84 (22nd September 2020) with 3 cases (Fig. 1b). The confirmed deaths associated with COVID-19 on 22nd September 2021 involved individuals aged 48- and 54-year-old in Sabah who showed symptoms of COVID-19 on Day 76 (14th September 2020) and Day 80 (18th September 2020), respectively. Another confirmed death involved an asymptomatic 72-year-old individual from Alor Setar, which was found positive on 19th August 2020 (Day 50). Generally, the signs of COVID-19 usually appear after 1 – 14 days of the incubation period but commonly occur after five days [13]. Based on the estimation of serial interval and incubation period, it was estimated that 44% of transmission probably had occurred before symptoms appeared [14, 15]. In the previous study, it was also reported that there was a significant relationship between viral load and incubation period, in which the initial viral load begins to increase within 5 to 6 days before the first symptoms appeared [14, 16, 17]. The incubation period becomes shorter when the viral loads are high, corresponding to low Cycle Threshold (Ct) values. Since the viral loads evolve, the high viral loads are probably the primary cause of transmission [16, 17].
There are significant increments (p-value < 0.05) on daily confirmed cases and deaths in Malaysia from July 2020 to June 2021 (Table 1). An excellent correlation between the number of confirmed cases and deaths was also observed (0.907, p-value < 0.05), in which high cases influenced a high mortality rate (Table 1). Previously, Malaysia has successfully curbed the first and second waves of the outbreak by lowering the confirmed cases in July until early September 2020 (Day 1 – Day 69), with less than 100 cases per day (Fig. 1a) [1]. However, an increment of confirmed cases is observed in the fourth week of September 2020 (Day 85 – Day 92), commencing the third epidemic wave in Malaysia [1]. The increasing of confirmed cases occurred right after the state election in Sabah on 26th September 2020 (Day 88) [1] Many cases are associated with high-risk areas in Sabah, which led to 29 clusters located in Sabah, and 26 clusters had an index case with travel history to Sabah mainly from the east of Sabah, including Lahad Datu, Semporna, Tawau, Kunak and Sandakan areas [18]. Despite the increment number of confirmed cases in Sabah, the control of people movements over the country was not restricted. The swab tests were not mandatory before travelling among the states, resulting in the confirmed cases being continuously escalated from single-digit to thousands per day [19].
Furthermore, the condition in Sabah has become worse due to lack of awareness about COVID-19 and its symptoms, especially among people who live in rural areas, failure to comply with the instructions by health officers, as well as the paucity of healthcare workers in Sabah's hospitals which had caused 10 400 backlogged COVID-19 test samples [20]. Based on the Department of Statistics Malaysia Official Portal in 2021, Sabah was one of the top three states with the highest population composition of 11.7%, preceded by Selangor with 20.1% and followed by Johor with 11.6% [21]. However, the population density of 99 people per one square kilometre in Sabah (52/km2) is not relatively high as in Federal Territory (FT) Kuala Lumpur (7188/km2), FT Putrajaya (2354/km2), Selangor (674/km2), and Johor (174/km2) [19, 21]. Although the population density in Sabah is not densely high as in Peninsular Malaysia, the majority of 3.83 million people in Sabah are settling along the Sabah's coastline instead of the interior mountainous part, which caused the spike of the COVID-19 cases in those areas after the state election of Sabah [22]. Besides that, irregular and undocumented migrants in Sabah have caused the COVID-19 situation in this state to become more challenging to COVID-19 screening tests and contact tracing since they were at risk of detention or deportation if found, resulting in difficulty in getting robust and reliable data [23].
Based on Figure 1(a), the increment of confirmed cases in Day 70 – Day 215 (8th September 2020 – 31st January 2021) were not steep as compared to confirmed cases in Day 280 – Day 340 (6th April 2021 – 5th June 2021). The commencing of triple-digit COVID-19 cases was observed on Day 70 (8th September 2021) during the recovery movement control order (RMCO) and later exponentially increased during the conditional movement control order 2.0 (CMCO 2.0) in Day 106 – Day 196 (14th October 2020 – 12th January 2021) [24]. The exponential increment in COVID-19 cases during CMCO 2.0 was due to the emergence of new clusters right after the Sabah state election held on 26th September 2020. Malaysia government was then decided to implement the movement control order 2.0 (MCO 2.0) again on 13th January 2021 (Day 197) after observing worrying COVID-19 numbers that reached thousands per day [25]. During Day 197 – Day 247 (13th January 2021 – 4th March 2021), MCO 2.0 successfully showed a decreasing trend in COVID-19 cases per day. However, the implementation of MCO 2.0 was not last long. The government was once again announced for the third CMCO on 5th March 2021 for the safety of Malaysia's economy [26]. Although the MCO 2.0 execution was not stricter than MCO 1.0 and allowed most businesses to operate, Malaysia still recorded a RM 600 million loss per day since most businesses were struggling in recovering phase, and investors remain pessimistic [27].
During the CMCO 3.0 and MCO 3.0 (Day 280 – Day 340), the spike in COVID-19 cases was observed higher than CMCO 2.0 and MCO 2.0 (Day 70 – Day 215) due to the mass testing in Selangor and Penang, failure to comply with the standard operating procedures (SOPs) by the societies, as well as the emergence of new coronavirus variants with higher infection rates comprising of United Kingdom variant (Alpha Variant B.1.1.7), South African variant (Beta Variant B.1.351), and Indian variant (Delta Variant B.1.617.2) [28]. Social gathering activities and the concentration of people in crowded spaces are the primary causes of COVID-19 spreading due to societies' difficulty in complying with the SOPs. In Selangor, the government state decided to fully utilise the antigen rapid test kit (RTK-Antigen) method during the mass testing since the testing results can be obtained in the same day as compared to the reverse transcription-polymerase chain reaction (RT-PCR) method, which the testing results can take up to three days and cause backlog [29]. The purpose of mass testing using RTK-Antigen was to promptly detect and isolate the silent carriers and understand the positivity rate and hotspots better. Therefore, the expectation of COVID-19 cases to spike higher than the previous was unsurprised. The increasing number of COVID-19 cases has also caused an overburden on the healthcare system, particularly in highly affected states such as Selangor, Sarawak, Penang, Kelantan and FT Kuala Lumpur, leading to the escalating of COVID-19 deaths [18].
3.2 Correlation among states using network analysis
In the current study, network analysis was constructed to determine the relationship of states in Malaysia associated with confirmed COVID-19 cases. COVID-19 pandemic risk can be assessed and visualised using correlation and network analysis. States that were densely connected with others will exhibit higher complexity of edges in the network graph suggesting the critical centre of virus transmission throughout the networks [8]. In this study, Spearman's rank coefficient was used to measure the polarity (-1 to 1) of correlation between states based on daily confirmed cases. A positive value of the Spearman rank correlation represents co-existence, whereas a negative value indicates opposition between two states. The starting point of a timeframe in the current study was set for quarter 3 (Q3) of 2020 (July – September), quarter 4 (Q4) of 2020 (October – December), quarter 1 (Q1) of 2021 (January – March 2021), and quarter 2 (Q2) of 2021 (April – June) as daily confirmed cases fluctuated, prompting this study to investigate the correlation between states that led to the spiked number of cases. Table 2 summarises the number of nodes and edges and the analysis time of these quarters of time frame. The correlations that were significantly different (p-value < 0.05) were discussed in this section.
In Q3 of 2020, Sabah and Kedah were highly correlated (r = 0.329) despite having a weak positive correlation compared to Perak with Perlis (r = 0.322) and Malacca with Selangor (r = 0.326) (Figure 2a). Sabah and Kedah had 1505 and 270 confirmed cases, respectively, throughout the entire quarter, yet no reports linking the COVID-19 transmission between these two states. Sabah reported the first cluster on 1st September at Lahad Datu District Police Headquarters lock-up, accounting for 74.7% of the total new cases between 7th to 13th September 2020 [23]. As for Kedah, the earliest positive COVID-19 cases were contributed by the PUI Sivagangga cluster and spread to Perlis and Penang. Several factors linked to COVID-19 transmission in Kedah included lack of physical distancing, family gathering who flouted standard operating procedures (SOP), and hospital visits [18]. Generally, the MoH expressed an alarming concern of COVID-19 spread as most respiratory viral tract infections were reported during rainy seasons in tropical regions [30]. Wan Nik et al. (2019) stated that two monsoon seasons with rapid wind speed faced by Malaysia: late May to September and November to March in Southwest and Northeast Malaysia, respectively, might contribute to the transmission of COVID-19 within this time frame.
Surprisingly, the total confirmed COVID-19 cases increased from 2594 to 101786 from Q3 to Q4 of 2020. Network analysis revealed a total of nine states including FT Kuala Lumpur, Johor, Perak, Selangor, Kelantan, Pahang, Negeri Sembilan, Pulau Pinang and Sabah that were significantly correlated in which Johor and FT Kuala Lumpur had the highest degree of interaction among others based on the visualisation (Figure 2b). Of the nine states, FT Kuala Lumpur and Selangor had a strong positive correlation (r = 0.765), followed by Johor and Selangor (r = 0.756). The increasing number of COVID-19 cases might have been contributed by geographical factors such as high population density and population movement, especially in urban centres [32]. Additionally, the confirmed COVID-19 cases in FT Kuala Lumpur and Selangor were also contributed by manufacturing industries [33]. Johor was also positively correlated with FT Kuala Lumpur (r = 0.755), Pahang (r = 0.674), Perak (r = 0.607), and Kelantan (r = 0.595). Other correlations (r values and degree of interaction) are summarised (Supplementary 1).
However, Johor and Sabah showed a negative correlation (r = -0.530), suggesting strategic implementations in Sabah that might reduce the spread of COVID-19 in Johor. Several comprehensive implementations in Sabah including limited non-essential services, implementation of Targeted Enhanced Movement Control Order (TEMCO), increasing of healthcare equipment (beds, ventilators, etc.) capacity, medical personnel mobilisation, point-of-entry testing, maximum daily RT-PCR testing capacity, mandatory 14-day quarantine at designated centres, quarantine centres for undocumented migrants, stringent border control, and more. Apart from that, Johor was placed under Conditional Movement Control Order (CMCO) and MCO, closing worship places, opening COVID-19 Quarantine and Low-risk Treatment Centres, and enforcing SOPs [34].
In Q1 of 2021 (January to March 2021), Figure 2c showed more complex networks in which 11 states formed 42 significantly strong positive correlations. This finding was supported by shifting the National Transmission State Assessment from Stage 2 (Localised Community Transmission) to Stage 3 (Large-scale Community Transmission – low confidence). Based on the visualisation, Johor, Kedah, Sabah, Selangor, Terengganu, and FT Putrajaya exhibited a similar and highest degree of interaction associated with COVID-19 transmission. Out of these states, Sabah and FT Putrajaya have the strongest positive correlation (r = 0.834), followed by Johor and FT Kuala Lumpur (r = 0.754). Additionally, Johor also exhibited positive correlations with Selangor (r = 0.715), Sabah (r = 0.646), Terengganu (r = 0.637), FT Putrajaya (r = 0.574), Melaka (r = 0.566), Kedah (r = 0.552), Pahang (r = 0.536), and Negeri Sembilan (r = 0.535). Other correlations (r values and degree of interaction) are also summarised (Supplementary 2).
The increase of COVID-19 spread was potentially due to inter-state travel during holiday celebrations, mainly in FT Kuala Lumpur, Selangor, Johor, Penang, Sabah, Kedah, Perak, Negeri Sembilan, and Malacca [18]. A few festive seasons (Q1 2021) that applied to these states, including New Year's Day (1st January 2021), Thaipusam (28th January 2021), and Chinese New Year (12th - 13th February 2021), hence might lead to an increase in population movement within the time frame. In addition, data from Google Mobility Report also indicated a surge of cumulative population movement (workplace, retail and recreations, parks, grocery and pharmacy, and transit stations) for Johor, Kedah, Sabah, Selangor, Terengganu, and FT Putrajaya within the quarter (Supplementary 4) [35], suggesting potential factor of COVID-19 spread [36].
The second quarter (Q2) of 2021 revealed the most complex network in the current finding (Figure 2d). All 16 states significantly correlated in COVID-19 transmission nationwide and exponentially increased the number of confirmed and death cases (Figure 1). All four states, including Selangor, Pahang, Malacca, and Kedah, had the highest degree of interaction (12 edges), among others. The National Transmission Stage Assessment was consecutively changed within this quarter from Stage 3 (Large-Scale Community Transmission – low confidence) to Stage 3 (Large-Scale Community Transmission – moderate confidence) effective on 26th April 2021, which further shifted to Stage 3 (Large-Scale Community Transmission – high confidence) on 10th May 2021. Kedah and Selangor remained the states with the highest degree of interaction from 9 to 12 correlations (edges) from Q1 to Q2 of 2021, respectively. During the time frame, the surge cases in Kedah and Selangor were linked to densely populated areas and those who contracted the virus at factories [37].
Additionally, Selangor and FT Kuala Lumpur had a strong positive correlation (r = 0.886). Subsequently, Melaka exhibited the highest positive correlations with Selangor (r = 0.883), Negeri Sembilan (r = 0.860), and Pahang (r = 0.854). Both Selangor and FT Kuala Lumpur consistently reported a high proportion of confirmed cases due to the burden of the healthcare system apart from Sarawak, Penang, Johor, and Kelantan. Moreover, multiple hospitals across FT Kuala Lumpur and Selangor struggled with surged admission of critically ill COVID-19 patients requiring oxygen support during this period [38]. Other correlations (r values and degree of interaction) are also summarised (Supplementary 3).
A total of 132673 and 11873 confirmed COVID-19 cases in Selangor and Malacca had been reported in the current quarter. However, no reports between Malacca and Selangor were found despite having a strong positive correlation (r = 0.883), and we inferred the transmission might be due to inter-state travel and rapid spread of COVID-19 within the local community, educational institutions, and places of worship. Considering the rise of population movement (residential, grocery and pharmacy) (Supplementary 5), the asymptomatic carriers and the emergence of new COVID-19 variants in Q1 of 2021 could potentially cause the virus to be more transmittable throughout the states [39]. In addition, several national festive seasons in Q2 of 2021 (April – July 2021), including Labour Day (1st May 2021), Eid Fitr (13-14th May 2021) and Wesak Day (26th May 2021), might link to the increase of population movements.
3.3 Prediction of confirmed cases and deaths in Malaysia using support vector regression model
In this study, SVR was employed to observe the reliability of this model in predicting the future number of confirmed cases and deaths in Malaysia. All SVR models constructed using a 70% training set of confirmed cases vs days, confirmed deaths vs days, and confirmed cases vs confirmed deaths obtained the best R2 values with 0.846, 0.859 and 0.829, respectively (Table 3). High R2 values (near to 1) together with low MSE and RMSE (near to zero) indicated that all SVR models are considered excellent and reliable predictive models [40]. Besides, low MSE and RMSE values also influence the high accuracy of SVR models. Meanwhile, the R2 values of 30% testing set for confirmed cases vs days, confirmed deaths vs days and confirmed deaths vs confirmed cases in Figure 3a, Figure 3b and Figure 3c are 0.855, 0.909 and 0.836, respectively (Table 3).
Based on Figure 3a and Figure 3b, it was observed that the predicted values of daily confirmed cases and deaths from Day 1 until Day 365 (July 2020 to June 2021) were lower but almost similar to the actual reported cases. This finding indicated that the SVR was a reliable and robust prediction method to briefly predict the impending number of daily infections and mortality rates. Nevertheless, in this study, the prediction of the SVR model was solely based on historical data and did not take into account the reproduction number (R0). The R0 is the estimated number of cases that an infected individual causes in spreading the disease to other individuals who are not yet infected. The R0 was utilised to determine the potential for a disease to spread in that population [6]. Recently, the determination of the R0 value is vital since this value is able to indicate the severity rate of the outbreak to spread among an individual [41]. Since our current aim only focuses on observing the SVR model's reliability in predicting the forthcoming COVID-19 cases, the R0 value may be proposed together with the SVR model for future study. Figure 4 forecasts the future number of daily infection and mortality rates was predicted from July 2021 until December 2021. It was observed that the number of confirmed cases and deaths in Malaysia will spike around July until August 2021, and a downward trend was expected to start in September 2021 (Figure 4) provided that the MoH and Malaysia government remain the similar intervention to curb the COVID-19 transmission. However, it was stressed that this forecast was merely based on daily confirmed cases and deaths variables, and more variables are needed to observe the influence of other variables on the COVID-19 trend in Malaysia.