Our results show a high rate of under-reporting, and consequently lower ascertainment, for road deaths and injuries in Ntcheu district. These are estimated at 11% and 14% ascertainment for the police and hospital data sources respectively. This is compared to the established 80% ascertainment regarded by Bhalla et al. (2009) [21] as high-level completeness. In our study, more road deaths were reported by the police (72 deaths, 42% ascertainment) compared to the hospital (25 deaths, 14% ascertainment). The estimates we have reported are similar to what Chokotho et al. reported in 2011 in South Africa [10], and what Razzak et al. reported in 1998 in Pakistan [27], where the authors reported 50.6% and 56.7% rates of under-reporting of road deaths respectively.Furthermore,, our estimates correspond closely to the 2018 Global Status Report on Road Safety which, using binomial regression models, estimated an 80% under-reporting of road deaths in Malawi, considering the official reported 1122 deaths in 2016 and the estimated total of 5601 road deaths in the same year [3]. In other countries including China, Hu et al. [28] reported in 2010 that police recorded deaths at twice the level recorded by death registration data. Our results are therefore consistent with internationally observed rates of under-reporting, and further show that the estimates reported at district level in Ntcheu are way below the estimates that can be trusted to represent the true road trauma burden in the district[29].
In this study, the two databases differed greatly with respect to the ratio of fatal to non-fatal events. The hospital database had a much lower fatal:non-fatal ratio (25:101) compared to the police database (72:130). Furthermore, the police data captured twice as many road deaths as the hospital data source. Several reasons might explain this phenomenon. For example, most deaths at the scene of the crash are never reported to the hospital and thus are never registered by the health authorities. In addition, non-fatal cases rather than contacting the police mostly go straight to seek health care and hence are likely to be reported by the health services. Furthermore, the missing matching variables in both the police and hospital data sources are also an important consideration. The small number of matches could lead to a much larger road trauma burden estimated for the district, such that when compared with the reported cases, the ascertainment of each database would be highly diluted. Chokotho et al. (2011) refer to other factors that could contribute to the reduced level of completeness of road death data, including misclassification of road deaths by health services, especially deaths that occur a considerable time after the road crash [10].
The 2010 PLoS medicine editorial report [30] highlighted the need e, while highlighting the clear deficiencies in the data that is collected regarding road crashes. This has been clearly shown in this study, as the data sources demonstrated clear deficiencies in the key data variables of age, gender, time of injury and type of road user. The findings from this study further support the PLoS editorial in that there was massive under-reporting in both the police and hospital databases. This means that there is not a reliable picture of how many pedestrians, cyclists, motor vehicle users and other road users are dying in Ntcheu district. We suspect that this may be the situation in many, if not all, districts in Malawi.
The type of poor data quality observed in this study has been shown to affect the relevance and usefulness of such data in informing injury prevention initiatives [30][31]. Relevance of injury data is defined as the ability within the collected data to: (a) identify new and/or emerging injury mechanisms; (b) monitor injury trends over time; and (c) describe key characteristics of the injured population (i.e. using the WHO's core minimum data set for injury surveillance) [31]. Mitchel et al. (2009) further discuss data usefulness in terms of representation of the whole injury burden, as non-representative data may focus prevention efforts on populations that are not truly at risk, which could result in a misdirection of resources [31]. The poor-quality data reported in this study highlights the urgent need to improve the quality and completeness of road trauma data in Ntcheu district, let alone in the rest of Malawi.
The quality of reported data analysed in this study is worrying, especially considering that these data are used to produce official reports for policy consideration and program planning. Low quality data makes it difficult to target road safety interventions to groups at risk, to high crash road sections, and to times of the day when road crashes are more likely to occur. To support this, evidence from Sri Lanka suggests significant differences in age, road user type, injury severity, between events reported and those not reported (33%) [32].
The health sector strategic plan (HSSP) 2017–2022 for Malawi [33] recognizes the huge burden of road trauma in Malawi, using an estimate of 35 deaths per 100,000 people. This number is well above the African regional average of 26.6 deaths per 100,000 people, and twice the global average of 17.4 deaths per 100,000 people. Despite these unsettling statistics, there has not been an adequate investment by public authorities to highlight this important public health and development problem in Malawi. The lack of accurate data may be one of the reasons that road safety is constantly understated as a major public health challenge, and why it has been under-resourced in national budgets.
Due to the poor quality of the data analysed in this study, we were not able to conduct a risk factor assessment for road trauma victims in Ntcheu district. Data sources from both the hospital and the police were marred by massive missing data. Risk factor assessments are possible with good quality data. For example, Loo et al. (2013) reported that road crash victims in Hong Kong over the age of 16 were most likely to be cyclists, pedestrians and back seat passengers [34]. They further reported that females involved in an accident were more likely than men to report their accident to the police. In our study, among the trauma cases with complete data (42%), pedestrians (21.5%, n = 47) were the most affected amongst all road users and men were more affected, contributing to about 65.8% (144) of the road trauma burden among the reported events. Even though our data suggests massive underreporting, his result affirms locally the global concern for road safety for vulnerable road users and men as high risk groups [35][1].
Our results further show that police sources are more likely to report road fatalities compared to the hospital data sources, which reported more non-fatal cases. Farmer (2002) cautions the use of police data sources, as there are high chances of misclassification, and thus recommends the need to complement police-based data with that from hospitals [36]. In a study conducted in Hyderabad, comparing hospital-based and police-based data, people were more likely to report to the hospital than the police especially in hit-and-run situations [37].
Study limitations
This study was conducted based on a dynamic population, and so some assumptions for the use of capture-recapture methods were not met. This may affect interpretation of estimates reported. However, the methods has reliably been used in dynamic populations such as ours. Furthermore, deaths and injuries recorded in Ntcheu may include persons not resident in Ntcheu while the denominator used to calculate estimates is the Ntcheu local population. The estimates in this study may have also been affected by the very high levels of missing data observed in the reviewed files, which may have affected the estimates reported. We therefore encourage readers and users of this report to interpret these estimates with caution and due consideration of its potential limitations. However, our results are consistent with estimates reported by WHO, in which binomial regression methods were used to estimate the true burden and level of underreporting.