Quality of Road Trauma Data in Malawi: The Case of Road Trauma Surveillance in Ntcheu District, Malawi.

Background Road trauma represents a major but neglected public health challenge in Malawi that requires concerted efforts for effective and sustainable prevention. In order to make the road trac system safer, it is important to develop an understanding of the whole system and its elements including vehicles, roads and road-users along with their physical, social and environmental circumstances. In this study, we measured data quality in terms of database and variable completeness of road trauma case reports to determine if the data are sucient for informing, monitoring and evaluating road safety interventions. Methods This was a quantitative retrospective study based in Ntcheu district, Malawi. Data were collected for year 2018 from both police and hospital sources. Categorical data exploration from these two sources was done using frequency distribution tables. Continuous data were summarized using means and standard deviation. Data source completeness was assessed using the capture–recapture methodology, while variable completeness was assessed using a checklist developed from the World Health Organization’s minimum injury surveillance core data set. Results The hospital data source was incomplete in areas of road user type, time of injury, mechanism of injury and place of injury. Thirteen case matches were identied between the two databases. Using the capture-recapture methods, the estimated road trauma events in Ntcheu district for the year 2018 is 954 (95% CI: 457, 1451) and an estimated 173 deaths (95% CI: 89, 257) in the same year. These estimates indicate 11% and 14% ascertainment of fatalities for the police and hospital data sources respectively. Discussion and conclusion There is signicant underreporting for road deaths and injuries in Ntcheu district, which means there are critical data quality challenges in the respective data sources. It is therefore imperative for the road safety agencies and partners to resolve these data reporting and acquisition challenges.

were adhered to and data collection was done under the supervision of the responsible o cers at each institution, to ensure that appropriate procedures were followed.

Data sources 1. Hospital data source
The hospital data were collected from the Health Management Information System (HMIS) set up by the government to manage health data in all health facilities in the country, triangulating data from the general casualty unit, operating theater and the mortuary department. The study investigator, supported by the emergency clinician and the HMIS o cer for the facility, collected the data using a structured checklist prepared for the purpose of this study. All injuries and deaths resulting from road crashes reported to the facility from January 1 2018 to December 31 2018 were included. The HMIS is a sentinel type of surveillance for health data [19]. It includes data for patients as they show up at a health facility and is managed by an HMIS o cer based at the facility. Kasambara et al. (2017) [20] highlight that data management through this platform is unsatisfactory in terms of accuracy, completeness, consistency and timeliness, making it unreliable to inform program planning and decision making.Even though this is the case with the data quality, this is also one the critical and popular sources of road trauma data in Malawi. Information on age, sex, road user type, time of the incident and injury deposition were gathered from this source.

The tra c police data source
The police are an important source of data for road injuries and deaths in Malawi. Assisted by the Station Tra c O cer (STO), the study investigator collected the road crash injury and death data for the year 2018. Fatalities include deaths that happened at the crash scene of the incident and those declared within one month of the incident. While only the police report deaths and injuries that occur outside the hospital, not all events are reported to them. This means that under-reporting by police-based registries is a widespread problem [11] [21]. In their report of road deaths in Malawi, Schlottmann et al. (2017) [22] and Manyozo et al. (2018) [23] agree that police-based data tend to underestimate the true burden of road deaths in Malawi. We collected data on age, sex, road user type, time of the incident and injury deposition similar to the information collected from the hospital data source.

Matching Criteria
The capture-recapture methods evaluate the degree of overlap between the two data sources, which allows estimation of a corrected number, which is then used to estimate the degree of completeness of the sources being assessed [13]. In this study, the degree of overlap was determined through case matching. A match was made when gender, age (within ve years), injury mechanism, location, and time (within three hours) matched, with one missing variable allowed as long as the other parameters were met. This case matching method was used in a similar study conducted in Malawi by  and was reported to be sensitive [24]. For crash location matching, data collectors have su cient information on the studied road network and the district in general.
Simple Capture -Recapture analysis Let the number of events recorded in the hospital registry, y, the number of events recorded in the police accident reports, x, and the number of events reported in both the trauma registry and police accident reports, z. Number of events, N, is estimated as: The variance and 95% con dence interval of the estimate N follows as: The estimated completeness of each database was calculated by dividing the number of road trauma events in each database by the ascertainment corrected number.

Data analysis
All quantitative data were veri ed to make sure that all cases captured were related road tra c crashes in Ntcheu district. Data from the two sources were explored separately before comparing quality between datasets. Categorical variables were explored using frequency tables. Continuous variables were summarized using means and standard deviation. The extent of non-reporting for each database was assessed using the capture-recapture methodology, while variable completeness was assessed using a checklist developed from the World Health Organization's injury surveillance core data set. All analyses were done using STAT14 [25].
1. Data quality assessment Both the police and hospital data sources had varying levels of missing data points, as presented in Table 1. The hospital data source lacked information on road user type, place of injury, time of injury and mechanism of injury. Table 2 shows that in 28% of the police records and 15% of the hospital records of road trauma, victim age was missing; in addition gender identity was missing for 9% of hospital road trauma cases compared to the 1% in the police records. For cases whose data were available across the two databases, the majority of the road trauma victims were in the age range of 15 to 44 years. The majority of road accident victims were male, accounting for 69.3% in the police records and 63.1% in the hospital records. Pedestrians were the highest affected road user category, contributing more than half of the all road injuries and deaths (n = 51, 50.5%) reported by police in Ntcheu district. In the combined data set the lack of data on type of road user and time the injury in the hospital database meant that these variables had higher missingness rates, 58% and 58.4%, respectively. The age group 30 to 44 years accounted for a third of all road injuries (66/231), and two-thirds of the total victims were men 144 (65.8%). Table 3 presents a summary of the distribution of road trauma victims from the combined databases.  [26] from the recent population and housing census, the estimated district level road trauma fatality rates were 11/100,000 and 4/100,000 for the police and hospital sources respectively. Fatality rate was estimated by diving the number of all deaths reported following a road crash in each data base by the Ntcheu district population reported in 2018.
When combined (see Table 4), the police and hospital records reported 23% of all road trauma cases and 50% of all road deaths in Ntcheu district in the year 2018.  [3]. In other countries including China, Hu et al. [28] reported in 2010 that police recorded deaths at twice the level recorded by death registration data. Our results are therefore consistent with internationally observed rates of under-reporting, and further show that the estimates reported at district level in Ntcheu are way below the estimates that can be trusted to represent the true road trauma burden in the district [29].
In this study, the two databases differed greatly with respect to the ratio of fatal to non-fatal events. The hospital database had a much lower fatal:non-fatal ratio (25:101) compared to the police database (72:130). Furthermore, the police data captured twice as many road deaths as the hospital data source.
Several reasons might explain this phenomenon. For example, most deaths at the scene of the crash are never reported to the hospital and thus are never registered by the health authorities. In addition, non-fatal cases rather than contacting the police mostly go straight to seek health care and hence are likely to be reported by the health services. Furthermore, the missing matching variables in both the police and hospital data sources are also an important consideration. The small number of matches could lead to a much larger road trauma burden estimated for the district, such that when compared with the reported cases, the ascertainment of each database would be highly diluted. Chokotho et al. (2011) refer to other factors that could contribute to the reduced level of completeness of road death data, including misclassi cation of road deaths by health services, especially deaths that occur a considerable time after the road crash [10].
The 2010 PLoS medicine editorial report [30] highlighted the need e, while highlighting the clear de ciencies in the data that is collected regarding road crashes. This has been clearly shown in this study, as the data sources demonstrated clear de ciencies in the key data variables of age, gender, time of injury and type of road user. The ndings from this study further support the PLoS editorial in that there was massive under-reporting in both the police and hospital databases. This means that there is not a reliable picture of how many pedestrians, cyclists, motor vehicle users and other road users are dying in Ntcheu district. We suspect that this may be the situation in many, if not all, districts in Malawi.
The type of poor data quality observed in this study has been shown to affect the relevance and usefulness of such data in informing injury prevention initiatives [30] [31]. Relevance of injury data is de ned as the ability within the collected data to: (a) identify new and/or emerging injury mechanisms; (b) monitor injury trends over time; and (c) describe key characteristics of the injured population (i.e. using the WHO's core minimum data set for injury surveillance) [31]. Mitchel et al. (2009) further discuss data usefulness in terms of representation of the whole injury burden, as non-representative data may focus prevention efforts on populations that are not truly at risk, which could result in a misdirection of resources [31]. The poor-quality data reported in this study highlights the urgent need to improve the quality and completeness of road trauma data in Ntcheu district, let alone in the rest of Malawi.
The quality of reported data analysed in this study is worrying, especially considering that these data are used to produce o cial reports for policy consideration and program planning. Low quality data makes it di cult to target road safety interventions to groups at risk, to high crash road sections, and to times of the day when road crashes are more likely to occur. To support this, evidence from Sri Lanka suggests signi cant differences in age, road user type, injury severity, between events reported and those not reported (33%) [32].
The health sector strategic plan (HSSP) 2017-2022 for Malawi [33] recognizes the huge burden of road trauma in Malawi, using an estimate of 35 deaths per 100,000 people. This number is well above the African regional average of 26.6 deaths per 100,000 people, and twice the global average of 17.4 deaths per 100,000 people. Despite these unsettling statistics, there has not been an adequate investment by public authorities to highlight this important public health and development problem in Malawi. The lack of accurate data may be one of the reasons that road safety is constantly understated as a major public health challenge, and why it has been under-resourced in national budgets.
Due to the poor quality of the data analysed in this study, we were not able to conduct a risk factor assessment for road trauma victims in Ntcheu district.
Data sources from both the hospital and the police were marred by massive missing data. Risk factor assessments are possible with good quality data. For example, Loo et al. (2013) reported that road crash victims in Hong Kong over the age of 16 were most likely to be cyclists, pedestrians and back seat passengers [34]. They further reported that females involved in an accident were more likely than men to report their accident to the police. In our study, among the trauma cases with complete data (42%), pedestrians (21.5%, n = 47) were the most affected amongst all road users and men were more affected, contributing to about 65.8% (144) of the road trauma burden among the reported events. Even though our data suggests massive underreporting, his result a rms locally the global concern for road safety for vulnerable road users and men as high risk groups [35][1].
Our results further show that police sources are more likely to report road fatalities compared to the hospital data sources, which reported more non-fatal cases. Farmer (2002) cautions the use of police data sources, as there are high chances of misclassi cation, and thus recommends the need to complement police-based data with that from hospitals [36]. In a study conducted in Hyderabad, comparing hospital-based and police-based data, people were more likely to report to the hospital than the police especially in hit-and-run situations [37].

Study limitations
This study was conducted based on a dynamic population, and so some assumptions for the use of capture-recapture methods were not met. This may affect interpretation of estimates reported. However, the methods has reliably been used in dynamic populations such as ours. Furthermore, deaths and injuries recorded in Ntcheu may include persons not resident in Ntcheu while the denominator used to calculate estimates is the Ntcheu local population. The estimates in this study may have also been affected by the very high levels of missing data observed in the reviewed les, which may have affected the estimates reported. We therefore encourage readers and users of this report to interpret these estimates with caution and due consideration of its potential limitations. However, our results are consistent with estimates reported by WHO, in which binomial regression methods were used to estimate the true burden and level of underreporting.

Conclusions
The road safety situation in Malawi appears to be worsening with the passing of each year [38] [33]. Thacker (2007) [39] emphasized that what gets measured also gets done, and conversely that what is not measured is likely not to be done. The poor quality and incomplete data observed in this study may be the reason why there has not been corresponding signi cant investment in interventions to address the road safety situation in Ntcheu district speci cally and Malawi generally, despite being one of the highest affected nations globally. It is therefore clear from this study that road tra c injury data in Ntcheu are in great need of improvement if they are to be used to inform prevention and progress evaluation. Stakeholders in road safety need to be aware that high quality data is a critical pre-requisite for assessing progress in road safety in Malawi, and that in its absence, any reported changes in road safety indicators may be misleading. However, the two data sources provide an opportunity for data linkage, to improve data quality for road trauma in Malawi. The national registration for Malawians provides a unique opportunity for data linkage between the police and the hospital data sources, as it provides a unique identi cation for each individual [40]. Stakeholders will need to review the operationalization of this strategy in future programs.

Recommendations
As a country, we have an opportunity to improve safety on our roads. Stakeholders working in road safety need to recognize the critical role of quality data in informing road safety initiatives, in understanding risk factors and in targeted program planning. They must therefore work to improve reporting, analysis and use of the road crash data. This study has provided an outline of critical data quality challenges in the relevant data sources. It is imperative for the responsible agencies and partners to work to resolve these data reporting and acquisition challenges. As indicated above, the national registration provides a unique identi er which provides an opportunity for data linkage between multiple sources to improve data acquisition and therefore quality of road crash data in Malawi.

Declarations
Author contributions SM conceptualized the project; SM and RM designed the methods for the study. Data collection was led by SM. SM with support from RM analyzed the data.
SM, RM and WS contributed to the development and revision of the manuscript and approved the nal version.