We compared the consistency of data collected by CHWs on vital events that occurred during a one-year period using two different data collection methods. Programmatic data captured more births from maternal records than a census-based birth history (646 versus 593). Both approaches seemed to capture some unique births that the other did not – 120 births were captured by the programmatic data alone, and 67 were captured by the census alone. Programmatic data also captured more birth records collectively from mothers and children than census data (723 versus 595). We observed very high consistency (> 95%) in classifying vital events among records identified in both methods. Although further research is needed to draw more inferential conclusions about data completion and quality, our findings seem to favor tracking vital events through CHWs’ routine household visits for program monitoring and evaluation, especially in a limited resource setting.
One of the reasons for fewer births identified in the census was that 43 women who consented to enroll in routine care through the broader RMNCH study did not agree to provide a birth history during the 2018 census. In this setting, the high rate of in- and out-migration poses challenges in care delivery, and may explain why some events were missing in the programmatic data (25, 26). This is consistent with other studies that have identified migration as a potential factor affecting vital events recording by CHWs (22). Future studies, especially using qualitative methods, can help explore why some events were missed by the programmatic data, and why women may have declined to provide birth histories during the census.
Although we observed high consistency in birth location classification (institutional and non-institutional) among births (from maternal records) identified in both sources (95.8%), there was lower consistency in the sensitivity analysis (72%, Supplementary Table 1). However, since non-missing records from both sources tended to be classified as “institutional” births, it is likely that the missing records would also have been classified the same way, which would lead to higher consistency. Of note, we used broad categories and did not compare more granular birth locations within these categories (e.g. “hospital” versus “health post” among institutional births). Further, the high consistency in birth outcome classification (stillbirth, living, and death) for births (collectively from maternal and child records) identified in both methods was almost entirely driven by those in the much larger “living” category. Although both approaches missed some adverse infant outcomes that the other captured, programmatic data identified more deaths and stillbirths than the census. In retrospective data collection, it is common to either fail to report or inaccurately report past events over time (27). Recall bias may be one reason for missing events from the census method, among other potential reasons including declining participation, and age heaping (28, 29).
Similar studies in other low-resource settings that validated routine data collected by CHWs using different methods have shown varied findings (7). These studies were conducted as part of the "Real-time Monitoring of Under-Five Mortality" (RMM) project in different countries in Africa (14). In Mali, the team validated the routine data collected by lay volunteer community-based workers with household census-based full birth history survey data collected by the same volunteer community-based workers. Their study spanned 20 villages with a catchment population of approximately 32,000. Two full-time field coordinators conducted supervision, data verification, and data reviews for feedback loops to support community-based workers. The vital events data that CHWs reported were comparable with the census data and produced similar estimates of under-five mortality (30). In Malawi, the team compared the expected number of birth and death estimates obtained from routine data collected by health surveillance assistants (HSAs) with rigorous household surveys, collecting the complete birth history of women (aged 15–49) in approximately 24,000 households. Two different studies were conducted in two phases at different times, with enhanced supervision and data quality management in the second phase. Each HSA had a supervisor at a health center, who was responsible for field assessment and data quality reviews (22, 31). HSAs severely underreported births and deaths in both phases despite increased supervision and data quality in the second phase. On average, HSAs underreported births by 44% and under-five deaths by 49% over the study period. Joos et al. (2016) cited the challenges of the existing government health systems and high turnover rates of HSAs as the potential reasons for poor data quality (31). In Ethiopia, a validation study was conducted in two rural zones covering a total population of about 4.4 million. The team compared the vital events data collected by a professionalized home-visiting cadre, health extension workers (HEWs), with the data from a household mortality survey. The household mortality survey data were collected using a stratified two-stage cluster sampling design, as part of a larger evaluation that reached approximately 28,000 households. This validation study found severe underreporting of vital events when compared to household survey estimates – HEWs only reported 30% of births and 21% of under-five births causing underestimation in mortality rates. The researchers mentioned the high workload and challenges of supervision in remote areas as some of the potential reasons for low-quality data reported by HEWs (3).
Our findings were more consistent with the Mali study, where CHWs were able to identify the majority of vital events equally from both approaches. The approaches that the Mali study and our study used were also similar – both were smaller scale studies comparing routine data with census-based birth history data collected by CHWs. CHWs in Mali were also able to identify more events from the routine data collection method than the census method, which was similar to our findings (30). While CHWs in all the studies received some level of supervision and training for data collection, their incentives varied by setting. CHWs in our setting and Ethiopia were salaried, whereas those in Malawi and Mali received some incentives for data collection (3, 22, 30, 31). However, salaried HEWs in Ethiopia were not able to report complete and quality data due to other local challenges (3). These findings suggest that a combination of different factors can impact the quality of data reported by CHWs. One key difference between our study and these other studies was our use of a mobile platform instead of paper-based tools or registers. Although CHWs in Ethiopia were part of a salaried and professionalized cadre like those in our study, they used paper-based tools, and research assistants later entered data into the database manually (3). In contrast, CHWs in our study used a mobile platform with built-in data validation, thereby eliminating incomplete form submissions and the need for manual data entry. In our experience, this was more efficient and less resource-intensive in ensuring better data quality. However, further studies are needed to investigate the complex factors that affect the quality and reliability of CHW routine data collection.
Differences in data quality assurance processes could have contributed to some of the inconsistencies we observed between the two data sources in our study. We began implementing regular data quality checks at the beginning of the census in 2018. Since programmatic data collection preceded the retrospective census data collection and extensive data quality checks, programmatic data quality might have been affected to some extent. However, there were some built-in data validations in the CommCare forms to reduce anticipated errors during programmatic data collection. Furthermore, since household data were being updated during the census, we found that this introduced some discrepancies in records, such as both old and new household IDs being retained when merging data during analysis. However, this seemed to be the case for only a few records.
There are several limitations to our study. One key limitation was the small study site comprising 14 wards. Infant death is a rare event, and the small numbers we observed limit our ability to make robust inferences using mortality data. Factors such as differences in CHWs’ length of employment, educational level, other competing work priorities, and training may also affect individual data collection. Since the same CHW collected data for her ward using both methodologies in this study, we likely mitigated the effects of these factors when comparing the two approaches. However, this can be a limitation as well: since census data collection followed programmatic data collection, CHWs may have retrieved memories from their care delivery to the same women, which could have influenced the data reported in the census. As has been noted in other studies, CHWs may have experienced a potential conflict of interest since they work to improve health outcomes in their communities, and are also asked to report data on adverse outcomes such as infant deaths (22). This seems unlikely in our study, however, as the census data also captured fewer births (in addition to fewer unfavorable outcomes) compared to the programmatic data.
In our experience, conducting the census in addition to CHWs’ existing workload was resource- and time-intensive. Although pregnancy screening was temporarily halted during the census, CHWs continued to deliver and collect data for services such as antenatal, postnatal, and early childhood care. Census data collection took longer than the expected four months to complete, and the community may have experienced survey fatigue. Frequent migration in the setting also posed challenges to completing census data collection within a shorter time period.
Our methods also lacked rigor in collecting mortality and stillbirth data. Although many LMICs commonly use a birth history method for mortality data, it is not always reliable for collecting information on stillbirths and neonatal deaths. There is a high chance of misclassification of self-reported stillbirths and neonatal deaths with this approach (32). Additionally, stillbirths and miscarriages may often be tied to religious and cultural beliefs in these communities (33). Thus, women may not openly disclose such events in a birth history. This could have caused misclassification or under-reporting of mortality and stillbirths in the programmatic data as well. However, this limitation may have been partially mitigated as CHWs belong to the same community as the women they serve, and have built trust with them through continued engagement during care delivery (34). Future studies should attempt to use a more advanced and in-depth method, such as verbal and social autopsy and participatory analytic methods, and strengthen linkages with government reporting systems to identify stillbirth and mortality events with greater accuracy (8, 13).