Mixed methods process evaluation of the Sundara Grama intervention promoting latrine use and safe disposal in rural Odisha, India

Process evaluations of public health programs are critical to understand if programs were delivered as intended and to identify improvements for future implementations. The aim of this paper is to describe a mixed methods process evaluation of the Sundara Grama intervention, which sought to improve latrine use and safe child feces disposal among latrine-owning households in rural Odisha, India. Methods The Sundara Grama intervention was delivered to 36 villages in Puri district by a grassroots NGO and included eight activities: palla performance (folk theater ‘edutainment’), transect walk, community meeting, community wall painting, mother’s meeting on child feces disposal, positive deviant household recognition, household visit, and latrine repairs. The process evaluation quantitatively assessed delity, dose delivered, and reach, and qualitatively examined recruitment, context, and satisfaction. Quantitative data collection included an activity observation survey, activity record, and endline trial survey. Qualitative data collection included an activity observation debrief and in-depth interviews with NGO mobilizers. Data collection took place during three time periods: during intervention delivery, immediately after, and several months post-implementation. For the quantitative data, a ‘delivery score’ was calculated for each activity, as well as the proportion of target participants in attendance. Qualitative data was analyzed using thematic analysis.

implementers' perceptions of the intervention. Given that Sundara Grama had impact on behavior, the process evaluation ndings enable understanding of what facets of the intervention should be replicated, adapted, or omitted in future iterations, as well as what contextual factors in uenced delivery. Further, lessons from this process evaluation can inform the delivery of other community-wide behavior change interventions, especially those focused on sanitation in India.
The goals of this paper are two-fold: to describe the mixed methods approach used to assess delivery of the Sundara Grama intervention and to report results and lessons learned from the process evaluation.

Approach
This process evaluation was informed by the Saunders et al. framework for assessing health promotion programs, and evaluated these components: delity, dose delivered, reach, recruitment, context, and both participant and implementer satisfaction (Supplemental Table 1) (13). We also conducted a cost analysis of intervention delivery. Our process evaluation aimed to answer ve key questions: 1. Was the intervention implemented as planned? ( delity, dose delivered) 2. Who was reached and how were participants recruited? (reach, recruitment) 3. What external factors impacted delivery? (context) 4. What did participants think of the intervention? (participant satisfaction) 5. What were the experiences of implementers in delivering the intervention activities? (implementer satisfaction)

Context and Sample
The Sundara Grama intervention was delivered to 36 villages in Puri district, Odisha state between January 2018 and February 2019. Puri is approximately 70% rural, and government sanitation campaigns have been implemented for many years (14,15). Thirty-three villages were engaged in the cluster randomized-controlled trial (CRT) to assess intervention impact (66 total villages), and three villages were engaged only in qualitative research, to assess village member perceptions of the intervention and possible spillover (six total qualitative villages). Information on the trial design, setting, and randomization procedures are reported elsewhere (16). This process evaluation leverages data from all 36 villages that received the intervention and 19 interviews with members of the intervention delivery team.

Sundara Grama Sanitation Intervention
The Sundara Grama intervention included a multi-level communication approach with activities delivered at the community, group, and household levels, reiterating the motto 'Moro Swacha, Sustha, Sundara Grama' (My Clean, Healthy, Beautiful Village). Each activity was designed to target speci c behavioral factors identi ed through formative research to in uence latrine use and/or safe child feces disposal. Community-level activities included an adapted palla theater performance with sanitation skits (this is a traditional entertainment art form of Odisha that includes skits, songs, and poetry with witty elements (17)); an early morning transect walk to re-evaluate the village's state of open defecation; a communitymeeting to discuss sanitation problems, create an action plan to address those problems, and identify positive deviant households (households where all members used a latrine all the time); and a community wall painting that showed both the decided-upon action plan, and a map of the village that indicated which households were positive deviants. The group-level activity was a mother's meeting with caregivers of children <5 years old to provide action knowledge and hardware (potties and scoops) to aid safe child feces disposal. Household-level activities included either provision of a celebratory poster to positive-deviant households or household visits with non-users to encourage commitment towards all members using the latrine. Based on observation of latrine conditions during the baseline trial survey, a subset of households were identi ed to receive a comprehensive assessment of their latrine's condition.
Those in need of minor repairs (e.g. missing door, broken slab) were subsequently selected to receive repairs to ensure latrine functionality and privacy.
Rural Welfare Institute (RWI), a local grassroots NGO, was the implementer. RWI engaged four teams of ve (one supervisor and four community mobilizers) to lead community meetings, transect walks, mother's meetings, and household visits (June -July 2018). The palla performances (June -July 2018), community wall paintings (completed after the rainy season in September -October 2018), and latrine repairs (completed between November 2018 -January 2019) were carried out by different local artisan groups. See Table 1 for a detailed description of intervention activities and Figure 1 for a timeline.

Data Collection
We used a mixed methods approach to quantitatively assess delity, dose delivered, and reach, and qualitatively examine recruitment, context, and satisfaction. Data collection took place from June 2018 to February 2019 during three time periods: during intervention delivery, immediately after, and four to six months post-implementation ( Figure 1). Data collection tools and approaches are described in Table 2.

Quantitative Data Collection
Quantitative data collection included an activity observation checklist, activity record, and endline trial survey. Activity observation checklists and endline trial surveys were conducted by a team of Odiaspeaking Emory enumerators who were engaged in the CRT, underwent a multi-day training, and pilot tested the tools. Activity records were lled out by RWI mobilizers who led activities and were trained on the tool by author PR.

Activity observation survey
An Emory enumerator completed an activity observation survey during each palla, transect walk, community meeting, and mother's meeting. This structured, checklist-style survey mirrored the activity guides used by RWI mobilizers and palla troupes. To assess delity and dose delivered, the survey included questions to con rm if intended components were conducted. Example delity items included completion of preparation steps, stakeholders in attendance, use of program materials, and components delivered in correct order. Dose delivered items included completion of each activity step and key messages delivered. The survey also included a question about issues that may have hindered the activity and two Likert questions to capture enumerator perception of the activity quality and level of participant engagement. To assess village population reach, the enumerator recorded the number of village members in attendance by age group (adult vs. under 18 years old) and sex. Attendance was taken at a speci c time point during each activity and enumerators used a tally counter device to aid their counting.
Surveys were completed using paper and pen to enable the enumerators to easily move through the survey. Responses were later entered into a digital version of the survey using ODK Collect (available from https://opendatakit.org/) on an Android phone.
Activity record RWI mobilizers lled out an activity record to con rm they completed each positive deviant recognition and household visit activity. To assess delity and dose delivered, the activity record con rmed whether or not key activity steps took place and if a banner or poster was given. To assess reach, the activity record documented number of household members in attendance. RWI mobilizers also lled out an activity record for each mother's meeting, which mostly acted as an attendance sheet documenting how many caregivers were in attendance, if their child(ren) also attended, and if the caregiver was given a potty and scoop. RWI submitted all activity records to the Emory research team, which were doubleentered into an Excel database for analysis. Emory enumerators did not complete these activity records as their involvement in household visits and attendance-taking during the mother's meetings might have disrupted those activities or caused the enumerators to be con ated with the RWI implementing team.

Endline trial survey
During the endline trial survey, the Emory enumerator team asked questions about the activities to respondents from latrine-owning households in the intervention villages. To assess dose delivered, respondents were asked if they received certain program materials (i.e. potty, scoop, poster). Among households that were selected to receive latrine repairs, dose delivered was assessed by asking the respondents to con rm whether or not the repairs took place. To assess household-level reach, respondents were asked which all activities the household had attended, including whether or not they had seen the community wall painting. All trial households, regardless of latrine-owning status, were censused during endline data collection to determine village size and latrine coverage, and thus inform reach calculations (described in detail below). Accordingly, reach measures could not be calculated for activities conducted in the three villages engaged only in qualitative research.

Qualitative Data Collection
Qualitative data collection included activity observation debriefs completed by Emory enumerators and implementer in-depth interviews (IDIs) conducted by author SU.

Activity observation debrief
An activity observation debrief was completed by an Emory enumerator after each palla, transect walk, community meeting and mother's meeting. To capture contextual factors and participant satisfaction, the debrief form included sections on factors that hindered or aided delivery and how village members reacted to and participated in the activity. The debrief forms were completed using paper and pen and were subsequently transcribed and translated into English by the eld supervisors.

Implementer interviews
Implementer IDIs were conducted immediately post-implementation to explore mobilizers' perceptions of and satisfaction with recruitment and delivery. Interview topics included successes and challenges with recruitment and delivery approaches, perceptions of participant satisfaction, and recommended changes to the activities. While all IDIs covered a set of common topics, only one intervention activity was explored deeply in each interview. Author SU aimed to interview all 20 RWI mobilizers but because of schedule limitations only interviewed 19. Interviews were conducted in Odia and audio recorded. SU re-listened to recordings and summarized responses by topic in English.

Quantitative Data Analysis
Delivery Score We calculated a 'delivery score' to assess delity and dose delivered based on relevant indicators from the activity observation survey or activity record. The maximum possible score was based on the number of indicators assessed for that speci c activity, with 1 or 2 points possible for each indicator. A maximum delivery score meant the activity was delivered as intended ( delity) and in its entirety (dose delivered). Common delity indicators across activities included: attendance by a key stakeholder, adequate length of activity, pre-activity preparations completed, and components delivered in correct order. Dose delivered indicators included completion of each activity component. For example, the palla performance included an introduction, six sanitation skits, speci c messages on latrine use and open defecation, and a closing. Each of these components was assessed based on one to several questions in the activity observation survey and could receive 0, 0.5, or 1pt depending on how completely the component was delivered. For the positive deviant recognition and household visit, the delivery score was calculated based on relevant indicators from the activity record. Delivery scores were converted to percentages and calculated for each intervention village. An average delivery score for a given activity was also calculated for each of the four RWI implementing teams, by averaging scores across their assigned villages. Supplemental Tables 2-5 outline the scoring criteria for each activity.
Fidelity and dose delivered was assessed for the community wall painting by author PR who reviewed photos of each painting and con rmed all components were present. Lastly, for the latrine repairs, dose delivered was assessed based on household con rmation in the endline trial survey that repairs were completed.

Reach
We determined village population reach and latrine-owning household reach. Village population reach was only calculated for activities implemented at the community-level and was determined by dividing the number of people in attendance at the activity (recorded in the activity observation survey) by the village population (determined from the trial endline survey data).
Latrine-owning household reach was calculated for all activities, except latrine repairs, by dividing the number of households that reported attending each activity by the total number of latrine-owning households in the village (both determined from the trial endline survey data). For reach at the mother's meeting, the denominator was total number of latrine-owning households with a child less than 6 years old in the village. For the positive deviant recognition and household visit activities, a combined reach was calculated since latrine-owning households were meant to receive one or the other of these two activities.
One-way ANOVA To examine consistency of program delivery and reach across the four RWI implementing teams, the delivery score and reach means of each respective team were compared using one-way analysis of variance (ANOVA) in IBM SPSS 26 statistics software.

Qualitative Data Analysis
The activity observation debriefs and implementer IDI responses were analyzed to uncover themes related to recruitment, satisfaction, and context. Author GDS read through the observation debriefs and created memos on emerging themes by process evaluation component. GDS then read through the data again and synthesized the predominant themes based on common memos. This process was repeated for the implementer IDI responses, with an additional examination of satisfaction themes from the implementer perspective. Findings from both analyses were compared to identify common themes, as well as themes unique to either data source (enumerator or RWI mobilizer).

Cost Analysis
The Sundara Grama intervention was designed to cost an average of 20 US dollars (USD) or less per target household, a funder requirement to ensure the intervention was policy-relevant and nancially feasible at scale. We report the total cost, in USD, of implementing Sundara Grama across the 33 trial intervention villages. The research team documented expenses related to intervention inputs and latrine repairs, and RWI provided their human resource costs to the research team. Training and overhead expenses are not included in the total cost. Cost per latrine-owning household reached was calculated by dividing the total delivery cost by the number of latrine-owning households that reported attending at least one of the activities in the endline trial survey.

Results
Was the intervention implemented as planned? ( delity, dose delivered) Average delivery scores were high for household visit activities (general visit: M = 97%; range = 87-100%) (positive deviant visit: M = 96%; range = 78-100%) and mother's meeting (M = 81%; range = 56-100%), indicating the activities were often conducted as intended (Table 3). Both the palla performance and transect walk had a lower average score of 77% (palla range = 52-100%; transect walk range = 50-91%), while the community meeting had the lowest average score at 60% (range = 40-83%). The activity observation survey data for these three activities showed steps were often followed only 'somewhat' in the correct order. For the community meeting, none of the meetings were sex-segregated as intended, few were conducted in a private place, most did not have participants introduce themselves, and three activity steps were rarely completed in full (recognize sanitation challenges, identify positive deviants, and discuss becoming a model village). There was no statistically signi cant difference in average delivery score between the four implementing teams for any of the activities.
Enumerator-reported quality of the activities, using a 4pt Likert, aligned with the delivery scores with 81% of mother's meetings, 78% of palla performances, 69% of transect walks, and 64% of community meetings being rated good or very good. Only two transect walks and one mother's meeting were rated poor.
All 36 community wall paintings included the required components. Lastly, among the 358 households selected to receive latrine repairs and surveyed at endline, 75% reported receiving the repairs.
Who was reached and how were participants recruited? (reach, recruitment)
For community-level activities, palla performances reached, on average, almost a quarter (M = 24%; range = 6-50%) of the village population, while far fewer were reached on average for the community meetings (M = 8%; range = 3-17%) and transect walks (M = 5%, range = 1-14%). For village population reach, there was a statistically signi cant difference between the four implementing teams for the transect walk (F = 3.08, p = 0.04).
The gender and age of those in attendance varied depending on activity. On average, about half of the audience members at the palla performances were women (M = 49% women; range = 27-77%); slightly fewer women on average attended the transect walk (M = 45% women; range = 0-77%), with no women in attendance in two villages; and more women on average attended the community meeting (M = 57% women; range = 7-87%). The community meeting was also attended by more adults on average (M = 81% adults; range = 64-100%) compared to the palla performance (M = 59% adults; range = 31-80%) and transect walk (M = 57% adults; range = 18-100%). Almost a third of transect walks (N = 10) had a majority of boys and girls <18 years old in attendance. For the mother's meeting, 20% of all participants did not have a household latrine and 41% brought their child.

Recruitment
Several factors were identi ed that aided activity recruitment, while others hindered recruitment (Table 4).
RWI mobilizers described the 'pre-intervention visits', which were designed to build rapport with village stakeholders and plan activity logistics, as a very successful strategy that later aided recruitment of village members to intervention activities. Additionally, mobilizers sometimes received recruitment assistance from village members and stakeholders who would help go door-to-door to invite village members to the activities. Speci cally, Anganwadi workers (teachers for government-run preschool centers) often helped with mother's meeting recruitment.
In contrast, the recruitment strategy used during the transect walk, where mobilizers went around beating a bell early in the morning, sometimes led to confusion and irritation. Some thought the bell was signaling a call to prayer or that someone had died, while others objected to hearing the bell so early in the morning or felt it disturbed their morning routine. RWI mobilizers also explained it was sometimes di cult to convince people to attend activities, especially for the community and mother's meetings, since an incentive was often expected and not provided. As some community members would tell them, "If there is no eating, there is no meeting." What external factors impacted delivery? (context) Delivery of the Sundara Grama activities was impacted by three contextual factors: stakeholder support, inclement weather, and social dynamics (Table 4).
Village stakeholders positively impacted delivery by providing the RWI mobilizers with additional assistance. According to enumerator observation debriefs, village stakeholders participated in the activities, helped prepare activity locations in advance, and even managed tensions or con icts that arose during the community meetings, speci cally around government latrine subsidies and construction quality. Based on the activity observation survey data, at least one stakeholder provided support in 92% of all activities observed. The most common stakeholders providing support included Ward members (45%), Anganwadi workers (43%), village heads (36%), and ASHA community workers (Accredited Social Health Activist) (30%).
Inclement weather negatively impacted delivery. Rain, and in a few cases severe heat, led to activities starting late, fewer participants being in attendance, participants leaving early, and the need to shift activity locations to seek better shelter. Activity observation survey data showed weather was an issue in 26% of all activities observed.
Social dynamics related to caste, gender, and age also hindered aspects of delivery and reach. According to both implementer IDIs and enumerator observation debriefs, caste divisions affected activities in three villages. In one village, caste divisions compelled RWI mobilizers to organize two separate palla performances and also led to issues organizing the community meeting. In a second village, one caste group was not able to attend the palla because it was held near the village temple from which they were prohibited. In a third village, one caste group refused to attend the transect walk in the presence of another caste group.
In interviews, RWI mobilizers described how younger mothers were sometimes not able to attend the mother's meeting and that older women from their households would attend in their place. The mother's meeting activity record data con rmed this observation; 48% of participants across all the meetings were 40 years old or older with 36% older than 45 years, indicating these participants were likely not the mother of the child <6 years old but potentially the grandmother.
What did participants think of the intervention? (satisfaction) According to both implementer IDIs and enumerator observation debriefs, the palla performances were positively received by village members, but the transect walk and meetings experienced some negative reactions.
Village members enjoyed the palla performances -they laughed at the jokes, "listened mindfully," praised the performance for bringing awareness to their village, and commented on how it was both educational and entertaining.
The transect walk elicited mixed reactions. It was well received by children in particular and all participants greatly enjoyed the handwashing demonstration at the end of the walk. However, village members often had negative reactions to visiting open defecation (OD) sites in the village and marking feces with colored powder, the main component of the activity. Many village members refused to visit the OD sites, while others expressed anger, irritation, disgust, and shame towards the act. In some cases, the RWI mobilizers were scolded for leading such an activity. Despite negative reactions, RWI mobilizers explained in interviews that a few participants felt the transect walk would positively impact their village. Both the community and mother's meetings experienced frequent upsets. Many community meetings were disrupted by participants voicing their frustration at the poor quality of their government-provided latrine or not having received their latrine subsidy; activity observation survey data showed poor latrine construction came up in 75% (N = 27) of the meetings and latrine subsidies came up in 53% (N = 19). In one meeting, participants attempted a 'mass exit' over these issues. Poor latrine construction was also mentioned by participants in 33% of the transect walks (N = 12).
In several mother's meetings, the distribution of potties and scoops caused upsets. RWI mobilizers were trained to provide participants with a potty and scoop at the end of the meeting once all information was covered. Some caregivers who left the meeting early or had not attended at all became upset over not receiving the hardware. Sometimes their husbands came and demanded the hardware.
What was the experience of implementers in delivering the intervention activities?
In interviews, RWI mobilizers provided feedback on aspects of intervention delivery that were successful, such as the pre-intervention visits, palla performances, and household visits, and aspects that were challenging, such as travelling to their assigned villages and being misconstrued as government o cials.
Mobilizers explained that the pre-intervention visits were critical to building rapport with village stakeholders from the start and that the palla performance was an ideal introductory activity since it was well received and helped mobilizers continue to build a positive relationship. Mobilizers also viewed the household visits as especially effective as they could directly engage with participants and reach members who were not able to attend the other activities, such as newly-married and younger women.
The biggest challenges mobilizers faced were with travel and the misbelief among community members that they were actually government o cials. Mobilizers lived far from their assigned villages and some villages were not easily reached by public transportation, on which female staff in particular relied as they did not own a motorbike like male staff. Many mobilizers described how they were repeatedly misidenti ed as government o cials, with community members believing they had come to cancel ration cards for those who were not using their government latrine. This misbelief and the issues related to latrine construction and subsidies caused many mobilizers to experience verbal attacks during activities, which were sometimes di cult to manage.
Mobilizers offered several recommended changes to the intervention: less repetitive activity messages, less prescriptive activity guides to allow for exibility in how messages are conveyed, snacks or other incentives at the meetings since it is expected and could make recruitment easier, later starts to activities (not in the early morning) to make logistics easier, and more time for household visits.
Finally, female mobilizers often had challenging experiences when delivering the intervention. These women, many of whom were young and in their rst job, reported being catcalled and shamed by community members as their presence de ed social norms restricting the mobility of young women. One woman's father began accompanying her to the villages because he was worried for her safety. Another mobilizer was scolded by her parents for leaving the house so early for work, as it was not socially appropriate. However, these same mobilizers explained that they gained the respect of community members over time and that they had become more con dent by the end, no longer shy in front of others and more comfortable with public speaking.

Cost Analysis
Delivery of the Sundara Grama intervention in 33 villages cost a total of 36,171.72 USD, with an average cost of 1,096.11 USD per village (Table 5). Payments to the palla troupes, latrine repair contractors, and wall painting artisans, including cost of materials, accounted for 43.6% of the total delivery cost (average of 477.96 USD per village); RWI staff salaries and transportation stipends accounted for 43.5% (average of 476.58 USD per village); and consumables, such as banners, posters, potties and scoops, accounted for 12.9% (average of 141.58 USD per village). Based on the endline trial survey, 1,956 latrine-owning households reported having attended at least one of the activities, making the cost per latrine-owning household reached 18.49 USD.

Discussion
We conducted a mixed methods process evaluation of the Sundara Grama behavior change intervention that sought to improve latrine use and safe child feces disposal in 36 villages in rural Odisha, India. The intervention activities reached a substantial portion of the target population at a cost of 18.49 USD per latrine-owning household reached. Activities were implemented with moderate to high delity, except for the community meeting, which often had several components missed, and were delivered consistently across the four mobilizer teams. Both participants and mobilizers praised the palla performance, but provided mixed reactions to other activities. Pre-intervention rapport building visits and village stakeholder support aided delivery, while inclement weather, certain recruitment strategies, and social dynamics hindered delivery. This process evaluation provides insights into what did and did not contribute to intervention success, and highlights the need for community-wide programs to identify and assess strategies that consider the social and political context, and impacts from past programming.
Two speci c components of the Sundara Grama intervention were critical and provide insights for other behavior change programs: 'edutainment' and multi-level activity delivery. Public health programs often use education-entertainment, or 'edutainment,' strategies to transfer knowledge and skills. However, a recent review of the literature on broadcast media interventions describes how these edutainment approaches can go beyond education alone and deliver messages that shift norms and attitudes to catalyze health behavior change (18). Theater performances, such as road shows, have also been used in this way successfully (19). The palla performance adds to this body of research; each skit was embedded with a variety of sanitation behavioral messages that touched upon motivations and social norms, as well as action knowledge and the health risks of open defecation. Our results show this kind of multifaceted folk theater performance can be delivered with quality, reach a large audience, and be positively received. Moreover, there are many bene ts to using traditional entertainment art forms, like the palla, compared to mass media: audience members experience the messaging as a collective which may bolster its acceptance, it is often better suited for hard-to-reach communities, and it can help revitalize a traditional art form (17).
We also found our multi-level approach with activities at the community, group, and household-level ensured all types of village members-men, women, children-were reached. This may explain the trial results, which reported modest increases in latrine use across both sexes and different age groups (12). The variety of activities provided multiple opportunities to communicate and reiterate behavioral messages across populations. This may be one reason why the trial results found a signi cant increase in safe child feces disposal despite mostly older women, likely grandmothers, attending the mother's group meeting: mothers were still receiving safe disposal messaging through other activities like the palla and household visit (12). Other water, sanitation and hygiene (WASH) programs that seek to improve the WASH behaviors for all types of community members should consider this kind of multi-level communication approach.
We also identi ed aspects of Sundara Grama that did not work well. The community meeting had the lowest delivery score, likely because it requires more skillful facilitation and participatory engagement, and may need more intensive implementer training for full delivery. The wall painting had the lowest reach and thus could be omitted from any future delivery given impact was achieved without it being noticed.
Social dynamics in uenced the implementation of Sundara Grama, offering lessons learned for future community-wide programs and emphasizing the importance of assessing social dynamics when evaluating delivery. In at least three intervention villages, casteism negatively impacted the ability for all village members to attend and engage in the palla performance and/or community meeting, a nding expanded upon in a separate qualitative paper on community perceptions of Sundara Grama, which discusses how social divisions hindered intervention delivery (20). Similarly, caste issues were documented in a qualitative process evaluation of a government sanitation program implemented in Puri between 2013-2014; lower-caste groups were sometimes forced to sit in a separate area during community meetings or were altogether not invited (15). In contrast, a process evaluation of a community-level handwashing behavior change program in rural Andhra Pradesh, 'SuperAmma,' quantitatively examined exposure to intervention activities and found no difference between caste groups (21). Such analyses by social groups are essential and should be more commonplace. However, as this and other studies demonstrate, qualitative explorations are also needed to capture participants' perceptions of their ability to fully engage and feel a part of activities. Future delivery of community-level programs in rural India should be mindful of caste divisions and identify, enact, and assess strategies, using both quantitative and qualitative approaches, to ensure equitable reach and engagement.
Social dynamics also negatively affected the experience of RWI mobilizers in implementing Sundara Grama. In order to deliver activities, the female mobilizers had to subvert gender norms that restrict young women's movement and engagement outside the home. As a result, female mobilizers were subjected to social backlash, including cat calling and public shaming. Other studies in India and Pakistan have also documented how restrictive gender norms limit the ability for women to both participate in and carry-out public health programs (22-24). Program evaluations do not always consider the implementer experience, but implementers must operate within the same cultural norms as participants and it is vital to understand how those norms may affect their role or lead to unintended harm. Moreover, these ndings demonstrate the need for NGOs and other implementers to establish safeguards and adequately prepare staff, often young women, who will have to challenge cultural norms as part of their work. Strategies may include establishing protocols to ensure safety and wellbeing, helping staff mentally prepare for negative social reactions, creating opportunities for staff to comfortably share their experiences and concerns, and equipping other staff who are not at risk of undermining a norm, and thus in a position of social power, with strategies for supporting their fellow team member. We caution implementers from altogether refraining from hiring staff that may face social backlash because doing so prevents individuals from making their own choices and taking on new opportunities; as several of the female RWI mobilizers explained, over time they gained the respect of community members and experienced a newfound self-con dence.
In addition to social dynamics, understanding and being able to respond to the political and historical context in which a program takes place is also invaluable for implementation success. The delivery of Sundara Grama was mired by the dissatisfaction and distrust village members felt from past government-led sanitation programs. During the community meetings and transect walks, village members often disrupted the activity to voice their frustration at the poor construction quality of their government latrine and the un lled promise of a latrine subsidy. Several studies have documented the same issues towards the various government sanitation campaigns rolled out between 2011 to 2018, indicating these issues are not new and quite persistent (7,15,25). The misbelief held by village members that RWI staff were actually government o cials who had come to force them to stop open defecating also impeded activity delivery. While this reaction was unexpected, it is not unfounded; coercive tactics authorized by local government o cials including harassment, public humiliation, nes, and the threat or actual loss of public bene ts are well documented during the latest sanitation campaign SBM (7, 26-28). When designing interventions, the political context and experience of past programs should be considered; community members, stakeholders and even implementing staff can be engaged from the start on how to address these issues head on as they are sure to arise. As one RWI mobilizer suggested, the Sundara Grama program could have included training on the latrine subsidy reimbursement process so mobilizers could offer some form of support to village members.
This process evaluation has many strengths. The study was framework-driven, systematically assessed delivery and reach across all villages, employed both quantitative and qualitative methodologies to appropriately evaluate each process evaluation component and triangulate ndings, and explored both the participant and implementer experience. We also note a few limitations. While most of the data was collected by our separate evaluation team, the RWI mobilizers documented their own delivery of the household activities, which could have led to biased data. In addition, since reach of latrine-owning households was assessed in the endline trial survey, which took place 4 to 6 months after implementation, it is possible household members had forgotten about the activities by that time, although this would lead to a more conservative reach assessment.

Conclusion
Using mixed methods and a framework-driven process evaluation, we found the Sundara Grama sanitation intervention was implemented as intended and achieved good reach. The edutainment palla performance and multi-level activity delivery were particularly salient approaches that could be applied to other WASH programs that aim for community-wide behavior change. We also uncovered lessons learned on the need for process evaluations to examine the social, political, and historical context in which a program takes place, as well as the implementer experience, to ensure successful and equitable delivery and prevent unintended harm.  Interactive meeting with facilitated group discussion on sanitation problems and solutions, creation of and commitment to a community action plan for sanitation goal, and celebration of 'positive deviant' households whose members are already exclusively using a latrine for defecation.  · Stakeholder support during activity*: Village stakeholders were sometimes observed to provide support during activities by preparing the location, participating themselves and, during some community meeting, helping calm upset participants.
· Inclement weather: Many activities were disrupted by rain and, in a few cases, severe heat. This led to activities starting late, fewer participants in attendance, participants leaving early, and the need to shift activity locations to seek better shelter.

· Caste divisions:
In a few villages, caste issues were faced. These issues either prevented village members of a given caste to attend the activity or required separate activities to be conducted, one for each caste group.
· Gender and social norms+: In both the palla and community meeting, RWI mobilizers explained how women and men sometimes had to sit in separate areas and younger, unmarried women were typically not allowed to attend activities. In the mother's group meeting, sometimes younger mothers were not able to attend when an older woman from their household was in attendance. +Includes palla troupe performance fee, materials, and travel stipend for troupe *Two performances were required in ve villages and an additional two performances were done for practice ±Banners were used in the palla performance, community meeting, and mother's meeting **Represents the number of intervention households surveyed at endline who reported their household has a latrine and attended at least one Sundara Grama activity Figures Figure 1 Timeline of process evaluation data collection (above arrow) and Sundara Grama implementation (below arrow)

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.