Three districts in three ecologically different provinces were purposively selected as the study sites based on timing of campaigns, malaria epidemiology, and environmental factors. Angoche district (Nampula province) is a coastal district located in the Northern region with high malaria transmission and has a population of 347,176 (based on the 2017 census). Changara district (Tete province) is located inland in the Central region with moderate to high malaria transmission and has a population of 123,056. Jangamo district (Inhambane province), located in the Southern region with moderate to high malaria transmission, is coastal and has a population of 105,306. Climate and ecology differ in the three sites. Temperatures are higher in the north and humidity higher in the coastal region. Throughout the country, the rainy season is from November to April, and average annual rainfall is around 1,000 mm in Nampula, 800 mm in Inhambane and 650 in Tete. All three districts are mainly agricultural with subsistence farming and some fishing in the coastal communities in Angoche and Jangamo. The mass campaign for which this durability monitoring was carried out was undertaken in Tete in May 2015 using the MAGNet® LLIN brand and in Inhambane and Nampula in October 2015, both distributing the Royal Sentry® LLIN brand. An additional LLIN mass campaign was carried out in Nampula in September/October 2016. In addition, all sites were included in the national 2017 LLIN mass campaign which took place between the 24- and 36-month surveys in all three sites.
This was a prospective study of a representative cohort of LLINs distributed during the 2015 mass campaign and followed up for three years. The design was based on the guidance from the U.S. President’s Malaria Initiative for LLIN durability monitoring (www.durabilitymonitoring.org) and in this case the study compared the durability of two different LLIN brands with the same characteristics (150 denier polyethylene LLIN incorporating alphacypermethrin, in blue colour). Both products list a loading dose of 5.8 g/kg, obtained full WHOPES recommendations in 2011 , and were converted to WHO prequalification in late 2017 (Royal Sentry, #003-001) and early 2018 (MAGNet, #014-001). Within six months of the respective mass distribution campaigns, LLIN were sampled and followed up after 12, 24, and 36 months through household surveys. At each time, point measures of physical durability were assessed (attrition and integrity) using a household questionnaire and net damage assessment tools. For all data points after baseline, 30 campaign LLIN per site were sampled and retrieved for assessment of insecticidal effectiveness (bio-assay) and chemical analysis of the active ingredient.
Sample size and sampling
Applying a design effect of 2.0 and 5% non-response rate of households, the required sample of LLIN after three years was 631 per site in order to detect a 12%-point difference between sites or estimate median survival of LLIN with a precision of ± 0.5 years (at alpha error 0.05 and beta error of 0.2). Taking into account the expected net attrition rates, a sample of 782 LLIN was estimated to be needed at baseline and based on the expected number of LLIN distributed per household (2.5), 340 households were needed to be sampled per site. These were sampled from 20 clusters (communities) with 17 households selected per cluster.
Clusters (communities) were sampled with probability proportionate to size using the campaign registration lists as sampling frame. Households within clusters were selected using simple random sampling from lists of eligible households prepared by the field teams on the day of the survey. For communities with more than 200 households a segmentation approach was used and only the selected segment was sampled. Up to five replacement households were sampled per cluster to substitute in case a sampled household had not received LLINs from the campaign or did not consent to participate. Within each household, all LLINs identified as from the campaign by brand, colour and report by the respondent were labelled with a unique ID number and bar-code for future follow-up, even when they were still in the package at the time of the baseline survey.
Campaign LLIN for bio-assay testing were sampled from the cohort (two LLIN per cluster) only at the final survey using simple random sampling. For the 12- and 24-months surveys, campaign LLINs were sampled from neighbouring households as follows: within each cluster two index households were randomly identified from the cohort and when the field teams reached these households, they went left to the next neighbour that had campaign LLIN and consented to give them up for the study. A brief questionnaire was filled for these LLIN regarding use and washing. For all LLIN collected for bio-assay new replacement LLIN were given.
An implementation team of nine individuals was established per site, with one overall site coordinator and two field teams each consisting of one supervisor and three interviewers. Activities in the field were overseen by staff of NMCP and NIH. Interviewers and supervisors were carefully selected so that they were culturally acceptable, had good knowledge of the local languages and experience in conducting household surveys.
A five-day training was held at baseline and three-day refresher trainings before each follow-up survey. Special emphasis was put on the process of a standardized assessment of net damage using a template to identify hole size categories and tallying hole counts using an application on the digital devices used for data entry. The questionnaire had three main modules: one for the household respondent, a second for the cohort campaign LLIN (including LLIN lost between campaign and baseline survey), and a third module for other LLIN owned by the household at each time point. In addition, a list of household members and assets was obtained at baseline and at the final survey. GPS coordinates were recorded at baseline and used to track household during follow up. If households moved within the clusters the new homes were identified, if they moved outside the cluster, they were considered lost to follow-up.
The baseline assessment took place in October-November 2015 in all three sites. The 12-month survey was done in Tete June 2016, and in Nampula and Inhambane August 2016. The 24-month follow-up took place May 2017 in Tete and August 2017 in Inhambane and Nampula. The 36-month final follow-up was done May 2018 in Tete, and July/August Inhambane and Nampula.
Outcomes of insecticidal effectiveness were based on bio-assay results using the standard WHO cone test, carried out at the Mozambique National Institutes of Health in Maputo. A pyrethroid-sensitive strain of Anopheles arabiensis was used with 10 mosquitoes per cone, five sites tested on each net (four sides and roof) and four replicates per location (20 cone tests with 200 mosquitoes per net in total). Recorded were 60-minute knock-down (KD60) and 24-hour mortality and then combined as optimal insecticidal effectiveness (KD60 ≥ 95% or functional mortality ≥ 80%), minimal effectiveness (KD60 ≥ 75% or functional mortality ≥ 50%), or failure (not reaching minimal effectiveness criteria) . Chemical residue analysis was done at the Centers for Disease Control and Prevention in Atlanta Georgia. Five pieces of netting were tested per net from the same locations as for bio-assays and the fabric weight per surface area recorded. The five samples were then cut into 10cmx10cm squares and pooled to get a homogeneous sample per net. The active ingredient (AI) incorporated into the filaments was extracted by heating under reflux for 30 minutes with xylene in presence of citric acid, addition of dioctyl phthalate as internal standard, and determination by gas chromatography with flame ionization detection (GC-FID) following the CIPAC method 454/LN/M/3.2 .
For data collection, tablets PCs (Samsung Galaxy Tab 5) were used and installed with the Open Data Kit (ODK) software for the questionnaire and Open Street Map for Android (OSMAND) for household tracking. Data from each field team was collected daily and directly uploaded to a secure data base if internet was available or collected on a local storage device by the coordinator until it could be transferred. Data was converted from ODK to comma-delimited data files using the ODK briefcase tool for inspection of incoming data and daily feedback was provided to the teams. For each survey round updated lists were compiled from the household and cohort net master files and preloaded on the ODK system including all households and cohort LLIN for which no definite outcome was available to date. After completion of the surveys, datasets were transferred to Stata version 14.2 (Stata, Texas, USA) for further aggregation, consistency checks and preparation for analysis. Stata do-files (macros) developed by the PMI VectorWorks project were applied and adjusted as needed . For the final analysis data sets from all four surveys were merged and a duration format data set prepared for survival analysis.
Definition of outcomes
The primary outcome measure was the physical net survival and was defined as the proportion of cohort LLIN received from the LLIN campaign still in serviceable physical condition (definition provided below) . Physical net survival incorporates both net attrition and net integrity, which were calculated as follows:
Net attrition rate due to wear and tear was defined as the proportion of originally received LLIN which were lost due to wear and tear (thrown away, destroyed or used for other purposes) at the time of assessment. LLIN received but given away for use by others or stolen were excluded from the denominator. Similarly, LLIN with unknown outcome were excluded.
Net integrity was measured first by the proportionate Hole Index (pHI) as recommended by WHO . Holes in cohort LLIN were counted categorized into four different sizes: size 1: 0.5-2 cm, size 2: 2-10cm, size 3: 10-25 cm and size 4: larger than 25 cm in diameter. The proportionate Hole Index (pHI) for each net was then calculated as the number of holes counted multiplied by the size category weights as suggested by WHO . Based on the pHI each net was then categorized as “good”, “damaged”, “serviceable” or “torn” as follows :
Good: total hole surface area <0.01m² or pHI<64
Damaged: total hole surface area 0.01-0.1 m2 or pHI 65-642
Torn: total hole surface area>0.1m² or pHI>642
Serviceable: total hole surface area≤0.1 m² or pHI≤642 (good or damaged)
In order to be able to compare physical survival measured at different time points the outcome of median net survival was estimated defined as the time in years until 50% of the originally distributed LLIN were no longer serviceable. Two approaches were used to estimate median survival. At each time point, the proportion surviving in serviceable condition were plotted against the hypothetical survival curves with defined median survival  (Additional File 1) and the median survival was taken as the relative position of the data point on a horizontal line between the two adjacent median survival curves. After the final survey median net survival was calculated from the last two time points provided both were below 85% (when the hypothetical curves are linear), using the following formula:
where tm is the median survival time, t1 and t2 the first and second time points in years and p1 and p2 the proportion surviving to first and second time point respectively in percent. Confidence intervals for this estimate were calculated by projecting the 95% CI from the survival estimates in the same way as described above.
Explanatory variable preparation
Overall household attitudes towards net care and repair were measured using a set of Likert score questions where a statement was read to the respondent (head of household or spouse) and the level of agreement recorded. These were analysed by recoding the four-level Likert scale score to have a value of -2 for “strongly disagree”, -1 for “disagree”, +1 for “agree” and +2 for “strongly agree.” These attitude scores for each respondent were then summed and divided by the number of statements to calculate an average household attitude score for which 0 represents a neutral result and positive values a positive result. For each site the proportion of households with a score above 1 (very positive attitude) were calculated at each survey. Further aggregation of results was done across all four surveys to determine whether a household was never found to have a very positive attitude score, at least once or twice or more. Results were aggregated across all four surveys i.e. “never” = responded with "never" in all surveys the household participated; “at times” = household reported the behaviour as “sometimes” in at least one survey round or had conflicting statements; “always” = responded with "always" in all surveys the household participated. Exposure and attitude were similarly aggregated, i.e. “once” = reported exposure or positive attitude score at one of the four survey rounds; “twice or more” = at two or more survey rounds. The same procedure was used for other household and net risk factors for durability.
A wealth index was calculated for the baseline data set using the basic household assets and a principal component analysis with the first component used as the index. Households were then grouped into tertiles. The full household data collection and wealth index was repeated at the final survey. However, at 12 and 24 months no specific household or member data were collected.
For continuous variables, arithmetic means were used to describe the central tendency and the t-test for comparison of groups for normally distributed data. Otherwise, median and Kruskal Wallis test were used. Proportions were compared by contingency tables and the Chi-squared test used to test for differences in proportions. For calculation of confidence intervals around estimates, the intra- and between-cluster correlation has been taken into account.
Survival analysis was done using an intention to treat approach, i.e. risk of failure was considered to start at the day of distribution irrespective of whether or when the net was hung and used. Failure was defined as a net being lost to wear and tear or “too torn” based on physical assessment. Nets that were given away or with unknown outcome were censored. The time of failure was directly calculated from the report of time of loss by the respondent or taken as the mid-point between the last two surveys if unknown. A secondary analysis used a per-protocol approach where the risk of damage was considered to begin only when a net was first hung. Determinants of survival were explored using Cox proportionate hazard models. Final model fit was tested using a link test and Schoenfeld residuals to check the proportionate hazard assumption.