Study design and setting
A cross-sectional study was conducted to detect geographical clusters and determine factors associated with early age sexual initiation among reproductive-age women from 2016 Ethiopian Demographic and Health Survey data. The study was conducted in Ethiopia (3o-14oN and 33o – 48oE), located at the horn of Africa. Governmentally, the country is divided into nine regional states and two city administrations [11]. Each region is sub-divided into zones, districts, towns, and kebeles (the smallest administrative units) (Fig. 1)
Data
The data for this study were retrieved from the DHS program authorized database. The survey is usually conducted at five-year intervals in a country. The country has undertaken four consecutive DHS surveys in (2000, 2005, 2011 and 2016). The Ethiopian DHS was planned to have estimates from the nine regional states and two city administrations. Geographical location data (latitude and longitude coordinates) were also taken from selected enumeration areas. The survey data sets and location data were salvaged through the web page of the international demographic health survey program after subscription and being an approved user.
Sampling technique and sampling size
A stratified two-stage cluster sampling technique using a national representative population-based survey was employed. The data were collected from 645 enumeration areas (EAs) (202 urban and 443 rural areas) independently in each stratum of the two stages using systematic sampling with probability proportional to size. After applying the weighting technique, a total of 12,033 women of reproductive-age (15–49 years) who had at least one occurrence of sexual intercourse was retrieved and included for the analysis. Spatial cluster detection and autocorrelation analysis were also done to discover the patterns of early age sexual initiation (Fig. 2).
Key variables and Measurements
Dependent variable
The study variables were grouped into dependent and independent variables. The dependent variable was early age sexual initiation, categorized dichotomously as “Yes/No” variable. Respondents who were engaged in sexual intercourse before the age of 18 were categorized as “Yes” and those who didn’t as “No.”
Independent variables
- Socio-demographic variables. Current respondents age, residence, region, religion wealth index, women’s education, working status, marital status, age at first marriage
- Sexual and reproductive history variables: age at first sex, age at first birth, current pregnancy status, ever had a termination of pregnancy
Data collection procedures and period
Data collection procedures have been published elsewhere. Briefly, data were collected by visiting households and conducting face-to-face interviews to obtain information on demographic characteristics, socioeconomic status, sexual and reproductive history, starting from January 18, 2016, to June 27, 2016 [11].
Operational definition
Early age sexual initiation was defined as the experience of first intercourse before 18 years of age [9, 11].
Statistical analyses
Descriptive and inferential analysis
All variables in the DHS data for this this analysis were given weight to adjust for differences in the probability of selection and to adjust for non-response in order to produce the proper representation using SPSS version 20. Descriptive statistics like frequencies, percentages, and measures of central tendency were computed. The bivariable analysis was also carried out to inspect the association between the dependent and each independent variables. All independent variables that were statistically significant at the bivariable model < 0.2 p-values were included in the multivariable logistic regression model to prevent the possible effects of a confounder. Adjusted odds ratios with a 95% confidence interval using the enter method were performed and variables with p-value < 0.05 in the multivariable model were considered as statistically significant.
Spatial analysis
Geographical Information System (ArcGIS version 10.4) software was used to visualize maps and analyze spatial statistics. Global and local scale spatial autocorrelation analysis were applied to explore the presence of clustering in the area and detect the geographical location of clusters of the early age sexual initiation. The Moran’s I index is the correlation coefficient, which measures the degree of association between a single variable with itself at different points in space as a function of the distance between points. The global Moran’s I statistic was used to measure the geographical clustering over the nation; whereas local Moran’s I statistic is used for constructing a localized measure of autocorrelation [12].
Spatial autocorrelation analysis
According to Tobler’s first law of geography, “everything is related to everything else, but near things are more related than distant things [13, 14].” The spatial autocorrelation (Global Moran’s I) statistic measures were used to evaluate whether the case patterns are dispersed, clustered or randomly distributed in the study area in which its values range from − 1 to 1 where Moran’s I values close to − 1 indicate perfect negative spatial autocorrelation (case dispersed), whereas Moran’s I close to 1 means positive spatial (case clustered) and Moran’s I zero implies perfect spatial randomness [15].
Anselin Local Moran’s I was used to investigating the existence of local level cluster locations of early sexual initiation and Moran’s I measure whether there were positively correlated (high-high and low-low) clusters or negatively correlated (high-low and low-high) clusters which are called outliers [16].
Hot spot analysis (Getis-Ord Gi* statistic)
The hot spot analysis tool calculates the Getis-Ord Gi* statistic that produces z-scores and p-values at a confidence level less than 0.05 which tell us where features with either high or low values cluster detected were statistically significant or not. Z-scores were used to assess the statistical difference of geographic clustering of early sexual initiation. A high (positive) z-score and small p-value of a feature indicate a significant hot spot whereas a low (negative) z-score with a small p-value indicates a significant cold spot; the higher or lower the z-scores, the more stronger the clustering, and a z-score near zero means no spatial clustering [17].
Spatial interpolation
Spatial interpolation technique was applied to predict values at unknown (non-sampled) locations using values at the measured (sampled) locations [18, 19]. Kriging spatial interpolation method was applied for predictions and produce smooth surfaces of the early age of sexual initiation.
Cluster detection and spatial scan statistical analysis
The spatial Scan statistical method is widely recommended since it performs very well in detecting local clusters [20]. It tests the presence of statistically significant spatial hotspots or clusters of early sexual initiation using Kuldorff’s SaTScan version 9.4 software. It uses a scanning window that moves across a study area; women who started sexual intercourse before the age of 18 were considered as cases and those the age group of 18 and above as controls to fit the Bernoulli model [12, 21].
Spatial cluster size < 25% of the population was used, as a higher boundary, which allowed both small and large clusters to be detected and ignored clusters that contained more than the maximum boundary. For each potential cluster, a likelihood ratio test statistic was used to determine if the number of observed early age initiation of sexual intercourse within the potential cluster was significantly higher than expected or not. The primary and secondary clusters were identified and assigned p-values and ranked based on their log likelihood ratio test, on the basis of 999 Monte Carlo replications [12, 21].
Ethical consideration
This study was based on an analysis of existing survey data with all identifier information that can be linked to particular individuals were removed. Written consent was obtained from the Measure DHS International Program, which authorized the datasets.