## Numbers of flies collected and infected

A total of 2,375 *G. m. morsitans* had their wing vein length measured and 2,464 were dissected for trypanosome infection examination out of which 2,195, were used in multilevel binary regression and principal component analyses - 1,491 and 704 males and females, respectively (Table 1).

## Wing length and prevalence of trypanosome infection

As shown in Figure 1, the Lusandwa site recorded the longest mean wing length in the females (1.75 mm) whilst Chisulo site had the longest mean wing length in the male flies (1.58 mm). The mean wing length of the females and males were such that the females had significantly longer mean wing length than males at Lusandwa and Zinaka, t = 27.41 and 31.77, respectively (p < 0.0001). No significant difference was observed between the mean wing length in females and males in the Chisulo site, t = 0.39 (p = 0.717). With regard to prevalence of trypanosomes, except for traps results in Zinaka (red box), female flies had higher prevalence than male flies at all study sites and sampling methods, (Figure 2). The difference in prevalence was significant between those of male and female flies caught from fly rounds in Zinaka and those of counterpart sexes at Chisulo and Lusandwa as evidenced by the non-overlapping 95% confidence intervals.

## Multilevel binary logistic regression analysis

The multilevel binary logistic regression results showed that, the wing length variable was an important predictor of the prevalence of trypanosomes in *G. m. morsitans*. However, it was observed that the wing length variable showed significant influence on prevalence of trypanosomes in some permutation models of only the whole data set and those models did not have the sex variable as one of the predictors. In the permutation model where the wing length variable had the strongest influence on prevalence of trypanosomes, i.e. the one that had method, season and wing length as predictor variables, it was observed that per unit increase in wing length, the log odds for the prevalence of trypanosomes significantly increased by 0.123 (p = 0.032), model 1c (Table 2). The model with season and wing length as the only predictors had the weakest but significant influence on prevalence of trypanosome, per unit increase in wing length, the log odds significantly increased by 0.106 (p = 0.037 - data not shown). In analysis of permutation models for one-method data set, i.e. fly round data set, it was observed that per unit increase in wing length, there too was an increase in log odds, as shown in a representative model 2c by 0.124, however, this increase was not significant (p = 0.069). No logistic regression analysis was run on trap-only data because the sample size of 224 was lower than the minimum of 250 required for the regression analysis. The wing length variable did not show significant influence on prevalence of trypanosomes in all other permutation models of other data sets that had sufficient sample sizes for analysis.

The sex variable also showed significant influence on prevalence of trypanosomes including when the fly round data set was analyzed, model 1b and 2b (Table 2). It was observed that moving from females to males, the log odds for prevalence of trypanosomes significantly reduced by 0.305 (p = 0.010) in the model with the weakest but significant influence on prevalence of trypanosomes for the whole data set and by 0.283 (p = 0.024) in the model with the weakest influence on prevalence of trypanosomes for the one-method (fly round) data set. Further, females had higher trypanosome prevalence rates than males, though not always significant, as noted from the negative log odds values of the sex variable (Table 2). Moreover, a linear regression analysis on wing length where sampling method, season, sex and study site were predictor variables for the whole data set showed that moving from females to males, wing length significantly reduced by 0.150 (p < 0.0001). For fly round and trap data sets, moving from females to males, wing length significantly reduced by 0.150 and 0.142, respectively (p < 0.0001).

In analysis of “females only” data, in all permutation models of method, season, wing length and ovarian category predictor variables, only the ovarian age category variable showed significant influence on prevalence of trypanosomes (p = 0.030). No variable showed significant influence on prevalence in all permutation models when the “males-only” data was analyzed using the same predictors as for “females only” data - ovarian category variable was replaced by the wing fray variable in males-only data analysis.

Results of the likelihood ratio test carried out on permutation models for the whole data set showed that the sex and wing length variables together significantly improved the fit on the data set (p = 0.024) from the AIC = 2104.4 for the model without the two variables to the AIC = 2100.9 for the model with the two variables – the full model (Table 3). However, this was not the case on similar tests carried out on fly round-only, females-only and males only data sets.

For the whole data set, the model with the lowest AIC = 2095.3 was the one where the sex variable was the only predictor variable (showed significant influence, p = 0.011), followed by the one where the sex and the wing length variables were the only predictors (AIC = 2097.0 – showed no significant influence, p = 0.077 and 0.580, respectively). The model with the sex variable (no wing length variable) among the predictors, model 1b, fitted the data better (AIC = 2099.7) than the one with wing length variable (and no sex variable) among the predictors, model 1c (AIC = 2101.8). When the one-method data sets were analyzed, fly round and traps, the model with the sex variable as the only predictor had the lowest AIC for both methods, 1860.6 and 245.2, respectively.**Variable inflation factor
**Results of tests for multicollinearity on full models of all data sets showed that all variable inflation factors (VIF) were less than 5 (Table 4). Wing length variable from whole data set, had the highest VIF of 1.80 and season variable from fly round-only data set had the lowest (1.01).

## Principal component analysis

Results of the principal component analysis showed that the wing length variable contributed the highest variance to the first principal component (PC1). However, it was observed that the wing length variable did so, only in data sets where, except for the trap-only data set, the variable “method” was among those used in the analysis (Table 5). These data sets included whole, females-only and males-only data sets, 39.15%, 37.79%, 33.09%, respectively. In analysis of the one-method data sets in which both sex data were included, the wing length variable contributed the highest variance (33.22%) to the first PC1 for the trap data set while the sex variable contributed the highest variance (45.76% - almost equal to that for the wing length variable (45.70%)) for the fly round data set. In analyses of the one-sex data sets from one method, the site variable, contributed the highest variance to the first PC1 for both methods data sets, except for males-only fly round data set, where the contribution was similar with that of the wing length variable.