A total of forty-eight of the 58 species of Polythoridae were included for all reconstructions presented here. All the taxa from Sanchez et al. 2018, including outgroups of other related Calopterygoid taxa (Philogangidae, Euphaeidae, and Pseudolestidae), are included within the analyses, with an additional 12 polythorid species new to these analyses. Geographic origin, collector details, and Genbank Accession Numbers for all specimens are summarized in Supplementary Table S6.
DNA amplification, sequencing, and alignment
For the 12 species new to this analysis we extracted DNA from either the legs or ¼ of the pterothorax using a DNeasy Tissue Kit (QIAGEN) from each specimen following the manufacturer’s protocol. We amplified three mitochondrial and three nuclear fragments: Cytochrome Oxidase I (~799bp), NADH subunit I dehydrogenase (~548 bp), 16S, (~340bp), Elongation Factor (~900 bp), 28S (~340bp) and 18S (~600bp) (see Supplementary Table S7 for a list of primers used). All gene fragments were amplified using PCR conditions as described in the associated publications for each pair of primers (Supplementary Table S7) Macrogen USA Inc. laboratories (NY) performed the purification protocol for the PCR products (15μl final volume for each primer) and the Sanger DNA sequencing. Primer contig assembly, peak chromatogram verification and the generation of per-individual consensus sequences were done using Geneious v 8. All fragments were aligned using MAFFT  and then manually aligned in Mesquite. Ribosomal genes were aligned manually with reference to secondary structure using the methods described in Kjer  and Kjer et al.. Finally, all genes were concatenated using Mesquite for the overall analyses.
Time Divergence Analysis
A relaxed-clock molecular dating analysis on the partitioned dataset was run using BEAST v 1.8.4 . Specifically, we partitioned the gene fragments as follows: (i) We linked the sites and clock models for all mtDNA fragments and (ii) unlinked all nuclear ones from their clock and site models. We implemented the appropriate model selection for each partition: HKY + G4 for all mtDNA, JC for 18S and PMRT, 28S and for EF1 we set the model to GTR + G, models obtained using the model selection tool of IQTree. We used lognormal relaxed clock models for all partitions, under a Yule speciation model tree prior. We generated a best ML phylogram with proportional branch lengths using the appropriate partition models using IQTree as the starting tree for the analysis. Most of the fossil calibrations prior distributions were uniform considering the oldest fossil age as the maximum and the youngest fossil age as the minimum bound. For the Euphaedidae outgroup calibration we created a Lognormal distribution of all the fossil ages that accounted for all the age variation. Table 3 shows the calibrated nodes (4), stem fossils (10) and prior distributions selected for the analyses. We ran four independent analyses to ensure convergence of the MCMC; convergence was checked using Tracer 1.7 . Finally, the independent runs for each treatment were combined using LogCombiner v 1.8.3. The dated ultrametric tree was obtained using TreeAnnotator v 1.8.3 and visualized using Figtree v 1.4 (http://tree.bio.ed.ac.uk/software/figtree/) and the R package ggtree. With the best ultrametric tree we obtain a lineage through time plot using the R package APE .
The ancestral range estimation was performed with the following software: R package Biogeobears v.0.2.1 and RASP (Reconstruct Ancestral State in Phylogenies). Both software packages allowed us to customize dispersal rates matrices and time stratification events, as well as infer areas among the following historical biogeography frameworks: Dispersal-Vicariance Analysis (DIVA), Statistical Dispersal-Vicariance (S-DIVA), Dispersal-Extinction Cladogenesis (DEC), and Bayesian inference of historical biogeography for discrete areas (BayArea). In addition, Matzke included a new parameter in all the models accounting for what he calls the founder-event speciation. These are described as “jumping dispersal events (J)”, which are rare events that occur when a new population colonizes a new area . In particular, BioGeoBEARS has implemented model selection among six different historical biogeographic scenarios (DEC, DEC+J, DIVALIKE, DIVALIKE + J, BayArea, BayArea + J; ). Although, Ree and Sanmartín  recently highlighted there are conceptual and statistical issues with the DEC + J model implemented in BioGeoBEARS, which can sometimes favour unparsimonious numbers for “jumping dispersal events”; and as a result it will not reflect a more close approximation of the “true” model of range evolution. For our analyses, we designated nine distinct geological areas based on the geological literature [1, 41, 42], which include; (A) Central America, (B) Tumbes-Choco-Magdalena, (C) North Western Andes, (D) North Eastern Andes, (E) Venezuela Highlands, (F) Central Andes, (G) Amazon Basin, (H) Guiana Shield, and (I) Brazilian Shield (See Figure 2, S4). The extant species distributions for each of our species at the tips of our time calibrated tree (Fig 2) were compiled from our locality data, and specimen records information from the following collections: Florida State Arthropod Collection (FSCA), U.S National Entomological Collection (USNM), Andes Museum of Natural History (ANDES), Entomological Collection of the Universidad de Antioquia, Colombia (CEUA), and the Rutgers Newark Entomological Collection (RUN_ODO). Ranges were restricted to be comprised of at most three different areas as there are no extant species that occupy a range made up of more than three of the determined areas. Impossible adjacency range combinations were manually removed from the ranges list used by BioGeoBEARS during the inference of ancestral states. Details on the model parameters, areas allowed, dispersal probabilities and time stratification schemes for each of the three scenarios (Control, P&AS and ANDES) are explained in the Supplementary Appendix S8. Each of these scenarios were subject to the six available historical biogeographic models of BioGeoBEARS, from the best-fit model based on the corrected Akaike Information Criterion (AICc) weights. However, following the recommendations of Ree and Sanmartín we assessed model consistency by comparing among the selected BioGeoBEARS models and carefully examined the reconstructions performed for those models in both BioGeoBEARS and RASP. Afterwards, we favored the model based on best fit for the empirical information on Polythoridae and method consistency. The best selected reconstructed areas models, for each scenario were mapped over the best time calibrated phylogeny and the directionality of the dispersal and vicariance events was represented using maps for each major clade [59, 65].
Time Diversification Analyses
To investigate the patterns of diversification and extinction rate variation through time and across lineages, we chose RevBayes[16, 17] to test the best diversification models and rate shifts, due to the high uncertainty associated with phylogenetic tree estimation. We calculated the Bayes Factors for each pair of candidate models estimating the marginal Likelihood (mlnL) using two sampling algorithms (stepping-stone sampling and path-sampling) among the following models: Yule Pure-Birth Model, Birth-Death Constant Model and Episodic Birth-Death Models with multiple time intervals (4, 10 and 100 ). Once the best model through time was selected we calculated the speciation (λ), extinction (μ), net diversification (λ-μ) and relative extinction (λ/μ) for the best model; all plots were generated using the RevGadgets package in R. In addition, we used the Branch Specific Diversification Model implemented in RevBayes[16, 17] to detect rate shifts across lineages. For all the models implemented in RevBayes we assumed a uniform taxon sampling and an incomplete sampling fraction of 48/57. For the model selection, we ran a total of 5000 MCMC generations with a burnin of 1000 generations, all the parameter outputs were checked in Tracer to assess the ESS, prior and posterior probability, and markov chain proper behavior. We estimated the mlnL of the models sampling the power posterior of each model for 1000 MCMC generations with a burnin of 10%. For the branch specific estimation, we assumed the heterogeneous model and we test several rate categories (N=1(constant), 4 and 10) across the lineages, for each rate we obtained the marginal probabilities and Bayes Factors as explained above. For the best selected model, we wrote a file with all the estimated parameters (λ avg, μ avg, λ-μ avg, λ/μ avg and its 95% confidence ranges) for each branch that can be mapped over our BEAST time divergence tree and was visualized using FigTree (http://tree.bio.ed.ac.uk/software/figtree/).