Identification And Annotation Of Chicken Ervs
The chicken genome (Gallus_gallus-5.0) was used as the input for ChERVs identification with two pipelines (Fig. 1). In the first pipeline, ChERVs were screened in the silkworm genome using the software LTRharvest  and LTRdigest . Pairs of putative LTRs that were separated by 1 to 15 kb and flanked by target site duplications (TSD) were screened by LTRharvest. The threshold of LTR nucleotide similarity used in LTRharvest was greater than 80%, and other parameters were set to their defaults. Internal sequence retroviral features of ERV candidates, including protein domains, polypurine tracts (PPT), and primer-binding sites (PBS), were annotated using LTRdigest with default parameters. A second pipeline, MGEScan-LTR was also employed to identify ChERVs using the default parameters. Each candidate containing at least three of the five canonical retroviral protein domains (Gag, PR, RT, IN, and RH) was retained in the results from LTRharvest and MGEScan-LTR. Finally, the ERV candidates identified by the two pipelines were merged as the ChERV library.
Sequences of the RT domain from ChERVs and known exogenous and endogenous retroviruses  were used for multiple alignment using MUSCLE (v3.8.31). A neighbor-joining phylogeny was built from the RT domain alignment using MEGA6 with 1,000 bootstrap replicates. The putative families of BmERVs were defined based on their support in phylogenetic trees.
Identification of genes located in the neighborhood of ERVs in the chicken genome
To show the distribution of ChERVs on the chromosome, we display ChERVs on each chromosome of the chicken. Moreover, the UCSC Genome Bioinformatics tool was used to screen for genes located within 100 kb of upstream and downstream of ChERVs (Fig. 1). Target genes are defined as genes that have annotated exons (UTR and CDS) within the defined sequence space of 100 kb. Blast2GO and WEGO were used to perform GO classification. The GO terms included molecular function, cellular component and biological process. For the pathway enrichment analysis, the genes were mapped to KEGG database. The interaction networks of ChERVs and neighboring immune-related genes were imported to Cytoscape software for visualization.
Four Transcriptome Raw Data Of Chicken Pathogenic Microorganism
To analyze the ChERVs transcriptome in chicken infected with pathogenic microorganism including AIV, ALV-J, MDV and APEC (Fig. 1), we downloaded the raw reads from NCBI with the details as follows.
In the study that provided the transcriptome raw data of AIV infection for our analysis , H5N1-infected chicken ileum samples and PBS-infected ileum samples (3 samples from each of 1 dpi) were selected for the analysis of ChERVs transcriptome. These six cDNA libraries were designated as GGA_HP_ileum_1_rep1 (SRA ID: ERR597332), GGA_HP_ileum_1_rep2 (ERR597329), GGA_HP_ileum_1_rep3 (ERR597323), GGA_con_ileum_1_rep1 (ERR597325), GGA_con_ileum_1_rep2 (ERR597328), GGA_con_ileum_1_rep3 (ERR597319).
The transcriptome raw data of ALV-J-infected spleen sample and control sample come from our previous study and download from GEO (accession: GSE63226) . Two cDNA libraries of ALV-J infected and uninfected samples from 140-day-old female chickens of White Recessive Rock were designated as WRR+ (GSM1544045) and WRR- (GSM1544046).
The published splenic transcriptome raw data of MDV-infected samples and uninfected samples at 14 dpi were selected for ChERVs transcriptome analysis . Two cDNA libraries of MDV-infected and uninfected samples were designated as CH-14dpi (SRA:SRX2425016) and CH-14d (SRA:SRX2425017).
The published bursa of fabricius transcriptome raw data of APEC-infected samples and uninfected samples at 5 dpi were selected for ChERVs transcriptome analysis . Seven cDNA libraries of APEC-infected and uninfected samples of susceptible phenotype chickens were designated as bursa_D5_S_rep1(GSM1724128), bursa_D5_S_rep2(GSM1724129),
bursa_D5_S_rep3(GSM1724130), bursa_D5_S_rep4(GSM1724131), bursa_D5_NC_rep1(GSM1724113), bursa_D5_NC_rep2(GSM1724114) and bursa_D5_NC_rep3(GSM1724115) (4 per each infected group, and 3 per each uninfected group).
Analysis Of Transcriptome Raw Data
To acquire high quality clean reads, the raw reads were filtered by removing the adapter-containing reads and low quality reads. The remaining clean reads were mapped to the chicken genome assembly (Gallus_gallus-5.0) using TopHat2 (version 126.96.36.199) . The mapped reads of each sample were assembled by software Cumerge, TopHat2 and Cufflinks . Genes and ChERVs abundances were quantified by software RSEM  and their expression level was normalized by FPKM (Fragments Per Kilobase of transcript per Million mapped reads).
To identify differentially expressed ChERVs (DEEs) and differentially expressed genes(DEGs), the edgeR package (http://www.rproject.org/) was used. ChERVs and genes with fold change of |log2FC|≥1 (FC: fold change) and a false discovery rate (FDR) < 0.05 were considered as DEEs and DEGs.
Effect Of Alv-j Infection On Cherv-3 Expression
To demonstrate the effect of ALV-J infection on ChERV-3 expression, quantitative real-time PCR (qPCR) was used to detect the endogenous expression of ChERV-3 env gene after ALV-J (SCAU-HN06, 105 TCID50/mL) infection at 24 and 48 hpi in primary chicken embryo fibroblasts (CEF). Uninfected CEF was used as control. The qPCR primer of ChERV-3 env were designed using the NCBI Primer BLAST program (F: TGTCAGCGGATGTTGTGGAA; R: CATCCAGGTGTGAGGTGCTT). The GAPDH gene was used as an internal control. qPCR was performed on a Bio-Rad CFX96 Real-Time Detection System using iTaqTM Universal SYBR® Green Supermix Kit reagents (Bio-Rad, CA, USA) according to the manufacturer’s specifications. Data analyses were performed using the 2−ΔΔCt method.
To further explore whether ALV-J infection affects the exogenous expression of ChERV3 envelope protein, we synthesized the 3xflag-ERV3env sequence (Supplementary file 1) by gene synthesis and inserted into pcDNA3.1 vector by 5‘KpnI and 3’ BamHI to construct the pcDNA3.1-ERV3env plasmid. DF1 cells, belong to chicken embryo fibroblast cell lines, are known to be susceptible only to exogenous ALV  and obtained from ATCC (Manassas, USA). DF1 cells were cultured in 24-well plates and transfected with 0.75 µg pcDNA3.1-ERV3env plasmid using Lipofectamine 3000 reagent (ThermoFisher, USA). 24 h later, the transfected DF1 cells were infected with 105 TCID50/mL of ALV-J strain SCAU-HN06. Uninfected DF1 cells transfected with pcDNA3.1-ERV3env plasmid were used as control. The exogenous expression of ChERV3 env mRNA was analyzed by qPCR and the envelope protein was analyzed by Western blot using flag antibody (Beyotime, China) at 24 and 48 hpi. Data are representative of three independent experiments, which are performed in triplicate.
Overexpression of ChERV-3 env gene and measurement of ALV-J infection in DF1cells
On the other hand, DF1 cells were transfected with 0.75 µg pcDNA3.1-ERV3env plasmid and pcDNA3.1 empty vector (NC), respectively. These transfected DF1 cells were infected with 105 TCID50/mL of ALV-J strain SCAU-HN06 at 24 h post transfection. qPCR was employed to detect the production of ALV-J in mRNA level with the specific primers of ALV-J gp85 gene . The cell supernatants were tested for ALV group-specific antigen (p27) using the Avian Leukosis Virus Antigen ELISA Test Kit (Zoetis, USA) according to the manufacturer’s instructions. The results were expressed as s/p ratios where s/p = (Sample Mean–Kit Negative Control Mean) / (Kit Positive Control Mean–Kit Negative Control Mean). All experiments were performed in triplicate.
Analysis Interferon-stimulated Genes (isgs) By Qpcr
To analyze the effect of ChERV3 envelope protein on host antiviral factors, DF1 cells were transfected with 0.75 µg pcDNA3.1-ERV3env plasmid and pcDNA3.1 empty vector, respectively using Lipofectamine 3000 (Invitrogen, USA). Chicken interferon-stimulated genes (ISGs) including STAT1, EIF2AK2(PKR), ISG12, Mx and CH25H were detected by qPCR at 24 h post transfection. The GAPDH gene was used as an internal control. All experiments were performed in triplicate.
Statistical comparisons were performed using GraphPad Prism 5 (GraphPad Software Inc., USA). Results are presented as means ± SEM, and statistical significance was assessed at P < 0.05(*), 0.01(**),or 0.001(***).