Characteristics of the dogs in the study
Details of the animals used in the study are presented in Table 1. There was no significant difference in age between the two diseased groups, but both diseased groups were significantly older than the normal group (P<0.001). Both sexes were represented in all groups and gender did not have an effect on gene expression, based on principal components analysis and network analysis (Figure 1A). There were six valves with Grade 3 and 4 MMVD from CKCS and five Grade 3 and 4 valves were from other breeds (terrier breeds and Border Collies) (Figure 1 B and C). The normal dogs were Beagles and mixed breeds (Table 1).
Network analysis of samples
To visualise the expression data, we used the network visualisation and analysis tool BioLayout (http://biolayout.org). Initially we wished to ascertain whether the samples could be grouped on the basis of disease status or breed. We therefore created a three dimensional sample-to-sample weighted network graph, based on a pairwise matrix of Pearson correlation coefficients. To minimise noise we filtered out low expression genes (normalised relative expression < 15). The graph was laid out at a Pearson correlation coefficient threshold of 0.98 (see Methods). This showed that expression patterns of all grade 3 and 4 samples were separated from the expression patterns of the normal samples (Figure 1B), and there was also a clear separation of CKCS diseased valve samples away from the non-CKCS diseased samples, with the non-CKCS diseased valve samples closer to the normal valves in the network than were the CKCS diseased valves (Figure 1C). This suggested that the diseased valves from non-CKCS and CKCS dogs had distinct patterns of gene expression.
To explore the transcriptomic differences between diseased valves from CKCS and other breeds and from normal valves, we used BioLayout to construct a gene co-expression network (GCN)  from the same expression data, using a threshold correlation coefficient of 0.9. The analysis included 7,247 genes making 20,612 edges. Nodes within the network were then clustered based on the similarity of expression pattern using the Markov Clustering (MCL) algorithm at an inflation value of 1.7, as implemented in BioLayout. This is a hypothesis-free approach, where the number of clusters is not constrained. In this analysis we reviewed all clusters with at least 5 nodes (probesets with expression patterns correlated at r ≥ 0.9). As we have seen in other studies [29, 30] there was considerable individual dog-specific variation and many clusters were driven by high expression of a subset of genes in a single or small number of individuals, independent of disease status. These clusters were not analysed further. However, there were also clusters showing overall high or low expression in the majority of diseased valves and a small number where expression was high or low only in CKCS. This was consistent with the sample-to-sample analysis and suggested that there were indeed sets of differentially expressed genes that distinguished the normal and diseased valves. The largest component of the network graph is shown in Figure 2A and clusters that demonstrate differences between diseased and normal valves are shown in Figure 2B. Histograms indicate the average expression of genes in the clusters; gene lists for these clusters and enlarged images of the histograms are presented in Additional File 1.
Having identified clusters that showed expression patterns associated with disease status (Figure 2B), we searched them for proliferation genes and found that there was no association of cell division markers  with disease or breed status, suggesting that perturbed patterns of cell division may not be a consistent feature of MMVD in either CKCS or other breeds. In addition, mitochondrial genes [32, 33] were spread amongst a number of clusters and showed no association with breed or disease status, suggesting that mitochondrial dysfunction does not make a major contribution to MMVD. Since immune response had been identified as altered in previous studies [18, 19] we also looked for clusters associated with changes in expression of immune/inflammatory genes . Consistent with reports of an immune involvement in MMVD, a small number of macrophage and T-cell genes were in cluster011 (higher in all diseased valves; Figure 2B and Supplementary File 1). However, most of the genes associated with immune function were found in different clusters and showed no correlation with disease or breed status.
As mentioned above, the majority of the clusters had expression patterns driven by a single sample, and would not reveal functions likely to be informative for MMVD aetiology in general. We therefore subjected only the genes within the disease associated clusters shown in Figure 2B to enrichment analysis using DAVID (see Methods). Cluster002, which was a large group of genes apparently down-regulated in CKCS, was analysed separately, while the genes in other down-regulated clusters and the up-regulated clusters were each pooled because they showed similar average expression patterns and were close to each other in the network. This ensured that the groups were of sufficient size for meaningful analysis. The gene lists for the three analyses are given in Additional file 1, Table S1 which also shows the histograms of average gene expression for each cluster.
For Cluster002, which contained genes that were lower in CKCS valves than those from other diseased dogs or normal dogs, there was enrichment of Biological Process (BP) terms related to muscle structure and activity, cardiac conduction and calcium ion release. For Cellular Component (CC), terms related to sarcolemma were enriched. The top ten GO terms are presented in Additional file 1, Table S2. For the group of clusters in which genes were generally down-regulated in all diseased valves, DAVID GO enrichment analysis found enrichment for extracellular matrix and cell surface terms. For the clusters which contained genes that were up-regulated in diseased valves, there was enrichment of GO terms related to inflammatory response, monocyte chemotaxis and extracellular exosome.
Comparison of CKCS diseased valve transcriptome with normal valve transcriptome
The BioLayout analysis identified groups of genes that appeared to be up- or down-regulated in diseased valves, as well as a cluster of genes that were down-regulated only in CKCS valves. The analysis revealed that there were many genes showing idiosyncratic sample-specific expression patterns, which may have concealed some meaningful differences and resulted in the paucity of significant GO term enhancement. We therefore generated lists of differentially expressed genes (DEG) with the Affymetrix Transcription Analysis Console, using a one-way between-subject ANOVA (unpaired), which takes into account variance among samples in calculating the p-value and determines a false discovery rate (FDR) to allow for multiple comparisons. A threshold FDR q-value of 0.05 was used in the analyses.
To confirm the findings of the GCN analysis, an initial comparison was made between CKCS diseased valves and normal valves. Using a fold change of at least 1.5 in either direction and an FDR q-value < 0.05, transcripts detected by 755 probesets were differentially expressed, representing 599 annotated genes and a number of unannotated probesets. 271 annotated genes were higher in CKCS (up-regulated) and 328 genes were lower in CKCS (down-regulated) (Figure 3A, full list of genes in Additional file 2, Table S3). These 599 genes were analysed using DAVID GO enrichment analysis. For the down-regulated genes, there was enrichment of similar GO terms to those found for Cluster002 of the GCN analysis, as shown for the top ten enriched GO terms in Additional file 2, Table S4. The enriched GO terms were related to cardiac muscle cell function and structure and calcium channel activity. For up-regulated genes terms related to immune response and ERK1/ERK2 activity were listed.
Comparison of CKCS diseased valve transcriptome with transcriptome of diseased valves from other breeds
The sample-to-sample analysis suggested that gene expression in CKCS diseased valves was different from that in diseased valves from other breeds. Therefore we next compared the two sets of diseased valves. 161 annotated genes were differentially expressed, 27 with higher expression in CKCS valves than other diseased valves and 134 with lower expression in CKCS valves (Figure 3B, Additional file 3, Table S5). For the genes that were lower in CKCS diseased valves than other breed diseased valves, the GO terms were similar to those distinguishing CKCS from normal valves, with an emphasis on cardiac muscle structure and function (Additional file 3, Table S6). A single GO term was enriched for the genes that were higher in CKCS valves than other breed diseased valves.
Eighty-one genes were downregulated in CKCS diseased valves compared both to normal valves and to diseased valves from other breeds, suggesting that these represent a breed-specific effect, which may or may not be related to the disease. Similar GO terms were enriched in this set of down-regulated genes, primarily terms associated with cardiac development and function (Additional file 4, Table S7 and Table S8). This suggests that CKCS have abnormalities of expression of genes involved in cardiac development and function which may be related to the early onset of MMVD. Nineteen genes were up-regulated in CKCS compared with both normal valves and diseased valves from other breeds. The genes are shown in Additional file 4, Table S7. A single GO term was enriched, as for the comparison between CKCS and other breed diseased valves (Additional file 4, Table S8).
Comparison of diseased valve transcriptome with normal valve transcriptome
The BioLayout sample-to-sample network showed that the non-CKCS diseased valves were close in gene expression pattern to the normal valves (Figure 1C), and there were no significant DEGs when comparing these two groups at the stringency used for the other comparisons (FDR q-value < 0.05). Consistent with this, the volcano plot for this comparison showed that the fold changes were lower, the p-values were higher and the genes were less scattered than in the other plots, supporting the observation that the normal diseased and non-CKCS diseased valves were closer in gene expression pattern than either were to the CKCS valves. Relaxing the stringency of the analysis (unadjusted p < 0.05, no FDR correction; Figure 3C) showed that there were 278 differentially expressed genes, of which 166 were higher in the diseased valves and 112 were lower. For this lower stringency set, there were several enriched GO terms that overlapped with the set produced comparing CKCS and normal valves (Additional file 6, Table S11).
The network cluster analysis showed that there were a number of genes where the majority of diseased valves were different from the normal valves. To examine further whether there were genes that were differentially expressed in both sets of diseased valves, we generated a list of DEGs comparing all diseased valves with normal valves (FDR q-value < 0.05, fold change at least 1.5 in either direction) (Additional file 5,Table S9). 106 genes were differentially expressed, 50 with lower expression and 56 with higher expression in diseased valves than normal valves (Figure 3D). Enrichment of GO terms for the down- and up-regulated genes is shown in Additional file 5, Table S10. For the up-regulated genes, terms related to skeletal system and mesenchyme migration were found, consistent with the increased EndoMT in the diseased valve. For the downregulated genes, the term calcium ion binding was enriched, supporting the idea that calcium homeostasis is perturbed in MMVD.
Pathway analysis of differentially expressed genes
The online Ingenuity Pathway Analysis tool (IPA) includes a database of genes/proteins in pathways built mostly from human and laboratory animal data. Analysis with IPA identified 77 canonical pathways when comparing the transcriptome of normal valves to CKCS valves, 28 when comparing CKCS valves to non-CKCS diseased valves and 56 when comparing all diseased valves with normal valves. The top three pathways for each comparison are shown in Table 3 and the top four upstream regulators for each dataset comparison, with their associated Z-score indicated predicted activation or inhibition, are shown in Table 4. Of particular note are changes in calcium signaling in the CKCS compared to both normal and non-CKCSs datasets (Tables 4 and 5). For calcium signaling there were 23 DEGs (17 lower in CKCS) comparing CKCS with normal, 10 DEGs (all lower in CKCS) comparing CKCS and non-CKCS and 5 DEGs (1 lower in diseased valves) comparing all diseased valves to normal. Shared DEGs included genes associated with calcium homeostasis, non-canonical TGFβ signalling pathways (ERK1/2, IP3, RhoGTPase), cytoskeleton, muscle contraction (including cardiomyocytes), and the calcineurin/NFAT (nuclear factor of activated T cells) pathway. IPA analysis also found that hepatic fibrosis/hepatic stellate cell activation was changed in CKCS and all diseased valves compared to normal. This association was also found in the lower stringency analysis of normal valves with non-CKCS diseased valves. In the hepatic fibrosis/hepatic stellate cell activation gene dataset DEGs included genes associated with collagen homeostasis, cytoskeleton and cell growth and differentiation. Fibroblasts from different organs have different phenotypes and the relevance to mitral VICs is unclear.
A curious observation identified was comparing CKCS to normal IPA predicted up-stream regulator analysis tended to be activated (positive activation Z-score in IPA), but the predicted regulators tended to be inhibited in CKCS relative to non-CKCS (negative activation Z-score). However, these differences in themselves are not the main biological interest, but which regulators are predicted to be affected and whether activated or inhibited. For CKCSs compared to normal there were 1831 molecules with F2 (prothrombin) signalling pathway having the strongest association, but also TNF and TGFβ1 signalling featured in the top four upstream regulators. Comparing the two disease datasets, 377 molecules were associated, with the top upstream regulator being myocyte-specific factor 2c (MEF2C). The down-stream effects, as predicted by IPA, for TGFβ1, MEF2C and F2 (prothrombin) are shown in Additional file 6, Figure S1 A-C.
The top five disease and function annotations assigned to the datasets by IPA and an illustrative graphical representation are shown in the Additional file 7, Table S11 and Figure S2. Comparing the datasets, these annotations generally matched to the same general themes, but identifying the subtle differences required examination of the graphic representations, with the illustrative example for Skeletal and Muscular Disorders and Developmental Disorder and Hereditary Disorder, comparing CKCS and non-CKCS (Additional file 7, Figure S2).