From a coating design perspective, knowledge discovery in databases can provide useful guidance for materials selection. In order to identify the trends or clustering in materials property data, we construct a database for 71 perovskites compounds (28 oxides, 7 chlorides, 8 bromides, 25 fluorides and 3 iodides) and 15 descriptors, including the ionic radius (rA, rB, rX), lattice parameters (a), elastic constants (Cij), bulk modulus (B), shear modulus (G), hardness (H), Cauchy pressure (Cp), the Pugh modulus ratio (B/G) and the fracture toughness (Kic). Table S1 contains the dataset used.
“t” is the tolerance factor as proposed by Goldsmith  is, given by:
t= (rA+rX)/√ 2 (rB+rX) (4)
“µ” is the octahedral factor µ = rB/rX
The Cauchy pressure is given by:
C p = (C 12 -C 44 ) (5)
The Hardness is calculated according the model of Chen et al 
H = 2(G 3 /B 2 ) 0.585 -3 (6)
Whereas, the facture toughness which measures the resistance of a material against crack propagation, is calculated according to the model of Niu et al 
K ic = V 0 1/6 G2 (B/G)1/2 (7)
Where V0 is the volume per atom (in m3), B and G are in MPa.
PCA is used to assess the correlation between each of the descriptors input into the regression analyses and the stability of the compounds. The results of these analyses can then be compared with the predictive models to understand the physics and limitations of the models. The PCs do not necessarily have an obvious physical meaning, but rather are a combination of descriptors which explain the largest variation in the data. The advantage of PCA is that, since each PC uniquely captures the effect of a certain combination of relevant descriptors, typically a few PCs are sufficient for describing a system.
The first analysis done, was to examine if in our case the PCA captures the differences between the different perovskites? The resulting scores plot of this analysis is shown in Fig. 1a. For this analysis (Fig. 1a), the sign of each principal component has only relational meaning. We notice that PC1(F1) captures 55.02% of the variance, whereas PC2(F2) captures 18.15%. The two PCs together capture ~ 74% of the variance of the data in Table S1. Therefore, a dataset of n-dimensions (15 initial descriptors in this case) can be reduced to a few dimensions (2 PCs) while capturing ~ 74% of the original information. The reduction in dimensionality makes trends and correlations, which are “hidden” in the data, become easily visualized and described in PC space as can be seen in Fig. 1a.
From looking at this figure it appears two important clustering those belonging to oxide perovskites and those to halide perovskites. Furthermore, within the oxide region, we observe a clear separation between the lanthanides and the transition metals. We notice that as PC1 increases the shear modulus (G) and toughness fracture (Kic) increases, on the other hand as PC2 increases the B/G, and H increase (See table S1). Therefore, a simple score plot could be a simple tool to identify the compounds with interesting mechanical and structural properties.
The loadings plot corresponds with the scores plot but represents the variance among descriptors. Figure 1b shows the loadings plot corresponding with the samples shown in Fig. 1a. The axes of the scores plot and loadings plot are the same so the information in the plots can be compared directly. The angles between the vectors tell us how characteristics correlate with one another. When two vectors are close, forming a small angle, the two variables they represent are positively correlated. If they meet each other at 90°, they are not likely to be correlated. When they diverge and form a large angle (close to 180°), they are negative correlated.
The impact of the descriptors is increased as its distance from the origin is increased. We notice from Fig. 1b two different clustering, those with negative PC1 (a, rA, rB, m, and rX) and those with a positive PC1 (t, C12, B, C44, C11, G, H, Cp, Kic). Globally, we observe that “a” is inversely correlated to all the mechanical properties. We clearly observe that “a” and “B” are inversely correlated, that means, as “a” increases “B” decrease. It seems that B/G is not correlated to H (~ 90°). B/G is inversely correlated too to octahedral factor “µ”. Therefore, the fact that perovskites with a low “µ” should have a large B/G and could be more ductile. Whereas, the tolerance factor “t” is correlated to “Cp”, that means that the Cauchy pressure is highly sensitive to the crystalline structures of perovskites.
Since, the relative impact of each descriptor in a loading score is identified by measuring the absolute distance from the origin, we display below the different PC’s equations as derived from the eigenvalue analysis:
PC1= -0.717r A -0.657rB -0.688rX -0.857a + 0.458t -0.178µ + 0.905C11 + 0.922C12 + 0.882C44 + 0.977B + 0.901G + 0.198(B/G) + 0.590Cp + 0.585H + 0.957Kic (8)
PC2= -0.033r A -0.568rB + 0.240rX -0.014a + 0.543t -0.685µ -0.181C11 -0.206C12 -0.103C44 + 0.006B + 0.255G -0.368(B/G) + 0.542Cp + 0.542H -0.218Kic (9)
For PC1 the coefficients (C11, C12, C44, B, G and Kic)) are the more important descriptors (~ 0,9), whereas for PC2 (µ, H, B/G) have the highest weighting (~ 0,7). These results confirm the observations noticed on the score plot of Fig. 1a.
Properties with similar PC values are highly correlated, while inverse PC values indicate inverse correlations. Globally, we observe that “a”, and the ionic radius are inversely correlated to almost all the mechanical properties. On the other hand, we notice that C11, C44 and Kic behave in the same manner (too close).
We have also performed PCA calculation for 58 inverse perovskites. The resulting scores plot of this analysis is shown in Fig. 2a, we notice that PC1(F1) captures 40.47% of the variance, whereas PC2(F2) captures 27.45%. The two PCs together capture ~ 68% of the variance of the data in Table S2. We notice three regions, the region “A” corresponds to the group of columns 2 of the periodic table (Ca3, Sr3, Ba3), as PC1 decreases the ionic radius of X increases. Region “B” corresponds to column 3 (Sc3), we observe also that as PC1 decreases the radius of ion A decreases (Tl, In, Ga, Al). Finally, region “C” corresponds to the other columns. On the other hand, we notice that as PC2’s increases the G and H increase, whereas, as PC1’s increases B/G and Kic increase. These behaviors are completely different than those observed for perovskites.
Figure 2b display the loading results for the inverse perovskites. We clearly observe that “a” and “B” are inversely correlated as in the oxide perovskites. It seems that B/G is inversely correlated to H (~ 180°). Whereas, the tolerance factor “t” is correlated to “B”. The PC’s equations as derived from the eigenvalue analysis are:
PC1= -0.963r A + 0.213rB + 0.414rX -0.899a + 0.460t -0.148µ + 0.824C11 + 0.840C12 + 0.362C44 + 0.977B + 0.476G + 0.281(B/G) + 0.591Cp -0.310H + 0.783Kic (10)
PC2= -0.038r A + 0.011rB + 0.403rX + 0.112a -0.064t + 0.127µ + 0.241C11 -0.358C12 + 0.762C44 -0.101B + 0.864G -0.793(B/G) -0.782Cp + 0.899H + 0.593Kic (11)
For PC1 the coefficients for (rA, a, C11, C12, C44, B) are the more important descriptors (~ 0,85) for PC1, whereas for PC2 (C44, G, H) have the highest weighting (~ 0,8). These results confirm the observations noticed on the score plot of Fig. 2a.
In this paper we are mainly interested to the ability of perovskites and inverse perovskites to deform (ductility) or to fracture (brittleness). It is well known that among those compounds some have superior mechanical properties; but almost are brittle. It is known that ductility occurs as atoms slide past one another in a bulk solid through dislocations.
There are two independent engineering elastic moduli: the shear (G) and the bulk (B) modulus. These quantities can be connected to single crystal elastic constants using different averaging techniques. The shear modulus encompasses is an indicator of the mechanical hardness H. Whereas, the bulk modulus represents a measure of the average bond strength of the atoms in the crystal, and it is proportional to the cohesive energy.
We present in Fig. 3 the variation of “B” versus “G” in order to reveal their ductility trend, as indicated by the ‘‘ductility’’ arrow. SrUO3, SrTiO3, SrVO3 for oxides perovskites and NbCPt3, SnCPt3 for inverse perovskites have a large “B” and “G”, we notice, that this behavior is also clearly seen on the PCA results (Fig. 1a, Fig. 2a, the arrow). On the other hand, we display in Fig. 4 the Cauchy pressure “C12– C44” versus “B/G”, since the ductility trend of certain cubic materials is based on the degree of the angular character of chemical bonding. As a general observation, ductile materials have positive values of Cauchy pressure, which correspond to more isotropic metallic bonding. On the other hand, brittle materials exhibit negative values of Cauchy pressure, which result from more angular character of the bonding. Whereas, the ratio “B/G” is considered as a parameter of ductility versus brittleness performance of solids. Ductility is characterized by a high “B/G” ratio (> 1.75), while low “B/G” is representative of brittleness. We observe that CaCrO3 and SbNNi3, respectively for oxide perovskites and inverse perovskites, have a large “B/G” and “Cp”. This is also clearly seen on the PCA results, since these compounds are isolated form all the other materials (Fig. 1a and Fig. 2a). We also display on Fig. 5 the variation of hardness versus the toughness fracture. The compound which seems to present high hardness H and fracture toughness Kic (Fig. 5) is SnPtC3 and SrVO3 for inverse perovskites and perovskites, respectively. However, even that CaCrO3 and SbNNi3 have high bulk modulus (Fig. 6), they have a poor hardness and fracture toughness. Based on these results, we may conclude that from principal components analysis results, we may predict the mechanical properties of perovskites and inverse perovskites.
Based on these intrinsic properties of the compounds we introduce a new criterion “B/G & Cp & H & Kic & B” (where & is logical AND) in order to predict the more interesting compound which could be used as thermal barrier coating (TBC). We propose, any compound which satisfy these five conditions jointly (Cp>0, B/G > 1.75, H > 2, Kic>2 and B > 140) will define a minimal “green light” region for the potential TBC compounds. Thus, the perovskites and inverse perovskites which may be good candidates as TBC are displayed on table S3. We notice that the inverse perovskites seem to be a more reliable candidates than oxide perovskites.
The question, which remains, are we able to select from the results of the principal component analysis materials that can potentially meet the property requirements? One needs to concentrate on the region where the materials exhibit a combination of relatively high hardness and ductility. The high hardness and high fracture toughness correspond to the region “A” in the score plot, whereas, region “B” represents ductile compounds. The situation is inversed for inverse perovskites. Therefore, interesting materials could be at the frontier between these two regions.
In this article we have also been interested by the possibility to make artificial materials with high hardness and toughness fracture through a thin coating approach. It consists of a method of obtaining high-hardness coatings in which a repeating layered structure of two materials with nanometer scale dimensions are deposited onto the surface. These structures are called "superlattices". As it has been studied by several authors [47–49], superlattices are characterized by the distance between each successive pair of layers “d”, which is known as the "bilayer repeat period". Xi Chu et al  have demonstrated that the hardening effect of the interfaces is reduced when the layers are narrow. They explained the decrease in hardness at large “d” is due to the dislocations moving within individual layers since they are not able to cross the interfaces.
In this work we have used the approach introduced by Koehler , who suggested for the first time in 1970 that a high-strength material could be obtained by fabricating a layered structure of two materials with the same crystal structure. Since, it is known that the interfaces between the layers could act as barriers to the motion of dislocations. Therefore, restricting the motion of dislocations will strengthen this type of material. So, if a dislocation moves into a layer with a higher shear modulus, the strain energy increases. Inducing a superlattice (A/B) a repulsive force that increases as dislocations in a layer “B” with a smaller modulus, GB, approach the interfaces with the layer “A” with a larger modulus, GA. According to Koehler's model, the critical stress required to move a dislocation across an abrupt interface is proportional to:
Q = (GA- GB)/(GA + GB) (12)
Therefore, a superlattice in which the difference in modulus between the two layers “ΔG” is large will therefore have a large critical stress and so a large hardness enhancement.
Since perovskites or inverse perovskites-based superlattices remain almost unexplored, we may ask this question. Do the perovskites or inverse perovskites offer a better material combination for superlattice (SL) coatings?
The information presented in Figs. 3–5 are used to design superlattice hard coatings. As discussed, materials with a small “Q” should be considered for synthesizing superlattice coatings to achieve effective hardness enhancement. Applying this criterion to the calculated perovskites and inverse perovskites, we may anticipate from (Table S4), that superlattices CaZrO3/SrNbO3, CaZrO3/ BaRuO3, CaZrO3/CaMoO3, NbCPt3/AlCPd3, NbCPt3/GaNNi3, NbCPt3/ZnNNi3 and NbCPt3/ScCPt3 have good potentials to offer enhanced hardness; in contrast, superlattices based on halides would show marginal hardness enhancement due to their small ΔG. These perovskites and inverse perovskites are all cubic structure according to the values of their tolerance and octahedral factors (table S4). However, we observe a large lattice mismatch for CaZrO3/CaMoO3, NbCPt3/GaNNi3 and NbCPt3/ZnNNi3, these may further enhance the hardness of these superlattices. These analyses are purely predictive and should be supported experimentally.
We can anticipate also from the PCA results, those potential materials for superlattice hard coatings. So, any combination of materials from region A as template and any materials from region B as substrate could give interesting superlattices e.g. (CaCrO3/SrVO3, SnCPt3/SbNNi3). We notice from the different calculations of ΔG and the position of each compound in the PCA score plots, that the distance between two materials from different clusters are correlated to ΔG. Since as this distance increases, ΔG increases.
Therefore, the logic presented here can be applied to any system with any number of samples and descriptors. Combining informatics with calculated data and physical properties will allow for the greatest understanding of structure-property relationships. With that knowledge, materials can then be engineered to maximize the desired properties. The use of PCA here demonstrates how informatics can be used to screen information to determine what is necessary and useful, and then to use that knowledge in experimental, computational, and materials design.
In this work we have analyzed perovskites and inverse perovskites compounds using a multivariate analysis. This work helps to develop a method for visually interpreting a PCA plot based on the correlation of the distances between the different perovskite compounds on the different plots. It has been clearly explained with respect to a logic focused on the correlation between the PCA results and the variation of the ΔG how to design better superlattice hard coatings materials. Thus, we expect that a simple visual observation of the PCA plots, in respect to the position of the any perovskite compounds in these plots will give us an insight on the variation of the ΔG.