Feature extraction of LRLO charge curve using Principal Component Analysis. Principal Component Analysis (PCA), an unsupervised learning algorithm known for reducing multidimensional data into fewer dimensions while minimizing information loss, was applied to analyze a dataset of over 20,000 charge cycles from LRLO cells, each cycle with unique physical characteristics and varying states of performance degradation (Fig. 1b). By reducing correlations between components, PCA successfully identifies a set of orthogonal, or independent, features within the data. [34, 35] This approach is based on the premise that the electrochemical properties of materials are controlled by a limited number of important independent factors. [36] By applying PCA, we aimed to simplify the complex dataset into a more interpretable form by highlighting the features that most significantly explain the variance in the charge curves.
Figure 1a illustrates the process of extracting principal components and their weights from the charge curves and then reconstructing the original curves from these components. The principal components that most efficiently capture the variance in the charge curves are shown in Fig. 1c. These components, which resemble V-I curves, denote charge variations across voltage ranges, similar to the dQ/dV curves of ten used in battery research. [37, 38, 39] This similarity can help researchers intuitively understand the relation of each component in terms of redox potentials. The demonstrated independence of these components, as shown in Fig. 1d, ensures an accurate representation of the electrochemical behavior of the dataset by reducing correlations.
Using the top seven components as the shared features of LRLO cells enabled accurate reconstruction of the original charge curves from their corresponding weights, with an average error of 0.16 mAh/g (and 0.32 mAh/g for 95% of the data), as shown in Fig. 1e. (see Supplementary Section 1 and Figure S3, 4, Table S1 for detailed accuracy for the number of components used) This reconstruction accuracy demonstrates the efficiency of PCA in capturing the essential dynamics of LRLO cell cycles. In the following section, the physical significance of these principal components will be explored to further understand their influence on the electrochemical properties of LRLO cells.
Physical interpretation of principal components. A comprehensive correlation analysis was performed to elucidate the relationship between PCA-derived features and the electrochemical performance of LRLO materials. Principal components identified without prior consideration of electrochemical properties, such as capacity and average voltage, showed significant correlations with these electrochemical performance metrics. (Fig. 2a) In particular, the first principal component (PC1) showed a strong correlation with both capacity and average voltage, while the second principal component (PC2) showed a weak correlation with average voltage. (See Supplementary Section 2 for detailed relations) The sum of these two principal component curves accounts for 97% of the explanatory power, indicating that PC1 and PC2 can capture most of the characteristics of the electrochemical behavior. (Figure S5)
The correlation between principal components and electrochemical performance was found to be remarkably consistent throughout the cycling phases. (Fig. 2b, c, S6) This was further validated by detailed correlation analyses comparing materials at cycle 10 (minimal performance degradation) and cycle 50 (extensive performance degradation). (Figure S7-9) Contrary to the expectation that the factors determining the performance of initial and degraded samples might be different,[40] the correlation between PC1 and capacity and average voltage at these two cycle stages was surprisingly similar. This finding suggests a fundamental underlying physics governing both the initial performance and the degradation process of lithium-rich materials. The cycle-independent correlation between key components and key performance metrics challenges the conventional understanding of the performance degradation of lithium-ion materials. This consistency implies that the same physical principles affect both the initial electrochemical performance and the response to degradation. Such insights are valuable for LRLO material design, suggesting that optimizing materials for initial performance may inherently provide benefits with respect to degradation.
To understand the physical meaning of the principal components, we examined how different weights assigned to these components affected certain characteristics observed in our data. Specifically, we examined the relationship between the weight of the first principal component (PC1) and battery capacity, as shown in Fig. 2d. This figure shows that a lower PC1 weight is associated with an increase in charge capacity and vice versa. This correlation suggests that variations in charge capacity are associated with changes in battery behavior at intermediate voltages, particularly around 3.8V. Further analysis of the derivative of charge with respect to voltage (dQ/dV) curves, shown in Fig. 2e, allows us to identify the factors that cause shifts in the positions of the reactions that occur at intermediate voltages. We observe that an increase in the weight of PC1 not only shifts these reaction peaks to higher voltages, but also diminishes reactions at lower voltages, around 3.4 V, contributing to a decrease in overall capacity. These shifts can be indicate increased resistance within the cell. [41, 42] The increase in resistance indicated by these voltage shifts adversely affects oxygen redox reaction. This reaction, which is particularly limited by its kinetic properties, is further impaired as resistance increases, resulting in a noticeable decrease. [43, 44] The underlying factors of this increased resistance include material properties such as surface area [45, 46] and conductivity, [47] as well as degradation factors such as the formation of solid-electrolyte interphase (SEI) layer and the occurence of side reactions.[48, 49]
Charging curves with the same PC1 weight but significantly varying PC2 weights show differences in average voltage rather than capacity as illustrated in Fig. 2f. dQ/dV curves in Fig. 2g show that the PC2 weight is related to the amount of low-voltage reaction and intermediate-voltage reaction. Specifically, a lower PC2 weight results in smaller low-voltage reaction and larger intermediate-voltage reaction, while a higher PC2 weight results in the opposite. The inverse movement of low voltage reaction height and intermediate-voltage reaction height suggests their relevance to average voltage rather than capacity.
The variation of low-voltage and center reaction heights in opposite directions with changes in PC2 weight could be related to the difference in cation/anion redox ratios or Mn reduction. Material factors could include composition,[50, 51] while degradation factors could include the redox involvement of Mn due to oxygen gas evolution.[6, 9, 52, 53] Extending these findings, the first and second principal components provide a lens through which specific physical phenomena affecting battery performance can be distinguished, particularly the thermodynamic and kinetic properties associated with oxygen and manganese redox in the low-voltage region. The charging process of LRLO cells typically involves three reaction domains: intermediate-voltage reaction associated with the oxidation-reduction of Ni and Co, high-voltage reaction driven primarily by oxygen redox, and more complex low-voltage reaction involving both oxygen and Mn3+/4 + redox. [43]Separation of oxygen and Mn redox in the low-voltage reaction region usually requires careful absorption spectrum analysis. [54] The principal components derived in this study, which capture correlations between reactions in the low-voltage region and those in other voltage regions, can help isolate low-voltage reaction. The increase/decrease in low-voltage reaction by PC1 is paired with the upward/downward shift of intermediate-voltage reaction, while the increase/decrease in low-voltage reaction by PC2 is paired with the weakening/strengthening of intermediate-voltage reaction intensity. Matching these observations with known facts, the changes in low-voltage reaction caused by PC1 are associated with the deactivation of oxygen redox due to morphological degradation. In contrast, the changes in low-voltage reaction caused by PC2 are attributed to the activation of Mn redox by TM migration.
This detailed analysis confirms that the changes in size, position, and height of intermediate-voltage and low-voltage reaction affected by the weights of PC1 and PC2 play a crucial role in defining charge capacity and average voltage. This strengthens the link between the data-driven PCA model and specific physical phenomena governing battery performance, and improves our understanding of electrochemical kinetics in LRLO electrodes.
Deciphering degradation behavior of LRLO cells using PCA. Tracking the key components of the degradation process allows understanding and explanation of the degradation behavior. X-ray-based diffraction and absorption analyses are instrumental in elucidating the degradation mechanisms of electrode materials and play a crucial role in revealing the mechanisms of LRLO materials. However, tracking the continuous degradation process of LRLO materials poses significant challenges due to the complex crystal structures, domains, and composite structures of the materials, which often obscure clear analytical insights. [55, 56] To overcome these limitations, PCA was used as a model that can explain the underlying physical phenomena of electrode degradation without relying directly on analytical techniques. As shown in Fig. 3a, the PCA model reveals distinct patterns of cyclic changes in PC1 and PC2 in different cells.
Most LRLO cells exhibit similar degradation patterns, with the behavior shown in Fig. 3b serving as a representative example. This pattern, characterized by an initial increase in PC2 during step i, followed by a simultaneous increase in both PC1 and PC2 in step ii, is consistent with findings from previous studies supported by controlled experiments and advanced analyses. [57] The increase in PC2, as shown in step i of Fig. 3c, corresponds to a decrease in average voltage. This decline indicates a decreases in redox peak heights coupled with an increase in low-voltage reaction, as shown in Fig. 3d. This change suggests an increase in manganese redox activities, along with a decrease in nickel and cobalt redox activities, due to structural degradation. [6, 9, 52, 53] Furthermore, the simultaneous increase in PC1 and PC2 in step ii, suggesting an increase in redox peak positions, as shown in Fig. 3e, is indicative of an increase in material resistance. [41]
Cases where only PC2 increases (Fig. S10) and where both PC1 and PC2 increase from the beginning (Fig. S11) are also observed, which basically reflect the typical pattern, but with subtle differences that are inferred to be due to the characteristics of the material. [58] These subtle differences are speculated to complicate mechanism research. Due to incomplete data measurement and labeling for all samples, comparing mechanisms based on the physical and chemical properties of the samples is beyond the scope of the current research and remains a direction for future studies.
In addition, Figure S12 highlights a rare pattern characterized by a decrease in both PC1 and PC2 values. This pattern, associated with an increase in charge capacity, can be interpreted as activation process of anion redox. Despite efforts to exclude the initial 9 cycles involved in activation from the dataset, it appears that data reflecting prolonged activation has been included. [59] This behavior suggests that certain materials can have long-lasting activation, and such materials can be detected by PCA analysis. In addition, the changes in PC1 and PC2 values during the activation process, which are opposite to those during the degradation process, indicate that activation and degradation are electrochemically opposite phenomena.
Predictive Modeling of LRLO Charge Curves with PCA. Figure 4a illustrates a methodological framework for applying principal components to accurately predict the charge curves of unknown LRLO cells beyond the scope of the initial training dataset. The utilization of linear regression to estimate the weights of the components from the PCA model facilitates the prediction of complete charge curves based on these specific components. The accuracy of the prediction is affected by the voltage window selected for prediction, as well as the number of PCA components used. The use of up to the first 7 principal components was found to be optimal for prediction accuracy, as detailed in Table S2. This finding is consistent with the earlier assertion that the electrochemical properties of materials are governed by a limited number of significant independent factors. Using an excessive number of components leads to over-fitting of the model to the training data, thereby reducing the prediction accuracy of the model.
Figure 4b shows the variation in charge curve prediction accuracy based on the size and position of the selected voltage windows. Utilizing the full charge curve range from 3.0V to 4.5V, covering the entire 100% length, allows predictions with an average RMSE of 0.38 mAh/g. This accuracy is not only due to the extensive data input, but also emphasizes the model’s effectiveness in accurately depicting the fundamental patterns of LRLO cells.
Charge curve prediction accuracy varies significantly depending on the size and the location of the segments of the voltage range. In particular, a segment covering 40% of the voltage range from 3.3V to 3.9V is sufficient to make accurate charge curve predictions with an RMSE of less than 3 mAh/g, as shown in Fig. 4c. On the contrary, excluding data below 3.8V significantly reduces the prediction efficiency. Specifically, a 40% voltage window from 3.8V to 4.4V fails to ensure accurate charge curve predictions, likely due to the dominance of reactions occurring around 3.8V. (Figure S13)
The ability of the PCA model to generalize LRLO material behaviors is further verified by comparing the PC1-PC2 plots of the test set with those of the training set, highlighting its effectiveness in dimensional reduction for materials different from those in the training set as shown in Figs. 4d and S14. The reconstruction accuracy of the test set is very close to that of the training set. (Table S3 and Figure S15) This result highlights the model’s ability to identify outlier scenarios-data points in the test set characterized by significantly large weights in both PC1 and PC2. These outliers, associated with significantly low capacity and average voltage, exhibit phase separation in XRD analyses as shown in inset of Fig. 4d, consistent with previous reports. [60] The PCA model’s ability to simplify the complexity of electrochemical data and easily identify anomalies underscores its practical utility.
The unique adaptability of the PCA model becomes even more apparent when test datasets, which includes abnormal data, are incorporated into training. Adding data beyond the range of the existing training set does not fundamentally alter the structure of the principal components, demonstrating the model’s resilience to the incorporation of outlier data. (see Supplementary Section 3 for details, Figure S16, S17) This resilience reduces the need to retrain the model when new data are added, thus ensuring sustained high performance. [61, 62] By seamlessly integrating additional data, including instance of phase separation identified by significant PC1 and PC2 weights, the PCA model not only maintains its core framework, but also enhances its predictive accuracy and efficiency in a dynamically evolving research environment.