Serum lipidomics as diagnostic potential for metabolic-associated hepatocellular carcinoma

Metabolic-associated fatty liver disease (MAFLD) is affecting more people globally. Indeed, MAFLD is associated with a spectrum of metabolic dysfunctions that can progress to hepatocellular carcinoma (MAFLD-HCC). This development can occur in a non-cirrhotic liver and thus, often lack clinical surveillance. Using comprehensive ultra-high-performance liquid chromatography mass-spectrometry, we investigated 1,295 metabolites in serum from 249 Caucasian liver patients. Here we show that MAFLD-HCC is characterized by a complete rearrangement of the serum lipidome which distinguishes MAFLD-HCC from other HCC patients. We used machine learning to build a diagnostic model for MAFLD to MAFLD-HCC. We quantied predictive metabolites and adjusted the model into the MAFLD-HCC Diagnostic Score, presenting superior diagnostic potential compared to current alpha-fetoprotein (AFP). The metabolic landscape shows a progressive depletion in unsaturated fatty acids and acylcarnitines during transformation. Therefore, serum metabolomics may provide valuable insight to monitor patients at risk, including morbidly obese, diabetics, and MAFLD patients.


Introduction
Non-alcoholic fatty liver disease (NAFLD) involves a spectrum of metabolic diseases affecting 24% of the population globally 1 . NAFLD is associated with obesity and type 2 diabetes 2 , and represents a spectrum of non-malignant conditions that range from hepatic steatosis to non-alcoholic steatohepatitis (NASH) with brosis and ultimately cirrhosis 3 . NAFLD is emerging as a leading risk factor of hepatocellular carcinoma (HCC) in both men and women (hazard ratio ~17) 4 , but tools to detect progression of NAFLD to HCC remain inadequate. Recently, the de nition of NAFLD was suggested to change into metabolic-associated fatty liver disease (MAFLD), recognizing that this disease is associated with metabolic dysfunction 5 , and acknowledging the multiple overlapping risk factors of the disease.
HCC is the fourth most common cause of cancer-related death worldwide 6 with rapidly increasing incidence and mortality rates 7,8 . The incidence of MAFLD-related HCC (MAFLD-HCC) varies between 0.5% and 19.5%, depending on the disease state (presence or absence of cirrhosis) and geography [8][9][10] . Indeed, the prevalence of MAFLD-HCC is increasing compared to alcohol-and viral-related HCC (AV-HCC) 9,11 . Importantly, accumulating evidence suggests that MAFLD HCC may develop in a non-cirrhotic background 10 . This presents a major clinical challenge as non-cirrhotic MAFLD patients currently are not under surveillance for HCC 8 . Therefore, non-invasive biomarkers are urgently needed for the monitoring of MAFLD and its progression to HCC.
Serum metabolomics provides insight into holistic metabolic changes and can potentially predict hepatic pathological lipid abundance 12 . However, metabolomic pro les are highly dependent on ethnicity and risk factors such as obesity and diabetes [13][14][15][16] . The understanding of the biology underlying MAFLD-to-HCC progression is still lacking. Therefore, inquiry of the metabolic shifts in the liver during malignant progression is crucial for the development of new diagnostics and therapeutic modalities 17 .
Here, we used a comprehensive serum metabolomics-based approach to identify unique MAFLD-HCC biomarkers, allowing us to distinguish MAFLD-HCC from patients with MAFLD or HCC on a background of alcohol or viral hepatitis (AV-HCC) in the Caucasian population.

Patients And Methods
Study population: A total of 249 patient samples were collected in this international and multicenter study. Our study cohort was divided into a discovery and a validation sets. The discovery set included serum samples from 196 patients from Spain and France: 27 patients with MAFLD-HCC, 32 patients with AV-HCC, 93 morbidly obese MAFLD patients undergoing bariatric surgery (OB-MAFLD), and 44 (CTRLs) that included 35 healthy subjects described previously 18 and 9 obese bariatric surgery patients with a non-alcoholic fatty liver disease (NAFLD)-activity score (NAS) less than 3 and liver brosis score <2. The validation set included serum samples from 37 MAFLD patients from Chile and Spain and plasma sampled from 16 MAFLD-HCC patients in Brazil and Denmark. MAFLD patients from the validation set were overweight (body mass index (BMI>25)) or obese (BMI) >30), however signi cantly leaner compared to patients in the OB-MAFLD group (p<0.0001). All MAFLD patients had biopsy-proven non-alcoholic fatty liver disease (NAFLD). The MAFLD-HCC group consisted of patients who were diagnosed non-alcoholic steatohepatitis (NASH) based on liver biopsy, self-reported alcohol consumption (<20 g/day) and hepatitis B or C serology (hepatitis B surface antigen, hepatitis B surface antibody, hepatitis B core antibody, and hepatitis C antibody). All patients were quali ed for curative liver resection. The clinical and biochemical representation of the study population is presented in Table 1. The study was performed following individual patient consent, local institutional review board (IRB), and approval from the Committees on Health and Research Ethics for the Capital Region of Denmark for use of archival material no. 17029679. All patient data sets were anonymized.
Metabolomic and statistical analyses: Comprehensive metabolomics including 1,295 metabolites was performed on serum samples with three platforms as described in detail in supplementary materials. The absolute quanti cation of selected metabolites was performed using the calibration curves and addition of heavy isotope-labeled standards. Metabolites with missing signal in >50% of samples were excluded from analysis. Otherwise, missing values were estimated by the k-nearest neighbor method with Metaboanalyst 4.0 19 . Data were quantile-normalized and log-transformed before analysis (SPSS 25.0.0). Additional details, including statistical analyses, can be found in the supplementary material.

Clinical and biochemical features of studied populations
We analyzed the metabolic composition of serum obtained from 249 Caucasian patients divided into a discovery set comprising patients with MAFLD-HCC with pre-diagnosed NASH (n=27), patients with alcoholor viral-related HCC (AV-HCC) (n=32), and non-cancerous controls (n=137). These controls included 44 healthy individuals (35 healthy subjects reported previously 18 and 9 bariatric surgery patients with NAS<3, and brosis score <2) and 93 morbidly obese MAFLD patients awaiting bariatric surgery (OB-MAFLD).
Furthermore, we independently validated our ndings in serum from 37 patients with MAFLD and in plasma from 16 patients with MAFLD-HCC. Importantly, all MAFLD-HCC patients had no prior history of viral hepatitis or excessive alcohol consumption. The clinical, pathological, and biochemical features of all patients are summarized in Table 1 and Supp. Fig. 1. We observed a signi cant difference in mean age (MAFLD-HCC patients were signi cantly older compared to CTRL, OB-MAFLD and AV-HCC), BMI (signi cantly higher in OB-MAFLD compared to CTRL, AV-HCC and MAFLD-HCC), as well as male:female ratio (signi cantly higher prevalence of HCC among males, but equal between MAFLD-HCC and AV-HCC). Nevertheless, unsupervised principal component analysis (PCA) showed that the metabolic pro les grouped independently of these covariates (Supp. Fig. 2). Additionally, the presence of underlying diabetes, cirrhosis, and level of brosis were not confounders. Further, the analysis of alpha-fetoprotein (AFP) or other liver biochemical features (alanine aminotransferase (ALT), gamma-glutamyl transferase (GGT), alkaline phosphatase (AP), bilirubin, albumin and prothrombin activity) to follow the liver function showed extensive variability already in non-cancerous patients, but generally remained within the reference ranges (Supp. Fig.   1). As such, there markers presented limited potential for diagnosing MAFLD-HCC (Table 2). To investigate whether metabolomic changes in MAFLD-HCC are etiology-speci c, we compared the pro les of MAFLD-HCC patients and 32 patients with AV-HCC. As such, the metabolite pro les of patients with AV-HCC showed signi cant overlap, but a clear separation from the pro les of patients with MAFLD-HCC (Supp. Fig. 2). Compared to patients with AV-HCC, MAFLD-HCC patients were older (p<0.05) and less likely to develop HCC on a cirrhotic background (respectively 90% compared to 30%, p=0.035). However, no signi cant differences were observed between MAFLD-HCC and AV-HCC in measurements of the liver function (AFP, AP, ALT, GGT, and bilirubin) or diabetes. Interestingly, tumors obtained from MAFLD-HCC patients were larger in size compared to tumors from patients with AV-HCC (p=0.0006) and more frequently displayed microvascular invasion (41% MAFLD-HCC compared to 20% AV-HCC, p=0.016).

MAFLD-HCC patients present a disparate serum metabolome
To establish a serum-based metabolomic landscape of MAFLD-HCC, we performed detailed metabolomics using a comprehensive library of 1,295 metabolites covering amino acids (AA), glycerophospholipids, fatty acyls, sterols, sphingolipids, and glycerolipids. In total, we detected 470 metabolites, of which 43 were excluded after correcting for age, gender, and BMI, before univariate and multivariate analyses.
As such, sparse partial least squares discriminant analysis (sPLS-DA) revealed that MAFLD-HCC patients metabolically are the most distinct group compared to controls (CTRL and OB-MAFLD) and patients with AV-HCC (Fig. 1A). Besides, MAFLD-HCC patients were the most dissimilar using unsupervised hierarchical clustering (Fig. 1B). Interestingly, AV-HCC and MAFLD-HCC subgroups showed low similarity and clustered far apart (Fig. 1A), suggesting that unique metabolic programs can be driven by different etiologies.

Unique metabolomic pro le of MAFLD-HCC patients
To investigate differences in the metabolic pro les between MAFLD-HCC patients and the other groups, we performed a series of pair-wise tests. As such, orthogonal partial least squares discriminant analysis (oPLS-DA) showed substantial separation of MAFLD-HCC patients from CTRL (R 2 X=0.178, R 2 Y=0.751, Q 2 =0.723) ( Fig. 2A). The differential expression analysis identi ed 274 signi cantly different (false-discover rate (FDR) corrected p<0.05) metabolites (DEMs), among which 152 metabolites were depleted and 122 metabolites were increased (Fig. 2B, Supp. Table 1A). Next, we performed pathway overrepresentation analysis of the depleted metabolites using integrated molecular pathway level analysis (IMPaLA) 20 and identi ed linoleic acid metabolism and G protein-coupled receptor (GPCR) signaling as the most impaired networks (Supp .  Table 1B). Contrary, cholesterol synthesis, membrane uidity and tra cking, and glycerophospholipid metabolism were among the most upregulated pathways (Supp . Table 1C). In addition to relate individual metabolites to processes, we compared unique classes of metabolites with the same chemical characteristics. As such, we de ned a unique depletion of acylcarnitines (AC), sterol lipids (ST), and fatty acids (FA; especially oxidized fatty acids (oxFA) and omega-6 FA) in MAFLD-HCC, while saturated triglycerides (TG) were upregulated (Fig. 2C, Supp. Fig. 3). Furthermore, we utilized Bioinformatics Methodology For Pathway Analysis (BioPAN) 21 for lipid pathway enrichment analysis. We observed a signi cant activation of reactions converting sphingomyelins (SM) to ceramides (Cer), a process catalyzed by sphingomyelin phosphodiesterases (SMPD2 and SMPD3), as well as phosphatidylcholines (PC) to diglycerides (DG), which is catalyzed by the sphingomyelin synthases (SGMS1 and SGMS2) (Supp. Fig. 4A). Similarly, signi cant alterations were revealed in the activity of FA desaturases and elongases with speci c activation of fatty acid desaturase 1 (FADS1) and impairment in FADS2, stearoyl-CoA desaturase 1 (SCD1) and elongation of very long chain fatty acid (ELOVL) elongases (ELOVL2, and ELOVL5) (Supp. Fig. 6A).
Lastly, we compared the metabolomes of MAFLD-HCC to HCCs with alcohol and/or viral etiology (AV-HCC).
Taken together, the serum of MAFLD-HCC patients is characterized by a signi cant depletion of FA re ective of a signi cantly lower FA biosynthesis with decreased FA desaturase and elongase activities. A depletion in both AC and ST with concurrent higher TG and Cer abundance is suggestive of a unique metabolic reprogramming in MAFLD-HCC patients. The altered SM:Cer ratio could be the result of an increased activity in the enzymes SMPD2 and SMPD3 or reduced activity of SGMS1, SGMS2, CERT1 in MAFLD-HCC patients. A simpli ed association between the lipid classes and their deregulation in MAFLD-HCC is presented in Fig.   2H.
Diagnostic potential of serum metabolomics Serum metabolomics has been successfully used as a diagnostic tool to discriminate liver diseases 17,18 .
Here, we investigated the potential of distinguishing MAFLD-HCC not only from healthy individuals and MAFLD patients, but also from AV-HCC. Thus, to generate a predictive metabolite signature, we rst used receiver operating characteristic (ROC) curves and calculated area under the curve (AUC) for each metabolite as a contrast test between MAFLD-HCC and the respective comparative groups (CTRL, OB-MAFLD, and AV-HCC). As such, 89 metabolites presented an AUC>0.75 distinguishing MAFLD-HCC patients from the other control patients (healthy or disease). Among the metabolites in the DEM signature, 14 metabolites presented a superior AUC>0.9 in all contrast tests ( Table 2). These metabolites and their fold change compared to CTRL are presented in Fig. 3A. Importantly, these 14 metabolites individually present Next, we assessed if a combination of the 14 metabolites would increase the diagnostic potential. As such, we employed support vector machine (SVM) modeling and determined that a panel of 5 metabolites yielded the optimal predictive accuracy (Supp. Fig. 6A). Indeed, the model based on the 5 metabolites reached an AUC>0.98 (Fig. 3B) for any of the contrasts (compared to all controls and AV-HCC) with a predictive accuracy greater than 90% (Fig. 3C). Importantly, the accuracy of the diagnostic panel was con rmed in the validation set and a model performance with an AUC of 0.91, including matching MAFLD-HCC patients according to BMI (Supp. Fig. 6B).

Validation and quanti cation of diagnostic metabolites
To reinforce the clinical relevance of the metabolite panel in the diagnosis of MAFLD-HCC patients, we wanted to validate and quantify the abundance of each metabolite. Also, we established their reference concentration range. Among the 14 metabolites with the greatest diagnostic value, only 10 of them have commercial standards available. Also, each metabolite needs to be detected within the linear range, which allows for absolute quanti cation. As such, we established the abundance of metabolites in the validation set (Fig. 4A). All metabolites except PC(0:0/22:5) showed a similar trend as in the discovery set, but overall with great variability. Moreover, we utilized the quanti cation to generate linear regression models estimating the concentration levels in the discovery set. The linear dependence and estimated concentrations are presented in Supp. Table 4. Lastly, among the validated metabolites in the panel, we selected an optimal set (linoleic acid, osbond acid, monounsaturated fatty acid MUFA (14:1n-5trans), and PC(18:2/0:0)) to build the MAFLD-HCC Diagnostic Score (MHDS). The MHDS is dependent on the absolute concentration level of each metabolite in the score. The MHDS performed with an AUC>0.75 for any given contrast (Fig. 4B-C). Finally, in the combined (discovery and validation) cohort, we established the odds ratio and relative risk for the MHDS (at cut-off value 0) based on the clinical and biochemical features.
The serum lipidome landscape re ects the progression to MAFLD-HCC We investigated whether the serum lipidome re ects the progressive nature of MAFLD-HCC development. To that end, we rst performed pair-wise contrasts between each group (Fig. 5A) to establish the trajectory of disease progression. Then, we employed a pattern hunter approach with Pearson correlation to designate the metabolic rearrangements in the development of the disease (Fig. 5B). A total of 412 lipids were detected in the discovery and validation MAFLD samples. As such, we independently compared OB-MAFLD (bariatric surgery) and MAFLD (non-bariatric surgery) patients with healthy CTRLs, resulting in 257 DEMs distinguishing OB-MAFLD from CTRLs (Fig. 5A). Likewise, in MAFLD, we de ned a total of 266 DEMs compared to the CTRL group (Fig. 5B), suggesting that both OB-MAFLD and MAFLD patients experience a signi cant metabolic rearrangement compared to healthy individuals. Indeed, a total of 138 metabolites were signi cantly different from CTRL and shared similar directionality, showing a signi cant depletion of AC, ChoE, and LPC (Fig. 5C). Contrary, DG and TG are among metabolites progressively upregulated in MAFLD patient groups. Interestingly, omega-6 FA and Cer were among deregulated metabolite classes.

Consequently, we next compared OB-MAFLD and MAFLD patients with MAFLD-HCC detecting 297 and 136
DEMs, respectively (Fig. 5D-E). As such, we noticed a second signi cant metabolic shift from MAFLD to MAFLD-HCC. Notably, OB-MAFLD samples were signi cantly different, suggesting that the disease progression is CTRL→OB-MAFLD→MAFLD→MAFLD-HCC. Following this disease progression, we de ned a total of 83 metabolites signi cantly different from MAFLD-HCC that shared the same directionality. This metabolic shift included an increase in PC and BA levels as well as depletion of AC, FA, and SM levels (Fig.   5F). Interestingly, TG were higher in the serum of MAFLD-HCC when compared to OB-MAFLD (patients awaiting bariatric surgery), but diminished when compared to MAFLD patients, who were signi cantly leaner.
To further explore the OB-MAFLD and MAFLD differences, we directly compared these groups. First, the MAFLD patients were signi cantly older and have a lower BMI compared to OB-MAFLD (Supp. Fig. 1). Also, 76% of MAFLD patients were classi ed as NASH (biopsy proven) compared to only 12% among OB-MAFLD (Table 1, Fisher's test p<0.0001). As such, we detected 323 DEMs re ecting a complete change in the lipidome landscape of these patient groups (Supp. Fig. 7). FA, AC, and ST metabolites were signi cantly diminished in MAFLD compared to OB-MAFLD. Contrary, SM, Cer, DG, TG, PE, and PC were all signi cantly higher in MAFLD. Interestingly, phosphatidylethanolamine N-methyltransferase (PEMT), the only enzyme catalyzing the reaction chain PE→PC→DG, and choline/ethanolamine phosphotransferase 1 (CEPT1), catalyzing the conversion of DG to PE are the only enzymes deregulated between the major lipid subclasses. As expected, we also observed signi cant differences in the FA reaction chain (Supp. Fig. 8).
Finally, we used pattern hunting to identify metabolites signi cantly associated with the full progression from healthy individuals to HCC (CTRL→OB-MAFLD→MAFLD→MAFLD-HCC axis) (Fig. 5G). As such, we found a total of 169 lipids progressively altered (FDR p<0.05, r>0.3; 100 DEMs negatively, and 69 positively) following this axis (Supp. Table 5). The negatively correlated metabolites include AC, ChoE, PUFA, LPC, and ST subclasses. Among the metabolites positively correlated, we found an overrepresentation of TG (47 out of 69 metabolites). Importantly, 18 TG signi cantly correlate with increasing NAS and brosis scores in OB-MAFLD patients, suggesting their importance in the progressive deterioration of the liver (Fig. 5H). Lastly, we tested whether any of these metabolites correlated with tumor size, recurrence, microvascular invasion or liver cirrhosis in MAFLD-HCC patients, however, none of the metabolites reached statistical signi cance.
This suggests that these metabolites are associated with MAFLD-HCC, but not directly involved the progression axis.

Discussion
A sedentary lifestyle and overnutrition have led to an epidemic of obesity, diabetes, and MAFLD that soon may become leading causes of HCC development. Metabolic reprogramming is at the core of MAFLD progression to HCC 17 . However, due to the need of invasive techniques for MAFLD diagnosis, our understanding of MAFLD-HCC is limited and the underlying metabolomic landscape remains elusive. The present study provides the rst comprehensive analysis covering 22 classes of metabolites and identifying complex relationships in the gradual progression of MAFLD to MAFLD-HCC. Importantly, this study was performed in patients with biopsy-proven liver histology before therapy. To our knowledge, it is also the rst study aiming to differentially diagnose MAFLD-HCC from HCC of other etiologies.
The diagnosis of MAFLD-HCC is challenging as patients with metabolic syndrome are not screened for HCC 8 . Indeed, often only patients with chronic liver disease of viral etiology are offered HCC screening.
Increasing levels of liver enzymes are suggestive of liver damage but are not speci c to hepatocarcinogenesis. Additionally, the diagnostic capacity of AFP 22 for HCC is limited and was particularly low for MAFLD-HCC patients (Fig. 3G, H, Supp. Fig. 1). As such, we have applied machine learning approaches to pin-point a combination of metabolites that offers the best diagnostic potential for MAFLD-HCC patients. We found that a combination of 5 metabolites accurately distinguishes MAFLD-HCC patients from healthy individuals (AUC=0.989), OB-MAFLD patients (AUC=0.997), and patients with AV-HCC (AUC=0.999). This model performed well against a validation set of MAFLD patients (AUC=0.905).
Furthermore, to bring metabolomics to clinical practice, it is crucial to establish methods for absolute quanti cation of metabolites as well as their reference ranges. Here, we were able to validate 10 metabolites using commercially available internal standards and measure their concentrations. As such, we built a MHDS based on the absolute concentrations of 4 metabolites (3 unsaturated FA and 1 phosphatidylcholine) that can be applied for both serum and plasma measurement and performs superior to AFP and GGT (Fig. 3E-H).
The unsaturated fatty acids (mono-(MUFA) and polyunsaturated (PUFA)) signi cantly differed between MAFLD-HCC and other groups were (Fig. 1C, 2B, D, F & Fig. 3A) with lower levels in MAFLD-HCC. The essential, omega-6 FA, linoleic acid, is a precursor to long-chain metabolites including arachidonic acid, as a substrate in prostaglandin synthesis and thus, their depletion may cause altered signaling and in ammatory response 23 . The role of linoleic acid in carcinogenesis remains controversial. On the one hand, linoleic acid and its derivatives have shown a tumor-suppressive role in colorectal cancer 24 . Conversely, linoleic acid accumulation (in a murine model) increases oxidative stress and causes a selective loss of intrahepatic CD4+ cells leading to MAFLD-mediated hepatocarcinogenesis 25 . Interestingly, the serum linoleic acid was heavily depleted in MAFLD-HCC patients (Fig. 3A), suggesting that the murine model was not able to fully mimic human disease and further emphasizing the potential limitation of murine studies in this eld. Indeed, increased metabolism of linoleic acid and arachidonic acid have been implicated in viral-associated HCC in the Korean population 26 . However, the different ethnicity (Korean compared to European) might affect serum metabolome and ethnicity or culturally driven dietary differences should be considered 15,27 . As such, sphingolipid and acylcarnitine pro les differ between European, African, and South Asian patients with diabetes, showing an increase in acylcarnitines in individuals from Suriname 15 . Targeted acylcarnitine pro les were recently reported as upregulated in Japanese MAFLD-HCC individuals 28 , which is opposite to what we observed in our study (Fig. 2C).
Furthermore, acylcarnitine pro les might be in uenced by the prevalence of diabetes, as the authors did not control for this covariate 28 . As such, our study was limited to patients of Caucasian origin, who are overrepresented in countries signi cantly affected by the prevalence of MAFLD.
HCC is a progressive disease developing over decades. We observed a gradual attenuation of acylcarnitines, lysophosphatidylcholines, and unsaturated FA and an increase in phosphatidylcholines and triglycerides during the progression of liver disease. Interestingly, some of the triglycerides that were found to be augmented during progression to MAFLD-HCC were previously associated with NAS and brosis scores in NASH 29 (Fig. 5H) but were not associated with tumor size. Although, we have presented a successful metric, demonstrating the ability of the metabolite panel in diagnosing CRTL®MAFLD®MAFLD-HCC patients, and distinguishing these patients from AV-HCC, the utility of the MHDS needs to be shown in an appropriate clinical assay, in a large cohort, and in ethnically diverse patients. However, strati cation of HCC based on their metabolome may in the future be an approach to evaluate cryptogenic HCC patients. It is notable that MAFLD-HCC patients were signi cantly older compared to other groups which limited the power to detect metabolic differences.
In conclusion, the depletion of unsaturated FA, and the increase of triglycerides are at the core of deregulated metabolic networks in MAFLD-HCC, leading to altered signaling and likely different nutrient utilization by cancer cells. These changes can be exploited for non-invasive surveillance of the 'at risk' population for early HCC detection in the background of metabolic syndrome.

Data availability
The data used in this study will according to IRBs be made available upon request if the requestor has an approved protocol.      (discovery and validation) sets. AFP and GGT were available only in the discovery set. E. The forest plot presenting a relative risk of MHDS at the established cut-off value as well as clinical and biochemical characteristics for pulled (discovery and validation) sets. AFP and GGT were available only in the discovery set.

Figure 5
The progressive metabolic perturbations in CTRL to MAFLD to MAFLD-HCC trajectory. A. The pair-wise compresence between OB-MAFLD and CTRL. B. The pair-wise compresence between MAFLD patients in the validation set (MAFLD(val)) and CTRL. C. The Venn diagram presenting common (signi cant and same