1 Introduction
Current clinical practice, involving disease diagnosis and treatment decisions, relies largely on blood plasma–based tests combined with patient history, physical checkups, and other clinical laboratory examinations. As one of the most complex proteomes in the human body, the plasma contains various components, including dissolved proteins, small-molecule metabolites, and electrolytes, which are secreted by organs and tissues [
1], making it a favorable substrate for comprehensive biomarker discovery and effective treatment evaluation. Moreover, the biological systems in the human body are connected by multilayered networks operating at the cellular and molecular levels [
2], and thus a comprehensive understanding of the plasma facilitates network-level analysis of the healthy or pathologic states at the tissue or organ levels [
3].
Recent progress in omics technologies have allowed the global profiling of proteins and metabolites under normal and disease conditions. In particular, proteomic and metabolic profiling studies have opened doors for the analysis of tissue microenvironments, which contain many circulatory proteins essential for proximal and distant signal transduction and systemic immune responses [
4,
5]. Plasma protein or metabolite analysis has demonstrated potential diagnostic value because it provides insight into the occurrence and development of human diseases, including solid tumors and leukemia [
6]. Protein or metabolite markers closely related to disease burden and therapeutic response can be identified, providing a potential basis for treatment evaluation [
7,
8]. Previously, we constructed a discriminant equation by using six serum metabolites with marked disturbances related to glucose metabolism in patients with acute myeloid leukemia (AML), showing prognostic value for AML [
9]. In addition, serum oncogenic metabolite 2-HG concentrations of AML patients with IDH1 or IDH2 mutations are 50- to 100-fold higher than those of patients without these mutations [
10,
11]. Therefore, changes in small-molecule metabolites in leukemia can sensitively reflect the pathophysiological changes in the body. Moreover, by integrating plasma multi-omics data into mechanistic models, researchers not only can identify correlations among various measured analytes but also can characterize the underlying relationships of these analytes and the biology on which they depend [
12,
13].
The most successful model of cancer differentiation therapy is the development of synergistic targeted therapy of all-trans retinoic acid (ATRA) and arsenic trioxide (ATO) for acute promyelocytic leukemia (APL) [
14]. Recently, by combining clinical parameters, such as white blood cell (WBC) count and molecular data, including
NRAS mutation and transcriptome-based score system, we divided APL cases into standard-risk (72% of the cases) and high-risk (28% of the cases) subgroups, which had 5-year disease-free survival rates of 98.1% and 88.4%, respectively [
14]. Building upon the model as a whole, a system view of not only the genomic, transcriptomic, and epigenetic features of APL cells [
15,
16] but also the plasma composition of the tumor mass and its modulation upon effective treatment is of great significance. In this study, to explore factors related to the pathogenesis of APL and pharmacological responses, we comprehensively profiled changes in various factors in APL plasma. Through our newly introduced cross-sectional correlation network framework, we integrated clinical parameters, plasma proteins, and metabolites, together with previously published RNA sequencing (RNA-seq) data of APL cells to cohesively translate the results into actionable information and knowledge source that could be used by physicians and healthcare systems.
2 Materials and methods
2.1 Samples from patients with APL
A total of 169 patients with APL and primary diagnosis (DX) and paired self-controlled complete remission (CR) paired plasma samples were obtained from a randomized multicenter clinical study (NCT01987297) conducted from December 6, 2012 to December 31, 2017 [
15]. A total of 25 paired patients were used for proteomics study. Plasma samples of sex- and age-matched controls from 106 healthy volunteers were included in healthy control (HC) groups. The collection and preservation of the samples were approved by the institutional review boards of all participating centers, and the collection of written informed consent for sample collection and research complied with the
Declaration of Helsinki [
17].
Apart from local data, we downloaded the following public data: (1) 323 primary APL RNA-seq data from our previously published paper (GSE172057), of which 169 were matched to proteomics or metabolomics samples [
15]; (2) RNA-seq data, including samples from 9 primary APLs, 49 other AML subtypes, and 19 HC bone marrow mononuclear cell (BMMC) samples (BeatAML cohort) [
18]; (3) public microarray data, which include bone marrow samples from 19 primary APLs, 62 other AML subtypes, and 5 normal promyelocytes (normal bone marrow promyelocytes sorted by flow cytometry) from healthy donors (GSE12662) [
19]; (4) PML/RARA chromatin immunoprecipitation-sequencing (ChIP-seq) binding peaks from our previous research [
16]. Genes with adjusted
P values < 0.05 and absolute concentration fold changes > 1.5 were considered significantly differentially expressed genes.
2.2 Clinical laboratory tests
A total of 88 clinical laboratory test results were obtained: (1) the percentage of bone marrow blasts in patients with primary APL; (2) 20 routine blood tests; (3) 6 coagulation-related tests; (4) 15 biochemical analyses; (5) 12 blood lipid tests; (6) 6 cytokines; (7) 18 additional biochemical analyses and 10 endocrine tests. The basic demographic and clinical parameters are shown in Table S1. The test results used for these experiments were collected according to availability, and whenever possible, each sample was used for multiple tests. No data or samples were excluded during data analysis to minimize sample selection bias. The detailed clinical parameters are provided in Supplemental Methods.
2.3 Plasma proteome and metabolome analysis
Procedures for the preparation of plasma samples, Olink proximity extension assays (PEA), data quality controls (QCs), differential protein analysis, and functional enrichment analysis are provided in the Supplemental Methods. The chemical standards for untargeted metabolomics, preparation of plasma samples, LC-MS chromatographic conditions, data acquisition, metabolic feature detection, retention time correction, metabolic feature identification, metabolic data processing, differential metabolite analysis, and pathway enrichment analysis processes are described in Supplemental Methods.
2.4 Correlation network and community analysis
We prepared three types of multi-omics predictors by using the clinical parameters, plasma proteomics, and plasma metabolomics data sets generated in this study, including (1) data from 88 clinical laboratory tests (Clin), such as routine blood tests, coagulation, biochemistry tests, immunity tests, and endocrine tests; (2) 991 unique plasma proteins (Prot) identified by Olink proximity extension assay (PEA); (3) 443 plasma metabolomic predictors (Meta) detected trough untargeted high-performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS). For each data set, all predictors were log2 transformed, median centered, and scaled by the standard deviation to a comparable scale for statistical analysis.
Cross-sectional correlation networks were constructed by using data from different layers for the calculation of Spearman correlation coefficients at various levels (for example, clinical parameters vs. proteomics, clinical parameters vs. metabolomics, and proteomics vs. metabolomics). Spearman correlation coefficients and
P values were calculated using Hmisc (version 4.5.0) R packages. The
P values were then adjusted for multiple-hypothesis testing by using the method of Benjamini and Hochberg [
20]. We selected an adjusted
P value < 0.05 and absolute Spearman’s ρ > 0.4 as thresholds for our significance level. We built the correlation network for the significantly correlated data pairs by using the igraph (version 1.2.6) [
21] R package and a multilevel optimization of modularity clustering algorithm to analyze the community structure [
22]. The correlation network was constructed and visualized with ggraph (version 2.0.4) [
23], visNetwork (version 2.0.9) [
24] R packages, and Cytoscape (version 3.8.2) [
25] software. Community-derived proteins were enriched through GO and KEGG pathways enrichment analyses performed using the clusterProfiler (version 3.16.1) [
26] R package. Community-derived metabolites were enriched by KEGG pathways by using the MetaboAnalystR (version 3.0.3) [
27] R package.
2.5 Functional enrichment analysis
Gene set enrichment analysis (GSEA) was performed using the GSEA (version 4.2.1) [
28] with MSigDB-hallmark (H) [
29] and MSigDB-curated (C5) gene sets [
30]. All comparisons were conducted with standard parameters. Given the need for exploratory analysis, we set a false discovery rate < 0.25 as a significant pathway, as suggested on the GSEA official website. GO functional and KEGG pathway analyses were used for protein enrichment analysis. The KEGG and SMPDB (small-molecule pathway database) pathway analyses were used for metabolites enrichment analysis as described in their respective method sections.
2.6 Protein–protein interaction (PPI) network analysis
We used the STRING database [
31,
32] to analyze the protein–protein interaction (PPI) relationship. We uploaded a list of proteins into the STRING database and downloaded the PPIs from the website. Only interactions with a medium combined confidence score greater than 0.4 were retained. Then, we used Cytoscape to visualize the network [
25].
2.7 Screening of nearest gene to PML/RARA binding sites
Genome-wide PML/RARA binding sites identified by ChIP-seq were downloaded from our previous research [
16]. To associate the nearest gene to PML/RARA for each binding peak, we used BEDTools closest command [
33].
2.8 Statistical analysis and visualization
All data types used in the cross-sectional correlation networks were normalized. Comparisons of categorical variables were ascertained by Fisher’s exact tests or Pearson’s chi-square tests. Paired two-sided Student
t-tests were used to compare the significance of paired samples. Wilcoxon tests were performed to compare non-normally distributed continuous variables. The enrichment results were visualized with the ggplot2 (version 3.3.2) R packages. The distance was calculated according to the Euclidean or Pearson’s correlation coefficient in the gplots package (version 3.0.1.1) [
34], and a heatmap was plotted by the ComplexHeatmap (version 2.4.3) [
35] R package. R version 4.1.0 was used for all statistical analyses. Venn diagrams were generated by the VennDiagram R package (version 1.6.20) [
36] for the visualization of the intersections among multiple groups.
2.9 Data availability
The plasma proteomic and metabolomic data reported in this study were deposited in OMIX database, China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (Accession Nos. OMIX002537 and OMIX002538). The remaining data are available in the article and Supplemental Material.
3 Results
3.1 Overview of the study design
To explore the plasma factors associated with the pathogenesis and treatment of APL, we collected paired plasma samples obtained from 169 patients with APL and PML/RARA fusion (aged 19–64 years; 58% males, 42% females) at the time of diagnosis (DX) and time of the first hematological CR after standard frontline therapy in the APL2012 trial [
37]. Plasma samples from 106 HCs (sex- and age-matched healthy individuals aged 18–65 years; 50% males, 50% females) served as reference controls (Fig.1). The bone marrow mononucleated samples of all these patients at the DX stage were previously subjected to RNA-seq analysis [
37].
We obtained available clinical laboratory test results of 88, 65, and 35 bioanalytics parameters for DX, CR, and HC samples, respectively. The proteomic data set included 991 plasma protein levels measured on 25 available paired APL samples (DX and CR) and 16 available HC samples with the Olink technology. For metabolome data, 443 metabolites were quantified from all 169 APL plasma samples and 106 HC samples with untargeted HPLC-MS/MS (Fig.1). Subsequently, single-layer exploratory analyses were performed on each data type, followed by a correlation network framework analysis (see Methods for details) that integrated all three layers of data to identify key cellular and molecular modalities associated with the pathophysiological states of the plasma composition of APL before and after treatment (Fig.1).
3.2 Clinical features and single-omics characterizations of APL plasma composition
3.2.1 Clinical laboratory tests
The demographic data of the APL patient cohorts are shown in Table S1. The patients were categorized into high-risk and standard-risk groups at diagnosis according to the risk classification systems we defined previously [
15]. The clinical laboratory tests included routine blood tests, coagulation tests, biochemical analyses, blood lipid tests, and endocrine tests (Table S1). The median bone marrow blasts percentage in patients with APL at diagnosis was 87.5%, and the median level of WBC counts, red blood cell (RBC), and platelet (PLT) counts were 3.32 × 10
9/L, 2.50 × 10
12/L, and 29 × 10
9/L, respectively (Fig. S1A; Table S1). At the CR stage, the median WBC, RBC, and PLT counts of the patients were 4.22 × 10
9/L, 3.36 × 10
12/L, and 248 × 10
9/L, respectively, which were close to those of HC (WBC, 5.60 × 10
9/L; RBC, 4.74 × 10
12/L; PLT, 211 × 10
9/L). In addition, the DX samples were characterized by using the indicators of coagulation abnormalities, such as decreased fibrinogen (Fg, 1.40 g/L; reference range, 1.8–3.5 g/L) and increased fibrin degradation product (45.95 mg/L; reference range, < 5 mg/L) and D-dimer (DD, 11.49 mg/L; reference range < 0.55 mg/L; Fig. S1B; Table S1). Regarding routine biochemical tests, although DX significantly differed from CR or HC, most of the tests’ interquartile ranges were within the reference limits, except the elevated levels of lactate dehydrogenase (LDH, 288 IU/L; reference range, 98–192 IU/L), ferritin (FERRIT, 619.25 ng/mL; reference range 11–306.8 ng/mL), and vitamin b12 (1335.50 pg/mL; reference range, 180–914 pg/mL; Fig. S1C; Table S1). Furthermore, the WBC counts were significantly higher, the PLT counts were significantly lower, and the coagulation indicators were more aberrant (lower Fg, longer thrombin time, and higher D-dimer) in the high-risk patients than in the standard-risk patients at the DX stage, consistent with their definitions and reflecting a heavier tumor load and more severe coagulation abnormalities in the former group vs. the latter one (Table S1).
3.2.2 Proteome
To profile plasma proteins, we conducted Olink PEAs to measure the levels of 991 plasma proteins from 25 paired DX-CR APL samples and 16 HCs (Fig.1). The overlapped QC samples from the PCA plot showed that the plasma protein data were of high quality. The DX samples were generally heterogeneous, whereas the CR samples were notably different from the HCs (Fig. S2A). A comparison of DX and CR samples identified 303 proteins with different abundance plasma levels (PDPs): 88 increased and 215 decreased in the DX group (Fig.2 and S2B; Table S2). An independent comparison between DX and HC samples identified 402 PDPs, and most of them (317/402, 78.9%) increased (Fig. S2B; Table S2). The PDPs in both comparisons were highly correlated (R = 0.63 and P < 0.00001; Fig. S2C), indicating most APL-related plasma proteins returned to normal levels at the CR stage. Meanwhile, most of the PDPs (239/255, 93.7%) in the high-risk vs. standard-risk patients (DX stage) increased (Table S2) and were related to cytokine signaling, leukocyte activity and apoptosis process, reflecting severe disease states in high-risk patients (Fig. S2D).
We further investigated the functional roles of the above mentioned 303 PDPs between DX and CR by dividing them into three clusters through hierarchical clustering. Cluster 1 consisted of 88 proteins with specifically elevated levels in the DX plasma. These proteins were significantly associated with PI3K-AKT and ERK signaling pathways (Fig.2 upper-right panel; Table S3), which were previously reported to be implicated in metabolic reprogramming and survival of leukemia stem cells [
38,
39]. By contrast, the levels of plasma proteins in cluster 2 (including 102 proteins) were specifically high in CR plasma and included proteins associated with interferon-gamma (IFN-γ) responses, oxidative stress, and intrinsic apoptotic signaling (Fig.2, upper-right panel; Table S3). The plasma level of ARG1, a key IFN-γ pathway–related factor, was verified by ELISA (Fig. S2E) and exhibited consistent trends as detected by Olink. Notably, the CR samples were collected immediately after remission induction therapy (the first hematological CR), and thus cluster 2 plasma proteins might mainly reflect the impact of therapeutics on the plasma composition at this stage. Indeed, compared with DX and HC plasma, the CR plasma had higher representation of proteins that were generally categorized as intracellular proteins (i.e., in cytoplasm and organelles; Fig.2, lower panel). This phenomenon can be attributed to the therapy-induced apoptosis or necrosis of APL blasts and the ensuing release of intracellular components. Cluster 3 proteins (113 proteins) were at similar levels in the CR and HC plasma but were lower in the DX plasma, and the biological functions of these proteins were enriched in leukocyte chemotaxis, coagulation, platelet activation, and T cell activation pathway (Fig.2; Table S3). The return to normal levels of these factors could result from the correction of coagulation abnormalities and hemorrhagic complications as well as impaired immune cell functions after treatment.
3.2.3 Metabolome
To profile plasma metabolites, we used high-performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS) to compare paired DX-CR samples from 169 APL cases and 106 HCs (Fig.1). A total of 443 metabolites were identified in 444 plasma samples (Fig. S2F). After systematic error removal using random forest normalization [
40], an average Spearman correlation coefficient of 0.86 was observed in all QC samples (Fig. S2G), indicating the robustness of the metabolomics data. Pairwise comparisons among sample groups demonstrated that metabolic changes were significant (Fig.2 and S2H). According to the criteria of absolute fold change > 1.5 and adjusted
P value < 0.05, a total of 69 differential metabolites were identified from the comparison between DX and CR samples (Fig.2 and S2G; Table S4). Most increased metabolites at DX belonged to the organic acid category (e.g., phenylalanine/tyrosine/tryptophan biosynthesis and synthesis/degradation of ketone bodies), whereas most metabolites with decreased levels were involved in nucleoside or nucleotide biosynthesis (e.g., purine metabolism and pyrimidine metabolism); taurine and hypotaurine metabolism; and the TCA cycle (Fig.2; Table S5). At the DX stage, the high-risk cases had significantly higher levels of nicotinamide and niacinamide than in the standard-risk cases (Table S4), consistent with the suggestion that targeting nicotinamide metabolic activity might improve the therapeutic efficacy against leukemia stem cells in relapsed or refractory AML [
41]. Nevertheless, at the CR stage, only a slight increase in alanine aminotransferase and aspartate aminotransferase levels and decrease in lymphocyte counts were observed in the high-risk patients, indicating that differences among the subgroups at disease onset were mostly eliminated at CR (Tables S1, S2, and S4).
3.3 Multidimension correlational network of APL plasma composition
While the aforementioned tier-wise assays (i.e. clinical laboratory tests, plasma proteins, and plasma metabolites) each provided a layer of bio-analytes that informed the pathophysiological states of APL, a holistic view of the plasma by correlating various measurements, as previously suggested by Yurkovich
et al. [
3], could reveal coordinately changed factors illuminating specific aspects of disease progression/regression states. Building upon this concept, we constructed cross-sectional correlation networks for the DX, CR, and HC samples by using the laboratory test, plasma proteome, and metabolome data (Fig.3, Step 1). The correlation network of DX plasma was dense and complex, having a total of 973 nodes and 2527 statistically significant Spearman correlations (R > 0.4; adjusted
P value < 0.05) between 46 clinical parameters, 583 plasma proteins, and 308 metabolites (Fig.3). By contrast, the correlation networks of CR and HC plasma were much sparser, having only 12 clinical parameters, 84 plasma proteins, and 93 plasma metabolites for the CR network (189 vertices and 164 edges; Fig.3) and 13 clinical parameters, 11 plasma proteins, and 54 metabolites for the HC network (78 vertices and 136 edges; Fig.3 and 3E). Therefore, the dense connections observed in the DX correlation network might be indicative of large-scale disturbances of the plasma composition resulting from both biomolecular release/absorption of leukemic cell mass and the organism-level response of the disease individual. In the CR stage, the coordinated changes in characterizing APL’s pathophysiological state might be curbed by ATRA/ATO therapy, contributing to the mode of action of this effective treatment.
Notably, most edges of the above networks were positively correlated (Fig.3; Table S6), suggesting that many plasma proteins and metabolites were functionally associated with clinical parameters during APL pathogenesis and treatment. This finding motivated us to extract communities (densely connected subgraphs) from the DX correlation networks with a short random walks approach (Fig.3; steps 2 and 3). A total of 12 communities with a degree no less than 10 or containing at least 12 vertices were identified (Fig.4). We then used the factors of these communities for functional annotation and pathway enrichment analysis and identified hub nodes (top-ranked data points in terms of centrality) within each community. Although these communities were predominantly populated with plasma proteins and metabolites (Fig. S3A and S3B), six well-established clinical parameters of APL were found to be hub nodes for three top communities. In particular, the hub–node bioanalytics (WBC, LDH, prothrombin time, and mean cell hemoglobin) in the top community (Fig.4, C1 community) were indicators of the pathophysiological state in a number of diseases, including leukemia. Consistent with this result, C1 proteins were significantly enriched in biological processes characterized by strong granulocytic cell activities in APL (e.g., leukocyte migration, neutrophil activation, and neutrophil degranulation; Fig.4), and C1 metabolites were involved in purine and pyrimidine metabolism, which were indicative of the active DNA synthesis activity of leukemic blasts (Fig. S3C and S3D, C1 community). The other two clinical parameters (creatinine in C8; FERRIT in C12) were commonly used as indicators of kidney malfunction and irregular body fluid ion concentrations, respectively (Fig. S3C and S3D). The centrality of these clinical parameters provided additional support for their clinical relevance and suggested additional plasma proteins or metabolites for the assessment of body homeostatic states.
To systematically explore the DX plasma correlation network, we analyzed GO/KEGG (Table S7) and metabolite set enrichment analysis terms (Table S8) significantly associated with community proteins and metabolites, respectively. This analysis indicated that plasma proteins implicated in leukocyte migration were present in several top communities, such as C1, C3–7, C9, and C10 (Fig.4 and S3A), whereas other immunity response–related GO terms were associated with individual communities, such as neutrophil-mediated immunity (C1), IFN-γ response (C5), and lymphocyte-mediated immunity (C9 and C10; Fig.4 and S3A). Furthermore, metabolites related to amino acid metabolisms were most well-connected with several top communities (e.g., C1–4, C6–8, C9, and C12; Fig.4 and S3B), while lipid metabolism was associated with C2, C4, C5, and C7 (Fig.4 and S3B).
3.4 PML/RARA might exert influence on the landscape of APL plasma
Though plasma components may be derived from a variety of tissues and cells in the body, a large number of studies have shown that tumor cells are major contributors to altered plasma factors in human cancer. For example, tumor cells may actively secrete or release signaling molecules to the plasma and thereby influence other cells [
42,
43]; alternatively, tumor cells may aggressively absorb metabolites from the microenvironment and fuel their aberrant metabolic activities [
44]. Thus, we hypothesized that the aberrant composition of the DX plasma was partially derived from APL blast cells. To support of this hypothesis, a comparison between the gene expression profiles of APL bone marrow samples and normal promyelocytes (PM, normal bone marrow promyelocytes sorted by flow cytometry corresponding to APL blasts) was performed [
19]. The plasma proteome of APL in the comparison between DX and HC suggested that both data sets were significantly associated (R = 0.28,
P < 0.00001; Fig.5). This analysis yielded 48 APL expression
high/plasma protein
high (Fig.5, upper-right quadrant) and 20 expression
low/plasma protein
low (Fig.5, lower-left quadrant) factors that were enriched in immunity-related processes. However, the former appeared to be preferentially associated with cellular immune responses, whereas the latter were mainly associated with innate and humoral immune responses (Fig.5). In addition, 32 APL expression
high/plasma protein
high factors and six expression
low/plasma protein
low factors were included in the top 12 communities in the DX plasma correlation network, suggesting them may be the critical regulators of the plasma composition (Fig. S4A; Table S9).
Given that aberrant gene expression programs of APL blasts are driven by the
trans-regulatory activity of PML/RARA [
16], we investigated whether some DX plasma proteome changes can be attributed to the protein products of PML/RARA-target genes in APL cells. An analysis pipeline was used to screen potential PML/RARA-target genes that (1) were differentially transcribed in APL (APL vs. PM by microarray data and APL vs. normal healthy BMMCs by RNA-seq data) [
18,
19], (2) showed aberrant levels of protein products in the comparison between DX plasma and HC or CR samples, and (3) possessed well-defined PML/RARA binding peaks near their promoters (nearest gene to PML/RARA binding sites) [
16] (Fig.5 and 5D). After these filtering steps, 12 proteins encoded by candidate PML/RARA-target genes remained, including seven with increased levels (FLT3, IL1RL1, IGFBP2, ADA, ITGA5, AZU1 and SELPLG) and five with decreased levels (CDKN1A, FGR, OLR1, MMP9 and CCL5) in the DX vs. CR comparison (Fig. S4B; Table S2).
Furthermore, we examined a previously reported PML/RARA ChIP-seq data set [
16] and found that 16.6% of proteins from the DX correlation network (97/583) were products of potential PML/RARA target genes and 80 of them were located in the top 12 communities (C1 to C12). Enrichment analysis with hypergeometric tests showed that these PML/RARA targets were mainly distributed in C1 (49/251;
P = 0.041) and C4 (8/30;
P = 0.045). PPI network analysis showed that these 80 proteins can interact with each other (Fig. S4C and S4D), indicating their possible involvement in APL perturbed disease network. Additional analysis of ChIP-seq data and gene expression data of the comparison between APL and other AML subtypes suggested that some of these targets were relatively specifically expressed in APL (Fig. S4A). Examples of APL-specific genes included
ITGA5 (Fig.5) and
FGR (Fig.5), whereas non-APL-specific genes included
ADA (Fig.5). Moreover, we confirmed the PML/RARA-dependent upregulation of ADA expression and activity in the NB4 and PR9 cell lines (Fig. S5A–S5C). Overall, these observations indicated that PML/RARA might be responsible, in part, for the aberrant regulation of the expression of some molecules within APL blasts, which in turn might affect the plasma composition.
Unlike plasma proteins, all metabolites are usually the building blocks or end products of metabolic pathways, and their levels are normally not under the direct control of transcription factors, such as PML/RARA. Moreover, a large variety of sources of plasma metabolites are available, which may be acquired through food consumption or may be absorbed or released from cells, and thus confound the assessment of their relationships with the physiologic states of particular cell types. Nevertheless, we noticed that APL-DX plasma contained significantly low levels of metabolites of purine, pyrimidine, taurine, and hypotaurine metabolism, which are normally involved in nucleic acid and amino acid synthesis and taken up by proliferating cells through circulation. We thus hypothesized that APL blasts might contribute, at least partially, to the reduced levels of some plasma metabolites. Consistent with this idea, a large number of upregulated genes in APL cells encode enzymes catalyzing the biosynthesis of nucleotides, amino acids, and fatty acids (Fig.6 and S6). Moreover, 13.6% (24/177) of these enzyme genes were previously described as the nearest genes to PML/RARA binding site (highlighted in red) [
16], including
IDH1, GLUL, NIT2, PCCA, ACADM, ASS1, CKB, and
SAT1, for carbohydrate and amino acids metabolism;
ACADM, ECI1, ACSL1, GLUL, GRHPR, HSP90AA1, CEL, GABARAPL1, and
PCBD1, for fatty acid metabolism; and
AK4, ADA, NT5M, ADCY9, ADK, PDE3A, and
POLD1, for nucleotide metabolism (Fig.6–6I). These results agreed with the assumption that PML/RARA drives the cellular machinery for nucleic acid and amino acid synthesis, results in the aggressive uptake of relevant metabolites, and reduces their levels in the plasma.
3.5 Features in the plasma composition post frontline therapy
The CR and HC correlation networks were sparse and simple, and few communities were found (Fig. S7A and S7B), suggesting APL disease perturbation features were effectively suppressed by therapy. Nevertheless, two aspects differed between CR and HC. First, the CR plasma contained a high representation of proteins that belonged to neither the secretome nor membrane proteins, which are two categories that usually compose the bulk of proteins detected in the plasma. A careful comparison between CR and HC plasma proteomes identified 364 PDPs, 96.2% of which (354/364) were present at higher levels in the CR plasma (Table S2; Fig.7). GO analysis indicated that they were proteins normally located within the lumens and granules of intracellular organelles, involved in a variety of biological functions, such as T cell activation, IFN-γ signaling, response to oxidative stress, and apoptosis (Fig.7; Table S10). These results were consistent with the findings that ATO can exert its antileukemia effect by raising oxidative stress and subsequently induce the apoptosis of APL cells [
45,
46] and treatment with ATRA/ATO significantly stimulates the interferon pathway and induces the apoptosis of APL cells [
47]. Furthermore, the relationship between ATRA and the immune system, especially IFN signaling, has been recently further explored, and
RIG-1 gene and RARA/STAT1 pathway were proposed to explain how ATRA enhances IFN target genes [
48]. In the plasma metabolites, 162 differential metabolites still existed in CR vs. HC comparison, the levels of 66% (107/162) of these metabolites increased in CR (Fig.7; Table S4), and the level of arachidonic acid metabolism notably increased (Fig.2; Table S5).
4 Discussion
Omics-based biomarkers arising at the genome, transcriptome, proteome, and metabolome levels have been widely used in disease diagnosis, treatment effect evaluation, disease stratification, risk assessment, and prognosis evaluation in leukemia settings [
15,
18,
47]. Despite that these studies provide an in-depth view of the disease states within leukemic blasts, the plasma environment supporting or accommodating the etiology and progression of leukemia remains poorly understood. Here we performed an integrated analysis of a series of clinical laboratory tests, proteomes, and metabolomes of plasma samples from a cohort of patients with APL at two critical clinical points. One at the time of diagnosis (DX) when the disease manifestation became apparent, and the other after frontline therapy when the first CR was achieved. Through multidimension correlation network analysis, we investigated bio-analytes with measurements significantly correlated at these two states and compared them with a similarly constructed correlation network characterizing HC plasma. The functional annotations of these networks shed light on how PML/RARA, likely by regulating the genes encoding signaling proteins and metabolic enzymes, might contribute to the establishment of the plasma composition of APL. Additionally, increased IFN-γ activities emerged from plasma proteomic analysis of the first CR stage, suggesting the activation of the immune system and a tumor-suppression microenvironment post-ATRA/ATO therapy. These findings expanded the current understanding of the pathophysiology and treatment responses of APL.
In reflection of the systems-level breakdown of the immune system in leukemia patients, a striking feature of the DX plasma correlation network was the appearance of many densely connected cellular and molecular bio-analytes that were independent of CR or HC states. Such connections can be simply interpreted as the consequences of the abnormal proliferation of APL blasts and the ensuing release or absorption of proteins and metabolites. However, our analysis indicated that most of these changes centered on a small group of “hub–node” factors, which were enriched with factors implicated in APL pathogenesis [
49–
52]. By incorporating publicly available data sets of gene expression profiles of APL cells [
18,
19], some “hub–nodes” factors were uncovered to be aberrantly expressed in APL cells at high levels. Importantly, genes encoding 12 plasma proteins were previously identified as
bona fide targets of PML/RARA, a fusion protein product of chromosomal translocation t(15;17) that causes APL [
53]. These observations suggested that PML/RARA might play dual roles in establishing the pathological states of APL: (i) it acts as a factor in the
trans-regulatory complex that drives leukemia blasts to autonomously establish an intracellular oncogenic gene expression program; (ii) activation or inhibition of some of its target genes might change protein (e.g., cytokines and chemokines) or metabolite products levels that can be secreted or leaked from the APL blasts to the blood circulation, allowing an influence to be exerted on surrounding cells (paracrine effect) or even leukemic cells themselves (autocrine effect) in bone marrow or plasma composition.
Another feature of the DX correlation network contains many small-molecule metabolites known for their involvement in nucleotide synthesis. Compared with CR and HC samples, the levels of metabolites in DX plasma were low, especially those fueling nucleic acid anabolism and energy consumption. The differential presence of these categories of metabolites was consistent with the expectation that active proliferating leukemia cells must rewire their metabolic programs. Thus, the APL blasts may be aggressively taking up nucleotide monomers to satisfy the anabolic demands of macromolecular biosynthesis (DNA synthesis) and the maintenance of redox homeostasis.
Though the majority of the correlation network features of DX (communities and hub nodes) disappeared in the plasma of the same patients at the first CR stage, the proteome and metabolome profiles of the latter were still different from those of the HC plasma samples. This result is perhaps not unexpected given that, at the first hematological CR, though normal hematopoiesis is restoring, minimal residual leukemia elements, especially some APL cells at the end stage of granulocytic differentiation, still need to be cleared off upon direct effect of ATRA/ATO from patients’ blood and bone marrow. Indeed, the time point of the first CR did not necessarily represent the same steady-state as that of HCs. The immune cells induced by the drugs performed a cytotoxic function against residual APL cells. Thus, the special patterns of correlating networks in the CR plasma composition, compared with those in HCs, might be indicative of a para-normal status that can feature the rebuilding of the physiologic hematopoietic system and the eradication of residual blasts by regaining normal immune functions. Further studies shall be directed at distinguishing these two scenarios. Additionally, further investigation is warranted to explore the potential impact of specific effects of ATRA/ATO during maintenance therapy. These considerations can provide a deeper understanding of the complex interplay between treatment regimens and alterations in the plasma proteome or metabolome, which may ultimately inform personalized treatment strategies for patients with acute promyelocytic leukemia.
Altogether, our study demonstrates that the multidimensional correlation network of the plasma proteomes and metabolomes, in conjunction with tissue- or cell-type specific gene expression profiles, facilitates the comprehensive analysis of the leukemia microenvironment and discovery of mechanistic linkages between factors at cellular and molecular levels. This analysis pipeline may be applied to the study of other types of leukemia.