|Year : 2021 | Volume
| Issue : 3 | Page : 756-763
Multiregion sequencing and subclonal analysis reveal intratumoral heterogeneity in esophageal squamous cell carcinoma
Dongni Gao1, Zicheng Zhang2, Qiwei Yang1, Baosheng Li3
1 Department of Clinical Medicine, Cheeloo College of Medicine, Shandong University; Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong, China
2 Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong; Department of Radiation Oncology, Shenzhen Traditional Chinese Medicine Hospital, The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Shenzhen, Guangdong, China
3 Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong, China
|Date of Submission||17-Feb-2021|
|Date of Acceptance||17-Mar-2021|
|Date of Web Publication||9-Jul-2021|
440 Jiyan Road, Huaiyin District, Jinan 250117, Shandong
Source of Support: None, Conflict of Interest: None
Purpose: The aim of this study was to investigate intratumoral genomic heterogeneity and subclonal structure of esophageal squamous cell carcinoma (ESCC).
Materials and Methods: Multiregion whole-exome sequencing was performed on 24 surgically acquired tumor samples from five untreated ESCC patients collected in 2019 to determine the heterogeneity of mutational landscape within tumors. Phylogenetic analysis and mutation process analysis were used to explore the distribution and dynamic changes of mutation spectrum, and subclone analysis was used to explore the subclonal composition and spatial structure of ESCC.
Results: An average of 60.2% of mutations were found heterogenous. TP53 and NOTCH1 mutations were confirmed to be early events, and mutations unique in different tumor regions showed a pattern of branching evolution. A large proportion of mutations were associated with abnormal activity of the apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC) family, and significant differences in mutation types between trunk and branch variants were found. Subclonal structure exhibited spatial correspondence and spatial limitations, and different genomic features were characterized between close and distant clones.
Conclusions: There is significant intratumoral genomic heterogeneity in the five ESCCs, and their subclonal structure is related to spatial locations.
Keywords: Esophageal squamous cell carcinoma, intratumoral heterogeneity, tumor evolution
|How to cite this article:|
Gao D, Zhang Z, Yang Q, Li B. Multiregion sequencing and subclonal analysis reveal intratumoral heterogeneity in esophageal squamous cell carcinoma. J Can Res Ther 2021;17:756-63
|How to cite this URL:|
Gao D, Zhang Z, Yang Q, Li B. Multiregion sequencing and subclonal analysis reveal intratumoral heterogeneity in esophageal squamous cell carcinoma. J Can Res Ther [serial online] 2021 [cited 2021 Aug 5];17:756-63. Available from: https://www.cancerjournal.net/text.asp?2021/17/3/756/321027
| > Introduction|| |
Esophageal carcinoma is one of the tumors with high morbidity and mortality worldwide, and the 5-year survival rate is about 20%. Unlike western countries, squamous cell carcinoma is the most prevalent pathological type in China. For nonoperable ESCCs, local radiotherapy and systemic therapy are the main treatment options, and intratumoral heterogeneity has a nonignorable impact on the treatment outcomes. Previous studies had made a lot of efforts to interpret the genomic characteristics of ESCC, and a series of significantly mutated genes such as TP53 and TNF750 had been identified., However, most of the studies above are based on single sample sequencing, and ignoring the potential impact of intratumoral heterogeneity.
In recent years, the intratumoral heterogeneity of ESCC has gradually gained attention. The work of Hao et al. assessed the heterogeneity of the ESCC genome and epigenome. The study of Chen et al. revealed the evolutionary relationship between esophageal cancer and intraepithelial neoplasia. However, the spatial distribution of tumor heterogeneity has not been interpreted in detail, and the discussion on tumor subclonal composition was not involved.
To further investigate intratumoral heterogeneity, five ESCC tumors were sequenced in multiple regions and the heterogeneity was assessed from three perspectives: mutation profile, mutation process, and subclonal structure, and both from dimensions of space and time. Our study shows that there are significant differences in the five ESCCs' genome between different regions of the tumor, thus providing theoretical support for further understanding of intratumoral heterogeneity of ESCC.
| > Materials and Methods|| |
Patient selection and sample preparation
Samples were collected between May and July 2019 at Shandong Cancer Hospital and Institute. The project and procedure were approved by the ethics committee of the hospital and written informed consent was obtained from the patients. All of the patients were diagnosed as esophageal squamous cell carcinoma by endoscopic biopsy and were subjected to surgery. None of them had received any treatment before surgery. The pathological tumor node and metastasis stage was determined according to AJCC 8th edition. Samples were collected right after surgical resection. 4–5 pieces of tissue with a volume of 16–25 mm3 were carefully obtained from separated regions of the tumor, and matching normal tissues were obtained from at least 5 cm away from the tumor. All of the samples were put into liquid nitrogen for snap frozen and stored at − 80°C, together with peripheral blood samples taken during surgery. The hematoxylin-eosin staining slides of the samples were assessed by two senior pathologists separately to make sure of enough tumor cell content.
Genomic DNA was extracted using a QIAamp DNA Mini Kit (QIAGEN); samples passed the quality test were used to prepare for library construction. The sequencing library was constructed by Agilent SureSelect Human All Exon V6 kit (Agilent Technologies) following the manufacturer's protocol. The posthybridization amplification product was quality checked and sequenced using Illumina HiSeq Xten instruments (2 × 150-bp paired-end sequencing).
Mutation calling, filtering, annotation, and driver gene identification
Reads containing sequencing adapters and those of low qualities were removed and were aligned to the NCBI reference genome (hg38) using BWA (v0.7.17) with default parameters. The SAMtools (v1.4) was used to remove amplification duplicates. The GATK (v4.1) software was used to perform local realignment and further qualification. The variation files were filtered and annotated with the ANNOVAR (2019Oct24). Putative driver genes were derived by considering both the documented mutations in Catalogue of Somatic Mutations in Cancer (COSMIC) cancer mutation census and recurrent mutations of ESCC provided in previous researches.
Construction of phylogenetic tree
All the mutations with variant allele frequency ≥2% were extracted from variation files and converted into a binary table; the R package phangorn (v2.5.5) was used to infer the phylogenetic trees using the parsimony ratchet method. E016D was removed for low purity. Branch lengths were determined according to the variation counts.
Mutational signature analysis
The types of single base substitutions were extracted from variation files and converted into six categories. The R package Sigminer was applied to decompose the mutational signatures with the distribution of 96 substitution types and to make a comparison with the COSMIC documented single-base substitution (SBS) signatures (2020, v3.1).
Cancer cell fraction and clone phylogeny estimate
The cancer cell fraction of each mutation was calculated using ABSOLUTE by taking into account the copy numbers and sample purity. CloneFinder (v. 0.1.1) (https://github.com/gstecher/CloneFinderAPI) was used to infer subclonal composition and clone phylogeny of each patient by grouping mutations into clusters. The input cancer cell fractions (CCFs) were converted into the SNV read count table according to the instruction. Only SNVs on diploid heterozygous sites, and with the total read count ≥20, mutant read count >2, and variant frequency >0.02 were included to ensure the reliability of subclone estimation. Then the inferred subclonal phylogenetic tree and subclone frequency were visualized.
Statistical analysis was performed by the R software (v4.0.3). Spearman's correlation test was used to analyze the correlation between mutant-allele tumor heterogeneity (MATH) and clinicopathological factors. Wilcoxon test was used to compare contributions of mutation signatures between branch and trunk mutations. Fisher's exact test was used to compare proportions of mutation types between branch and trunk mutations.
| > Results|| |
Multiregion sequencing reflects heterogeneity
To explore intratumoral mutational heterogeneity of ESCC, multiregion whole-exome sequencing was performed on 24 tumor samples from 5 ESCC patients, paired with matched normal tissues and peripheral blood samples collected during surgery. All of the patients were diagnosed as squamous cell carcinoma by endoscopic biopsy and confirmed by pathological analysis after surgery, none of them had received any treatment before surgery, and details of clinical-pathological information were listed in [Supplementary Table 1].
The mean sequencing coverage depth was ×205. An average of 542 somatic mutations were detected per patient (range 252–873) [Figure 1] and [Supplementary Figure 1], and the average percentage of heterogenous mutations that are not shared by all of the tumor regions was 60.2% [Figure 2], indicating significant regional divergence. The MATH values were also calculated to quantify the mutational heterogeneity of our multiregional samples [Supplementary Figure 2]. As indicated in studies of breast and head and neck cancers, a higher MATH was prognostic of worse overall survival,, though this has not been emphasized in ESCC. The medium multiregional MATH of the 5 patients was 70.2, much higher than the value derived from single samplings. And a tendency of higher MATHs in patients of higher tumor grades was observed, though this was not statistically significant.
|Figure 1: Mutational landscapes in multiregion samples of esophageal squamous cell carcinoma patients. The top panel showed the proportions of six mutation types. Bar plot (middle) indicates the number of coding mutations per sample. The heatmap displays genes with coding mutations, ranked based on the mutation frequency. The bottom panel represents the clinical and pathological parameters of the patients. Bar plot (right), number of samples with mutation|
Click here to view
|Figure 2: Intratumoral heterogeneity of mutations and phylogenetic trees of esophageal squamous cell carcinoma patients. Venn diagram showed the number of shared and private mutations of multiple regions. Heatmap displays the presence (blue) or absence (gray) of somatic mutations in every tumor region. Phylogenetic trees were constructed from all of the mutations by phangorn, putative driver mutations were indicated on the tree. The blue, orange, and green edges represent ubiquitous, shared, and private mutations, respectively. Proportions of heterogenous mutations were indicated|
Click here to view
Phylogenetic analysis defines early and late events
To further characterize mutational heterogeneity between regions, phylogenetic trees were constructed based on the presence or absence of all the variants from different tumor regions, taking normal tissues as the root [Figure 2]. According to the graph theory of tumor phylogenetic tree, the trunk of the tree represents mutations that were ubiquitous in all of the regions, indicating early events in tumor evolution, while the branch represents shared mutations between some of the regions or private mutations unique to a single region, such mutations were possibly acquired late and potentially have roles in promoting tumor progression. The length of the tree was proportional to the number of variations, and the nodes stand for the most recent common ancestors of descendant branches. Consistent with previous ESCC studies, phylogenetic trees of all the five patients have relatively long trunks, with an average truncal length of 221 mutations, which meant that a large proportion of the mutations were acquired early. As indicated by Chen et al., driver mutations prevalent in ESCC such as those on TP53 and ZNF750, can also be detected in squamous dysplasia.
According to the COSMIC cancer mutation census (v92) and genomic studies of ESCC previously, a total of 32 putative driver genes with different confidence were identified from our patients [Supplementary Table 2]. Nineteen of them were shared by ≥3 regions of the tumor, and 10 were single-region specific. No significant difference was observed between truncal versus branchal distributions of oncogenes and tumor suppressor genes, as discussed in other studies, possibly due to the limited number of genes present.
Significantly mutated genes of ESCC such as TP53 and NOTCH1 were always on the trunk, while A1CF mutations were detected in individual regions of two patients (E022E and E023D, respectively) [Figure 2] and [Supplementary Figure 3]. Details of mutations were also inspected, and we usually have mutations unique to different regions of the tumor often had different functions, such as ERCC3 mutation unique to E023A, involving DNA repair and mRNA processing pathway, while unique EPAS1 mutation in E023D involved lipogenesis and may be associated with metabolic abnormalities, thus supporting a pattern of branched evolution.
Temporal shift of mutation patterns
To explore factors contributing to the genomic heterogeneity of ESCC, the mutational processes of all the mutations were analyzed. The counts of six types of base substitutions were derived from information of mutation types and were converted into 96 types of mutational context to make a comparison with COSMIC SBS signatures (2020, v3.1) [Figure 3]a, [Figure 3]b, [Figure 3]c. In addition, trunk and branch variations were examined separately to explore the dynamics of mutational processes during tumor evolution.
|Figure 3: Mutation processes of esophageal squamous cell carcinoma patients. (a) Trunk and branch proportions of six mutation types of the individual patient and all cases, asterisk represents significance of P values. (b) The counts of 96 contexts of mutation types of trunk and branch mutations. (c) Heatmap displays the hierarchical clusters of trunk and branch mutations of patients according to the enrichment of SBS signatures. (d) The change of SBS enrichment from trunk to branch, the P value of SBS90 was 0.012, enrichment change of SBS15 was not statistically significant. SBS = Single-base substitution|
Click here to view
For SBSs, the most prevalent type across patients was C > T transition, followed by C > G, C > A, and T > A variation [Figure 3]a. C > T and C > G mutations at TpC sites were associated with the APOBEC family of cytidine deaminases-mediated mutagenesis, an abnormality common in most human cancers that leads to increased DNA replication error rate. And C > A transversions were associated with tobacco smoking. The proportion of all the six variations were significantly changed from trunk to branch (Fisher's exact test, P < 0.05 all cases). Among them, the prevalences of C > T and C > A mutations were decreased, on contrary with T > A mutations, suggesting that a conversion of leading mutagenic factors may have occurred during the evolution.
Although individual decomposition of mutational context did not convert into prominent signatures, analysis across samples did reveal higher contributions of SBS2 and SBS13 [Figure 3]c, especially in E018 and E023, both signatures represent the abnormal activity of the APOBEC family. Moreover, the contribution of SBS15 was also highly ranked, as seen in both trunk and branch mutations of E016, standing for a signature correlated with defective DNA mismatch repair. A chemotherapy-associated signature, SBS90, was the only signature that significantly changed from trunk to branch, yet no reasonable explanation could be made [Figure 3]d and [Supplementary Table 3].
Subclonal structure and spatial limitation
Using a published algorithm CloneFinder, mutated genes across the samples were clustered according to their CCFs, subclonal composition and clone phylogeny were inferred as well. An average of 6.2 subclones were decomposed from the tumors [Figure 4]a. Due to the limited number of cases, underlying factors affecting the number of subclones could not be confirmed. Nevertheless, the spatial locations of sampling correspond well with the subclonal compositions, as the samples with the same subclones were basically located in the adjacent anatomical positions [Figure 4]b. For example, E018A, B, and D samples were obtained from the right margin of the ulcerated tumor, and Clone4 was detected in both of them, contrasted with E018E obtained from the contralateral side, same was true for E021. And for patient E022, an isolated tissue E022E, that automatically shed from the tumor thus missed the record of anatomic location were also sequenced. Subclonal analysis revealed a close relationship with E022A, indicating it likely shed from the position near E022A.
|Figure 4: Subclone composition and phylogeny of esophageal squamous cell carcinoma patients. (a) Subclones and clone phylogeny inferred by CloneFinder. The close and distant subclones were indicated in bold. (b) Visualization of subclonal distribution across tumor regions. (c) Summed frequency and regions of occurrence of subclones. (d) Difference of proportions of mutation types between close and distant subclones. (e) Difference of enrichment of SBS signatures between close and distant subclones. (f) GO analysis of genes shared between or private to close and distant subclones. Genes private to distant subclone were enriched in the ion channel-binding category (P adjusted = 0.041). SBS = Single-base substitution|
Click here to view
To get a further understanding of subclone structures in scaling of both space and time, the two most separated subclones were manually chosen from the clone phylogeny trees as close and distant clone respectively. Similar to the molecular phylogenetic trees constructed in the former section, the close clones tend to be more spatially distributed, occupying 3.4 regions per patient on average, while distant clones in three out of five patients were unique to one single region [Figure 4]c, indicating a pattern of spatial limitation as suggested by Yates et al. No obvious difference of subclone distribution was observed between central and peripheral groups of sampling locations.
To characterize the underlying mechanisms of subclonal phylogeny of the five ESCCs, GO analysis was performed on coding genes shared between and private to close and distant clones [Figure 4]f. Genes unique to distant clones were enriched in the term of ion channel binding (adjusted P = 0.041). And the mutational process was also analyzed to explore potential mechanisms driving the dynamic evolution from close to distant clones [Figure 4]d. For the six mutational processes assessed, the proportions of T > C and T > A mutations were significantly decreased from close to distant clones, while C > G mutations were increased [P values were indicated in [Figure 4]d]. Signature analysis revealed an increased proportion of SBS2 and SBS13 signatures [Figure 4]e, suggesting potential roles of the APOBEC family in driving the subclonal evolution of the five ESCCs.
| > Discussion|| |
Radiation therapy and chemotherapy are the foundational treatment measures for ESCC patients who are not suitable for surgery. However, the effect of intratumoral heterogeneity on nonsurgical treatment outcomes has always been a disturbing problem. And our understanding of ESCC intratumoral heterogeneity is far from deep enough. For this reason, multiregion sequencing was performed on 24 tumor samples from 5 ESCC patients. Among the 542 somatic mutations detected per patient, 60.2% were heterogenous and with a tendency of positive correlation between the degree of heterogeneity and tumor grades. Such percentages varied among studies and may be confounded by the quality of sequencing, yet they still suggest that large amounts of mutations might be missed out by single samplings for most studies of the cancer genome.
Phylogenetic analysis provided a more intuitive perspective. Roles of driver mutations could be inferred directly from their positions on the tree. Despite the heterogeneity, a significant number of mutations were located on the trunk, possibly accumulated through tumorigenesis. Among the 32 putative drivers, 19 were shared by most of the tumor regions compared with 10 uniquely located genes, though a larger sample size was needed to draw a convincible conclusion. While drivers such as TP53 and NOTCH1 were consistently located on the trunk, the A1CF gene was identified on unique regions from two patients, and whether it has an effect on ESCC progression remains to be determined. The graph theory of the phylogenetic tree also provides clues for patterns of evolution. In patients E018 and E023, private drivers of different regions involve distinct molecular pathways, thus affecting different functional phenotypes, this is so-called branched evolution, a pattern that has been confirmed in a growing number of older cancers.
Mutational signatures can be decomposed from the type of nucleotide variations using an algorithm called nonnegative matrix factorization, from where the mutagenic origin can be inferred. While the major mutation types on the trunk and branches of all patients were C > T and C > G were, associated with APOBEC activity, dynamic change of mutation types from trunk to branch can still be observed, although this change cannot be fully explained from the perspective of mutation signatures.
Tumor subclone analysis is gradually gaining attention. Malignant cells in tumors basically exist as subclones, with a subset of tumor cells originating from the same most recent ancestors having relatively uniform genotypes and possibly similar phenotypes. Based on this hypothesis, a variety of algorithms were designed to analyze the subclonal composition and evolution of tumors using sequencing data., Although the precision and richness of the variant information provided by sequencing are still critical factors that affecting the conclusion, the subclonal structures and clone phylogeny of the 5 ESCCs were decomposed for preliminary exploration. And more importantly, the potential relationship between subclonal composition, clone phylogeny, and spatial locations were assessed by the first time in ESCC. In the current study, the subclonal composition inferred by the algorithm corresponds well to the spatial position. Samples with similar subclonal composition often locate in adjacent rather than distant regions, and the recently generated subclones tend to stay in fixed regions. This is important for cancers like ESCC which depends heavily on local treatment measures such as radiotherapy, as the spatial limitation of tumor subclones is helpful for making treatment plans according to the anatomic location.
Due to the limitation of technology and sample size, the current research has not been translated into significant clinical indications, and the functional interpretation of subclonal evolution also needs supplementary information from other omics. Even so, as a proof-of-principle study, our research takes a further step toward understanding the intratumoral heterogeneity of ESCC.
Financial support and sponsorship
- Taishan Scholar Construction Project, Grant/Award Number: ts20120505
- National Natural Science Foundation of China, Grant/Award Number: 81874224
- Academic promotion program of Shandong First Medical University, Grant/Award Number: 2019LJ004.
Conflicts of interest
There are no conflicts of interest.
| > References|| |
Zeng H, Zheng R, Zhang S, Zuo T, Xia C, Zou X, et al
. Esophageal cancer statistics in China, 2011: Estimates based on 177 cancer registries. Thorac Cancer 2016;7:232-7.
Zeng H, Zheng R, Guo Y, Zhang S, Zou X, Wang N, et al
. Cancer survival in China, 2003-2005: A population-based study. Int J Cancer 2015;136:1921-30.
Gao YB, Chen ZL, Li JG, Hu XD, Shi XJ, Sun ZM, et al
. Genetic landscape of esophageal squamous cell carcinoma. Nat Genet 2014;46:1097-102.
Cheng C, Zhou Y, Li H, Xiong T, Li S, Bi Y, et al
. Whole-genome sequencing reveals diverse models of structural variations in esophageal squamous cell carcinoma. Am J Hum Genet 2016;98:256-74.
Hao JJ, Lin DC, Dinh HQ, Mayakonda A, Jiang YY, Chang C, et al
. Spatial intratumoral heterogeneity and temporal clonal evolution in esophageal squamous cell carcinoma. Nat Genet 2016;48:1500-7.
Chen XX, Zhong Q, Liu Y, Yan SM, Chen ZH, Jin SZ, et al
. Genomic comparison of esophageal squamous cell carcinoma and its precursor lesions by multi-region whole-exome sequencing. Nat Commun 2017;8:524.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25:1754-60.
Schliep KP. Phangorn: Phylogenetic analysis in R. Bioinformatics 2011;27:592-3.
Wang S, Li H, Song M, Tao Z, Wu T, He Z, et al
. Copy number signature analysis tool and its application in prostate cancer reveals distinct mutational processes and clinical outcomes. PLoS Genet. 2021;17:1–23.
Carter SL, Cibulskis K, Helman E, McKenna A, Shen H, Zack T, et al
. Absolute quantification of somatic DNA alterations in human cancer. Nat Biotechnol 2012;30:413-21.
Mroz EA, Patel KB, Rocco JW. Intratumor heterogeneity could inform the use and type of postoperative adjuvant therapy in patients with head and neck squamous cell carcinoma. Cancer 2020;126:1895-904.
McDonald KA, Kawaguchi T, Qi Q, Peng X, Asaoka M, Young J, et al
. Tumor heterogeneity correlates with less immune response and worse survival in breast cancer patients. Ann Surg Oncol 2019;26:2191-9.
Swanton C, McGranahan N, Starrett GJ, Harris RS. APOBEC enzymes: Mutagenic fuel for cancer evolution and heterogeneity. Cancer Discov 2015;5:704-12.
Pfeifer GP, Hainaut P. On the origin of G → T transversions in lung cancer. Mutat Res 2003;526:39-43.
Miura S, Gomez K, Murillo O, Huuki LA, Vu T, Buturla T, et al
. Predicting clone genotypes from tumor bulk sequencing of multiple samples. Bioinformatics 2018;34:4017-26.
Davis A, Gao R, Navin N. Tumor evolution: Linear, branching, neutral or punctuated? Biochim Biophys Acta Rev Cancer 2017;1867:151-61.
Yates LR, Gerstung M, Knappskog S, Desmedt C, Gundem G, Van Loo P, et al
. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat Med 2015;21:751-9.
Miura S, Vu T, Deng J, Buturla T, Oladeinde O, Choi J, et al
. Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data. Sci Rep 2020;10:3498.
[Figure 1], [Figure 2], [Figure 3], [Figure 4]