Close
About
FAQ
Home
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Early detection of lung cancer by characterizing circulating rare cells using peripheral blood liquid biopsy
(USC Thesis Other)
Early detection of lung cancer by characterizing circulating rare cells using peripheral blood liquid biopsy
PDF
Download
Share
Open document
Flip pages
Copy asset link
Request this asset
Transcript (if available)
Content
Early Detection Of Lung Cancer
By Characterizing Circulating Rare Cells
Using Peripheral Blood Liquid Biopsy
by
Ji Youn Seo
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
August 2024
Copyright 2024 Ji Youn Seo
1
Acknowledgements
First and foremost, I thank God for His guidance and blessings throughout this journey. His love, strength
and wisdom have supported me every day. I am grateful for His presence in all moments.
My deepest gratitude to Professors Peter Kuhn and James Hicks for providing an environment and
opportunities for valuable research to improve patients' lives. Thank you for your encouragement and
guidance through this challenging journey.
To CSI-Cancer team and all the mentors, this whole research would not have been possible without you.
Working as a team toward the same goal has been powerful and meaningful. I am grateful for all your
support, guidance and contributions.
To all the collaborators, I would like to thank you for all the shared knowledge and for teaching me the
importance of communication. Your contributions have enriched my research experience immensely.
I am grateful to my committee professors for their invaluable feedback and guidance, helping me organize
and progress in this journey as a scientist.
I am sincerely grateful to the Thuy Thanh Truong, and the Schlegel Family for providing financial
support to enable this research.
To my family and friends, thank you for your endless love and encouragement. Your support and
understanding have been crucial in finding the meaning and value in my life and this journey.
Finally, to all the patients who contributed to this research, you are the reason and purpose of this thesis.
Without you, this thesis would have no meaning. I deeply appreciate your contribution and hope this
research contributes to solving the challenges of lung cancer and improving patients' lives.
2
ii
TABLE OF CONTENTS
Acknowledgements........................................................................................................................................ii
List of Tables.................................................................................................................................................v
List of Figures...............................................................................................................................................vi
Abstract........................................................................................................................................................vii
Introduction....................................................................................................................................................1
Chapter 1: Characterization of Circulating Rare Cells in Small Cell Lung Cancer.......................................4
1.1. Introduction......................................................................................................................... 4
1.2. Materials and Methods........................................................................................................ 5
1.2.1. Liquid Biopsy Samples .............................................................................................5
1.2.2. Immunofluorescence Staining and Imaging.............................................................. 5
1.2.3. Detection and Classification of Circulating Rare Cells............................................. 6
1.2.4. Patient-level Classification Modeling........................................................................7
1.2.5. Whole Genome Single Cell Copy Number Alteration ..............................................7
1.2.6. Single-Cell Alteration Classification Model .............................................................8
1.2.7. Statistical Analysis ....................................................................................................8
1.3. Results .................................................................................................................................8
1.3.1. Patient-level Classification Modeling........................................................................8
1.3.2. Characterization of Circulating Rare Cells ..............................................................10
1.3.3. Genome-Wide Single-Cell Copy Number Analysis ................................................13
1.3.4. Single-Cell Alteration Classification Model............................................................16
1.4. Discussion ..........................................................................................................................17
Chapter 2: Characterization of Circulating Rare Cells in Non Small Cell Lung Cancer ............................19
2.1. Introduction .......................................................................................................................19
2.2. Materials and Methods ......................................................................................................20
2.2.1. Liquid Biopsy Samples ............................................................................................20
2.2.2. Immunofluorescence Staining and Imaging ............................................................20
2.2.3. Detection and Classification of Circulating Rare Events ........................................21
2.2.4. Whole Genome Single Cell Copy Number Alteration ............................................21
2.2.5. Targeted Proteomics By Imaging Mass Cytometry .................................................21
2.2.6. Statistical Analysis...................................................................................................22
2.3. Results ...............................................................................................................................22
2.3.1. Characterization of CTCs in NSCLC using HDSCA3.0 .........................................22
2.3.2. Genome-Wide Single-Cell Copy Number Analysis ...............................................29
2.3.3. Proteomic Characterization of Circulating Rare Cells ............................................30
3
iii
i
2.4. Discussion..........................................................................................................................32
Chapter 3: Utilization of Circulating Rare Cells in Lung Cancer Screening...............................................35
3.1. Introduction .......................................................................................................................35
3.2. Materials and Methods ......................................................................................................36
3.2.1. Liquid Biopsy Samples ............................................................................................36
3.2.2. Patient-Level Classification Modeling ....................................................................37
3.2.3. Statistical Analysis ..................................................................................................37
3.3. Results ...............................................................................................................................37
3.3.1. Characterization of CTCs in Lung Cancer Patients, Benign and High-risk
Individuals Using HDSCA3.0 ...........................................................................................38
3.3.2. Case Study of Early Lung Cancer Detection ...........................................................42
3.3.3. Patient-level Classification Model...........................................................................43
3.4. Discussion..........................................................................................................................46
Overall Discussion .......................................................................................................................................49
Supplementary Information .........................................................................................................................51
Bibliography.................................................................................................................................................65
4 iv
List of Tables
Table 1. Patient collection information of SCLC ........................................................................................20
Table 2. Patient collection information of NSCLC, SCLC and High Risk Individuals...............................36
Table S1. Feature naming conventions in EBImage ....................................................................................63
1
v
List of Figures
Figure 1. HDSCA3.0 Workflow.....................................................................................................................9
Figure 2. Patient-level classification model results using HDSCA3.0 liquid biopsy data...........................10
Figure 3. CRCs identified in the PB from SCLC patient samples using HDSCA3.0 with
the Landscape assay.....................................................................................................................................13
Figure 4. Detected cells from UM-001 using the EpCAM assay.................................................................13
Figure 5. Single-cell CNA profiling of rare cells detected in SCLC PB samples .......................................15
Figure 6. Single-cell alteration classification model performance...............................................................16
Figure 7. Circulating rare events identified in the PB from NSCLC patients..............................................24
Figure 8. Circulating rare events identified in the PB from NSCLC patients by HDSCA1.0 and
HDSCA3.0 ...................................................................................................................................................26
Figure 9. Circulating rare events identified in the PB from NSCLC patients, benign individuals and
ND................................................................................................................................................................28
Figure 10. Proteomic analysis of CRCs in NSCLC.....................................................................................31
Figure 11. Circulating rare events identified in the PB from lung cancer patients, benign and
high-risk individuals.....................................................................................................................................39
Figure 12. Circulating rare events identified in the PB from NSCLC patients and ND..............................41
Figure 13. A case study of a NSCLC patient who was initially at high risk without cancer
but diagnosed a year later after the screening..............................................................................................43
Figure 14. Patient-level classification model results using enumeration data of channel-based
rare cell groups from lung cancer patients, high-risk individuals and NDs.................................................44
Figure 15. Patient-level classification model results using comprehensive phenotypic data of CRCs
from lung cancer patients, high-risk individuals and NDs...........................................................................46
Figure S1. Representative cell images of the top important clusters….......................................................51
2
vi
Figure S2. Circulating rare cells identified in the peripheral blood from SCLC patient samples using
HDSCA3.0 with the Landscape assay..........................................................................................................56
Figure S3. Single cell CNA profiling of rare cells detected in PB samples from 14 SCLC patients .........58
Figure S4. Feature importance of the features for the random forest model ...............................................59
Figure S5. Single cell CNA profiling of rare cells detected in PB samples from 5 NSCLC patients ........60
Figure S6. Feature importance of the features for the patient-level model classifying lung cancer
patients and NDs..........................................................................................................................................61
Figure S7: Feature importance of the features for the patient-level model classifying high risk
individuals and NDs.....................................................................................................................................62
3 vii
Abstract
Lung cancer, the second most common cancer in men and women and the leading cause of cancer
mortality in the United States, has a significantly higher survival rate when detected early. Current
screening methods, such as low-dose computed tomography (LDCT), have limitations including high
false-positive rates and overdiagnosis. Liquid biopsy through peripheral blood (PB) analysis presents a
promising complementary approach for early detection by identifying biomarkers, including circulating
rare cells (CRCs) such as circulating tumor cells (CTCs). This study employs the high-definition
single-cell assay (HDSCA) workflow, an enrichment-independent method for improving patient outcomes
through the early detection and comprehensive analysis of CRCs. The third-generation HDSCA
incorporates advanced immunofluorescence staining and an unsupervised rare event detection algorithm
to capture a comprehensive population of CRCs. The results indicate that HDSCA3.0 successfully
identifies a diverse range of circulating rare events, providing valuable insights into the heterogeneity of
lung cancer. This thesis demonstrates that the characterization of the phenotypic heterogeneity of CRCs in
SCLC and NSCLC patients, and evaluates the feasibility of HDSCA3.0 for lung cancer screening in
high-risk individuals, can enhance diagnostic precision and potentially improve patient outcomes.
4
viii
Introduction
Lung cancer is the second most common cancer and the leading cause of cancer mortality,
accounting for almost 25% of all cancer-related deaths in the United States [1]. Lung cancer is
histologically divided into 2 main types: small cell lung cancer (SCLC) and non-small cell lung cancer
(NSCLC). SCLC comprises about 14% of lung cancer diagnoses while NSCLC comprises approximately
80% [2]. The survival rate is drastically higher when lung cancers are detected early at the localized stage
(63% five-year survival rate), compared to when detected at a metastatic stage (8% five-year survival
rate) [3]. However, lung cancer symptoms typically do not appear until the disease has reached a
metastatic stage, making early detection challenging and significantly impacting survival rates [4].
Therefore annual lung cancer screening by low-dose computed tomography (LDCT) scan has been
recommended by the U.S. Preventive Services Task Force (USPSTF) for people who are at high risk of
developing lung cancer to detect the disease earlier [5]. High-risk individuals are defined as those having
a 20-pack-year or more smoking history, who smoke now or have quit within the past 15 years, and who
are between 50 and 80 years old.
However, it is important to note the risks associated with LDCT. One of the biggest concerns of
LDCT is the high false-positive rate which can lead to additional unnecessary tests and invasive
processes. According to studies in the US, false-positive rates for LDCT screening range from 9.6% to
28.9% during the first-ever CT scans [6]. The National Lung Screening Trial (NLST) reported that 26.3%
of the total number of LDCT screening tests got false-positive results and 2% of them had to experience
subsequent invasive procedures after further diagnostic methods including PET scans [7]. Furthermore,
overdiagnosis is another risk of screening. Overdiagnosis refers to the overdetection of cancers that are
slowly progressing during the individual's lifetime not leading to symptoms and ultimately considered not
harmful. A recent study showed that 49% of cancers detected by screening may be overdiagnosed (95%
CI 11%–87%), leading to unnecessary treatments, which can be distressing and carry significant risks
such as surgery [8]. Therefore, there exists a need for a complementary approach to conventional
radiological screening that can provide more specific evidence for a cancer diagnosis and malignancy.
Liquid biopsy is minimally invasive and also known to be able to detect clinically meaningful
biomarkers for early detection of cancer [11]-[15]. Peripheral blood (PB) is the most commonly tested
liquid biopsy due to its high sensitivity in diagnosing cancer and revealing tumor characteristics.
Sensitivity is crucial because it reduces the chance of missing cancer cases, ensuring early detection and
prompt treatment. [13], [14], [15]. Therefore, liquid biopsy is often discussed as a complementary strategy
to current screening tests, with some methods like cell free DNA (cfDNA) already clinically validated for
lung cancer. PB samples are commonly used to detect cancer through various analytes, each serving a
specific role in cancer detection and characterization. CTCs are intact cells that originate from primary or
metastatic tumors, detected in the bloodstream, providing crucial insights into cancer progression and
metastasis. CTCs serve as valuable diagnostic and prognostic markers due to their ability to inform on the
status and evolution of the disease [16], [17], [18]. Circulating tumor DNA (ctDNA) and cfDNA are
fragments of DNA shed from tumors into the bloodstream, offering important prognostic and predictive
information on cancer [19]. While cfDNA has been widely recognized as a valuable biomarker for lung
cancer detection, it has certain limitations. The concentration of cfDNA can be extremely low, especially
in early-stage cancers, leading to potential false negatives. Additionally, cfDNA primarily provides
1
genetic information, lacking the ability to capture the phenotypic and functional characteristics of cancer
cells. This is where the analysis of CRCs, including CTCs, becomes critical. Unlike ctDNA and cfDNA,
CTC analysis allows for comprehensive studies of whole cells, enabling detailed molecular profiling
through DNA, RNA, and protein analysis, which is essential for understanding cancer biology and
developing personalized treatment strategies [19], [20].
The first U.S. Food and Drug Administration (FDA) approved technology for CTC detection in
metastatic breast, colorectal, and prostate cancer by the CellSearch platform which uses an enrichment
approach based on magnetic beads coated with epithelial cell-adhesion molecule (EpCAM) antibodies.
However, one significant hurdle for CTC analysis in NSCLC, the most common type of lung cancer, is
the low detection rate or low prevalence of CTCs in these patients [8], [12], [13]. This challenge has
prevented the clinical adoption of CTC analysis for NSCLC. Another major drawback of the
EpCAM-based approach is its inability to detect CTC subpopulations with low or no EpCAM expression,
including cells undergoing epithelial-to-mesenchymal transition (EMT) or mesenchymal CTCs. These
significant number of CTC subpopulations are crucial as they contribute to tumor progression, metastasis,
and intercellular communication [21]. Therefore, EpCAM alone is insufficient to capture the full diversity
of CTCs, which exhibit significant plasticity and phenotypic alterations [22], [23], [24]. There have been
EpCAM-independent enrichment approaches applied by several different studies, such as the size-based
isolation technique (ISET) that showed the possibility of the presence of CTC subpopulations that are
EpCAM-negative. However, this methodology still has limitations with low sensitivity, lack of tumor
specificity, and missing tumor cells or extracellular vesicles that are crucial as they play significant roles
in tumor progression, metastasis, and intercellular communication by carrying tumor-specific biomarkers
[16]. Given these limitations, there is a critical need for non-enrichment approaches that offer higher
sensitivity and specificity to isolate and characterize CTCs in lung cancer. Such advancements would
enhance our ability to understand the clinical relevance of CTCs as diagnostic and prognostic biomarkers,
ultimately improving early detection and treatment strategies for lung cancer.
This study applies the high-definition single-cell assay (HDSCA) workflow which does not rely
on an enrichment strategy for isolating rare cells in PB. Instead, it employs a non-enrichment approach
and advanced molecular profiling techniques to analyze and characterize circulating rare events within a
sample. We have previously identified CTCs in prostate, breast, colorectal, pancreatic, and lung cancer
patients applying HDSCA [25], [26], [27], [28]. The commercialized version of the HDSCA has
previously demonstrated its clinical utility in advanced prostate and breast cancer [29]. Expanding upon
the findings of these studies, we have introduced the third-generation HDSCA (HDSCA3.0) with an
advanced implementation of both an immunofluorescence staining assay and the rare event detection
algorithm to observe a broader spectrum of circulating rare events. To enhance our detection capabilities,
we have added vimentin and CD31 to the previously validated HDSCA1.0 panel, which includes DAPI,
Cytokeratin (CK), and CD45. This allows us to identify cells with endothelial or mesenchymal
characteristics in addition to epithelial and immune cells. Furthermore, the HDSCA3.0 workflow employs
an advanced detection approach with an unsupervised rare event detection algorithm, capable of detecting
every nucleated cell and also oncosomes, a type of large extracellular vesicles released by cancer cells.
Taken together, the HDSCA3.0 workflow may provide the opportunity to capture the heterogeneity of
circulating rare events, including CTCs, and demonstrate a relatively more comprehensive landscape of
them in lung cancer. This is crucial for early detection, as understanding the molecular and cellular
diversity of the tumor can lead to more accurate diagnosis and better-informed treatment decisions.
2
In this thesis, I aimed to identify and characterize circulating rare events in blood of patients
diagnosied with lung cancer using the HDSCA3.0 workflow. Chapter 1 demonstrates the phenotypic and
genotypic heterogeneity of CRCs in SCLC patients. Chapter 2 exhibits the phenotypic, genotypic, and
proteomic heterogeneity of circulating rare events in NSCLC patients. In Chapter 3, we demonstrate the
feasibility of utilizing the HDSCA3.0 workflow for lung cancer screening. The circulating rare events
detected in individuals at high risk of developing lung cancer were investigated. Along with the findings
from Chapters 1 and 2, patient-level classifiers were developed to stratify lung cancer patients from the
high-risk individuals and normal blood donors (NDs) with no known pathology. This series of studies
provides the potential to investigate a liquid biopsy approach to be complementary to current screening
methods in the early detection of lung cancer.
3
Chapter 1: Characterization of circulating rare cells in small cell lung
cancer
This study is published in Seo, J., Kumar, M., Mason, J. et al. Plasticity of circulating tumor cells in small
cell lung cancer. Sci Rep 13, 11775 (2023). https://doi.org/10.1038/s41598-023-38881-5
Scientific Reports has granted reprint approval for use in this dissertation.
1.1. Introduction
Small cell lung cancer (SCLC) has an aggressive and rapidly metastasizing malignancy with an
average 5-year survival rate of 7% [10]. SCLC comprises about 10-15% of all lung cancers with the
remainder classified as non-small cell lung cancer (NSCLC) [10]. Despite affecting a smaller fraction of
patients, the survival rates of patients with SCLC are significantly lower than those with NSCLC. For
SCLC patients with limited disease, the 5-year survival rate is 20%, while the 5-year survival rate for
those with extensive SCLC is less than 1%. Given the large decrease in survival rate for extensive disease
compared to limited disease, it is critical to develop diagnostic and prognostic tools to help detect and
monitor the disease at its earliest stage to make treatment most effective.
Despite the aggressive nature of SCLC, the standard of care for first- and second-line therapies
has not appreciably changed for nearly four decades [11]. While SCLC initially responds to chemotherapy
and radiation therapy, there is a rapid emergence of relapse and drug resistance leading to short survival
times in the majority of patients [12], [13], [15]. Novel treatment approaches have shown promise,
especially with immune checkpoint inhibitors combined with platinum and etoposide [12]. However,
robust and sustained responses to those regimens are still not demonstrated. This supports the need for
new approaches to better monitor treatment efficacy and to better characterize the disease for improved
management of metastatic SCLC.
Recent studies in SCLC focus on the molecular characterization of the tumor landscape,
highlighting tumor heterogeneity and cancer cell plasticity in order to identify novel biomarkers as
therapeutic targets [6], [8], [15], [30]. Phenotype-determined cellular plasticity is an important
phenomenon in cancer progression, emerging as a contributor to therapy evasion [2], [8], [14]. Cellular
plasticity allows a cancer cell to change phenotype without additional genetic mutations, which may be
independent of therapeutic pressure. Studies in various cancer types have shown that a neoplastic cell can
hijack developmental processes as a way to adapt to environmental stressors [16], [31].
Epithelial-to-mesenchymal transition (EMT) is one of the well-known examples of showing
plasticity which consists of both morphological and molecular changes. Vascular mimicry (VM) is
another example where cancer cells trans-differentiate and acquire endothelial cell behavior. We have
previously shown that VM cells represented within circulating tumor cells (CTCs) in SCLC enable de
novo generation of vascular networks which could contribute to dissemination and metastasis [32].
Furthermore, cellular plasticity allows the conversion of cells between four defined subtypes of SCLC
cells with distinct therapeutic vulnerabilities previously classified based on the differential expression of
four biomarkers: ASCL1, NEUROD1, POU2F3, and YAP1 [33], [34], [35]. The ability to profile a tumor
at this deep molecular level will deliver more efficient and personalized treatment options for this
devastating cancer.
4
One major barrier to understanding SCLC has been the limited access to tissue samples for a
comprehensive analysis of the disease. This is due to SCLC patients rarely undergoing surgical resection
and even then only limited numbers of cells are available [36]. A liquid biopsy approach can provide a
minimally invasive route to repeatedly detect clinically relevant analytes [36], [37], [38]. CTCs detected
in the liquid biopsy of SCLC patients have been confirmed as a prognostic biomarker with the potential to
improve therapeutic strategies [32], [39], [40]. The liquid biopsy can be used to characterize the disease at
the single-cell level and resolve the limitations of tissue biopsy and allow for routine, non-invasive
sampling.
In this study, we apply a third-generation advanced direct imaging platform (high-definition
single-cell assay; HDSCA3.0) to analyze the liquid biopsy [27] and identify the phenotypic and genotypic
heterogeneity of CRCs in SCLC patients. We investigated various cell groups that are differentially
observed in SCLC patients allowing for stratification from non-cancerous donors (NDs). Single-cell
genomic analysis revealed the clonal populations and confirmed the SCLC cellular plasticity of CTCs
detected by the liquid biopsy. The data presented provides the molecular characterization of a wide
spectrum of CTC subtypes detected in SCLC patient samples.
1.2. Materials and Methods
1.2.1. Liquid Biopsy Samples
Patients with histological or cytological confirmation of chemotherapy-naive SCLC were
recruited and consented at The Christie NHS Foundation Trust according to an ethically approved
protocol (NHS Northwest 9 Research Ethical Committee). Informed written consent was obtained from
all subjects. Experimental protocols were approved by the Institutional Review board of Clinical and
Experimental Pharmacology Laboratory at the Paterson Institute for Cancer Research in Manchester. All
methods were conducted in compliance with the guidelines of the International Conference on
Harmonisation Good Clinical Practice (ICH Harmonised Tripartite Guideline E6: Note for Guidance on
Good Clinical Practice (CPMP/ICH/135/95) Step 5), the Declaration of Helsinki, and in accordance with
applicable regulations on the use of human tissue for research. Peripheral blood samples of up to 10 ml
were collected at the entry to the study in blood collection tubes (Cell-free DNA, Streck, La Vista, NE,
USA) and processed by the Convergent Science Institute in Cancer (CSI-Cancer) at the University of
Southern California within 48 h as previously described [41]. In brief, after red blood cell lysis, nucleated
cells were attached as a monolayer on custom-made glass slides (Marienfeld, Lauda, Germany) and
cryopreserved until analysis. Peripheral blood samples from 10 NDs with no known pathology were
collected from Scripps Research Institute and processed according to standard operating procedures.
1.2.2. Immunofluorescence Staining and Imaging
Immunofluorescence staining was performed with the use of an IntelliPATH FLX™ autostainer
(Biocare Medical LLC, Irvine, CA, USA) in batches of 50 slides of approximately 3 million nucleated
5
cells as previously described [27]. Two assays were utilized in this study and are described below.
Following staining, slides were imaged using a custom-made fluorescent scanning microscope.
Landscape assay [27]
Two slides from each patient were thawed and stained using IntelliPATH FLX™ autostainer
(Biocare Medical LLC, Irvine, CA, USA). Fixation was conducted for 20 minutes with 2%
neutral buffered formalin solution (VWR, San Dimas, CA), followed by blocking nonspecific
binding with 10% goat serum (Millipore, Billerica, MA) for 20 minutes. The slides were then
incubated with a conjugate of a mouse anti-human CD31 Alexa Fluor 647 monoclonal antibody
(mAb) (clone: WM59, MCA1738A647, BioRad, Hercules, CA) and preincubated with goat
anti-mouse IgG monoclonal fragment antigen binding (Fab) fragments (115-007-003, Jackson
ImmunoResearch, West Grove, PA) for 4 hours. Subsequently, the cells were permeabilized with
methanol and incubated with an Ab cocktail of the following mAbs; mouse anti-human
pan-cytokeratin (PanCK) mAbs of CK 1,4,5,6,8,10,13,18 (clones: C-11, PCK-26, CY-90,
KS-1A3, M20, A53-B/A2, C2562, Sigma, St. Louis, MO), CK 19 (clone: RCK108, GA61561-2,
Dako, Carpinteria, CA) to identify epithelial cells, and mouse antihuman CD45 Alexa Fluor®
647 mAb (clone: F10-89-4, MCA87A647, AbD serotec, Raleigh, NC) marking endothelial and
immune cells, and rabbit anti-human vimentin (Vim) mAb (clone: D21H3, 9854BC, Cell
Signalling, Danvers, MA) to identify mesenchymal cells. Slides were subsequently incubated
with Alexa Fluor® 555 goat anti-mouse IgG1 secondary antibody (A21127, Invitrogen, Carlsbad,
CA) and counterstained with 4′,6-diamidino-2-phenylindole (DAPI; D1306, Thermo Fisher
Scientific). Slides were then mounted with a glycerol-based aqueous mounting media before
adding coverslips to maintain cell integrity for further downstream analysis.
EpCAM-targeted assay
Two slides from patient UM-001 were stained with PanCK and CD45 antibodies as described
above with DAPI [32] complemented with a monoclonal EpCAM antibody (1:250, 324202,
Biolegend, San Diego, CA; Alexa Fluor® 488 goat anti-mouse IgG2b secondary antibody, 1:500,
A21141, Invitrogen, Carlsbad, CA) at Epic Sciences. CTCs were identified with standard image
analysis protocols [26], [27], [41], [42].
1.2.3. Detection and Classification of CRCs
Detection of rare events from the imaging data set is conducted using an unsupervised clustering
algorithm that clusters rare cells using extracted quantitative morphologic features, as previously reported
[27]. The CRCs detected were further classified into eight channel-based cell classifications defined by
the fluorescence signal intensities and distribution of four different channel markers. For the Landscape
assay, these groups include CK-positive only CTCs, CK|Vim-positive CTCs, CK|(CD45/CD31)-positive
cells, CK|Vim|(CD45/CD31)-positive cells, DAPI-positive only cells, (CD45/CD31)-positive only cells,
Vim-positive only cells, and Vim|(CD45/CD31)-positive cells. For the EpCAM-targeted assay, these
groups include CK-positive only CTCs, CK|EpCAM-positive CTCs, CK|CD45-positive cells,
6
EpCAM|CD45-positive cells, CK|EpCAM|CD45-positive cells, DAPI-positive only cells, CD45-positive
only cells, and EpCAM-positive only cells. Rare cell channel-based classification was validated by a
second trained analyst. The frequency of the rare cells for each classification was declared as the
abundance of rare cells per ml. This was calculated by assessing the total number of nucleated cells per
two slides against the total white blood cell counts per ml of the received sample.
1.2.4. Patient-level classification modeling
The morphologic features of the rare cells identified were used to map these events onto a pre-constructed
t-SNE of previously identified rare cells from various cancer types and NDs. Based on the nearest
representative cell within the multi-dimensional t-SNE space, the cells were then assigned a cell identifier.
Using the morphological hierarchy of the representative cells, the identified rare events for the SCLC
patients and NDs can be clustered into similar groups. Using a top-down approach and starting with two
clusters, the counts of cells per mL of each group are calculated for both SCLC patients and NDs. These
counts are used as input data for a random forest classification model of 1000 decision trees to predict
whether the distribution depicts SCLC patients or NDs. This process is repeated with 3 clusters up until a
predefined stopping criteria (e.g., 100 clusters). The optimal number of cell clusters was determined by
the minimum out of bag (OOB) error rate of the random forest model. OOB error rate is calculated during
the training process as the random forest will randomly hold out a small subset of input data to be used as
testing data on each of the decision trees that are constructed. This method of measuring model
performance was chosen due to the small number of SCLC patients and NDs available. Using the feature
importance of the optimal random forest model, the cell clusters were ordered to identify those that
contribute the most to correctly stratifying the classes. Next, starting with the top two most important
clusters, random forest models were incrementally recreated to determine the best model with the lowest
error rate as a means of pruning the final input dataset (i.e., feature reduction). To more easily visualize
the cells within each cluster, we organize the events by their morphological hierarchy and display the
events in a grid-like fashion.
1.2.5. Whole Genome Single Cell Copy Number Alteration
Rare cell relocation, re-imaging, isolation, next-generation sequencing (NGS), and copy number
alteration (CNA) analysis was conducted as previously reported [43]. In brief, cells of interest were
relocated using registered coordinates and imaged with a 40x objective. Subsequently, individual cells
were extracted from slides using a robotic micromanipulator system followed by single-cell whole
genome amplification (WGA; Sigma-Aldrich; Cat# WGA4). Libraries were constructed using the DNA
Ultra Library Prep Kit (New England Biolabs; Cat# E7370) and sequenced using Illumina NextSeq 500 at
USC Genomics Core or at Fulgent Genomics (Temple City, CA). The copy number profile of each
individual cell was reconstructed from the frequency of unique reads mapped to the human genome
(hg19). Only cells with total reads above 50,000 per cell, a total alignment rate above 50%, minimal
noise, an in-house quality score greater than or equal to 2.5, and had reads across the whole genome (no
apoptosis-induced alterations) were included in the analysis.
7
1.2.6. Single-Cell Alteration Classification Model
To investigate the relationship between phenotype and genotype in SCLC, multiple predictive
models (e.g. random forest, naïve Bayes, and support vector machine) were implemented. The
morphometric features from the HDSCA image data of the Landscape assay-stained cells were the input
parameters. A binary “Clonally Altered” vs. “Not Clonally Altered” designation for each individual cell
genomic profile was the target output. A trained genomic analyst provided guidance on input selection for
training according to the genomic instability, eliminating genomic profiles exhibiting technical artifacts,
and clonality as determined by the presence of more than two cells having at least three alterations in
concordance from the same sample. Feature selection was conducted to prevent overfitting of the data and
issues with multicollinearity, as well as to further optimize the model. Features with a correlation above
0.9 were grouped together and the feature with the highest variance was selected to represent the subset,
resulting in a final set of 56 features.
1.2.7. Statistical Analysis
Statistical analyses and visualization were performed with R (version 4.1.2). Statistical
significance was determined at a p-value ≤ 0.05. Mann-Whitney U test was conducted to observe the
statistical differences between SCLC patients and NDs. Prediction accuracy was measured by the Area
Under the receiver operating characteristic Curve (AUC) score.
1.3. Results
This study consists of 24 PB samples collected from 14 chemo-naive SCLC patients and 10 NDs.
One test of a sample consists of two slides being analyzed, therefore a total of 28 slides (2,304,659
cells/slide on average) from SCLC patients and 20 slides (2,116,477 cells/slide on average) from NDs
were used.
1.3.1. Patient-level classification modeling
We first investigated whether the liquid biopsy could differentiate between SCLC patients and
NDs by exploring the differential cellular populations. A patient-level classification model was
constructed using the rare cell populations detected by HDSCA3.0. Figure 1 shows a schematic of the
overall data science pipeline. It is important to note that NDs often exhibit small numbers of rare cells
across the range of channel types, making a rigorous statistical treatment necessary to distinguish SCLC
from ND.
8
Figure 1. HDSCA3.0 Workflow. 1) Peripheral blood samples are plated onto slides. 2) Slides are
immunofluorescence stained by Landscape assay and automatically scanned. 3) Detection of rare cells from
the imaging data set is conducted using an unsupervised clustering algorithm that clusters rare cells using
extracted quantitative morphologic features. Further downstream analysis of single-cell CNA on those
detected rare cells is performed. 4) The CRCs detected are further automatically classified into
channel-based cell classifications defined by the fluorescence signal intensities. 5) The channel-based cell
classification of each cell is validated by a trained analyst. 6) Enumeration of each rare cell type per sample
is counted.
The patient-level classifier for the given prediction problem of differentiation between SCLC
patients and NDs showed perfect concordance (100%) with correct predictions across all patients (Fig.
2A). This verifies that the rare cells in SCLC detected by HDSCA3.0 have a significant influence in
differentiating SCLC patients from NDs. The top-most influential cellular clusters that had the highest
impact on stratification (Fig. 2B) were further investigated to examine what phenotypic cellular
populations comprise them. Figure 2C shows the distributions of rare cells for each cluster. The top 14
groups all showed higher counts within the SCLC group as compared to the NDs. Interestingly, the most
important clusters consisted of various cellular phenotypes including the pan-CK positive CTC candidates
previously described in SCLC patients and also cell phenotypes that have not previously been described
in SCLC (Fig. S1). This patient-level classification modeling of the liquid biopsy indicates that (1) a PB
sample can stratify SCLC from ND, and (2) a heterogeneous population of rare cells exists in SCLC
which are influential in differentiating between SCLC patients and NDs.
9
Figure 2. Patient-level classification model results using HDSCA3.0 liquid biopsy data. (A) Confusion
matrix of predicting NDs and SCLC patients. (B) Feature importance ranking of the top influential clusters.
Depicts the relative contribution of each event cluster to making correction predictions. (C) Box plots
showing the distribution of the top 20 influential clusters separated by each classification. The plot is
shown on a logarithmic scale depicting cells per million (Cell per M).
1.3.2. Characterization of CRCs
We further examined the phenotypic cellular populations in each patient using manual
channel-type classifications. Figures 3A and S2 show representative single-cell images of each
channel-based classification detected in the liquid biopsy of SCLC patients. The signal distribution of CK,
Vim, and CD45/CD31 immunofluorescent markers for channel-based cells is shown in Fig. 3B. The
10
enumeration and proportion of the eight different channel-based cell groups in each SCLC patient and the
NDs are shown in Fig. 3C and D. The CK CTCs were detected significantly more in SCLC patients
(mean: 411.19 cells/ml, range: 5–3402.05 cells/mL) compared to the NDs (mean: 0.35 cells/ml, range:
0–3.77 cells/ml, p-value < 0.0001). The CK CTCs accounted for over 50% of total CRCs from 43% of
SCLC patients (n = 6). Interestingly, not only CK-only CTCs but also CK|Vim CTCs were detected
significantly more in SCLC (mean: 23.82, range: 0–178.69 cells/ml) compared to the NDs (mean: 1.03,
range: 0–11.47 cells/ml, p-value = 0.046) as well as most of the other CRC groups (Fig. 3E). In general,
the enumeration of total rare cells was significantly greater in the SCLC patient cohort (mean: 602.39
cells/ml) compared to the ND cohort (mean: 65.67 cells/ml, p-value < 0.0001).
11
12
Figure 3. CRCs identified in the PB from SCLC patient samples using HDSCA3.0 with the
Landscape assay. A. Representative cell images of each cell group. Each row shows a composite image
plus each of the four channels separately; DAPI in blue, CK in red, Vim in white, (CD45/CD31) in green.
B. Signal distribution of immunofluorescence markers for channel-based cells. Each cell group is
correspondingly annotated to A with eight different colors. C. The number of total rare cells per mL is
calculated for each SCLC patient and ND. Each rare cell group is annotated with eight different colors
which are shown in the bottom middle of the figure. The number of rare cells of 50 NDs was averaged. D.
Proportion plot of the rare cell groups of each SCLC patient and averaged NDs. E. Box plots comparing
cell counts per ml of each cell group between NDs and SCLC patient samples. *: p-value ≤ 0.05, **:
p-value ≤ 0.01,***: p-value ≤ 0.001,****: p-value ≤ 0.0001.
To further investigate the modulation of EpCAM expression in the CTC population in SCLC, the
HDSCA workflow was conducted using the EpCAM-targeted assay on one SCLC patient (UM-001).
Figure 4A shows the images of representative cells of each channel-based classification in which the
majority of cells were identified as CK CTCs and CK|EpCAM CTCs (71.9%, 128 out of 178 cells).
Interestingly, we detected the modulation of EpCAM expression; from EpCAM-positive to
EpCAM-negative cells within the CTCs. A total of 69 CK|EpCAM CTCs and 59 CK CTCs were
detected. We further detected the presence of CK|CD45, and CK|EpCAM|CD45 cells (Fig. 4B). Together,
this case study demonstrates the presence of a heterogeneous phenotypic population of CTCs in SCLC.
Figure 4. Detected cells from UM-001 using the EpCAM assay. A. Representative cell images of each
cell group. Each row shows a composite image plus each of the four channels separately: DAPI in blue, CK
in red, CD45 in green, and EpCAM in white. B. The number of total rare cells per mL is provided. Each
rare cell group is annotated with four different colors which are shown in the bottom right of the figure.
1.3.3. Genome-Wide Single-Cell Copy Number Analysis
Single-cell genomic analysis was conducted to determine whether the detected CTC candidates
and other rare cell types exhibit clonal genomic alterations characteristic of SCLC. Single-cell copy
13
number alteration (CNA) profiling was performed on 309 cells in total from 14 SCLC patients: 139 CK
CTCs, 11 CK|Vim CTCs, 26 CK|Vim|(CD45/CD31) cells, 27 CK|(CD45/CD31) cells, 40 Vim cells, 24
Vim|(CD45/CD31) cells, 14 CK|EpCAM CTCs, and 21 DAPI cells. White blood cells were also isolated
from each patient as internal controls. Figure S3 shows a heatmap of single-cell CNA profiles of isolated
rare cells from each of the 14 SCLC patients, clustered by each channel-based classification. Interestingly,
genomic alterations were observed in seven types of CTC candidates with various phenotypic
combinations of epithelial, endothelial, and mesenchymal biomarker expression, not only in the CK
CTCs.
Furthermore, the presence of a genetically clonal CTC population that is highly phenotypically
variable confirmed cellular plasticity (Fig. 5A). CK CTCs, CK|(CD45/CD31), and CK|Vim|(CD45/CD31)
cells in Patient 8 harbored clonal gene losses in tumor suppressor genes such as RB1, TP53, and PTEN
that are known as the most frequently altered genes in SCLC [44], [45]. Loss of one copy of chromosome
3p is one of the most frequent and early events in human cancer [46]. Gains of 8q containing the MYC
gene that has been identified as an oncogenic driver in SCLC [46] and including the RICTOR gene, a
subunit of the mTORC2 complex, as well as in the IL7R gene were observed. CK CTCs, CK|Vim cells,
and CK|Vim|(CD45/CD31) cells in Patient 6 also harbored clonal alterations in SCLC-associated genes
(Figs. 5A and S3), including the tumor suppressor genes, PTEN, RB1, and TP53. The heatmaps for all of
the patients are shown in Fig. S3.
14
Figure 5. Single-cell CNA profiling of rare cells detected in SCLC PB samples. A. CNA profile of
representative single rare cell from each SCLC patient and their Landscape-stained cell images; CK in red,
Vim in white, CD45/CD31 in green, and DAPI in blue. The rare cell type of each cell is annotated with
color labels at the top right. B. CNA profiles of sequenced cells from UM-001 stained by Landscape assay.
C. CNA profiles of sequenced cells from UM-001 stained by EpCAM-targeted assay. EpCAM expression
for each cell is shown at the top of the heatmap. The rare cell group of each cell in B and C are annotated
with different color labels at the top left of each heatmap. CNA gains are shown in red, neutrals in white,
and losses in blue.
In addition to the Landscape assay, Patient 1 was further analyzed for EpCAM. Figure 5B and C
show Patient 1 with analysis of cells isolated from both the Landscape and EpCAM-targeted assays. The
clonal population had losses in the 3p, 10q, 13q, and 17p regions corresponding to RASSF1, PTEN, RB1,
15
and TP53, respectively. The gains associated with the clonal population of cells were identified in 1q, 3q,
and 5p regions corresponding to BCL9, MUC1 (1q), PIK3CA (3q26), p63 (3q28), and TERT (5p15).
These CNAs have been confirmed from recent studies to be recurrently lost in SCLC [44], [45], [46]. The
clonal alterations were observed in 16 out of 20 (80%) CK CTCs, one CK|Vim cell, one
CK|Vim|(CD45/CD31) cell, one Vim cell, and one Vim|(CD45/CD31) cell from the Landscape assay. The
images of those cells from the Landscape staining are shown in Fig. 5A. Across the range of CK and
EpCAM expression, 11 out of 12 (91%) CK|EpCAM CTCs and 10 out of 13 (77%) CK CTCs exhibit low
or no expression of EpCAM and also shared clonal alterations.
Together, CNA analysis (1) verifies that the CK and CK|EpCAM CTCs have clonal alterations,
(2) displays CTC heterogeneity, and (3) confirms CTC plasticity.
1.3.4. Single-Cell Alteration Classification Model
An association between phenotypic characteristics and genomic alteration was investigated to
assess the significance of phenotypic variability in SCLC. The random forest classification model using
the phenotypic features of the rare cells was conducted to classify each rare cell as clonally altered or not.
The quantitatively extracted cellular phenotypic features include the intensity of the immunofluorescence
markers and morphometric characteristics. The classifier showed a high performance of 0.86 Area Under
the receiver operating characteristic Curve (AUC) score (Fig. 6A). The importance of the features was
calculated to investigate the significance of different phenotypic features in predicting genomic alterations
(Fig. 6B).
Figure 6. Single-cell alteration classification model performance A. ROC Curve of the random forest
classification model. B. Feature importance of the top 20 features used in the random forest model. The
whole list of the features is shown in Figure S4.
12 of the top 20 features for the model were CK-related features that help the model identify the
genomically clonal, altered, SCLC cells (Fig. 6B). Such features included the ratio of CK to CD45/CD31
positivity and the shape or size of the CK positivity within the cell. In addition to the CK-related
influential features, Vim-related phenotypic features were also contributed to strongly affect the
16
classification of clonal SCLC cells (7 out of the top 20 features). Overall, our classification model of
single-cell alteration showed a robust performance utilizing the cellular phenotypic features and the
characteristics of intensity or morphologies of CK and Vim expression were the strongest predictors.
1.4. Discussion
In this study, we describe several important findings in SCLC: (1) SCLC patients and NDs can be
stratified using a liquid biopsy, (2) detection of a heterogeneous population of CTCs, and (3)
characterization of SCLC cellular plasticity.
A patient-level classification model was able to stratify the SCLC patients from NDs with perfect
concordance using the rare cells detected by HDSCA3.0, confirming the abundance of CRCs as a
clinically useful analyte for SCLC patients. Furthermore, the rare cells comprising the most important
clusters included not only the CK-positive cells previously identified as CTCs but also a phenotypically
distinct cellular population not previously described for stratifying SCLC patients from NDs. This
emphasizes the power of a rare cell framework in detecting ultra-rare CTCs and their potential utility as a
complementary tool to current methods of imaging and pathology tests for the diagnosis of SCLC.
We investigated the phenotypic heterogeneity of CTCs in SCLC through the utilization of
multiple assays to characterize a wide spectrum of rare cells. Through the utilization of multiple epithelial
biomarkers in the EpCAM-targeted assay, we observed the wide range of CK and EpCAM expression in
the CTC population in which 46% of the CTC population did not express EpCAM. We have previously
shown that the HDSCA platform can detect a high abundance of CTCs without EpCAM expression that
was not able to be detected by CellSearch in SCLC [32]. Previous studies have reported similar findings,
in which phenotypic variability of CTCs in patients with SCLC, with a specific subpopulation of CTCs
being clinically relevant [40]. Furthermore, the single-cell prediction model supports the hypothesis of
CTC phenotypic variability within SCLC. We hypothesized that the single-cell prediction model would
use primarily CK expression as the main predictor input, but the Vim expression was also a top predictor
supporting the importance of CTC heterogeneity in SCLC. As we have shown in SCLC [35] and also
from the other cancer types [27], [47], liquid biopsy approaches that are unbiased like the HDSCA
platform will result in higher efficiency in isolating CTCs and detecting ultra-rare CTCs from SCLC
patients.
Tumor plasticity enables a subset of cancer cells to transition between different cell states that
accelerate tumor progression and metastasis [34], [48], [49]. The single-cell sequencing results confirm
the existence of tumor cell plasticity by indicating that a phenotypically heterogeneous population of cells
can be genomically stable. Cancer stem-like cells are a subset of cancer cells that have the ability to
generate the intra-tumor heterogeneity of different cell phenotypes from differentiation [50], [51]. EMT is
one demonstration of tumor plasticity, with the intermediate states between the epithelial and
mesenchymal phenotypes being associated with poor patient survival and chemotherapy resistance [51],
[52]. The CK CTCs and the CK|Vim CTCs harboring clonal alterations detected in this study potentially
indicate the presence of EMT. Notably, other phenotypic rare cells were also identified with clonal
alterations suggesting further dynamic cellular plasticity. Cellular plasticity is fundamental to SCLC
17
tumorigenesis, thus requiring longitudinal prognostic tools to properly characterize the dynamic cell state.
We have shown that a minimally invasive liquid biopsy which allows for repeated sampling can address
the challenges associated with the detection of variable cell states with the evidence of clinical utility. Our
results highlight the heterogeneity of CRCs by the liquid biopsy approach with the identification of tumor
cell plasticity. Further investigation is warranted to overcome the limitations of this study regarding the
small sample size of enrolled patients and the number of characterized single cells.
In conclusion, in this study we establish the validity of CRC detection by stratifying SCLC
patients and NDs with a high degree of accuracy using a classification model. Further, the data presented
here provides evidence for cellular phenotypic plasticity, through the detection of heterogeneous CRCs
carrying the clonal tumor genotype with hallmarks of SCLC. Although matching tumor tissue was not
available for direct genomic comparison, work by ourselves and others [53], [54] in other cancers have
shown that clonal CTC populations closely reflect the genomics of clonal cells in the tumor. This provides
new information for the potential stratification of treatments and the development of targeted therapeutics.
This study demonstrates that liquid biopsy can provide a non-invasive route of tissue sampling with the
opportunity for clinical monitoring and the development of better-stratified targeted therapies.
18
Chapter 2: Characterization of circulating rare cells in non-small cell
lung cancer
2.1. Introduction
NSCLC is the most frequent type of lung cancer, accounting for about 85% of all diagnoses [2],
[6]. Unfortunately, NSCLC diagnoses are mostly made (approximately 75%) in metastatic stages when
cancer has already spread and the prognosis is poor. In this respect, biomarker investigation that might
allow for early detection of NSCLC is critical to improving the prognosis of this disease.
CTCs are the cells detached from the primary tumor or metastasis and found in the blood that can
be used to indicate the disease. Examination of CTCs can provide clinically relevant information
representing the state of the disease [55]. Therefore, recent studies showed that CTCs can be used in
cancer diagnosis or screening and even for disease monitoring for metastatic relapse and therapeutic
strategies [18]. However, there is no FDA-approved CTC detection technology for lung cancer yet. CTC
detection in NSCLC is challenging due to significant tumor heterogeneity, leading to variability in CTC
characteristics and complicating their identification [56], [57]. Additionally, NSCLC patients typically
have a low concentration of CTCs in their bloodstream, requiring highly sensitive detection technologies
to distinguish them from millions of other blood cells [56], [58]. CellSearch, which is the only
FDA-approved CTC detection method for breast, colorectal, and prostate cancer, isolates
EpCAM-positive CTCs but generally shows a lower detection rate in NSCLC compared to other tumor
entities [59]. CTC positivity rates of 21-36% were reported in metastatic NSCLC applying CellSearch,
but early stages of NSCLC including stage I, II, and IIIA were CTC negative or with an extremely low
rate of less than 4%.
Here we apply a non-enrichment-based HDSCA workflow that we have previously observed a
high prevalence rate of CTCs (67%) in NSCLC patients of every stage using the first generation of the
workflow [25]. In this study, we used the HDSCA3.0 workflow which utilizes the advanced
immunofluorescence staining technique and unsupervised rare event detection algorithm to identify the
circulating rare events associated with NSCLC. We conducted a four-channel immunofluorescence assay
consisting of CK, Vim, CD31, CD45, and DAPI. Also, an unsupervised rare event detection algorithm
was conducted in the HDSCA3.0 workflow that allows every nucleated cell and oncosome to be
investigated. We characterized and examined circulating rare events by multiplexed targeted proteomics
based on imaging mass cytometry (IMC) and genomic analysis of single-cell copy number profiling. We
aim to investigate the use of the extended capabilities of HDSCA3.0 to identify a comprehensive
overview of circulating rare events that can provide clinically valuable biomarkers in NSCLC from a
liquid biopsy.
19
2.2. Materials and Methods
2.2.1. Liquid Biopsy Samples
Patients with chemotherapy-naive NSCLC were enrolled (Table 1). The patients were recruited to
multiple locations at Stanford University, the University of California San Diego, The Billings Clinic,
California Pacific Medical Center, and the VA Palo Alto Health Care System.
All PB samples of up to 10ml were collected in blood collection tubes (Streck, La Vista, NE,
USA) and processed as previously described [41]. PB samples from NDs with no known pathology were
collected from Scripps Research Institute and Epic Sciences and processed according to standard
operating procedures.
Table 1. Patient collection information
Location of collection Patient cohort Collection criteria
Stanford, the University of
California San Diego, The
Billings Clinic, California
Pacific Medical Center, and the
VA Palo Alto Health Care
System
● 145 chemo-naive NSCLC
patients (87 stage I, 13 stage II,
24 stage III, and 21 stage IV
patients)
● 56 benign individuals;
benignity of tumors was defined
by the extracting physician after
reviewing the medical record.
Patients were enrolled for
diagnostic workup who were
with a lung nodule or mass and
underwent 18F-FDG PET-CT
imaging for lung cancer
evaluation.
Scripps Research Institute,
Epic Sciences
60 NDs Individuals with no known
pathology were included.
2.2.2. Immunofluorescence Staining
Two assays of immunofluorescence staining were utilized in this study; HD-CTC and Landscape
assay in HDSCA1.0 and HDSCA3.0 workflows, respectively. HD-CTC assay in HDSCA1.0 was
performed as described in our previous studies [25], [41].
In brief, two slides from each sample were thawed and stained using IntelliPATH FLX™
autostainer (Biocare Medical LLC, Irvine, CA, USA). After fixation by paraformaldehyde (PFA) and
permeabilization by methanol, cells were incubated with pan anti-CK antibodies recognizing CK 1, 4, 5,
6, 7, 8, 10, 13, 18 (clones: C-11, PCK-26, CY-90, KS-1A3, M20, A53-B/A2, C2562, Sigma, St. Louis,
MO) and 19 (clone: RCK108, GA61561-2, Dako, Carpinteria, CA) and a conjugated anti-CD45 antibody
(clone: F10-89-4, MCA87A647, AbD serotec, Raleigh, NC) followed by incubation with an Alexa
20
555-conjugated secondary antibody (A21127, Invitrogen, Carlsbad, CA) and DAPI (D1306, Thermo
Fisher Scientific) as a nuclear stain.
Landscape assay in HDSCA3.0 was performed as previously described in 1.2.2. Pan-CK to
identify epithelial cells, a combination of CD31 and CD45 marking endothelial and immune cells, Vim to
identify mesenchymal cells, and DAPI marking the nucleus of the cell were used.
2.2.3. Detection and Classification of Circulating Rare Events
Both slides stained from HD-CTC in HDSCA1.0 workflow or Landscape assay in HDSCA3.0
workflow were scanned and produced 2304 frames per slide as previously described. Each cell was
segmented and their quantitative phenotypic characteristics were extracted by the image processing
package “EBImage” in R. Candidate cells were then classified by different approaches from HDSCA1.0
and HDSCA3.0 workflow as follows.
Cells detected by the HDSCA1.0 workflow were classified as candidate HD-CTCs only if they
were CK positive and CD45 negative, contained an intact DAPI nucleus without identifiable apoptotic
changes (blebbing, degenerated appearance) or a disrupted appearance, and were morphologically distinct
from surrounding white blood cells with manual validation by a trained analyst.
The events detected by the HDSCA3.0 workflow were classified as previously described in 1.2.3.
using an unsupervised clustering algorithm that can capture every circulating rare event in blood for
further analysis. On top of the eight channel-based cell classifications defined as previously described,
every DAPI negative circulating rare event with CK positivity and spherical shape is classified as an
oncosome. Therefore, nine groups of rare event types are used for the classification: CK CTCs,
CK|Vim-positive CTCs, CK|(CD45/CD31)-positive cells, CK|Vim|(CD45/CD31)-positive cells,
DAPI-positive only cells, (CD45/CD31)-positive only cells, Vim-positive only cells,
Vim|(CD45/CD31)-positive cells, and oncosomes.
2.2.4. Whole Genome Single Cell Copy Number Alteration
Single-cell copy number profiling was performed as previously described in 1.2.5. Individual
cells were extracted using a robotic micromanipulator system and underwent single-cell WGA. DNA
libraries were constructed using the DNA Ultra Library Prep Kit and sequenced on an Illumina NextSeq
500. Copy number profiles were reconstructed from unique reads mapped to the human genome (hg19),
including only cells with total reads above 50,000 per cell, a total alignment rate above 50%, minimal
noise, an in-house quality score of at least 2.5, and reads covering the entire genome.
2.2.5. Targeted Proteomics By Imaging Mass Cytometry
Cells of interest were subjected to targeted proteomic analyses with the use of the CyTOF Helios
imaging mass cytometer (Fluidigm) as previously described [42], [60]. In brief, regions of interest (ROI)
on the slides are scanned in situ with a highly focused, pulsed laser, such that each pulse vaporizes a 1
µm2 bloc of the sample and the resulting ions are introduced into the inductively coupled plasma
21
time-of-flight mass spectrometer (ICP-TOF-MS) with helium as a carrier gas. The ion counts for each
pulse can then be measured and collected into a protein expression image with a resolution of 1 µm2
across the ROI. First, the slides that have been stained with immunofluorescence were re-stained with
isotope-labeled antibodies cocktail; CK7 (Fluidigm; Cat# 3164020A; Clone: RCK105; Dilution: 1:400),
CK8/18 (Fluidigm; Cat# 3174014A; Clone: C51; Dilution: 1:200), Vim (Abcam; Cat# ab193555; Clone:
EPR3776; Dilution: 1:300), EpCAM (Fluidigm; Cat# 3144026D; Clone: 9C4; Dilution: 1:200), TWIST1
(Fluidigm; Cat# 3999999-2; Clone: rabbit polyclonal; Dilution: 1:50), CD59 (Abcam; Cat# 3047012B;
Clone: H19; Dilution: 1:100), β-catenin (Fluidigm; Cat# 3147005A; Clone: D10A8; Dilution: 1:300),
VE-Cadherin (Fluidigm; Cat# 3155010A; Clone: D87F2; Dilution: 1:100), CD68 (Fluidigm; Cat#
3159035D; Clone: KP1; Dilution: 1:50), CD3 (Fluidigm; Cat# 3170001B; Clone: UCHT1; Dilution:
1:100), CD45 (Fluidigm; Cat# 3089003B; Clone: HI30; Dilution: 1:200), CD31 (Fluidigm; Cat#
3145004B; Clone: WM59; Dilution: 1:100) and DNA intercalator.
A region of interest (ROI) of ~400 µm × 400 µm centered on each cell of interest was then ablated
with a 1 µm diameter pulsed laser, followed by ionization and quantification in the CyTOF Helios
instrument. The pulsed laser sequentially ablates the cells of interest along with the surrounding ~300
leukocytes in the defined ROIs. Ion count values are measured at each of the 160,000 acquired pulses in
the ROI for each isotope and are stored as a text file for further analysis. Multi-dimensional images of the
ROI showing the expression of each isotope-labeled antibody by a resolution of 1 µm2 are generated
according to the ion counts. The range of ion count values is internally normalized within the ROI for
each isotope where the brightest pixel represents the 98th percentile of the cumulative ion count signal.
Segmentation of cells and calculation of ion count per cell were processed using the CellProfiler
(ver.3.15) and Ilastik (ver.1.3.3) programs. HistoCAT was used to visualize and analyze the IMC data
interactively.
2.2.6. Statistical Analysis
Statistical analyses and visualization were performed with R (version 4.1.2). Statistical
significance was determined at a p-value ≤ 0.05. Mann-Whitney U test was conducted to observe the
statistical differences between different clinical cohorts.
2.3. Results
This study consists of PB samples collected from 145 NSCLC patients, 56 benign individuals,
and 60 NDs. Two slides from each sample have been analyzed; 290 slides (2,455,234 cells/slide on
average) from NSCLC patients, 112 slides (2,143,140 cells/slide on average) from benign individuals, and
120 slides (2,174,311 cells/slide on average) from NDs were used.
2.3.1. Characterization of CTCs in NSCLC using HDSCA3.0
We previously analyzed the cellular populations in patients with NSCLC using the HDSCA1.0
workflow, which utilized the HD-CTC assay with three immunofluorescence markers: DAPI, CK, and
CD45. This analysis identified CK-positive CTCs but had limitations in detecting a broader spectrum of
circulating rare events, which is important because it allows for the identification of diverse phenotypic
22
subtypes of CTCs and other rare cells, providing a more comprehensive understanding of tumor
heterogeneity and aiding in developing more effective diagnostic and therapeutic strategies.
We found that the HDSCA3.0 workflow, which includes additional biomarkers (Vim and CD31)
and an unsupervised rare event detection algorithm, significantly enhances the detection of a diverse
range of CTCs and other rare cell types. Specifically, the HDSCA3.0 identified CK only CTCs, CK|Vim
CTCs, CK|(CD45/CD31) cells, CK|Vim|(CD45/CD31) cells, Vim only cells, Vim|(CD45/CD31) cells,
DAPI only cells, and (CD45/CD31) only cells (Fig. 7A). The signal distribution of CK, Vim, and
CD45/CD31 immunofluorescence markers in the classified groups is shown in Figure 7B. Figure 7C
shows the eccentricity distribution, representing the irregularity of the shape of each type of rare event.
23
24
Figure 7. Circulating rare events identified in the PB from NSCLC patients. A. Representative cell
images of each channel-based cell group. Each row shows a composite image plus each of the four
channels separately; DAPI in blue, CK in red, Vim in white, (CD45/CD31) in green. B. Signal distribution
of immunofluorescent markers for channel-based cell groups. Each cell group is correspondingly annotated
to A with nine different colors. C. Eccentricity distribution of channel-based cell groups.
To investigate the potential of HDSCA3.0 in detecting previously undiscovered rare cell
populations, we compared the rare cell populations with those detected by the HDSCA1.0 workflow.
Sister slides from the same 100 NSCLC patients previously analyzed with the HDSCA1.0 workflow were
analyzed using the HDSCA3.0 workflow for comparison. Figure 8A shows representative rare cells
observed from one of the NSCLC patients detected by HDSCA1.0 and HDSCA3.0. Notably, not only the
CK-only CTCs but also the CRCs including CK|Vim cells, CK|(CD45/CD31) cells, and
CK|Vim|(CD45/CD31) cells were identified by the addition of Vim and CD31 markers. Furthermore, the
unsupervised clustering algorithm enabled the other CRCs without CK expression to be identified;
including Vim cells, Vim|(CD45/CD31) cells, DAPI only cells, (CD45/CD31) only cells, and even
oncosomes that are CK-positive rare events negative for DAPI.
25
26
Figure 8. Circulating rare events identified in the PB from NSCLC patients by HDSCA1.0 and
HDSCA3.0. A. Representative cell images of CRCs and oncosomes observed in an NSCLC patient
detected from HDSCA1.0 (left) and HDSCA3.0 (right). Each row shows a composite image and each of the
channels separately; DAPI in blue, CK in red, Vim in white, CD45, and (CD45/CD31) in green. B. The
number of rare events per ml is enumerated for each NSCLC patient (x-axis) from both HDSCA1.0 and
HDSCA3.0. The upper and lower graph shows the enumeration of the rare events detected by HDSCA3.0
and HDSCA1.0, respectively. Each rare cell group annotated with different colors are shown at the right
side of the figures.
The total number of detected rare events was dramatically increased in HDSCA3.0 with the more
specific classifications of the CK-positive rare cells. The total abundance of CK-positive only CTCs,
CK|Vim-positive cells, CK|(CD45/CD31)-positive cells, CK|Vim|(CD45/CD31)-positive cells detected by
HDSCA3.0 were significantly higher than the incidence of CK-positive CTCs classified by HDSCA1.0.
This indicates that the CK-positive CTC population detected by HDSCA1.0 could still be detected in
HDSCA3.0, and the addition of Vim and CD31 enabled further subdivision into four CK-positive rare cell
subtypes. Moreover, every patient had either (CD45/CD31)-positive only rare cells, Vim-positive only
cells, Vim|(CD45/CD31)-positive cells, and oncosomes that were not detectable by HDSCA1.0,
demonstrating the enhanced sensitivity and comprehensive detection capabilities of the HDSCA3.0
workflow.
27
Figure 9. Circulating rare events identified in the PB from NSCLC patients, benign individuals and
ND. A. (left) Logarithmic scaled number of total rare cells is enumerated for each NSCLC patient and ND.
(right) Proportion plot of the rare cell groups of each NSCLC patient and NDs. Each rare cell group is
annotated with nine different colors which are shown at the right bottom of the figure. B. Logarithmic
28
scaled box plots comparing the number of cells of each cell group between NSCLC patients, benign
individuals, and ND samples. *: p-value ≤ 0.05, **: p-value ≤ 0.01.
We further examined whether the number of detected circulating rare events in the blood of
patients with NSCLC by HDSCA3.0 is associated with the malignant lesion in the individuals' bodies.
The enumeration and the proportion of circulating rare events classified into nine different channel-based
groups in each NSCLC patient, benign individual, and ND are shown in Figure 9A. The number of
circulating rare events was compared between the NSCLC patients, benign individuals, and NDs (Fig.
9B). The CK-positive only CTCs were observed as significantly abundant in NSCLC patients (mean: 5.97
sd: 1.32 range: 0 - 65.07 cells/ml) compared to the benign individuals (mean: 2.54 sd: 0.54 range: 0 - 5.43
cells/ml p-value = 0.0002) and NDs (mean: 1.02 sd: 0.55 range: 0 - 3.17 cells/ml p-value = 0.00004).
Notably, not only the CK-positive only CTCs but also CK|Vim CTCs were detected in 15% of NSCLC
patients (n = 22) and significantly more detected in NSCLC (mean: 5.53 sd: 2.38 range: 0 - 181.27
cells/ml) compared to the benign individuals (mean: 4.42 sd: 0.98 range: 0 - 5.12 cells/106 cells p-value =
0.005) and NDs (mean: 0.39 sd: 0.73 range: 0 - 3.44 cells/ml cells p-value = 0.004). Other candidate
CTCs including CK|(CD45/CD31) and CK|Vim|(CD45/CD31) cells were found in 30.7% (n = 46) and
73.1% (n = 106) of NSCLC patients respectively. Significantly higher prevalences of those CTC
candidates were observed in NSCLC patients (mean: 8.98 sd: 6.79 range: 0 - 302.93 cells/ml; mean: 9.08
sd: 19.48 range: 0 - 759.96 cells/ml respectively) compared to the NDs (mean: 1.82 sd: 2.80 range: 0 -
13.98 cells/ml p-value = 0.0019; mean: 4.21 sd: 5.04 range: 0 - 23.22 cells/ml p-value = 0.045
respectively).
Additional CRCs including Vim-positive only cells and Vim|(CD45/CD31) cells were detected.
Also, rare cells that are morphologically distinct from immune cells were detected, including
(CD45/CD31) positive only cells and DAPI positive only cells. A significant difference was also observed
with the oncosomes between the NSCLC patients (mean: 9.87 sd: 12.51 range: 0 - 711.57 events/ml),
benign individuals (mean: 8.22 sd: 24.30 range: 0 - 316.95 events/ml p-value = 0.045), and the NDs
(mean: 4.38 sd:52.32 range: 0 - 158.18 events/ml p-value = 0.03). Furthermore, the CK-positive only
CTCs, CK|Vim CTCs, CK|(CD45/CD31) cells, CK|Vim|(CD45/CD31) cells, and oncosomes were
significantly more prevalent in benign individuals compared to NDs (p-values are 0.018, 0.037, 0.028,
0.016 respectively). Together, our results reveal and characterize the circulating rare events that are
differentially present in NSCLC patients compared to benign individuals and NDs.
2.3.2. Genome-Wide Single-Cell Copy Number Analysis
Single-cell CNA profiling analysis was performed on 70 cells from the same 5 NSCLC patients: 2
CK|Vim cells, 5 CK|(CD45/CD31) cells, 19 CK|Vim|(CD45/CD31) cells, 2 Vim cells, 25
Vim|(CD45/CD31) cells, 9 DAPI cells, and 8 white blood cells from each patient as internal controls were
isolated and genomically analyzed. None of the analyzed cells exhibited copy number alterations. Figure
S5 shows the heatmaps of single-cell CNA profiles of isolated rare cells from each of the 5 NSCLC
patients, clustered by each channel-based classification. This finding indicates that their association with
NSCLC may be driven by non-genetic factors. This underscores the need to explore alternative
mechanisms of cancer and the complex interactions within the tumor microenvironment.
29
2.3.3. Proteomic characterization of CRCs
Proteomic analysis was conducted by imaging mass cytometry (IMC) to further examine the CTC
candidates and other circulating rare events detected in NSCLC. Proteomic analysis was conducted on
three of the eight phenotypes, specifically on 20 cells of interest from the same 5 NSCLC patients who
underwent genomic analysis in Section 2.3.2: 6 CK|(CD45/CD31) cells, 9 CK|Vim|(CD45/CD31) cells,
and 5 Vim|(CD45/CD31) cells. Also, 4 cell clusters consisting of more than 5 CK|Vim|(CD45/CD31) cells
were analyzed. White blood cells were also isolated from each patient as internal controls. The results of
targeted proteomic analysis using IMC are shown in Figure 10A as a heatmap of standardized ion counts
per the cell of interest. The IMC images of cells of interest expressing each type of labeled antibody are
shown in Figure 10B and immunofluorescence images of the cells are also shown in the first row.
In concordance with the immunofluorescence assay, CK7 and CK8/18 were abundantly expressed
in CK|(CD45/CD31) cells and CK|Vim|(CD45/CD31) cells while negative with EpCAM, confirming the
presence of CTC candidates that are EpCAM-negative but CK-positive. β-Catenin, a multifunctional
protein involved in cell adhesion and signaling pathways and also an epithelial marker [61], [62], was
expressed in some of the CK-positive cells. Its expression indicates the potential involvement of Wnt
signaling pathways in these cells, which is significant for understanding their tumorigenic properties. Vim
which is known to be expressed in mesenchymal and endothelial cells was presented in
CK|Vim|(CD45/CD31) cells and Vim|(CD45/CD31) cells, echoing the results from the
immunofluorescence assay. CD31 as an endothelial marker was also expressed in all CK|(CD45/CD31)
cells, CK|Vim|(CD45/CD31) cells, and Vim|(CD45/CD31) cells as exhibited by the immunofluorescence
assay. VE-Cadherin which is also expressed in endothelial cells was observed in some of the
CK|Vim|(CD45/CD31) cells. CD59, which belongs to membrane complement regulatory proteins
(mCRPs) and inhibits complement cytolytic activity, is known to assist malignant cells to escape
immunologic surveillance and complement-mediated cytolysis [63]. Overexpression of CD59 has been
demonstrated in many types of solid cancers including NSCLC [63]. We observed that CD59 was
expressed in every CK|(CD45/CD31) cell and some of CK|Vim|(CD45/CD31) cells (3 out of 9 cells) and
Vim|(CD45/CD31) cells (3 out of 5 cells) while negative in white blood cells, indicating their
characteristic as cancer-related CRCs.
Cell clusters consisting of more than five CK|Vim|(CD45/CD31) cells showed similar proteomic
expression patterns as the individual CK|Vim|(CD45/CD31) cells, abundantly expressing CK7, CK8/18,
β-Catenin, Vim, VE-Cadherin, CD31, and CD59 (Fig. 10C). Additionally, cell clusters showed expression
of TWIST1, which is a transcription factor of the basic helix-loop-helix class and is known to induce
EMT in a variety of tumors [9], [64], [65], [66]. CD68, a marker for macrophages and monocytes, was
also expressed in the subgroups of cells in the clusters.
Overall, we could confirm the concordant results between targeted proteomic analysis and
immunofluorescence staining and verified the phenotypic heterogeneity of CRCs in NSCLC.
30
Figure 10. Proteomic analysis of CRCs in NSCLC. A. Heatmap of proteomic expression in CRCs in
NSCLC. B. IMC image of each marker expressed in each CRC. The composite immunofluorescence assay
image of each cell is shown in the top row; DAPI in blue, CK in red, Vim in white, (CD45/CD31) in green.
C. IMC images of the CRC clusters.
31
2.4. Discussion
In this study, we describe the following key findings: 1) Demonstration of extended capabilities
of HDSCA3.0 compared to HDSCA1.0 to identify a comprehensive overview of circulating rare events in
NSCLC from a liquid biopsy. 2) Detection of a heterogeneous population of CTCs in NSCLC. 3)
Identification of cancer-associated CTCs and circulating rare events that are differentially prevalent in
NSCLC patients compared to benign individuals and NDs. 4) Confirmation of phenotypic heterogeneity
by targeted proteomics and 5) Identification of the lack of significant copy number alterations in the
analyzed CRCs, highlighting the importance of understanding dynamics of cancer beyond genetic
alterations.
The early detection rate of NSCLC is crucial for achieving successful treatment and improving
overall survival, as most patients are diagnosed at advanced stages. Thoracic imaging, particularly LDCT,
has been traditionally used for early detection, showing a 20% reduction in lung cancer mortality.
However, concerns remain regarding false positives, overdiagnosis, and invasive procedures. Tissue
biopsies, while essential for diagnosis, have limitations due to tumor heterogeneity, the inability to
represent the entire tumor, and assess genetic abnormalities comprehensively.
Liquid biopsy offers a non-invasive and repeatable approach to detecting tumor cells or
tumor-related events in body fluids, overcoming the limitations of current strategies for early detection. It
can complement existing screenings by reducing false positives and overdiagnosis with its non-invasive
nature. Liquid biopsy also addresses the limitations of tissue biopsies and has the potential to become a
complementary diagnostic tool. Assessing tumor-derived elements in PB holds clinical significance and
shows promise as potential biomarkers. Therefore, investigating biomarkers through a liquid biopsy for
early detection of NSCLC is critical in improving prognosis by overcoming the limitations of current
strategies and offering a comprehensive assessment of the disease.
In our study, we demonstrate the comprehensive overview of heterogeneous circulating rare
events including CTCs in NSCLC reflecting a tumor heterogeneity that enables an extensive tumor
characterization, which is crucial for making therapeutic decisions and defining prognosis.
In previous studies, the most commonly used CTC detection methodologies have been CellSearch
which is the only FDA-approved CTC detection method for breast, colorectal, and prostate cancer [10],
[67], [68]. CellSearch isolates CTCs using an enrichment analysis based on EpCAM expression.
However, not only the low detection rate of CTCs in early NSCLC patients but also the fact that NSCLC
patients can harbor different CTC subpopulations, including EpCAM-positive and EpCAM-negative cells
have shown the limitations of EpCAM-based enrichment approach [69]. Previous studies also have
detected CTCs using several different EpCAM-independent isolation techniques using size-based
isolation techniques [69], [70], [71]. However, this methodology still faces limitations in terms of low
sensitivity, absence of tumor specificity, and the inability to capture smaller-sized tumor cells or
extracellular vesicles.
Here we apply the HDSCA workflow which is a non-enrichment approach. In our previous work,
the first generation of HDSCA verified the high detection rate of CTCs in stage IV NSCLC patients [72].
In this study, we used the third generation of HDSCA which can detect a broader spectrum of CRCs and
32
oncosomes compared to the first generation. Additional biomarkers in the immunofluorescence assay
enabled the investigation of circulating rare events for epithelial, mesenchymal, and endothelial origin. By
this comprehensive profiling, we were able to detect phenotypic heterogeneity of CRCs that have not
been revealed in NSCLC from our previous generations of HDSCA. CTCs detected by our previous
version of HDSCA were the CK+|CD45- cells that have nuclei without identifiable apoptotic changes or a
disrupted appearance and are morphologically distinct from surrounding white blood cells. In this study of
early and late-stage NSCLC patients, using HDSCA3.0 generation, we could detect CK+|(CD45/CD31)-
CTCs in NSCLC patients in accordance with our previous study. On top of that, the CK|Vim CTCs, and
CK|(CD45/CD31), CK|Vim|(CD45/CD31) CTC candidates were observed in 15%, 30.7%, 70.7% of the
samples, respectively. Also, these cells were found to be significantly more abundant in NSCLC patients
compared to NDs, indicating their association with NSCLC. Interestingly, the significant prevalence of
CK-positive only CTCs, CK|Vim CTCs, and oncosomes were found in NSCLC patients compared to
benign individuals. Our non-enrichment investigation enabled not only the comprehensive
characterization of circulating rare events in NSCLC but also provided the possible biomarkers to stratify
disease malignancy.
Furthermore, the unsupervised event detection approach enabled the identification of a broader
spectrum of tumor-associated rare events. Oncosomes are vehicles that are shed from tumor cells for
crosstalk between both tumor cells and cells of the surrounding microenvironment. Oncosomes have
proposed implications as diagnostic and prognostic biomarkers, as well as therapeutic targets [73], [74],
[75]. In NSCLC, tumor-derived oncosomes are known to accelerate angiogenesis and tumor growth and
have the potential to promote metastasis [76], [77]. However, only a few studies have explored the
potential of oncosomes as diagnostic biomarkers in NSCLC and were conducted with small-sized cohorts
[76], [77]. In this study, we could observe a significantly higher abundance of oncosomes in NSCLC
patients compared to benign individuals and NDs. Our results showed not only a robust detection rate but
also reproducible patterns of oncosomes in NSCLC, indicating the potential utility of oncosomes as a
diagnostic biomarker in lung cancer. Additional investigations are needed to fully demonstrate the
mechanism of oncosomes association with tumor cells in lung cancer.
As multiple previous studies have suggested [14], [78], investigation of the subpopulation of
tumor-associated circulating cells is critical as understanding tumor heterogeneity and exploring
biomarkers are necessary for enabling early detection of NSCLC and thereby improving the prognosis of
this disease. Our results demonstrate not only the heterogeneous CRC populations but also the oncosomes
that are both differentially detected in NSCLC which would provide opportunities to find the potential
candidates of cancer biomarkers.
Targeted proteomic analysis using IMC enabled further validation and confirmation of phenotypic
heterogeneity of the CRCs. In this study, CK|(CD45/CD31) cells, CK|Vim|(CD45/CD31) cells, and
Vim|(CD45/CD31) cells from 5 NSCLC patients were conducted through proteomic validation. Our IMC
results on the CK|(CD45/CD31) cells and CK|Vim|(CD45/CD31) cells showed concordance with the
immunofluorescence assay and further confirmed the epithelial and mesenchymal features expressing
CK7, CK8/18, β-Catenin, Vim, and also TWIST1 in the CK|Vim|(CD45/CD31) cell clusters, indicating
the tumor microenvironment heterogeneity or a potential possibility of EMT. Furthermore, every
CK|(CD45/CD31) cell, some of CK|Vim|(CD45/CD31) cells (3 out of 9 cells), and Vim|(CD45/CD31)
cells (3 out of 5 cells) were observed to express CD59 which is known to be associated with metastasis
33
and overexpressed in many types of solid cancers including NSCLC. Together, CK|(CD45/CD31) cells
that are cytokeratin positive and express markers typically found on endothelial cells might be
tumor-associated endothelial cells. The CK|Vim|(CD45/CD31) cells, expressing both epithelial and
mesenchymal markers, may indicate highly plastic and heterogeneous population likely involved in EMT
and interactions with the tumor microenvironment, potentially reflecting increased migratory and invasive
capabilities that are essential for metastasis. Lastly, the Vim|(CD45/CD31) cells might be mesenchymal or
endothelial cells within the tumor microenvironment. Thus, our results not only confirm the phenotypic
diversity of CRCs using targeted proteomic analysis but also exhibit their association with cancer as a
tumor microenvironment (TME).
TME consists of a heterogeneous population of cells, including tumor cells and recruited nearby
various cells. This interaction between the tumor cells and the surrounding TME plays a crucial role in
extracellular matrix remodeling, tumorigenesis, angiogenesis, invasion, migration, metabolism, and
proliferation [79]. In our study, we identified CRCs and oncosomes associated with NSCLC. These
included not only CTCs but also TME cells expressing cancer-associated proteins. Our single-cell
genomic analysis revealed that these cells do not have genomic alterations, yet previous studies have
shown that such TME cells can induce cancer progression or promotion even without genetic alterations,
by producing various growth factors, chemokines, and cytokines [79], [80], [81]. Furthermore, previous
studies [77] showed that tumor-derived extracellular vesicles (EVs) including oncosomes could be a part
of the communication between the cancer cells and the TME. Artem et al. showed that EVs are critical in
transforming normal stromal cells into tumor-promoting cells by transferring molecules like RNA and
proteins. In NSCLC patients, we detected significantly prevalent oncosomes, suggesting these vesicles
might have driven the transformation of tumor surrounding normal cells into cancer-associated cells,
enhancing the tumor-supportive microenvironment. Further investigation is warranted to clearly
demonstrate the mechanisms of TME cells that play a role in cancer.
Taken together, our results demonstrate the feasibility of liquid biopsy to detect and identify
potential biomarkers including CTCs and circulating rare events of TME associated with NSCLC. While
our results have shown promising results in providing a comprehensive understanding of the biology
behind molecular heterogeneity in lung cancer and demonstrating the potential biomarkers, the clinical
utility for its use in the screening or diagnosis of NSCLC needs to be further investigated. The
combination of biomarkers, as well as their integration with other diagnostic tools like imaging
techniques, presents a promising strategy in the field while confirmatory tissue biopsy is still required.
Identification of optimal strategy for the routine clinical practice for NSLC screening and diagnosis would
be necessary.
34
Chapter 3: Utilization of circulating rare cells in lung cancer screening
3.1. Introduction
Lung cancer is the most common type of cancer and the leading cause of cancer-related mortality
and NSCLC accounts for about 85% of all lung cancer types [2], [82], [83], [84]. Identification at an early
stage (stage I) of NSCLC can get offered a favorable prognosis by surgical resection, with 5-year survival
rates of 70–90% [85]. However, most patients over 75% have advanced stages (stage III or IV) at the time
of diagnosis and the prognosis remains poor at around 15–19% survival rate [85]. SCLC is a more
aggressive disease than NSCLC, with a worse prognosis of 5% of overall 5-year survival. More than 90%
of the patients get diagnosed with stage III or IV SCLC. Therefore for both SCLC and NSCLC, there is an
urgent need for the early detection of lung cancer. LDCT has been a promising screening method for the
early diagnosis of lung cancer and has been the only USPSTF-recommended screening test. However, as
we discussed earlier, LDCT has a high false-positive rate and potential drawbacks including radiation
exposure and overdiagnosis.
Multiple studies have suggested various complementary approaches to this current screening
method using machine learning [85]-[89]. In recent years, this powerful tool has gained prominence,
resolving intricate challenges across various domains. In particular, machine learning utilizing diagnosis
promises to revolutionize healthcare by leveraging abundant patient data to provide precise and
personalized diagnoses [86], [91], [92]. Most of the diagnostic model analyses using the machine learning
approach in lung cancer have utilized image data from CT scans or X-rays [92]. Although those methods
could successfully classify lung malignancies, underlying problems from CT screening itself still remain.
Only a few studies have investigated the diagnostic approaches by machine learning utilizing analytes
other than CT images. Hyunku Shin et al. performed diagnosis prediction using circulating exosomes by
deep learning-based spectroscopic analysis [93] but did not use specified lung cancer targeted markers
and had low specificity. Ying Xie et al. investigated metabolic biomarkers as early-stage lung cancer
diagnostic biomarkers using various machine learning techniques [94], but the key limitation of low
specificity still remained from assessing global metabolic change that is not available to differentiate
cancer from other diseases with systemic metabolic alterations.
The discovery of CTCs and analysis of their feasibility for disease detection has been
demonstrated in several cancer studies [16], [17]. In lung cancer, several studies have addressed the
potential of CTC identification in the diagnosis of the disease using various CTC detection approaches
[12], [14]. However, one of the major hurdles for existing methods is the reproducibility and sensitivity of
CTC detection. Automated systems for CTC isolation and identification such as the CellSearch held a
standardization of the procedure but had a low sensitivity. Furthermore, to investigate CTC as a diagnostic
marker in lung cancer, it is necessary to demonstrate the capability of the approach to discriminate
between lung cancer patients, healthy subjects, and also the cohort who are at high risk but not diagnosed
with lung cancer. However, only a few studies have examined the association of CTCs in the high-risk
group.
35
Our previous work has demonstrated the feasibility of using the HDSCA3.0 to investigate and
identify potential biomarkers of circulating rare events, including CTCs and other rare cells of epithelial,
mesenchymal, endothelial, or hematological origin.
In this aim, we demonstrate the capability of utilizing HDSCA3.0 to be used as a complementary
diagnostic tool to the current screening method. We constructed a patient-level classification based on
machine learning techniques stratifying lung cancer patients from the NDs and the cohort who are at high
risk of having lung cancer. Our results provide an opportunity for liquid biopsy to be utilized as a
complementary diagnostic approach to enhance early lung cancer detection and improve screening and
stratification of patients across different risk groups.
3.2. Materials and Methods
3.2.1. Liquid Biopsy Samples
All PB samples of up to 10ml were collected in blood collection tubes (Streck, La Vista, NE,
USA) and processed as previously described [41].
Table 2. Patient collection information of NSCLC, SCLC and High Risk Individuals
Location of collection Patient cohort Collection criteria
Manchester, UK ● 7 chemo-naive NSCLC patients
● 2 chemo-naive SCLC patients
● 43 not-cancer individuals
(6 CT-positive, 8 Indeterminate, 29
CT-negative patients classified from
the LDCT scans; reported by
National Health Service (NHS)
consultant radiologists with an
interest in thoracic radiology.)
Patients were enrolled for the
second annual screening
round (T1) of Manchester’s ‘Lung
Health Check’ (LHC) pilot of
community-based lung cancer
screening [95].
Individuals aged 55–74 with a
history of smoking at participating
general practices were invited to an
LHC based on a 6-year lung cancer
risk calculation (PLCOm2012), and
individuals at higher risk (defined
as ≥1.51% over 6 years) were
offered annual LDCT screening.
Scripps Research Institute
● 10 NDs Individuals with no known
pathology were included.
Epic Sciences
● 50 NDs Individuals with no known
pathology were included.
36
3.2.2. Patient-Level Classification Modeling
A patient-level classification model was developed to distinguish between four different cohorts:
lung cancer patients, benign cases, high-risk individuals, and those with undetermined (ND) status, using
XGBoost. The dataset, consisting of approximately 700 continuous morphologic features of patient cells,
was divided into training (70%) and test (30%) sets, ensuring a balanced representation of all classes
(cancer, benign, high-risk, nd) in each set. To address the significant class imbalance, the Synthetic
Minority Over-sampling Technique (SMOTE) was employed. SMOTE generates synthetic samples by
interpolating between existing minority class samples, effectively balancing the dataset by increasing the
number of minority class samples [96], [97].
Feature selection was performed by initially training the XGBoost model with all features and
ranking them based on their importance scores provided by the model. Among the machine learning
models tested, including random forest and SVM, the XGBoost model showed the best performance,
significantly enhancing patient-level classification, and was therefore chosen as the final model for our
analysis. Features with importance scores below a cross-validated threshold were removed to retain only
the most informative features for the final training process. The XGBoost classifier, as described by Chen
and Guestrin [98], was used for its efficiency and ability to handle high-dimensional data. XGBoost
builds an ensemble of decision trees in a sequential manner, where each tree is trained to correct the errors
of the previous trees. Model parameters such as learning rate, maximum depth, number of estimators, and
minimum child weight were optimized through a grid search with cross-validation to achieve the best
possible performance. The trained model was evaluated on the test set using performance accuracy and
precision. Predictions were made on the test set samples, and these metrics provided insights into the
model's ability to accurately classify individuals.
3.2.3. Statistical Analysis
Statistical analyses and visualization were performed with R (version 4.1.2). Statistical
significance was determined at a p-value ≤ 0.05. Mann-Whitney U test was conducted to observe the
statistical differences between different clinical cohorts.
3.3. Results
This study consists of PB samples collected from 7 NSCLC patients, 2 SCLC patients, 14 benign
individuals, 29 high-risk individuals and 60 NDs. Two slides from each sample have been analyzed; 14
slides (2,405,711 cells/slide on average) from NSCLC patients, 4 slides (2,217,952 cells/slide on average)
from SCLC patients, 28 slides (2,252,315 cells/slide on average) from benign individuals, 58 slides
(2,459,783 cells/slide on average) from high-risk individuals, and 120 slides (2,174,311 cells/slide on
average) from NDs were used.
37
3.3.1. Characterization of CTCs in lung cancer patients, benign and high-risk individuals using
HDSCA3.0
We initially analyzed the cellular populations in lung cancer patients, as well as benign and
high-risk individuals, using HDSCA3.0. The phenotypic cellular populations were identified and
classified into nine distinct groups based on the positivity of four immunofluorescence staining markers:
DAPI, CK, Vim, and CD45/CD31. Figure 11A presents representative cells for each channel-based
classification group detected in lung cancer patients, benign, and high-risk individuals. The distribution of
signals from the CK, Vim, and CD45/CD31 immunofluorescence markers within these classified groups
is depicted in Figure 11B. Figure 11C illustrates the eccentricity distribution, which represents the shape
irregularity of each type of rare event.
38
39
Figure 11. Circulating rare events identified in the PB from lung cancer patients, benign and
high-risk individuals. A. Representative cell images of each channel-based cell group. Each row shows a
composite image plus each of the four channels separately; DAPI in blue, CK in red, Vim in white,
(CD45/CD31) in green. B. Signal distribution of immunofluorescent markers for channel-based cell groups.
Each cell group is correspondingly annotated to A with nine different colors. C. Eccentricity distribution of
channel-based cell groups.
We examined whether the number of detected circulating rare events by HDSCA3.0 is associated
with lung cancer patients, benign individuals, and high-risk individuals. The enumeration and proportion
of circulating rare events classified into nine different channel-based groups in each cohort are shown in
Figure 12A. The number of circulating rare events was compared between the lung cancer patients,
benign individuals, high-risk individuals, and NDs (Fig. 12B).
Across four different clinical groups, each CRC type with CK expression, including CK, CK|Vim,
CK|(CD45/CD31), CK|Vim|(CD45/CD31) cells, and oncosomes, showed statistically different
prevalences in lung cancer patients. The CK only CTCs were significantly more abundant in lung cancer
patients (mean: 1.7, range: 0 - 3.05 /10
6 cells) compared to benign individuals (mean: 0.65, range: 0 - 0.80
/10
6 cells), high-risk individuals (mean: 0.47, range: 0 - 1.33 /10
6 cells), and NDs (mean: 0.51, range: 0 -
2.88 /10
6 cells). Additionally, CK|Vim CTCs were significantly more prevalent in lung cancer patients
(mean: 0.78, range: 0 - 1.24 /10
6 cells) compared to benign individuals (mean: 0.17, range: 0 - 0.10 /10
6
cells), high-risk individuals (mean: 0.05, range: 0 - 0.07 /10
6 cells), and NDs (mean: 0.006, range: 0 -
0.009 /10
6 cells). Other candidate CTCs, including CK|(CD45/CD31) and CK|Vim|(CD45/CD31) cells,
showed significantly higher prevalences in lung cancer patients (mean: 0.89 and 2.87 /10
6 cells,
respectively) compared to benign individuals (mean: 0.76 and 1.68 /10
6 cells), high-risk individuals
(mean: 0.62 and 1.82 /10
6 cells), and NDs (mean: 0.58 and 0.89 /10
6 cells, respectively). Oncosomes also
showed significantly higher prevalences in lung cancer patients (mean: 3.98 events/10
6 cells) compared to
benign individuals (mean: 2.69 events/10
6 cells), high-risk individuals (mean: 2.18 events/10
6 cells), and
NDs (mean: 0.85 events/10
6 cells). Additional CRCs, including Vim-positive only cells, and rare cells
morphologically distinct from immune cells, such as Vim|(CD45/CD31) cells, (CD45/CD31)-positive
only cells, and DAPI-positive only cells, were also detected. Among these, Vim|(CD45/CD31) cells and
Vim-positive only cells were found to be significantly more abundant in lung cancer patients compared to
the other cohorts. Together, our results reveal and characterize the circulating rare events that are
differentially present in lung cancer patients compared to benign individuals, high-risk individuals, and
NDs.
40
41
Figure 12. Circulating rare events identified in the PB from NSCLC patients and ND. A. (left)
Logarithmic scaled number of total rare cells is enumerated for each NSCLC patient and ND. (right)
Proportion plot of the rare cell groups of each NSCLC patient and NDs. Each rare cell group is annotated
with nine different colors which are shown at the right bottom of the figure. B. Logarithmic scaled box
plots comparing the number of cells of each cell group between lung cancer patients, benign individuals,
high-risk individuals and ND samples. *: p-value ≤ 0.05, **: p-value ≤ 0.01.
3.3.2. Case Study of Early Lung Cancer Detection
One of the high risk individuals without cancer who had a relatively higher prevalence of
CK|Vim|(CD45/CD31), Vim|(CD45/CD31) cells, oncosomes, and a higher total count of rare events
within the high risk cohort was diagnosed with lung cancer a year after the screening. This case study
suggests the potential of using liquid biopsy to detect cancer-associated CRCs and oncosomes for early
lung cancer detection.
A significant observation in our study was a high-risk individual who was initially screened
without a cancer diagnosis but was subsequently diagnosed with lung cancer a year later. This patient was
enrolled to this study as a high-risk cohort according to PLCOm2012 calculation and were offered annual
LDCT screening. During the initial screening, this patient was not diagnosed with cancer, but exhibited a
significantly elevated number of CK|Vim|(CD45/CD31) cells, Vim|(CD45/CD31) cells, and oncosomes
compared to the average counts observed in other high-risk individuals. The total count of rare events was
also comparably higher in this patient.
In Figure 13A, the different types of cells detected in this patient are shown along with their
representative cell images. On average, high-risk individuals had approximately 1.82
CK|Vim|(CD45/CD31) cells /10
6 cells, but this patient had 2.21 cells (Fig. 13B). Similarly, the average
number of Vim|(CD45/CD31) cells in high-risk individuals was 0.18 cells/10
6 cells, whereas this patient
had 0.82 cells /10
6 cells. For oncosomes, the average count in high-risk individuals was 1.74 events/10
6
cells, while this patient had 9.28 events/10
6 cells. Overall, the total count of rare events in this patient was
significantly higher than the average of 4.89 events/10
6 cells in other high-risk individuals, with this
patient having 14.06 events/10
6 cells.
Upon follow-up, this patient was diagnosed with NSCLC a year after the initial screening. This
case underscores the potential of using liquid biopsy to detect cancer-associated CRCs and oncosomes for
early lung cancer detection.
42
Figure 13. A case study of a NSCLC patient who was initially at high risk without cancer but
diagnosed a year later after the screening. A. Representative cell images of each channel-based cell
group. Each row shows a composite image plus each of the four channels separately; DAPI in blue, CK in
red, Vim in white, (CD45/CD31) in green. B. The number of each type of rare event per 10
6 cells. Each rare
cell group is annotated with different colors which are shown at the right bottom of the figure.
3.3.3. Patient-level classification model
We also investigated whether the lung cancer patients and high-risk individuals could be
differentiated by applying HDSCA3.0 using a patient-level classification model. The patient-level
43
classification model was developed to distinguish between four different cohorts: lung cancer patients,
benign cases, high-risk individuals, and NDs.
Initially, the model was trained using the enumeration of channel-based rare cell groups as
features. For the classification between cancer and ND, the confusion matrix in Figure 14A illustrates an
accuracy of 66% showing suboptimal performance of the model. The feature importance ranking in
Figure 14B shows that CRC type with CK expression, including oncosomes, CK|Vim, CK-positive only,
CK|(CD45/CD31) cells were most significant, indicating their role in distinguishing cancer from ND.
Similarly, for the classification between high-risk individuals and NDs, Figure 14C shows an accuracy of
62.5%, and Figure 14D highlights CK-related CRC groups as top features, suggesting their importance in
differentiating high-risk individuals from ND.
Figure 14. Patient-level classification model results using enumeration data of channel-based rare cell
groups from lung cancer patients, high-risk individuals and NDs. A. Confusion matrix result of a model
classifying lung cancer patients and NDs. B. Feature importance of the CRC groups. C. Confusion matrix
result of a model classifying high-risk individuals and NDs. D. Feature importance of the CRC groups.
44
To improve performance, the model was then trained on all 761 quantitative cellular and nuclear
parameters, including the intensity of immunofluorescence markers and various morphometric
characteristics. For the classification between lung cancer patients and NDs, Figure 15A shows a high
accuracy of 87.5%, with the feature importance ranking in Figure 15B revealing CK and Vim-related
morphologic features as the most important. These features include not just intensity but also other
morphological characteristics, indicating that a comprehensive approach to cell profiling enhances
classification accuracy. All features are shown in Figure S6 ranked by importance. Similarly, for the
classification between high-risk individuals and NDs, Figure 15C shows a high accuracy of 83.3%, and
Figure 15D ranks CK and Vim-related morphologic features at the top, alongside other features like
CD45/CD31 and cell eccentricity, highlighting the complexity and multidimensional aspects required for
effective classification. All features ranked by importance are shown in Figure S7.
Taken together, the HDSCA3.0 shows significant potential in differentiating lung cancer patients
and high-risk individuals from NDs using a patient-level classification model. By employing a
multidimensional approach that extends beyond channel intensity-based rare cell groups, the performance
of the patient-level classification model is significantly enhanced, utilizing comprehensive phenotypic
features together. Notably, the quantitative cellular and morphometric parameters such as cell eccentricity
have been crucial in improving diagnostic accuracy. These parameters offer detailed insights into cell
characteristics, which significantly enhance the ability of the model to distinguish between lung cancer
patients, high-risk individuals, and NDs. This highlights the necessity of comprehensive profiling and the
importance of employing non-enrichment analysis.
45
Figure 15. Patient-level classification model results using comprehensive phenotypic data of CRCs
from lung cancer patients, high-risk individuals and NDs. A. Confusion matrix result of a model
classifying lung cancer patients and NDs. B. Top 10 most important features ranked by importance. C.
Confusion matrix result of a model classifying high-risk individuals and NDs. D. Top 10 most important
features ranked by importance. (B,D) Characteristics of each feature are color labeled at the bottom right of
the plots.
3.4. Discussion
In this study, we demonstrated the following findings: 1) Identification of heterogeneous
circulating rare events, including CTCs, other CRCs and oncosomes in lung cancer patients, benign and
high-risk individuals. 2) Observation of cancer-associated CTCs, CRCs, and oncosomes highly prevalent
in lung cancer patients compared to benign and high-risk individuals, and NDs. 3) Confirmation of the
association of these cells with lung cancer from a case study of a high-risk individual diagnosed with lung
46
cancer afterwards. 4) Validation of the significant potential of the HDSCA3.0 platform in stratifying lung
cancer patients, high-risk individuals, and NDs using a patient-level classification model based on
investigation. Our findings highlight the utility of liquid biopsy for early lung cancer detection and patient
stratification, addressing the limitations of current screening methods.
Our initial analyses focused on characterizing the cellular populations in not only lung cancer
patients and benign cases, but also in the high-risk individuals. Even though LDCT is the only
recommended screening method for the early detection of lung cancer, a high false-positive rate of 96.4%
from the LDCT indicates an urgent need to find better or complementary approaches to discriminate
between healthy subjects, lung cancer patients, and individuals at high risk but not with lung cancer. Only
a few studies have investigated the association of CTCs with the clinical status of high-risk of having lung
cancer. Ilie et al. [99] demonstrated that the CTCs detected from patients with chronic obstructive
pulmonary disease (COPD) using the ISET method, which isolates CTCs based on their larger size and
phenotypic cellular features, have an association with lung cancer development. The presence of the
CTCs in pre-cancerous conditions such as COPD may result from chronic inflammation and
epithelial-mesenchymal transition (EMT), which cause pre-malignant cells to enter the bloodstream.
These mechanisms facilitate the early dissemination of cells that may later develop into cancer. Ilie et al.
showed that the COPD patients with CTCs eventually got lung nodules detected by CT scan after 1-4
years of CTC detection. It indicates the significance of CTC investigation in the high-risk group and
implies that the differentiation from the NDs is crucial for the early detection of lung cancer.
We utilized five different biomarkers to identify rare circulating events of epithelial,
mesenchymal, endothelial, or hematological origin. Our results comprise a broader spectrum of the
high-risk cohorts compared to the previous studies [99] including the patients with benign lung tumors
and the individuals who are ever smokers aged between 55 to 74 having more than 1.51% of the PLCO
m2012 risk model score. This score is calculated based on demographics and clinical history, enhancing
the accuracy of lung cancer risk prediction and aiding in the identification of high-risk individuals for
effective screening. We could demonstrate significant heterogeneity of CRCs in lung cancer patients,
benign and high risk individuals. CRCs with CK-positive events, including various subtypes such as
CK-positive only cells, CK|Vim, CK|(CD45/CD31), CK|Vim|(CD45/CD31) cells, and oncosomes being
more prevalent in lung cancer patients compared to benign and high-risk individuals. Interestingly, those
CRCs and oncosomes were more prevalent in high-risk individuals compared to those in NDs. This
significant differentiation suggests the potential of these CRCs as biomarkers for early detection of lung
cancer. Specifically, a high-risk individual with a significant elevation of cancer-associated CRCs and
oncosomes got diagnosed with lung cancer, a year after the initial screening. This case study emphasizes
the potential of using liquid biopsy to identify cancer-associated biomarkers for early detection of lung
cancer. The early detection of elevated CK|Vim|(CD45/CD31) cells, Vim|(CD45/CD31) cells, and
oncosomes in this patient suggests the predictive value of these markers and the importance of monitoring
high-risk individuals.
To investigate whether the detected rare events could further differentiate between lung cancer
patients, benign and high-risk cohorts, and NDs, we developed a machine learning-based patient-level
classifier. The model initially used the enumeration of channel-based rare cell groups. While our results
showed a consistency with the previous studies in various cancers [26], [27], [28], indicating the
feasibility of CK CTCs and oncosomes as a diagnostic marker in lung cancer, it showed suboptimal
47
performance. A multidimensional approach of incorporating a comprehensive set of 761 quantitative
cellular and nuclear parameters, including the intensity of immunofluorescence markers and various
morphometric characteristics, the model performance of patient stratification improved significantly. The
improvement in model performance with the inclusion of morphometrics highlights the importance of
detailed cellular characterization. Morphometric features such as cell shape, size, and nuclear
characteristics provide a deeper understanding of the phenotypic diversity among the detected cells. These
features enabled more detailed and accurate cell identification, which channel-based classification
methods might have overlooked. This suggests that the channel-based classification needs further
development to fully leverage these detailed morphometric characteristics. The enhanced patient-level
classification model effectively differentiated lung cancer patients from NDs and high-risk individuals
from NDs with high accuracy, reinforcing the clinical utility of CRCs in advancing lung cancer
diagnostics. Furthermore, the classifier model to stratify lung cancer patients from high-risk cohorts
showed a high accuracy of 91.6% and both high specificity and sensitivity (91.6% and 91.6%,
respectively). Given that the current lung cancer screening has a high false positive rate, the stratification
between the high-risk cohorts and the lung cancer patients is critical. Therefore our investigation of rare
events that are influential to the stratification might provide potential feasibility of liquid biopsy to aid in
increasing the performance of current approaches for the early detection of lung cancer. The most
important phenotypic features of the rare events for the stratifications were mostly related to the CK and
Vim expression. Other morphologic features including the eccentricity of the cell shape and CD45/CD31
expression were also shown as important feature to stratify the lung cancer patients and high-risk
individuals from NDs. Together, we demonstrated that the rare events detected by HDSCA3.0 have a
significant influence in differentiating lung cancer patients and high-risk group patients.
While our study presents promising results, there are some limitations to consider. The relatively
small sample size of enrolled patients and characterized single cells may affect the generalizability of our
findings. Additionally, further longitudinal studies might aid to validate the clinical efficacy of the
HDSCA3.0 platform in enabling early detection of lung cancer. Nevertheless, our findings underscore the
potential of liquid biopsy and the HDSCA3.0 platform in advancing early detection and stratification of
lung cancer, offering a promising complement to current screening methods.
48
Overall Discussion
Lung cancer remains a leading cause of cancer mortality, making early detection crucial for
improving survival rates. However, current diagnostic and screening methods face significant challenges.
While LDCT has been effective in reducing lung cancer mortality among high-risk individuals, it faces
challenges such as high false-positive rates and issues of overdiagnosis. Furthermore, the reliance on
invasive procedures like tissue biopsies are not only painful and expensive but also carry risks of
complications such as bleeding and infection. These limitations highlight the necessity for complementary
screening and diagnostic methods to enhance early lung cancer detection and management. This thesis
explores the potential of the HDSCA3.0 platform in addressing the challenges through the identification
and characterization of heterogeneous CRCs.
The key findings of my thesis include the following. Firstly, the HDSCA3.0 platform successfully
identified and characterized a broad spectrum of CRCs, including CTCs, and oncosomes in not only
SCLC and NSCLC patients, but also in high-risk individuals and those with benign tumor. The platform
enabled further phenotypic and genotypic validation of these rare events, revealing significant
heterogeneity and providing insights into their potential clinical relevance. Importantly, Chapter 1 results
demonstrated cellular plasticity in SCLC, highlighting the dynamic nature of CTCs. Cellular plasticity
contributes to tumorigenesis, metastasis, therapy resistance and poor patient outcomes. The HDSCA3.0
platform’s ability to detect and characterize these phenotypically diverse cells with clonal genomic
alterations provides a deeper understanding of tumor biology and supports the development of more
effective diagnostic strategies. Secondly, the HDSCA3.0 platform demonstrated the significant clinical
utility of CRCs in the context of NSCLC by providing comprehensive profiling of these cells. This
included targeted proteomic validation, which confirmed the phenotypic heterogeneity of CRCs and their
association with lung cancer. The ability to detect and characterize these rare events enhances our
understanding of tumor biology and supports the development of more precise diagnostic strategies.
Lastly, the development and application of machine learning models based on data from the HDSCA3.0
platform significantly enhanced the stratification of lung cancer patients and high-risk individuals from
NDs. These models utilized comprehensive sets of quantitative cellular and morphometric parameters,
improving diagnostic accuracy. A notable case study within the high-risk cohort highlighted the potential
of the HDSCA3.0 platform for early detection. A high-risk individual, who initially exhibited elevated
levels of cancer-associated CRCs detected by our platform, was subsequently diagnosed with lung cancer
a year later. This case emphasizes the predictive value of CRCs and the importance of monitoring
high-risk individuals. Taken together, the results demonstrate the clinical utility of CRCs in providing a
non-invasive, precise method for early lung cancer detection and patient stratification.
In conclusion, the comprehensive, non-invasive assessment provided by the HDSCA3.0 platform
presented in this thesis has the potential to significantly improve patient outcomes by enabling earlier
detection and more precise monitoring of lung cancer. The HDSCA3.0 platform allows for a
comprehensive analysis of CRCs, including CTCs and oncosomes, which are crucial for understanding
the heterogeneity and dynamics of lung cancer. By providing detailed phenotypic profiles of these cells
and their association with both lung cancer and high-risk status of lung cancer, this approach enhances
diagnostic accuracy and helps identify cancer-associated biomarkers that are essential for early detection.
The clinical utility of the HDSCA3.0 platform lies in its potential to enhance the precision and accuracy
49
of lung cancer diagnostics. By detecting a broad spectrum of CRCs and validating their heterogeneity, the
platform offers insights into the dynamic nature of tumor cells, particularly in SCLC and NSCLC. This
capability is crucial for improving early detection, monitoring disease progression, and personalizing
treatment plans, ultimately leading to better patient outcomes.
Future research should focus on larger cohort studies to validate these findings and establish the
clinical utility of the identified biomarkers. Integrating liquid biopsy methodologies with other diagnostic
tools, such as imaging techniques, can further enhance screening accuracy. Additionally, advancements in
machine learning and data integration will likely provide deeper insights into disease progression and
patient management, moving towards precision medicine in lung cancer care. These advancements are
crucial for better management and prognosis of lung cancer, as they enable us to make more informed
decisions regarding diagnosis. The ability to detect lung cancer at an early stage in high-risk populations
can significantly reduce mortality rates and healthcare costs associated with advanced-stage treatments.
By providing a non-invasive, cost-effective, and accurate diagnostic tool, the HDSCA3.0 platform has the
potential to transform lung cancer screening programs and improve survival rates on a population level.
Ultimately, the integration of the HDSCA3.0 platform into clinical practice has the potential to lead to
improved quality of life and survival rates for lung cancer patients by enabling early detection of the
disease.
50
Supplementary Information
51
52
53
54
Figure S1: Representative cell images of the top important clusters. The cells of SCLC patients are shown on
the right side and the cells of NDs are shown on the left side for each cluster.
55
56
Figure S2. Circulating rare cells identified in the peripheral blood from SCLC patient samples using
HDSCA3.0 with the Landscape assay. A. Representative cell images of each cell group. Each row shows a
composite image plus each of the four channels separately; DAPI in blue, CK in red, Vim in white, (CD45/CD31) in
green
57
Figure S3: Single cell CNA profiling of rare cells detected in PB samples from 14 SCLC patients. CNA
heatmap of all sequenced cells from 14 SCLC patients. The rare cell type of each cell is annotated with color labels
at the bottom of each of the heatmap. The color labels for each rare cell type are described at the bottom right. CNA
gains are shown in red, neutrals in white and losses in blue.
58
Figure S4: Feature importance of the features for the random forest model. The cellular and nuclear features
used for classification are shown by ranked importance. The colors of the feature names indicate their related
categories: red for CK-related features, green for Vim-related features, and black for other morphologic features
such as cell shape, size, and nuclear characteristics.
59
Figure S5: Single cell CNA profiling of rare cells detected in PB samples from 5 NSCLC patients. CNA
heatmap of all sequenced cells from 5 NSCLC patients. The rare cell type of each cell is annotated with color
labels at the bottom of each of the heatmap. The color labels for each rare cell type are described at the bottom
right. CNA gains are shown in red, neutrals in white and losses in blue.
60
Figure S6: Feature importance of the features for the patient-level model classifying lung cancer patients
and NDs. The cellular and nuclear features used for classification are shown by ranked importance. The colors of
the feature names indicate their related categories: red for CK-related features, light blue for Vim-related features,
green for CD45/CD31-related features, and gray for other morphologic features such as cell shape, size, and
nuclear characteristics. Detailed descriptions of the feature names can be found in Table S1.
61
Figure S7: Feature importance of the features for the patient-level model classifying high risk individuals
and NDs. The cellular and nuclear features used for classification are shown by ranked importance. The colors of
the feature names indicate their related categories: red for CK-related features, light blue for Vim-related features,
green for CD45/CD31-related features, yellow for features related to both CK and Vim, and gray for other
morphologic features such as cell shape, size, and nuclear characteristics. Detailed descriptions of the feature
names are provided in Table S1.
62
Table S1. Feature Naming Conventions in EBImage [100]
Label Description
object.layer.feature Basic structure of feature names where object is the
type of object, layer is the reference image layer, and
feature is the measurement.
0
Binary mask layer
a, b, c, ...
Individual reference layers
(e.g., different color channels)
Ba
Top-hat transformed reference layer ‘a’
cx, cy
Centroid coordinates (x, y)
majoraxis Major axis length
eccentricity Eccentricity of the object
radius.max Maximum radius
mean
Mean intensity
sd
Standard deviation of intensity
mad Median absolute deviation of intensity
q001, q005, q05
Quantiles of intensity (0.1%, 0.5%, 5%)
63
Reprint Approval
64
Bibliography
[1] “Cancer Facts & Figures 2023,” Am. Cancer Soc., vol. American Cancer Society, 2024, [Online].
Available:
https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-fac
ts-and-figures/2023/2023-cff-special-section-lung-cancer.pdf.
[2] H. Sung et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality
Worldwide for 36 Cancers in 185 Countries,” Cancer J Clin, vol. 71, no. 3, pp. 209–249, May 2021,
doi: 10.3322/caac.21660.
[3] “Lung Cancer Survival Rates,” American Cancer Society. [Online]. Available:
https://www.cancer.org/cancer/lung-cancer/detection-diagnosis-staging/survival-rates.html.
[4] “Lung Cancer—Patient Version.” National Cancer Institute, 2024. [Online]. Available:
https://www.cancer.gov/types/lung/patient/lung-treatment-pdq.
[5] US Preventive Services Task Force, “Screening for Lung Cancer: US Preventive Services Task Force
Recommendation Statement,” JAMA, vol. 325, no. 10, pp. 962–970, Mar. 2021, doi:
10.1001/jama.2021.1117.
[6] “Lung cancer - Symptoms and causes,” Mayo Clinic. Accessed: Jun. 17, 2024. [Online]. Available:
https://www.mayoclinic.org/diseases-conditions/lung-cancer/symptoms-causes/syc-20374620
[7] D. E. Jonas et al., “Screening for Lung Cancer With Low-Dose Computed Tomography: Updated
Evidence Report and Systematic Review for the US Preventive Services Task Force,” JAMA, vol.
325, no. 10, pp. 971–987, Mar. 2021, doi: 10.1001/jama.2021.0377.
[8] J. Brodersen, T. Voss, F. Martiny, V. Siersma, A. Barratt, and B. Heleno, “Overdiagnosis of lung
cancer with low-dose computed tomography screening: meta-analysis of the randomised clinical
trials,” Breathe, vol. 16, no. 1, Mar. 2020, doi: 10.1183/20734735.0013-2020.
[9] Q.-Q. Zhu, C. Ma, Q. Wang, Y. Song, and T. Lv, “The role of TWIST1 in epithelial-mesenchymal
transition and cancers,” Tumour Biol. J. Int. Soc. Oncodevelopmental Biol. Med., vol. 37, no. 1, pp.
185–197, Jan. 2016, doi: 10.1007/s13277-015-4450-7.
[10]W. J. Allard et al., “Tumor Cells Circulate in the Peripheral Blood of All Major Carcinomas but not
in Healthy Subjects or Patients With Nonmalignant Diseases,” Clin. Cancer Res., vol. 10, no. 20, pp.
6897–6904, Oct. 2004, doi: 10.1158/1078-0432.CCR-04-0378.
[11] V. Hofman et al., “Preoperative circulating tumor cell detection using the isolation by size of
epithelial tumor cell method for patients with lung cancer is a new prognostic biomarker,” Clin.
Cancer Res. Of . J. Am. Assoc. Cancer Res., vol. 17, no. 4, pp. 827–835, Feb. 2011, doi:
10.1158/1078-0432.CCR-10-0445.
[12]F. Tanaka et al., “Circulating tumor cell as a diagnostic marker in primary lung cancer,” Clin. Cancer
Res. Of . J. Am. Assoc. Cancer Res., vol. 15, no. 22, pp. 6980–6986, Nov. 2009, doi:
10.1158/1078-0432.CCR-09-1095.
[13]A. Snow, D. Chen, and J. E. Lang, “The current status of the clinical utility of liquid biopsies in
cancer,” Expert Rev. Mol. Diagn., vol. 19, no. 11, pp. 1031–1041, Nov. 2019, doi:
10.1080/14737159.2019.1664290.
65
[14] C. Freitas et al., “The Role of Liquid Biopsy in Early Diagnosis of Lung Cancer,” Front. Oncol., vol.
11, p. 634316, Apr. 2021, doi: 10.3389/fonc.2021.634316.
[15]Z. Zhang, N. Ramnath, and S. Nagrath, “Current Status of CTCs as Liquid Biopsy in Lung Cancer
and Future Directions,” Front. Oncol., vol. 5, p. 209, 2015, doi: 10.3389/fonc.2015.00209.
[16]E. Dotan, S. J. Cohen, K. R. Alpaugh, and N. J. Meropol, “Circulating tumor cells: evolving evidence
and future challenges,” The Oncologist, vol. 14, no. 11, pp. 1070–1082, Nov. 2009, doi:
10.1634/theoncologist.2009-0094.
[17]F. Castro-Giner and N. Aceto, “Tracking cancer progression: from circulating tumor cells to
metastasis,” Genome Med., vol. 12, no. 1, p. 31, Mar. 2020, doi: 10.1186/s13073-020-00728-3.
[18]V. MALY, O. MALY, K. KOLOSTOVA, and V. BOBEK, “Circulating Tumor Cells in Diagnosis and
Treatment of Lung Cancer,” In Vivo, vol. 33, no. 4, pp. 1027–1037, Jul. 2019, doi:
10.21873/invivo.11571.
[19]L. E. Lowes et al., “Circulating Tumor Cells (CTC) and Cell-Free DNA (cfDNA) Workshop 2016:
Scientific Opportunities and Logistics for Cancer Clinical Trial Incorporation,” Int. J. Mol. Sci., vol.
17, no. 9, p. 1505, Sep. 2016, doi: 10.3390/ijms17091505.
[20]S. Calabuig-Fariñas, E. Jantus-Lewintre, A. Herreros-Pomares, and C. Camps, “Circulating tumor
cells versus circulating tumor DNA in lung cancer—which one will win?,” Transl. Lung Cancer Res.,
vol. 5, no. 5, pp. 466–482, Oct. 2016, doi: 10.21037/tlcr.2016.10.02.
[21]A. Hanssen et al., “Characterization of different CTC subpopulations in non-small cell lung cancer,”
Sci. Rep., vol. 6, p. 28010, Jun. 2016, doi: 10.1038/srep28010.
[22]R. Königsberg et al., “Detection of EpCAM positive and negative circulating tumor cells in
metastatic breast cancer patients,” Acta Oncol. Stockh. Swed., vol. 50, no. 5, pp. 700–710, Jun. 2011,
doi: 10.3109/0284186X.2010.549151.
[23]T. Fehm, V. Müller, C. Alix-Panabières, and K. Pantel, “Micrometastatic spread in breast cancer:
detection, molecular characterization and clinical relevance,” Breast Cancer Res. BCR, vol. 10, no.
Suppl 1, p. S1, 2008, doi: 10.1186/bcr1869.
[24]B. Mostert et al., “Detection of circulating tumor cells in breast cancer may improve through
enrichment with anti-CD146,” Breast Cancer Res. Treat., vol. 127, no. 1, pp. 33–41, May 2011, doi:
10.1007/s10549-010-0879-y.
[25]A. Carlsson et al., “Circulating Tumor Microemboli Diagnostics for Patients with Non-Small Cell
Lung Cancer,” J. Thorac. Oncol. Of . Publ. Int. Assoc. Study Lung Cancer, vol. 9, no. 8, pp.
1111–1119, Aug. 2014, doi: 10.1097/JTO.0000000000000235.
[26]S. M. Setayesh et al., “Multianalyte liquid biopsy to aid the diagnostic workup of breast cancer,” Npj
Breast Cancer, vol. 8, no. 1, pp. 1–11, Sep. 2022, doi: 10.1038/s41523-022-00480-4.
[27]S. Chai et al., “Platelet-Coated Circulating Tumor Cells Are a Predictive Biomarker in Patients with
Metastatic Castrate-Resistant Prostate Cancer,” Mol. Cancer Res. MCR, vol. 19, no. 12, pp.
2036–2045, Dec. 2021, doi: 10.1158/1541-7786.MCR-21-0383.
[28]S. N. Shishido et al., “Cancer-related cells and oncosomes in the liquid biopsy of pancreatic cancer
patients undergoing surgery,” Npj Precis. Oncol., vol. 8, no. 1, p. 36, Feb. 2024, doi:
10.1038/s41698-024-00521-0.
66
[29]H. I. Scher et al., “Association of AR-V7 on Circulating Tumor Cells as a Treatment-Specific
Biomarker With Outcomes and Survival in Castration-Resistant Prostate Cancer,” JAMA Oncol., vol.
2, no. 11, pp. 1441–1449, Nov. 2016, doi: 10.1001/jamaoncol.2016.1828.
[30]A. K. Mattox, C. Bettegowda, S. Zhou, N. Papadopoulos, K. W. Kinzler, and B. Vogelstein,
“Applications of liquid biopsies for cancer,” Sci. Transl. Med., vol. 11, no. 507, p. eaay1984, Aug.
2019, doi: 10.1126/scitranslmed.aay1984.
[31]T. Cowling and H. Loshak, “An Overview of Liquid Biopsy for Screening and Early Detection of
Cancer,” in CADTH Issues in Emerging Health Technologies, in CADTH Horizon Scans. , Ottawa
(ON): Canadian Agency for Drugs and Technologies in Health, 2016. Accessed: Jun. 17, 2024.
[Online]. Available: http://www.ncbi.nlm.nih.gov/books/NBK555478/
[32]S. C. Williamson et al., “Vasculogenic mimicry in small cell lung cancer,” Nat. Commun., vol. 7, no.
1, p. 13322, Nov. 2016, doi: 10.1038/ncomms13322.
[33]A. Krohn et al., “Tumor cell heterogeneity in Small Cell Lung Cancer (SCLC): phenotypical and
functional differences associated with Epithelial-Mesenchymal Transition (EMT) and DNA
methylation changes,” PloS One, vol. 9, no. 6, p. e100249, 2014, doi: 10.1371/journal.pone.0100249.
[34]V. da Silva-Diz, L. Lorenzo-Sanz, A. Bernat-Peguera, M. Lopez-Cerda, and P. Muñoz, “Cancer cell
plasticity: Impact on tumor progression and therapy response,” Semin. Cancer Biol., vol. 53, pp.
48–58, Dec. 2018, doi: 10.1016/j.semcancer.2018.08.009.
[35]N. Basumallik and M. Agarwal, “Small Cell Lung Cancer,” in StatPearls, Treasure Island (FL):
StatPearls Publishing, 2024. Accessed: Jun. 17, 2024. [Online]. Available:
http://www.ncbi.nlm.nih.gov/books/NBK482458/
[36]E. G. Pizzutilo et al., “Liquid Biopsy for Small Cell Lung Cancer either De Novo or Transformed:
Systematic Review of Different Applications and Meta-Analysis,” Cancers, vol. 13, no. 9, p. 2265,
May 2021, doi: 10.3390/cancers13092265.
[37]A. E. Revelo et al., “Liquid biopsy for lung cancers: an update on recent developments,” Ann. Transl.
Med., vol. 7, no. 15, p. 349, Aug. 2019, doi: 10.21037/atm.2019.03.28.
[38]C. Rolfo and A. Russo, “Liquid biopsy for early stage lung cancer moves ever closer,” Nat. Rev. Clin.
Oncol., vol. 17, no. 9, pp. 523–524, Sep. 2020, doi: 10.1038/s41571-020-0393-z.
[39]V. Foy, F. Fernandez-Gutierrez, C. Faivre-Finn, C. Dive, and F. Blackhall, “The clinical utility of
circulating tumour cells in patients with small cell lung cancer,” Transl. Lung Cancer Res., vol. 6, no.
4, Aug. 2017, doi: 10.21037/tlcr.2017.07.05.
[40]A. De Luca, M. Gallo, C. Esposito, A. Morabito, and N. Normanno, “Promising Role of Circulating
Tumor Cells in the Management of SCLC,” Cancers, vol. 13, no. 9, p. 2029, Apr. 2021, doi:
10.3390/cancers13092029.
[41]D. Marrinucci et al., “Fluid Biopsy in Patients with Metastatic Prostate, Pancreatic and Breast
Cancers,” Phys. Biol., vol. 9, p. 016003, Feb. 2012, doi: 10.1088/1478-3975/9/1/016003.
[42]S. Chai et al., “Identification of epithelial and mesenchymal circulating tumor cells in clonal lineage
of an aggressive prostate cancer case,” NPJ Precis. Oncol., vol. 6, p. 41, Jun. 2022, doi:
10.1038/s41698-022-00289-1.
[43]T. Baslan et al., “Genome-wide copy number analysis of single cells,” Nat. Protoc., vol. 7, no. 6, pp.
67
1024–1041, Jun. 2012, doi: 10.1038/nprot.2012.039.
[44]J. George et al., “Comprehensive genomic profiles of small cell lung cancer,” Nature, vol. 524, no.
7563, pp. 47–53, Aug. 2015, doi: 10.1038/nature14664.
[45]J. Hu et al., “Comprehensive genomic profiling of small cell lung cancer in Chinese patients and the
implications for therapeutic potential,” Cancer Med., vol. 8, no. 9, pp. 4338–4347, Jun. 2019, doi:
10.1002/cam4.2199.
[46]G. Sozzi et al., “The FHIT gene 3p14.2 is abnormal in lung cancer,” Cell, vol. 85, no. 1, pp. 17–26,
Apr. 1996, doi: 10.1016/s0092-8674(00)81078-8.
[47]S. N. Shishido et al., “Characterization of Cellular and Acellular Analytes from Pre-Cystectomy
Liquid Biopsies in Patients Newly Diagnosed with Primary Bladder Cancer,” Cancers, vol. 14, no. 3,
Art. no. 3, Jan. 2022, doi: 10.3390/cancers14030758.
[48]S. Shen and J. Clairambault, “Cell plasticity in cancer cell populations,” F1000Research, vol. 9, p.
F1000 Faculty Rev-635, Jun. 2020, doi: 10.12688/f1000research.24803.1.
[49]C. E. Meacham and S. J. Morrison, “Tumour heterogeneity and cancer cell plasticity,” Nature, vol.
501, no. 7467, pp. 328–337, Sep. 2013, doi: 10.1038/nature12624.
[50]G. M. Wahl and B. T. Spike, “Cell state plasticity, stem cells, EMT, and the generation of
intra-tumoral heterogeneity,” Npj Breast Cancer, vol. 3, no. 1, pp. 1–13, Apr. 2017, doi:
10.1038/s41523-017-0012-z.
[51]A. P. Thankamony, K. Saxena, R. Murali, M. K. Jolly, and R. Nair, “Cancer Stem Cell Plasticity - A
Deadly Deal,” Front. Mol. Biosci., vol. 7, p. 79, 2020, doi: 10.3389/fmolb.2020.00079.
[52]B. Bakir, A. M. Chiarella, J. R. Pitarresi, and A. K. Rustgi, “EMT, MET, Plasticity, and Tumor
Metastasis,” Trends Cell Biol., vol. 30, no. 10, pp. 764–776, Oct. 2020, doi:
10.1016/j.tcb.2020.07.003.
[53]L. Welter et al., “Treatment response and tumor evolution: lessons from an extended series of
multianalyte liquid biopsies in a metastatic breast cancer patient,” Cold Spring Harb. Mol. Case
Stud., vol. 6, no. 6, p. a005819, Dec. 2020, doi: 10.1101/mcs.a005819.
[54]C. Riebensahm et al., “Clonality of circulating tumor cells in breast cancer brain metastasis patients,”
Breast Cancer Res., vol. 21, no. 1, p. 101, Sep. 2019, doi: 10.1186/s13058-019-1184-2.
[55]J. Wang, K. Wang, J. Xu, J. Huang, and T. Zhang, “Correction: Prognostic Significance of Circulating
Tumor Cells in Non-Small-Cell Lung Cancer Patients: A Meta-Analysis,” PLoS ONE, vol. 9, no. 1, p.
10.1371/annotation/6633ed7f-a10c-4f6d-9d1d-9c1245822eb7, Jan. 2014, doi:
10.1371/annotation/6633ed7f-a10c-4f6d-9d1d-9c1245822eb7.
[56]S. Ju et al., “Detection of circulating tumor cells: opportunities and challenges,” Biomark. Res., vol.
10, no. 1, p. 58, Aug. 2022, doi: 10.1186/s40364-022-00403-2.
[57]L.-M. Rieckmann et al., “Diagnostic leukapheresis reveals distinct phenotypes of NSCLC circulating
tumor cells,” Mol. Cancer, vol. 23, no. 1, p. 93, May 2024, doi: 10.1186/s12943-024-01984-2.
[58]R. P. L. Neves et al., “Proficiency Testing to Assess Technical Performance for CTC-Processing and
Detection Methods in CANCER-ID,” Clin. Chem., vol. 67, no. 4, pp. 631–641, Apr. 2021, doi:
10.1093/clinchem/hvaa322.
68
[59]P. Mondelo-Macía et al., “Current Status and Future Perspectives of Liquid Biopsy in Small Cell
Lung Cancer,” Biomedicines, vol. 9, no. 1, p. 48, Jan. 2021, doi: 10.3390/biomedicines9010048.
[60]E. Gerdtsson et al., “Multiplex protein detection on circulating tumor cells from liquid biopsies using
imaging mass cytometry,” Converg. Sci. Phys. Oncol., vol. 4, no. 1, p. 015002, Mar. 2018, doi:
10.1088/2057-1739/aaa013.
[61]S. Howard, T. Deroo, Y. Fujita, and N. Itasaki, “A positive role of cadherin in Wnt/β-catenin
signalling during epithelial-mesenchymal transition,” PloS One, vol. 6, no. 8, p. e23899, 2011, doi:
10.1371/journal.pone.0023899.
[62]T. Brabletz et al., “Invasion and Metastasis in Colorectal Cancer: Epithelial-Mesenchymal Transition,
Mesenchymal-Epithelial Transition, Stem Cells and β-Catenin,” Cells Tissues Organs, vol. 179, no.
1–2, pp. 56–65, Jun. 2005, doi: 10.1159/000084509.
[63]B. Li et al., “CD59 is overexpressed in human lung cancer and regulates apoptosis of human lung
cancer cells,” Int. J. Oncol., vol. 43, no. 3, pp. 850–858, Sep. 2013, doi: 10.3892/ijo.2013.2007.
[64]X. Ding, F. Li, and L. Zhang, “Knockdown of Delta-like 3 restricts lipopolysaccharide-induced
inflammation, migration and invasion of A2058 melanoma cells via blocking Twist1-mediated
epithelial-mesenchymal transition,” Life Sci., vol. 226, pp. 149–155, Jun. 2019, doi:
10.1016/j.lfs.2019.04.024.
[65]T. Liu et al., “The EMT transcription factor, Twist1, as a novel therapeutic target for pulmonary
sarcomatoid carcinomas,” Int. J. Oncol., vol. 56, no. 3, pp. 750–760, Mar. 2020, doi:
10.3892/ijo.2020.4972.
[66]H. Ren et al., “TWIST1 and BMI1 in Cancer Metastasis and Chemoresistance,” J. Cancer, vol. 7, no.
9, p. 1074, 2016, doi: 10.7150/jca.14031.
[67]M. G. Krebs et al., “Evaluation and prognostic significance of circulating tumor cells in patients with
non-small-cell lung cancer,” J. Clin. Oncol. Of . J. Am. Soc. Clin. Oncol., vol. 29, no. 12, pp.
1556–1563, Apr. 2011, doi: 10.1200/JCO.2010.28.7045.
[68]C. H. Huang et al., “A multicenter pilot study examining the role of circulating tumor cells as a
blood-based tumor marker in patients with extensive small-cell lung cancer,” Front. Oncol., vol. 4, p.
271, 2014, doi: 10.3389/fonc.2014.00271.
[69]A. Hanssen, S. Loges, K. Pantel, and H. Wikman, “Detection of Circulating Tumor Cells in
Non-Small Cell Lung Cancer,” Front. Oncol., vol. 5, p. 207, Sep. 2015, doi:
10.3389/fonc.2015.00207.
[70]M. G. Krebs et al., “Analysis of Circulating Tumor Cells in Patients with Non-small Cell Lung
Cancer Using Epithelial Marker-Dependent and -Independent Approaches,” J. Thorac. Oncol., vol. 7,
no. 2, pp. 306–315, Feb. 2012, doi: 10.1097/JTO.0b013e31823c5c16.
[71]V. Hofman et al., “Detection of circulating tumor cells as a prognostic factor in patients undergoing
radical surgery for non-small-cell lung carcinoma: comparison of the efficacy of the CellSearch
Assay
TM and the isolation by size of epithelial tumor cell method,” Int. J. Cancer, vol. 129, no. 7, pp.
1651–1660, Oct. 2011, doi: 10.1002/ijc.25819.
[72]S. N. Shishido et al., “Circulating tumor cells as a response monitor in stage IV non-small cell lung
cancer,” J. Transl. Med., vol. 17, no. 1, p. 294, Aug. 2019, doi: 10.1186/s12967-019-2035-8.
69
[73]W.-H. Chang, R. A. Cerione, and M. A. Antonyak, “Extracellular Vesicles and Their Roles in Cancer
Progression,” Methods Mol. Biol. Clifton NJ, vol. 2174, pp. 143–170, 2021, doi:
10.1007/978-1-0716-0759-6_10.
[74]H. Julich-Haertel et al., “Cancer-associated circulating large extracellular vesicles in
cholangiocarcinoma and hepatocellular carcinoma,” J. Hepatol., vol. 67, no. 2, pp. 282–292, Aug.
2017, doi: 10.1016/j.jhep.2017.02.024.
[75]D. C. I. Goberdhan, “Large tumour-derived extracellular vesicles as prognostic indicators of
metastatic cancer patient survival,” Br. J. Cancer, vol. 128, no. 3, pp. 471–473, Feb. 2023, doi:
10.1038/s41416-022-02055-3.
[76]T. Kato, J. V. Vykoukal, J. F. Fahrmann, and S. Hanash, “Extracellular Vesicles in Lung Cancer:
Prospects for Diagnostic and Therapeutic Applications,” Cancers, vol. 13, no. 18, p. 4604, Sep. 2021,
doi: 10.3390/cancers13184604.
[77]B. Sandfeld-Paulsen et al., “Exosomal Proteins as Diagnostic Biomarkers in Lung Cancer,” J.
Thorac. Oncol. Of . Publ. Int. Assoc. Study Lung Cancer, vol. 11, no. 10, pp. 1701–1710, Oct. 2016,
doi: 10.1016/j.jtho.2016.05.034.
[78]C. Pérez-Ramírez, M. Cañadas-Garre, A. I. Robles, M. Á. Molina, M. J. Faus-Dáder, and M. Á.
Calleja-Hernández, “Liquid biopsy in early stage lung cancer,” Transl. Lung Cancer Res., vol. 5, no.
5, pp. 517–524, Oct. 2016, doi: 10.21037/tlcr.2016.10.15.
[79]C. M. Neophytou, M. Panagi, T. Stylianopoulos, and P. Papageorgis, “The Role of Tumor
Microenvironment in Cancer Metastasis: Molecular Mechanisms and Therapeutic Opportunities,”
Cancers, vol. 13, no. 9, p. 2053, Apr. 2021, doi: 10.3390/cancers13092053.
[80]X. Bian et al., “Microvesicles and chemokines in tumor microenvironment: mediators of intercellular
communications in tumor progression,” Mol. Cancer, vol. 18, no. 1, p. 50, Mar. 2019, doi:
10.1186/s12943-019-0973-7.
[81]M. A. Nengroo, A. Verma, and D. Datta, “Cytokine chemokine network in tumor microenvironment:
Impact on CSC properties and therapeutic applications,” Cytokine, vol. 156, p. 155916, Aug. 2022,
doi: 10.1016/j.cyto.2022.155916.
[82]R. L. Siegel, K. D. Miller, N. S. Wagle, and A. Jemal, “Cancer statistics, 2023,” CA. Cancer J. Clin.,
vol. 73, no. 1, pp. 17–48, Jan. 2023, doi: 10.3322/caac.21763.
[83]“SEER Cancer Statistics Review, 1975-2016,” SEER. Accessed: Jun. 17, 2024. [Online]. Available:
https://seer.cancer.gov/csr/1975_2016/index.html
[84]A. N. Giaquinto, K. D. Miller, K. Y. Tossas, R. A. Winn, A. Jemal, and R. L. Siegel, “Cancer
statistics for African American/Black People 2022,” CA. Cancer J. Clin., vol. 72, no. 3, pp. 202–229,
May 2022, doi: 10.3322/caac.21718.
[85]S. Blandin Knight, P. A. Crosbie, H. Balata, J. Chudziak, T. Hussell, and C. Dive, “Progress and
prospects of early detection in lung cancer,” Open Biol., vol. 7, no. 9, p. 170070, Sep. 2017, doi:
10.1098/rsob.170070.
[86]A. Esteva et al., “A guide to deep learning in healthcare,” Nat. Med., vol. 25, no. 1, pp. 24–29, Jan.
2019, doi: 10.1038/s41591-018-0316-z.
[87]S. Bhatia, Y. Sinha, and L. Goel, “Lung Cancer Detection: A Deep Learning Approach,” in Soft
70
Computing for Problem Solving, J. C. Bansal, K. N. Das, A. Nagar, K. Deep, and A. K. Ojha, Eds.,
Singapore: Springer, 2019, pp. 699–705. doi: 10.1007/978-981-13-1595-4_55.
[88]J. De Fauw et al., “Clinically applicable deep learning for diagnosis and referral in retinal disease,”
Nat. Med., vol. 24, no. 9, pp. 1342–1350, Sep. 2018, doi: 10.1038/s41591-018-0107-6.
[89]G. Litjens et al., “Deep learning as a tool for increased accuracy and efficiency of histopathological
diagnosis,” Sci. Rep., vol. 6, no. 1, p. 26286, May 2016, doi: 10.1038/srep26286.
[90]Z. Zhang et al., “Pathologist-level interpretable whole-slide cancer diagnosis with deep learning,”
Nat. Mach. Intell., vol. 1, no. 5, pp. 236–245, May 2019, doi: 10.1038/s42256-019-0052-1.
[91]K.-H. Yu, A. L. Beam, and I. S. Kohane, “Artificial intelligence in healthcare,” Nat. Biomed. Eng.,
vol. 2, no. 10, pp. 719–731, Oct. 2018, doi: 10.1038/s41551-018-0305-z.
[92]S. Nageswaran et al., “Lung Cancer Classification and Prediction Using Machine Learning and
Image Processing,” BioMed Res. Int., vol. 2022, p. 1755460, 2022, doi: 10.1155/2022/1755460.
[93]H. Shin et al., “Early-Stage Lung Cancer Diagnosis by Deep Learning-Based Spectroscopic Analysis
of Circulating Exosomes,” ACS Nano, vol. 14, no. 5, pp. 5435–5444, May 2020, doi:
10.1021/acsnano.9b09119.
[94]Y. Xie et al., “Early lung cancer diagnostic biomarker discovery by machine learning methods,”
Transl. Oncol., vol. 14, no. 1, p. 100907, Jan. 2021, doi: 10.1016/j.tranon.2020.100907.
[95]P. A. Crosbie et al., “Second round results from the Manchester ‘Lung Health Check’
community-based targeted lung cancer screening pilot,” Thorax, vol. 74, no. 7, pp. 700–704, Jul.
2019, doi: 10.1136/thoraxjnl-2018-212547.
[96]R. Blagus and L. Lusa, “SMOTE for high-dimensional class-imbalanced data,” BMC Bioinformatics,
vol. 14, no. 1, p. 106, Mar. 2013, doi: 10.1186/1471-2105-14-106.
[97]N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority
Over-sampling Technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun. 2002, doi:
10.1613/jair.953.
[98]T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco
California USA: ACM, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.
[99]M. Ilie et al., “‘Sentinel’ circulating tumor cells allow early diagnosis of lung cancer in patients with
chronic obstructive pulmonary disease,” PloS One, vol. 9, no. 10, p. e111597, 2014, doi:
10.1371/journal.pone.0111597.
[100]“EBImage—an R package for image processing with applications to cellular phenotypes |
Bioinformatics | Oxford Academic.” Accessed: Jul. 05, 2024. [Online]. Available:
https://academic.oup.com/bioinformatics/article/26/7/979/211835
71
Asset Metadata
Creator
Seo, Ji Youn (author)
Core Title
Early detection of lung cancer by characterizing circulating rare cells using peripheral blood liquid biopsy
Contributor
Electronically uploaded by the author
(provenance)
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Computational Biology and Bioinformatics
Degree Conferral Date
2024-08
Publication Date
08/07/2024
Defense Date
06/24/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
circulating rare cell,circulating tumor cell,early detection,liquid biopsy,lung cancer,NSCLC,OAI-PMH Harvest,SCLC
Format
theses
(aat)
Language
English
Advisor
Kuhn, Peter (
committee chair
), Katritch, Vsevolod Seva (
committee member
), Mason, Jeremy (
committee member
), Rohs, Remo (
committee member
), Sun, Fengzhu (
committee member
)
Creator Email
jiyounse@usc.edu,sjyjoy92@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113998TE1
Unique identifier
UC113998TE1
Identifier
etd-SeoJiYoun-13358.pdf (filename)
Legacy Identifier
etd-SeoJiYoun-13358
Document Type
Thesis
Format
theses (aat)
Rights
Seo, Ji Youn
Internet Media Type
application/pdf
Type
texts
Source
20240813-usctheses-batch-1195
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
uscdl@usc.edu
Abstract (if available)
Abstract
Lung cancer, the second most common cancer in men and women and the leading cause of cancer mortality in the United States, has a significantly higher survival rate when detected early. Current screening methods, such as low-dose computed tomography (LDCT), have limitations including high false-positive rates and overdiagnosis. Liquid biopsy through peripheral blood (PB) analysis presents a promising complementary approach for early detection by identifying biomarkers, including circulating rare cells (CRCs) such as circulating tumor cells (CTCs). This study employs the high-definition single-cell assay (HDSCA) workflow, an enrichment-independent method for improving patient outcomes through the early detection and comprehensive analysis of CRCs. The third-generation HDSCA incorporates advanced immunofluorescence staining and an unsupervised rare event detection algorithm to capture a comprehensive population of CRCs. The results indicate that HDSCA3.0 successfully identifies a diverse range of circulating rare events, providing valuable insights into the heterogeneity of lung cancer. This thesis demonstrates that the characterization of the phenotypic heterogeneity of CRCs in SCLC and NSCLC patients, and evaluates the feasibility of HDSCA3.0 for lung cancer screening in high-risk individuals, can enhance diagnostic precision and potentially improve patient outcomes.
Tags
circulating rare cell
circulating tumor cell
early detection
liquid biopsy
lung cancer
NSCLC
SCLC
Linked assets
University of Southern California Dissertations and Theses