Experimental data for FA model development and validation:
Example of multi-omic analysis of a SCC lesion from one FA
individual
The tumorigenesis process (Fig. 6A) results in a heterogenous
composition of tumors, i.e., each tumor contains cells in various
stages of the transformation process to aggressively metastasizing
cells. Importantly, tumors are not only composed of malignant
proliferating cells, but also by multiple cell types, thus making the
tumor mass a complex ecosystem that includes immune cells of multiple
types (B cells, T cells, macrophages, etc.), tumor-associated
fibroblasts, endothelial cells (60) and even microbes, including
bacteria and fungi (61). At the same time, a tumor is not only composed
of cells, but also by extracellular matrix and secreted factors that can
signal messages among cells (60). If malignant tumors from individuals
with FA are to be characterized and this information used for accurate
model building, all these factors must be accounted for. In this
respect, high throughput multi-omics technologies can leverage the
components of the tumor of interest, generating data in multiple
modalities that need to be integrated and potentially exploited for
discovering novel biomarkers and therapeutic targets for individuals
with FA.
Of note, classical DNA sequencing, RNA sequencing (RNA-seq), and protein
detection technologies are not able to deconvolute and deconstruct the
above-mentioned complex composition of a given tumor, since they use the
bulk content of the tumor or tissue and are, therefore, constrained to
detect the mean expression of molecules, or the presence of a
predominant DNA sequence, thus losing information of minor cell
populations or incipient emergent cellular clones (62). However, we are
witnessing the appearance, development, and refinement of multiple
technologies with the capacity to resolve the cellular heterogeneity of
tumors. Among these technologies, one of the most popular is single-cell
RNA-seq (scRNAseq) , which has given rise to a growing number of
datasets from liquid and solid tumors (after tumor dissociation), as
well as healthy tissue, leading to a compendium of
single-cell-resolution gene expression atlases of multiple tissues and
organs (63). Although scRNAseq is a technology that has revolutionized
the resolution at which we analyze cell populations and tissues, it
still lacks a critical component, i.e., preservation of tissue
architecture in its original context (62).
In the context of FA cancer, we are interested in the implementation of
technologies that, in a multi-omics fashion, will generate
single-cell-resolution data but will prevent tissue disaggregation and,
therefore, maintain tissue architecture. The latter implies the
preservation of cellular neighborhoods and cell-cell interactions, which
are lost when the tissue is disaggregated. These technologies are known
as spatial omics and include spatial transcriptomics, spatial
proteomics, and spatial genomics, which combine molecular
characterization with spatial resolution (64). The aim of these spatial
resolution technologies is to assign omics information to spatial
locations in the tissues, reaching cellular and subcellular resolution.
Spatial genomics assigns DNA sequencing information, including
copy-number variants and somatic mutations; spatial transcriptomics
provides information on the number of transcripts of a certain gene per
region; and spatial proteomics provides relative amounts of protein
concentrations (64). The data obtained by these multi-omics technologies
are highly dimensional in nature and require potent computational tools
for their analysis. Although intense research is underway for improving
all spatial omics technologies, the most developed are spatial
transcriptomics and spatial proteomics. These technologies will allow
for the detection and quantification of cell populations of interest,
the discovery of new cell populations, the comparison of the abundance
of cell populations across the carcinogenic progression and the
quantitative and qualitative description of infiltrating immune cells
(65, 66). These technologies have the capacity to compensate for the
lack of resolution of bulk sequencing analyses, which has hampered the
detection of premalignant clones at early stages in FA (67).
Here, we use as an example a hypopharynx cancer from a 41-year-old woman
with FA. The hematoxylin- and eosin-stained tumor sample shows
multistage carcinogenesis, ranging from low-grade dysplasia (yellow) to
high-grade dysplasia (orange) and invasive carcinoma (red)
(Fig. 6B). This type of formalin-fixed paraffin-embedded (FFPE)
sample can be used for exploration and information retrieval using one
or multiple of the multi-omics technologies discussed here (see also Box 2). If, for example, tissue-cyclic immunofluorescence
(t-CycIF) is used, multiple sequential pictures of the tissue stained
with fluorescent antibodies will be acquired and stitched. The composite
image that is generated must first be segmented using artificial
intelligence-based programs, such as ASHLAR (66), which recognize every
cell nucleus and apply single-cell-level segmentation of the tumor
(Fig. 6C, upper left panel). For every cell, we can
feature-extract the expression of every marker of interest and proceed
to non-supervised machine learning-based algorithms, such as uniform
manifold approximation and projection (UMAP) (Fig. 6C, upper
right panel), which generate clusters of cells based on the similarity
of their expressed markers. This allows the separate visualization of
cell populations (68), such as cancer and immune tumor-infiltrating
cells. After feature extraction, we can explore the expression of
markers of interest in every tumor population or across the tumor
progression, for example the proportion of proliferating cells
(Fig. 6C, lower left panel), the relative expression of p53
(Fig. 6C, lower right panel) or other markers of interest.