Experimental data for FA model development and validation: Example of multi-omic analysis of a SCC lesion from one FA individual

The tumorigenesis process (Fig. 6A) results in a heterogenous composition of tumors, i.e., each tumor contains cells in various stages of the transformation process to aggressively metastasizing cells. Importantly, tumors are not only composed of malignant proliferating cells, but also by multiple cell types, thus making the tumor mass a complex ecosystem that includes immune cells of multiple types (B cells, T cells, macrophages, etc.), tumor-associated fibroblasts, endothelial cells (60) and even microbes, including bacteria and fungi (61). At the same time, a tumor is not only composed of cells, but also by extracellular matrix and secreted factors that can signal messages among cells (60). If malignant tumors from individuals with FA are to be characterized and this information used for accurate model building, all these factors must be accounted for. In this respect, high throughput multi-omics technologies can leverage the components of the tumor of interest, generating data in multiple modalities that need to be integrated and potentially exploited for discovering novel biomarkers and therapeutic targets for individuals with FA.
Of note, classical DNA sequencing, RNA sequencing (RNA-seq), and protein detection technologies are not able to deconvolute and deconstruct the above-mentioned complex composition of a given tumor, since they use the bulk content of the tumor or tissue and are, therefore, constrained to detect the mean expression of molecules, or the presence of a predominant DNA sequence, thus losing information of minor cell populations or incipient emergent cellular clones (62). However, we are witnessing the appearance, development, and refinement of multiple technologies with the capacity to resolve the cellular heterogeneity of tumors. Among these technologies, one of the most popular is single-cell RNA-seq (scRNAseq) , which has given rise to a growing number of datasets from liquid and solid tumors (after tumor dissociation), as well as healthy tissue, leading to a compendium of single-cell-resolution gene expression atlases of multiple tissues and organs (63). Although scRNAseq is a technology that has revolutionized the resolution at which we analyze cell populations and tissues, it still lacks a critical component, i.e., preservation of tissue architecture in its original context (62).
In the context of FA cancer, we are interested in the implementation of technologies that, in a multi-omics fashion, will generate single-cell-resolution data but will prevent tissue disaggregation and, therefore, maintain tissue architecture. The latter implies the preservation of cellular neighborhoods and cell-cell interactions, which are lost when the tissue is disaggregated. These technologies are known as spatial omics and include spatial transcriptomics, spatial proteomics, and spatial genomics, which combine molecular characterization with spatial resolution (64). The aim of these spatial resolution technologies is to assign omics information to spatial locations in the tissues, reaching cellular and subcellular resolution. Spatial genomics assigns DNA sequencing information, including copy-number variants and somatic mutations; spatial transcriptomics provides information on the number of transcripts of a certain gene per region; and spatial proteomics provides relative amounts of protein concentrations (64). The data obtained by these multi-omics technologies are highly dimensional in nature and require potent computational tools for their analysis. Although intense research is underway for improving all spatial omics technologies, the most developed are spatial transcriptomics and spatial proteomics. These technologies will allow for the detection and quantification of cell populations of interest, the discovery of new cell populations, the comparison of the abundance of cell populations across the carcinogenic progression and the quantitative and qualitative description of infiltrating immune cells (65, 66). These technologies have the capacity to compensate for the lack of resolution of bulk sequencing analyses, which has hampered the detection of premalignant clones at early stages in FA (67).
Here, we use as an example a hypopharynx cancer from a 41-year-old woman with FA. The hematoxylin- and eosin-stained tumor sample shows multistage carcinogenesis, ranging from low-grade dysplasia (yellow) to high-grade dysplasia (orange) and invasive carcinoma (red) (Fig. 6B). This type of formalin-fixed paraffin-embedded (FFPE) sample can be used for exploration and information retrieval using one or multiple of the multi-omics technologies discussed here (see also Box 2). If, for example, tissue-cyclic immunofluorescence (t-CycIF) is used, multiple sequential pictures of the tissue stained with fluorescent antibodies will be acquired and stitched. The composite image that is generated must first be segmented using artificial intelligence-based programs, such as ASHLAR (66), which recognize every cell nucleus and apply single-cell-level segmentation of the tumor (Fig. 6C, upper left panel). For every cell, we can feature-extract the expression of every marker of interest and proceed to non-supervised machine learning-based algorithms, such as uniform manifold approximation and projection (UMAP) (Fig. 6C, upper right panel), which generate clusters of cells based on the similarity of their expressed markers. This allows the separate visualization of cell populations (68), such as cancer and immune tumor-infiltrating cells. After feature extraction, we can explore the expression of markers of interest in every tumor population or across the tumor progression, for example the proportion of proliferating cells (Fig. 6C, lower left panel), the relative expression of p53 (Fig. 6C, lower right panel) or other markers of interest.