Figure Legends
Figure 1. Schematic representation of three principal approaches
to prepare barcoded libraries. (A, C and D) Molecular barcodes can be
introduced to a template by ligating sequencing adaptors (A) or
hybridizing molecular inversion probes (C) or directly given by PCR
amplification with target-specific primers (D). Of note, both forward-
and reverse adaptors were synthesized followed by adaptor extension and
A tailing. (B) Schematic representation of the different applications
between the usage of molecular barcodes and sample barcodes. Molecular
barcodes used in pooled sample 1 aims to correct sequencing errors: a
misreading nucleotide, guanosine (G) for example, is corrected in final
consensus sequences. Molecular barcodes used in pooled sample 2 aims to
identify true mutations (marked by red triangles); a mistaken mutation
(marked by a blue triangle) is eventually removed in final consensus
sequences. Panel (A) is modified based on Figure 1 in Schmitt et al.
(2012)[19];
panel (C) is modified based on Figure 1 in Hiatt et al.
(2013)[23].
Figure 2. Schematic representation of the molecular barcoding
strategies used for a population-scale testing to screen
SARS-CoV-2-infected individuals. (A) A sequence of 10 base pairs DNA
barcodes named LAMP barcodes was incorporated in the FIP primer while
performing RT-PCR. PCR barcodes adjacent to Illumina P5 and P7 sequences
flanked the both ends of the library. Annotated amplicon sequence is
modified based on Figure 1b in Ludwig et al.
(2021)[27].
(B) Two sets of unique barcodes named i5 and i7 sample barcodes were
placed adjacent to the P5 and P7 adaptors in Illumina sequence primers
in the stage of PCR amplification. Illustration is modified based on
Figure 1b in Bloom et al.
(2021)[51].
Figure 3. Schematic representation of the molecular barcoding
strategies applied in HIV and SIV. (A) A swarm of cDNA synthesis
primers containing a string of eight degenerate nucleotides named Primer
ID and a three nucleotides sample barcode were used to PCR amplify the
HIV-1 protease (pro ) gene. (B) A sequence of 20-nucleotides
molecular barcodes was used to tag the region downstream the HIV 5’ long
terminal repeat in the HIV-based vector. After infection, viral DNA
containing molecular barcodes were inserted in the host genome. Inverse
PCR performed on genome DNA isolated from the infected cells identifies
provirus insertion sites; RT-PCR performed mRNA of barcoded proviruses
measures viral transcription driven by the HIV 5’ long terminal repeat.
(C) The genomic characterization of nearly full-length HIV and the
composition of the molecular barcode sequence. A 21 nucleotides barcode
sequence was inserted in a non-expressed region upstream of the HA tag
overlapped with the HIV vpr gene in a nearly full-length and
replication-competent HIV genome. The original sequence in the barcode
region was given in this illustration. Each third nucleotide was
replaced by a thymidine in the sequence of barcodes. Pink stick marks
the region where a molecular barcode is inserted. Illustration is
modified based on Figure 1A in Marsden et al.
(2020)[61].
(D) The genomic characterization of SIV and the composition of the
molecular barcode sequence. Molecular barcodes encompassing 10 random
nucleotides in length were inserted between the stop codon of the SIVvpx gene and the start codon of the SIV vpr gene in the
SIVmac239 plasmid. Illustration is modified based on Figure 1A in
Fennessey et al.
(2017)[63].
Figure 4. Schematic representation of the molecular barcoding
strategies applied in Influenza A virus. (A) A string of 22 nucleotides
molecular barcodes were carried by a shRNA library with
amplification[66].
Amplified products encompassing molecular barcodes were inserted between
the Influenza A virus genes encoding NS1 and NEP, which have been
manipulated in their previous
work[65].
Illustration is modified based on Figure 1A in Varble et al.
(2014)[24].
(B) Three sorts of barcodes, including cell barcodes, UMI and viral
barcodes were applied on viral mRNA to measure single mRNA transcript in
cells infected with Influenza A viruses. Annotated amplicon was
illustrated based on Figure 1C in Russell et al.
(2018)[67].
Figure 5. Schematic representation of the genomic
characterization of Zika virus and the composition of the molecular
barcode sequence. Molecular barcode consisting of eight degenerate
codons (24 nucleotides) was embedded into the gene encoding the NS2A
protein. Pink stick marks the region where a molecular barcode is
inserted. Nucleotides written in pink color are referred to as
degenerate nucleotides.
Figure 6. Schematic representation of the rationale design of
experimental evolution models proposed in this review article. Proposed
experimental evolution models are composed of two parts: the reservoir
of natural host cells/animals used to experimentally generate a swarm of
laboratory-produced variant strains and the validation stage, in which
we will predict phenotypes that are caused by genotypic changes and
survey their impacts on human cells at a single-sequence level.