SARS-CoV-2
SARS-CoV-2 continues to threaten our lives after more than two years of
the COVID-19 outbreak. Up to date, more than 500 millions cases of
COVID-19 have been reported, including more than six millions deaths
worldwide, numbers that increase
daily[39].
One of the major challenges to suppress local spread of SARS-CoV-2 is
due to the fact that 33% of people with SARS-CoV-2 infection are
estimated to be
asymptomatic[40].
Population-scale testing thus appears to be very important for us to
effectively detect people with SARS-CoV-2 infections and rapidly place
them in
quarantine[41–43]at the first place. The success of population-scale testing relies on
how many individual samples can be synchronously tested. To tackle this
issue, several groups have applied the barcode strategy on the
high-throughput sequencing platform to make their testing capacity as
maximal as possible. Ludwig and colleagues reported a high-throughput
technology named LAMP-seq to sequence ten of millions of individual
samples at the same
time[27].
LAMP-seq is derived from Reverse-Transcription Loop-mediated Isothermal
Amplification
(RT-LAMP)[44,45]by employing molecular barcodes specific for each sample. LAMP-seq can
be started with an unpurified swab
sample[46]followed by a single heating step, extensive sample pooling, massively
parallel RT-LAMP, and standard computational
analysis[27]to identify infected people. Six different LAMP primers [F3, B3,
forward inner primer (FIP), backward inner primer (BIP), LF, LB] were
used in a RT-LAMP reaction to target the SARS-CoV-2 N gene; a sequence
of 10 base pairs DNA barcode was incorporated in the FIP primer (primer
sequence is TCTGGCCCAGTTCCTAGGTAGTNNNNNNNNNNCCAGACGAATTCGTGGTGG, where
Ns represent a unique barcode
sequence)[27,47,48](Figure 2A). The authors further suggested using the combination of
unique barcodes per sample to form a compressed barcode space, where up
to five barcodes can be present in order to enlarge the capacity for
sample pooling as long as a small fraction of samples is expected to be
positive during population scale testing. The characterization of the
sequencing library is illustrated in Figure 2A. The authors calculated a
minimum Levenshtein distance to ensure that one to two insertion,
deletion or substitution errors between any pair of barcodes were
detectable in barcode sets, indicating that the molecular barcode system
employed in LAMP-seq was robust. It is important to note that the
complexity of molecular barcodes (1,000 - 10,000 barcode set) in
LAMP-seq is sufficient to cover the dynamic range of input viral loads
and the presence of molecular barcodes do not affect LAMP sensitivity,
product amounts, or downstream PCR amplification. Finally, this
molecular barcoding system was validated by employing a modified Bloom
filter[49]based on a pool of 10,000 barcodes to estimate the probability of
false-positive and false-negative generated by LAMP-seq. After
computational simulation the authors concluded that when using 5
barcodes per sample, 3 barcodes are detectable.
Bloom and colleagues proposed another high-throughput method named
Swab-Seq by employing molecular barcodes for a population-scale
testing[50,51].
Two sets of 1,536 unique barcodes were placed adjacent to the P5 and P7
adaptors in Illumina sequencing primers (so-called i5 and i7 sample
barcodes), respectively, rendering the presence of two independent
barcodes in each amplicon (Figure 2B). Barcoded primers were designed to
amplify the SARS-CoV-2 S gene. Every barcode is 10 base pairs in length
and does not contain homopolymer repeats greater than 2 nucleotides. At
least three nucleotides (a minimum Levenshtein distance of
3)[52]are different between two barcodes present in the same amplicon,
allowing for demultiplexing even in the face of sequencing errors.
Swab-Seq has been validated on the bench by employing purified RNA
nasopharyngeal samples and extraction-free saliva specimens, showing
that it has extremely high sensitivity and specificity for the detection
of viral
RNA[50,53].
Swab-Seq has been scaled up since 2021 to support asymptomatic
screening. To date, over 80,000 tests have been performed by applying
this high-throughput protocol, at a scale of ~10,000
testing samples per week. More
methods[53–55]that employ the similar strategy of molecular barcoding for sample
multiplexing, rendering the feasibility of initiating a population-scale
testing have been expected to come up one after another. At present, the
molecular barcoding strategy used for SARS-CoV-2 studies mainly focus on
scaling up the testing capacity efficiently. How to use molecular
barcodes to gain more insight into the SARS-CoV-2 Virology will be
considerable for further investigation.