SARS-CoV-2
SARS-CoV-2 continues to threaten our lives after more than two years of the COVID-19 outbreak. Up to date, more than 500 millions cases of COVID-19 have been reported, including more than six millions deaths worldwide, numbers that increase daily[39]. One of the major challenges to suppress local spread of SARS-CoV-2 is due to the fact that 33% of people with SARS-CoV-2 infection are estimated to be asymptomatic[40]. Population-scale testing thus appears to be very important for us to effectively detect people with SARS-CoV-2 infections and rapidly place them in quarantine[41–43]at the first place. The success of population-scale testing relies on how many individual samples can be synchronously tested. To tackle this issue, several groups have applied the barcode strategy on the high-throughput sequencing platform to make their testing capacity as maximal as possible. Ludwig and colleagues reported a high-throughput technology named LAMP-seq to sequence ten of millions of individual samples at the same time[27]. LAMP-seq is derived from Reverse-Transcription Loop-mediated Isothermal Amplification (RT-LAMP)[44,45]by employing molecular barcodes specific for each sample. LAMP-seq can be started with an unpurified swab sample[46]followed by a single heating step, extensive sample pooling, massively parallel RT-LAMP, and standard computational analysis[27]to identify infected people. Six different LAMP primers [F3, B3, forward inner primer (FIP), backward inner primer (BIP), LF, LB] were used in a RT-LAMP reaction to target the SARS-CoV-2 N gene; a sequence of 10 base pairs DNA barcode was incorporated in the FIP primer (primer sequence is TCTGGCCCAGTTCCTAGGTAGTNNNNNNNNNNCCAGACGAATTCGTGGTGG, where Ns represent a unique barcode sequence)[27,47,48](Figure 2A). The authors further suggested using the combination of unique barcodes per sample to form a compressed barcode space, where up to five barcodes can be present in order to enlarge the capacity for sample pooling as long as a small fraction of samples is expected to be positive during population scale testing. The characterization of the sequencing library is illustrated in Figure 2A. The authors calculated a minimum Levenshtein distance to ensure that one to two insertion, deletion or substitution errors between any pair of barcodes were detectable in barcode sets, indicating that the molecular barcode system employed in LAMP-seq was robust. It is important to note that the complexity of molecular barcodes (1,000 - 10,000 barcode set) in LAMP-seq is sufficient to cover the dynamic range of input viral loads and the presence of molecular barcodes do not affect LAMP sensitivity, product amounts, or downstream PCR amplification. Finally, this molecular barcoding system was validated by employing a modified Bloom filter[49]based on a pool of 10,000 barcodes to estimate the probability of false-positive and false-negative generated by LAMP-seq. After computational simulation the authors concluded that when using 5 barcodes per sample, 3 barcodes are detectable.
Bloom and colleagues proposed another high-throughput method named Swab-Seq by employing molecular barcodes for a population-scale testing[50,51]. Two sets of 1,536 unique barcodes were placed adjacent to the P5 and P7 adaptors in Illumina sequencing primers (so-called i5 and i7 sample barcodes), respectively, rendering the presence of two independent barcodes in each amplicon (Figure 2B). Barcoded primers were designed to amplify the SARS-CoV-2 S gene. Every barcode is 10 base pairs in length and does not contain homopolymer repeats greater than 2 nucleotides. At least three nucleotides (a minimum Levenshtein distance of 3)[52]are different between two barcodes present in the same amplicon, allowing for demultiplexing even in the face of sequencing errors. Swab-Seq has been validated on the bench by employing purified RNA nasopharyngeal samples and extraction-free saliva specimens, showing that it has extremely high sensitivity and specificity for the detection of viral RNA[50,53]. Swab-Seq has been scaled up since 2021 to support asymptomatic screening. To date, over 80,000 tests have been performed by applying this high-throughput protocol, at a scale of ~10,000 testing samples per week. More methods[53–55]that employ the similar strategy of molecular barcoding for sample multiplexing, rendering the feasibility of initiating a population-scale testing have been expected to come up one after another. At present, the molecular barcoding strategy used for SARS-CoV-2 studies mainly focus on scaling up the testing capacity efficiently. How to use molecular barcodes to gain more insight into the SARS-CoV-2 Virology will be considerable for further investigation.