Figure 4. ORFeome capture using LASSO probes. a, Schematic of
the workflow. b, Post-capture PCR of circles obtained from the
capture of 3,078 ORFs of E. coli K12 performed using the LASSO probe
library. The inset is a histogram denoting the size distribution of the
targeted ORFs split into bin sizes of 40 bp. Targeted ORFs have an
increase in 140bp of residual LASSO sequences once captured and run on a
gel. c. Median RPKM enrichment ratios of targeted ORFs versus
non-targeted genetic elements ratios of a LASSO probe library obtained
by using the DNA Recombinase Mediated Assembly (blue) and the assembly
method developed by Tosi L. and coworkers in 2017 (red). d, Bee
swarm plot combined with boxplot Average depth of sequencing per
kilobase for each targeted ORF (n=3087) and non targeted ORF (n=905).
Center lines show the medians; box limits indicate the 25th and 75th
percentiles as determined by R software; whiskers extend 1.5 times the
interquartile range from the 25th and 75th percentiles, outliers are
represented by dots. n = 3057, 1004 sample points. e.Normalized read depth of targeted ORFs as a function of the length of
the ORF
Post capture PCR of circles obtained from the capture of 3078 ORFs of
E.coli K12 was run in a 1.2% agarose gel and is shown in Fig. 4
b. and their apparent size distribution corresponded well with that of
the targeted ORFs. Post capture PCR amplicon was enzymatically
fragmented and sequenced on an Illumina NextSeq instrument to obtain 150
nucleotide paired end reads.
For reads mapping to the E. coli genome, we calculated target enrichment
factors, which we defined as the reads per kilobase of genetic element
per million reads (RPKM), which were mapped to the targeted ORFs versus
non-targeted ORFs. Furthermore, RPKM targeted/non-targeted ratios were
analyzed for different length genetic elements by binning Fig. 4
c In this experiment, LASSO targeted ORFs were enriched in all bins (up
to ~250 × for ORFs < 1kb) representing 8 times
improvement in comparison to enrichment previously measured by Tosi and
coworkers (2017).
Fig. 4d. illustrates box plots of average depth of sequencing
per kilobase for each targeted and for each untargeted ORF. The targeted
ORFs were significantly enriched compared with the non-targeted ORFs (by
Welch two-sample t-test). The mean and the median RPKM of the targets
was 2476 and 264 for the targets respectively while the mean and the
median RPKM of the Non Targets was 31 and 1.26 respectively.
Fold-enrichment of targets was calculated to be between 60- and 200-fold
(by the median or mean of the target RPKM, respectively, over the mean
non-target RPKM). At a cutoff of three times the median non-target RPKM,
around 70% of the targeted ORFs were successfully captured. The
normalized abundance of each target ORF was negatively correlated with
the ORF length; (Fig. 4e ). This length bias was previously
reported (Tosi et al. 2017) and it reflects target length-dependent
capture efficiency, post-capture PCR bias or a combination of the two
effects.