Results

Experiment 1: Evaluation of Illumina TruSeq

In our hands, the Illumina TruSeq protocol has performed extremely reliably with samples on the scale of  100ng of total RNA, the manufacturer recommended lower limit of the protocol. However, attempts to create libraries from much smaller samples yielded low complexity libraries, corresponding to as much as 30-fold PCR duplication of fragments. Anecdotally, less than 5% of libraries made with at least 90ng of total RNA yielded abnormally low concentrations, which we observed correlated with low complexity (Data not shown). To determine the lower limit of input needed to reliably produce libraries, we attempted to make libraries from 40, 50, 60, 70, and 80 ng of Drosophila total RNA, each in triplicate.

Total TruSeq cDNA library yields made with a given amount of input total RNA. Yields measured by Nanodrop of cDNA libraries resuspended in 25\(\mu L\) of EB. The italicized samples were unusually low, and when analyzed with a Bioanalyzer, showed abnormal size distribution of cDNA fragments.
Amount Input RNA Replicate A Replicate B Replicate C
40 ng 57 ng 425 ng 672 ng
50 ng 435 ng 768 ng 755 ng
60 ng 115 ng 663 ng 668 ng
70 ng 300 ng 593 ng 653 ng
80 ng 468 ng 550 ng 840 ng

\label{table:truseqtitration}

We considered the two libraries with lower than usual concentration to be failures. While a failure rate of approximately 1 in 3 might be acceptable for some purposes, we ultimately wanted to perform RNA sequencing on precious samples, where a failure in any one of a dozen or more libraries would necessitate regenerating all of the libraries. Furthermore, due to the low sample volumes involved (less than approximately 500pg of poly-adenylated mRNA), common laboratory equipment is not able to determine the particular point in the protocol where the failures occurred.

Thus, we consider 70 ng of total RNA to be the conservative lower limit to the protocol. While this is about 30% smaller than the manufacturer suggests, it is still several orders of magnitude larger than we needed it to be. We therefore considered using other small-volume and “single-cell” RNA-seq kits, which we had less experience with and less faith in the data.

Experiment 2: Competitive Comparison of Low-volume RNAseq protocols

We first sought to determine whether the low-volume RNAseq protocols available faithfully recapitulate linear changes in abundance of known inputs. We generated synthetic spike-ins by combining D. melanogaster and D. virilis total RNA in known, predefined proportions of 0, 5, 10, and 20% D. virilis RNA. For each of the low-volume protocols, we used 1ng of total RNA as input, whereas for the TruSeq protocol we used 100ng.

Although pre-defined mixes of spike-in controls have been developed and are commercially available \cite{Jiang_2011}, we felt it was important to ensure that a given protocol would function reproducibly with natural RNA, which almost certainly has a different distribution of 6-mers, which could conceivably affect random cDNA priming and other amplification effects. Furthermore, our spike-in sample more densely covers the approximately \(10^5\) fold coverage typical of RNA abundances. It should be noted, however, that our sample is not directly comparable to any other standards, nor is the material of known strandedness. We assumed that the majority of each sample is from the standard annotated transcripts, but did not verify this prior to library construction and sequencing.

The different protocols had a variation in yield of libraries from between 6 fmole (approximately 3.6 trillion molecules) and 2,400 femtomoles, with the TruSeq a clear outlier at the high end of the range, and the other protocols all below 200 fmole (Table \ref{tab:protocols}). All of these quantities are sufficient to generate hundreds of millions of reads—far more than is typically required for an RNA-seq experiment. We pooled the samples, attempting equimolar fractions in the final pool; however, due to a pooling error, we generated significantly more reads than intended for the TruSeq protocol, and correspondingly fewer in the other protocols. Unless otherwise noted, we therefore sub-sampled the mapped reads to the lowest number of mapped reads in any sample in order to provide a fair comparison between protocols.

We were interested in the fold-change of each D. virilis gene across the four samples, rather than the absolute abundance of any particular gene. Therefore, after mapping and gene quantification, we normalized the abundance \(A_{ij}\) of every gene \(i\) across the \(j=4\) samples by a weighted average of the quantity \(Q_j\) of D. virilis in sample \(j\), as show in equation \ref{eqn:norm}. Thus, within a given gene, a linear fit of \(\hat{A}_{ij}\) vs \(Q_j\) should have a slope of one and an intercept of zero.

\[\label{eqn:norm} \hat{A}_{ij} = A_{ij} \div \frac{\sum_j Q_j A_{ij}}{\sum_j (Q_j)^2}\]

We filtered the D. virilis genes for those with at least 20 mapped fragments in the sample with 20% D. virilis, then calculated an independent linear regression for each of those genes. As expected, for every protocol, the mean slope was 1 (\(t\)-test, \(p<5\times10^{-7}\) for all protocols). Similarly, the average intercepts for all protocols was 0 (\(t\)-test, \(p<5\times10^{-7}\) for all protocols). Also unsurprisingly, the TruSeq protocol had a noticeably higher mean correlation coefficient (\(0.98 \pm 0.02\)) than any of the other protocols (\(0.95 \pm 0.06\), \(0.92\pm0.09\), and \(0.95 \pm 0.06\) for Clontech, TotalScript, and SMART-seq2, respectively). The mean correlation coefficient was statistically and practically indistinguishable between the Clontech samples and the SMART-seq2 samples (\(t\)-test \(p = .11\), Figure \ref{fig:fithists}).

Indeed, the only major differentiator we could find between the low-volume protocols we measured was cost. For only a handful of libraries, the kit-based all inclusive model of the Clontech and TotalScript kits could be a significant benefit, allowing the purchase of only as much of the reagents as required. By contrast, the Smart-seq2 protocol requires the a la carte purchase of a number of reagents, some of which are not available or more expensive per unit for smaller quantities. Furthermore, there could potentially be a “hot dogs and buns” problem, where reagents are sold in non-integer multiples of each other, leading to leftovers. Many of these reagents are not single-purpose, however, so leftovers could in principle be repurposed in other experiments.