David Coil edited Library Preparation and Sequencing .md  almost 10 years ago

Commit id: 7577ec340a8f1dd312be88befca640458da0999b

deletions | additions      

       

Given the overcapacity of Illumina sequencing for bacterial genomes, sequencing a single genome presents a problem (unless willing to pay the ~$2000 total cost and throw away most of the data). Sequencing facilities will typically not "pool" samples from multiple groups because they don't want to oversee the pooling or deal with the associated accounting hassles. However, collaborating with other groups can be a great option. Many labs sequence genomes or metageomes on a regular basis; adding in one additional sample isn't technically very difficult, but it will entail oversight of the pooling and the associated accounting hassles. This will also entail a discussion of barcode compatibility, to ensure that all barcodes are sufficiently unique for demultiplexing.  ##Downsampling  Coverage (read depth) is the average number of reades reads  representing a given nucleotide. nucleotide and is a function of the number and size of genomes pooled onto a run. The optimal amount of coverage depends on the read length, the assembler being used, and other factors.  For Illumina data assembled using this workflow  we recommend that this number be between ~30X and 100X. Much less than 30X coverage 20x  and the quality of any given base 200x. See our more detailed discussion  in the assembly may come into question. Conversely, too much coverage can reduce the quality section ??? "Interpretation  of the assembly and require downsampling. **Instructions or reference for downsampling?** A5-miseq stats".  If you have coverage significantly higher than 100x 200x  and wish to downsample your data we have written a script (sub_sample_reads) (sub\_sample\_reads) for this purpose. You will first need to calculate how many reads you want the script to sample. We recommend determining how many reads would be equivalent to 100x coverage (divide the genome size by the average read length and multiply by 100). You can download the script using the curl command. Create a new directory containing the reads you wish to downsample. In the terminal navigate the directory you just created and download the script using the following syntax curl https://raw.githubusercontent.com/gjospin/scripts/master/subsample_reads.pl > sub_sample_reads.pl