Christopher Medway edited Results and Discussion.tex  over 8 years ago

Commit id: 85d22c111b806aa914a438bed57d8c3269a24f52

deletions | additions      

       

\section{Results and Discussion}  \subsection{Assembly of a Novel Yersinia Genome}  Quality metricies reported by FASTQC for the 5,260,610 75bp Illumina MiSeq reads (paired-end) indicated the data was of high quality; for both pairs, median Phred scores were above 30 across the entire read (Figure 1). Residual Illumina adapter sequence was detected  and Annotation} removed using the fastx-clipper.   (optimal kmer=53) Fourty contigs larger than 1kbp were assembled, with an average length of 118,677bp (n50 = 276703bp). The total length of all contigs was 4,747,089bp, the largest contig was 563,205bp. This is broadly consistent with known Yersinea genome sizes.  To perform a phylogenetic analysis of Yersinia species, 16S ribosomal subunit sequence was downloaded from GenBank. In total 34 RefSeq sequences from 17 different Yersinia species were analysed (Table 1). To supplement the analysis with Y.enterocolitica species from the full spectrum of biovars (1A, 1B and 2-5), contigs from fifteen Y.enterocolitica samples, reported in Reuter et al, were downloaded from the European Nucleotide Archieve (Reuter 2014) (Table 2). Where not already available, 16S ribosomal subunit nucleotide sequence were extracted from assembled contigs using the RNAmmer server (v1.2) (Lagesen 2007). 16S FASTA sequences were aligned using Clustal Omega (Sievers 2011, Goujon 2010, McWilliam 2013) and alignment files were used to construct phylogenetic trees in Seaview using a parsimony model with 100 bootstrap replicates (Gouy 2010).  Contigs were scaffolded to a reference Y.enterocolitica genome (Genbank: AM286415.1) using the Contiguator web application (http://combo.dbe.unifi.it/contiguator) (Galardini 2011). Scaffold and contig assemblies were annotated with gene features using two independent tools; PROKKA (Seemann 2014) and the RAST server (Aziz 2008, Overbeek 2014). Annotations is Genbank format were uploaded to Artemis for visulaisation (Rutherford 2000).  PathogenFinder v1.1 and ResFinder v2.1 were used to identify and rank pathogenic genes and antibiotic resistance genes respectively (Cosentino 2013) (Zankari 2012). The identification of bacterial insertion sequences was performed using the ISFinder website (https://www-is.biotoul.fr/) (Siguier 2006).  5,260,610 Illumina paired-end reads were assembled into 40 large contigs (>1kbp) totaling 4,747,089bp. A server-based nucleotide blast on each contig revealed that, in every case, the closest match was to \textit{Y.enterocolitica}.  Gene prediction was broadly consistent between three different tools; PROKKA and RAST. PROKKA predicted 4341 gene features, whereas RAST was slightly higher with 4494 gene features. In both cases, a core set of 84 Yersinia housekeeping genes were identified by both annotation methods.