Ino de Bruijn Add N/A contig alignment and low purity alignment investigation.  almost 10 years ago

Commit id: d97e302a89d4aa87aee6ca64bf4f97b5a8cf5549

deletions | additions      

       

{  "metadata": {  "name": "" "",  "signature": "sha256:258caa1f79a9a4cfbd5435b6b85d59cd1f8056295811a51f8ae691874b8ac8d1"  },  "nbformat": 3,  "nbformat_minor": 0, 

"language": "python",  "metadata": {},  "outputs": [],  "prompt_number": 18 52  },  {  "cell_type": "code", 

"There are none in the filtered pairs though, since we took only properly aligning pairs and bowtie2 does not set the properly aligned flag for pairs aligning with one mate to the beginning of a genome and one mate to the end of a genome."  ]  },  {  "cell_type": "markdown",  "metadata": {},  "source": [  "## Investigate low purity and N/A contigs\n",  "### N/A contigs\n",  "There are still some contigs left that do not map anywhere. Most of them are really short contigs (< 100), they are currently not filtered by masmvali. There were only 4 contigs for velvetnoscaf31 that were not mapping. After aligning with blast:\n",  "\n",  "```\n",  "$ cd /gulo/glob/inod/projects/masmvali-partdeux/reassembly-filtered-reads/Sample_1ng_even/metassemble/assemblies/velvet/noscaf/noscaf_31/val/blastNAcontigs/blast\n",  "$ blastn -db nt -num_threads 1 -max_target_seqs 1 -outfmt '6 std stitle' -query ../contigs-min500-purity-below-09.fa > contigs-min500-purity-below-09-megablast.tsv\n",  "```\n",  "Turned out that all of them aligned to Sulfurihydrogenibium sp. YO3AOP1. The nucmer minimum cluster parameter '-c 65' was too large for MUMmer to find the alignment. For all of them there would be only one match with exactly matching length -l 20 which was not big enough to be a cluster and would therefore not be extended (I assume). We could either only use larger contigs to mitigate this problem or change the minimum cluster parameter to something smaller for more sensitivity.\n",  "\n",  "### Low purity contigs\n",  "For velvetnoscaf31 there were 15 contigs with length > 500 and purity below 90%:\n",  "\n",  "```\n",  "$ less contig-purity.tsv | awk '$2 > 500 && $4 < 0.9' | cut -f1-4 | sort -k4,4 | cut -f1 > naids.txt\n",  "$ wc -l naids.txt\n",  "15 naids.txt\n",  "```\n",  "\n",  "#### Blast results\n",  "```\n",  "blastn -db nt -num_threads 1 -outfmt '6 qseqid sseqid qstart qend qlen sstart send slen evalue bitscore length pident nident sgi sacc staxids stitle' -query ../contigs-min500-purity-below-09.fa > contigs-min500-purity-below-09-megablast.tsv\n",  "```"  ]  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "import pandas as pd\n",  "blastr = pd.read_csv(\"/media/milou/glob/projects/masmvali-partdeux/reassembly-filtered-reads/Sample_1ng_even/metassemble/assemblies/velvet/noscaf/noscaf_31/val/blast-low-purity-contigs/blast/contigs-min500-purity-below-09-megablast.tsv\", sep=\"\\t\",\n",  " names=\"qseqid sseqid qstart qend qlen sstart send slen evalue bitscore length pident nident sgi sacc staxids stitle\".split())"  ],  "language": "python",  "metadata": {},  "outputs": [],  "prompt_number": 13  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "blastr.head()"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
qseqidsseqidqstartqendqlensstartsendslenevaluebitscorelengthpidentnidentsgisaccstaxidsstitle
0 NODE_6298_length_1226_cov_6.613377 gi|6626247|gb|AE000782.1| 1 633 1256 1880117 1880749 2178400 0.000000e+00 1170 633 100.00 633 6626247 AE000782 224325 Archaeoglobus fulgidus DSM 4304, complete genome
1 NODE_6298_length_1226_cov_6.613377 gi|6626247|gb|AE000782.1| 624 1256 1256 1881964 1882596 2178400 0.000000e+00 1170 633 100.00 633 6626247 AE000782 224325 Archaeoglobus fulgidus DSM 4304, complete genome
2 NODE_31545_length_1101_cov_9.786558 gi|157914509|gb|CP000850.1| 257 1131 1131 5293414 5294288 5786361 0.000000e+00 1616 875 100.00 875 157914509 CP000850 391037 Salinispora arenicola CNS-205, complete genome
3 NODE_31545_length_1101_cov_9.786558 gi|157914509|gb|CP000850.1| 1 275 1131 5293095 5293369 5786361 4.000000e-140 508 275 100.00 275 157914509 CP000850 391037 Salinispora arenicola CNS-205, complete genome
4 NODE_31545_length_1101_cov_9.786558 gi|145301903|gb|CP000667.1| 1005 1131 1131 4807660 4807787 5183331 2.000000e-34 158 128 89.06 114 145301903 CP000667 369723 Salinispora tropica CNB-440, complete genome
\n",
  "

5 rows \u00d7 17 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 22,  "text": [  " qseqid sseqid qstart \\\n",  "0 NODE_6298_length_1226_cov_6.613377 gi|6626247|gb|AE000782.1| 1 \n",  "1 NODE_6298_length_1226_cov_6.613377 gi|6626247|gb|AE000782.1| 624 \n",  "2 NODE_31545_length_1101_cov_9.786558 gi|157914509|gb|CP000850.1| 257 \n",  "3 NODE_31545_length_1101_cov_9.786558 gi|157914509|gb|CP000850.1| 1 \n",  "4 NODE_31545_length_1101_cov_9.786558 gi|145301903|gb|CP000667.1| 1005 \n",  "\n",  " qend qlen sstart send slen evalue bitscore length \\\n",  "0 633 1256 1880117 1880749 2178400 0.000000e+00 1170 633 \n",  "1 1256 1256 1881964 1882596 2178400 0.000000e+00 1170 633 \n",  "2 1131 1131 5293414 5294288 5786361 0.000000e+00 1616 875 \n",  "3 275 1131 5293095 5293369 5786361 4.000000e-140 508 275 \n",  "4 1131 1131 4807660 4807787 5183331 2.000000e-34 158 128 \n",  "\n",  " pident nident sgi sacc staxids \\\n",  "0 100.00 633 6626247 AE000782 224325 \n",  "1 100.00 633 6626247 AE000782 224325 \n",  "2 100.00 875 157914509 CP000850 391037 \n",  "3 100.00 275 157914509 CP000850 391037 \n",  "4 89.06 114 145301903 CP000667 369723 \n",  "\n",  " stitle \n",  "0 Archaeoglobus fulgidus DSM 4304, complete genome \n",  "1 Archaeoglobus fulgidus DSM 4304, complete genome \n",  "2 Salinispora arenicola CNS-205, complete genome \n",  "3 Salinispora arenicola CNS-205, complete genome \n",  "4 Salinispora tropica CNB-440, complete genome \n",  "\n",  "[5 rows x 17 columns]"  ]  }  ],  "prompt_number": 22  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "len(blastr)"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "metadata": {},  "output_type": "pyout",  "prompt_number": 25,  "text": [  "441"  ]  }  ],  "prompt_number": 25  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "bg = blastr.groupby([\"qseqid\", \"stitle\"])\n",  "bg.count()"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
qseqidsseqidqstartqendqlensstartsendslenevaluebitscorelengthpidentnidentsgisaccstaxidsstitle
qseqidstitle
NODE_100676_length_9011_cov_31.258905Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique HTCC1062, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Cellulophaga algicola DSM 14237, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridiales sp. SS3/4 draft genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile 630 complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile BI1 chromosome, complete sequence 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile BI9 chromosome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile CD196 complete genome, strain CD196 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile R20291 complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile complete genome, strain 2007855 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile groESL operon, complete sequence 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Deferribacter desulfuricans SSM1 DNA, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Francisella cf. novicida 3523, complete genome 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
...................................................
\n",
  "

380 rows \u00d7 17 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 28,  "text": [  " qseqid \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " sseqid \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " qstart \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " qend \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " qlen \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " sstart \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " send \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " slen \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " evalue \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " bitscore \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " length \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " pident \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " nident \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " sgi \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " sacc \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " staxids \\\n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  " stitle \n",  "qseqid stitle \n",  "NODE_100676_length_9011_cov_31.258905 Aquificales str. CIR30126 chaperonin GroEL (groEL) gene, partial cds 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, strain F118-4 1 \n",  " Arcobacter bivalviorum partial hsp60 gene for Heat shock protein 60KD, type strain F4T 1 \n",  " Arcobacter sp. MDC1641 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1747 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter sp. MDC1767 heat shock protein 60 (cpn60) gene, partial cds 1 \n",  " Arcobacter suis CECT 7833 partial cpn60 gene for heat shock protein 60 family chaperone, type strain F41 1 \n",  " Arcobacter venerupis partial hsp60 gene for Heat shock protein 60KD, type strain F67-11T 1 \n",  " B.napus plastid 60-kDa chaperonin-60 beta-polypeptide (cpn-60 beta) mRNA, partial cds 1 \n",  " Bacillus firmus 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Bacillus firmus partial hsp60 gene for 60 kDa chaperonin, strain W 1527 1 \n",  " Bacillus subtilis strain B-14821 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23051 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23053 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. spizizenii strain B-23055 GroEL (groEL) gene, partial cds 1 \n",  " Bacillus subtilis subsp. subtilis RO-NN-1, complete genome 1 \n",  " Biomphalaria glabrata heat shock protein 60 (HSP60) mRNA, complete cds 1 \n",  " Borrelia hispanica strain Sp1 GroEL (groEL) gene, partial cds 1 \n",  " Borrelia hispanica strain Sp3 GroEL (groEL) gene, partial cds 1 \n",  " Butyrivibrio proteoclasticus B316 chromosome 1, complete sequence 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Japan DNA, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-NL, complete genome 1 \n",  " Candidatus Arthromitus sp. SFB-mouse-Yit DNA, complete genome 1 \n",  " Candidatus Pelagibacter ubique HTCC1002 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique HTCC1062, complete genome 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1013 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1016 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1025 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1040 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1051 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1057 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1061 HSP60 gene, partial cds 1 \n",  " Candidatus Pelagibacter ubique strain HTCC1062 HSP60 gene, partial cds 1 \n",  " Cellulophaga algicola DSM 14237, complete genome 1 \n",  " Clostridiales sp. SS3/4 draft genome 1 \n",  " Clostridium celatum strain ATCC 27791 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium difficile 630 complete genome 1 \n",  " Clostridium difficile BI1 chromosome, complete sequence 1 \n",  " Clostridium difficile BI9 chromosome 1 \n",  " Clostridium difficile CD196 complete genome, strain CD196 1 \n",  " Clostridium difficile R20291 complete genome 1 \n",  " Clostridium difficile complete genome, strain 2007855 1 \n",  " Clostridium difficile groESL operon, complete sequence 1 \n",  " Clostridium difficile heat shock protein GroEL (groEL) gene, complete cds 1 \n",  " Clostridium drakei strain SL1 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium scatologenes strain ATCC 25775 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Clostridium sticklandii str. DSM 519 chromosome, complete genome 1 \n",  " Deferribacter desulfuricans SSM1 DNA, complete genome 1 \n",  " Desulfurobacterium thermolithotrophum DSM 11699, complete genome 1 \n",  " Francisella cf. novicida 3523, complete genome 1 \n",  " Helicobacter bilis ATCC 51632 60 kDa chaperonin (cpn60) gene, partial cds 1 \n",  " Helicobacter bilis hsp60 gene for heat shock protein 60, partial cds, strain: FR106 1 \n",  " Helicobacter bilis partial hsp60 gene for heat shock protein 60, strain ATCC 51630 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO220 1 \n",  " Helicobacter sp. 'Flexispira str. FL56' partial hsp60 gene for heat shock protein 60, strain KO534B 1 \n",  " Helicobacter sp. 'Flexispira taxon 2' partial hsp60 gene for heat shock protein 60, strain ATCC 49314 1 \n",  " Helicobacter sp. 'Flexispira taxon 3' partial hsp60 gene for heat shock protein 60, strain ATCC 49320 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 43968 1 \n",  " Helicobacter trogontum partial hsp60 gene for heat shock protein 60, strain ATCC 49310 1 \n",  " ... \n",  "\n",  "[380 rows x 17 columns]"  ]  }  ],  "prompt_number": 28  },  {  "cell_type": "markdown",  "metadata": {},  "source": [  "Unfortunately too many hits per sequence, better to look at nucmer results per reference (or change BLAST parameters/only use references instead of nt)."  ]  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "import pandas as pd\n",  "nucmerpd = pd.read_csv(\"/media/milou/glob/projects/masmvali-partdeux/reassembly-filtered-reads/Sample_1ng_even/metassemble/assemblies/velvet/noscaf/noscaf_31/val/blast-low-purity-contigs/nucmer/nucmer.coords\", sep=\"\\t\",\n",  " names=\"S1 E1 S2 E2 LEN1 LEN2 IDY LENR LENQ COVR COVQ REFID QRYID\".split(), index_col=False)"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "output_type": "stream",  "stream": "stdout",  "text": [  "ERROR! Session/line number was not unique in database. History logging moved to new session 246\n"  ]  }  ],  "prompt_number": 131  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "nucmerpd.head()"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
S1E1S2E2LEN1LEN2IDYLENRLENQCOVRCOVQREFIDQRYID
0 1880117 1880749 1 633 633 633 100.0 2178400 1256 0.03 50.40 Archaeoglobus_fulgidus_DSM_4304 NODE_6298_length_1226_cov_6.613377
1 1881964 1882596 624 1256 633 633 100.0 2178400 1256 0.03 50.40 Archaeoglobus_fulgidus_DSM_4304 NODE_6298_length_1226_cov_6.613377
2 521935 525786 1 3852 3852 3852 100.0 2970275 5492 0.13 70.14 Caldisaccharolyticus_DSM_8903 NODE_144567_length_5462_cov_19.081656
3 526902 528554 3840 5492 1653 1653 100.0 2970275 5492 0.06 30.10 Caldisaccharolyticus_DSM_8903 NODE_144567_length_5462_cov_19.081656
4 4263530 4272810 1 9281 9281 9281 99.9 5258541 11013 0.18 84.27 Chloroflexus_aurantiacus_J-10-fl NODE_163185_length_10983_cov_16.791224
\n",
  "

5 rows \u00d7 13 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 132,  "text": [  " S1 E1 S2 E2 LEN1 LEN2 IDY LENR LENQ COVR \\\n",  "0 1880117 1880749 1 633 633 633 100.0 2178400 1256 0.03 \n",  "1 1881964 1882596 624 1256 633 633 100.0 2178400 1256 0.03 \n",  "2 521935 525786 1 3852 3852 3852 100.0 2970275 5492 0.13 \n",  "3 526902 528554 3840 5492 1653 1653 100.0 2970275 5492 0.06 \n",  "4 4263530 4272810 1 9281 9281 9281 99.9 5258541 11013 0.18 \n",  "\n",  " COVQ REFID \\\n",  "0 50.40 Archaeoglobus_fulgidus_DSM_4304 \n",  "1 50.40 Archaeoglobus_fulgidus_DSM_4304 \n",  "2 70.14 Caldisaccharolyticus_DSM_8903 \n",  "3 30.10 Caldisaccharolyticus_DSM_8903 \n",  "4 84.27 Chloroflexus_aurantiacus_J-10-fl \n",  "\n",  " QRYID \n",  "0 NODE_6298_length_1226_cov_6.613377 \n",  "1 NODE_6298_length_1226_cov_6.613377 \n",  "2 NODE_144567_length_5462_cov_19.081656 \n",  "3 NODE_144567_length_5462_cov_19.081656 \n",  "4 NODE_163185_length_10983_cov_16.791224 \n",  "\n",  "[5 rows x 13 columns]"  ]  }  ],  "prompt_number": 132  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "print len(nucmerpd)"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "output_type": "stream",  "stream": "stdout",  "text": [  "35\n"  ]  }  ],  "prompt_number": 133  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "nucmerpd.groupby([\"QRYID\",\"REFID\"]).count()"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
S1E1S2E2LEN1LEN2IDYLENRLENQCOVRCOVQREFIDQRYID
QRYIDREFID
NODE_100676_length_9011_cov_31.258905Persephonella_marina_EX-H1 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_114641_length_4007_cov_12.858996Gemmatimonas_aurantiaca_T-27_DNA 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_116750_length_15171_cov_18.040604Rhodopirellula_baltica_SH_1 3 3 3 3 3 3 3 3 3 3 3 3 3
NODE_144567_length_5462_cov_19.081656Caldisaccharolyticus_DSM_8903 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_163185_length_10983_cov_16.791224Chloroflexus_aurantiacus_J-10-fl 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_169705_length_2232_cov_20.505377Porphyromonas_gingivalis_ATCC_33277_DNA 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_179244_length_804_cov_7.087065Desulfovibrio_piger_ATCC_29098 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_184936_length_645_cov_5.525581Salinispora_arenicola_CNS-205 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_233246_length_477_cov_10.846960Treponema_denticola_ATCC_35405 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_304540_length_2772_cov_50.233044Shewanella_baltica_OS185 2 2 2 2 2 2 2 2 2 2 2 2 2
Shewanella_baltica_OS223 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_31545_length_1101_cov_9.786558Salinispora_arenicola_CNS-205 2 2 2 2 2 2 2 2 2 2 2 2 2
Salinispora_tropica_CNB-440 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_344588_length_5230_cov_9.675526Deinococcus_radiodurans_R1 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_346358_length_6782_cov_12.580213Nitrosomonas_europaea_ATCC_19718 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_6298_length_1226_cov_6.613377Archaeoglobus_fulgidus_DSM_4304 2 2 2 2 2 2 2 2 2 2 2 2 2
NODE_86907_length_3816_cov_11.066299Deinococcus_radiodurans_R1 2 2 2 2 2 2 2 2 2 2 2 2 2
\n",
  "

17 rows \u00d7 13 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 134,  "text": [  " S1 \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " E1 \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " S2 \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " E2 \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " LEN1 \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " LEN2 \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " IDY \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " LENR \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " LENQ \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " COVR \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " COVQ \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " REFID \\\n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  " QRYID \n",  "QRYID REFID \n",  "NODE_100676_length_9011_cov_31.258905 Persephonella_marina_EX-H1 2 \n",  "NODE_114641_length_4007_cov_12.858996 Gemmatimonas_aurantiaca_T-27_DNA 2 \n",  "NODE_116750_length_15171_cov_18.040604 Rhodopirellula_baltica_SH_1 3 \n",  "NODE_144567_length_5462_cov_19.081656 Caldisaccharolyticus_DSM_8903 2 \n",  "NODE_163185_length_10983_cov_16.791224 Chloroflexus_aurantiacus_J-10-fl 2 \n",  "NODE_169705_length_2232_cov_20.505377 Porphyromonas_gingivalis_ATCC_33277_DNA 2 \n",  "NODE_179244_length_804_cov_7.087065 Desulfovibrio_piger_ATCC_29098 2 \n",  "NODE_184936_length_645_cov_5.525581 Salinispora_arenicola_CNS-205 2 \n",  "NODE_233246_length_477_cov_10.846960 Treponema_denticola_ATCC_35405 2 \n",  "NODE_304540_length_2772_cov_50.233044 Shewanella_baltica_OS185 2 \n",  " Shewanella_baltica_OS223 2 \n",  "NODE_31545_length_1101_cov_9.786558 Salinispora_arenicola_CNS-205 2 \n",  " Salinispora_tropica_CNB-440 2 \n",  "NODE_344588_length_5230_cov_9.675526 Deinococcus_radiodurans_R1 2 \n",  "NODE_346358_length_6782_cov_12.580213 Nitrosomonas_europaea_ATCC_19718 2 \n",  "NODE_6298_length_1226_cov_6.613377 Archaeoglobus_fulgidus_DSM_4304 2 \n",  "NODE_86907_length_3816_cov_11.066299 Deinococcus_radiodurans_R1 2 \n",  "\n",  "[17 rows x 13 columns]"  ]  }  ],  "prompt_number": 134  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "nucmerpd[nucmerpd[\"QRYID\"] == \"NODE_304540_length_2772_cov_50.233044\"]"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
S1E1S2E2LEN1LEN2IDYLENRLENQCOVRCOVQREFIDQRYID
29 1 1293 1510 2802 1293 1293 99.77 5229686 2802 0.02 46.15 Shewanella_baltica_OS185 NODE_304540_length_2772_cov_50.233044
30 5228178 5229686 1 1509 1509 1509 99.93 5229686 2802 0.03 53.85 Shewanella_baltica_OS185 NODE_304540_length_2772_cov_50.233044
31 1 988 1815 2802 988 988 99.39 5145902 2802 0.02 35.26 Shewanella_baltica_OS223 NODE_304540_length_2772_cov_50.233044
32 5144089 5145902 1 1814 1814 1814 99.83 5145902 2802 0.04 64.74 Shewanella_baltica_OS223 NODE_304540_length_2772_cov_50.233044
\n",
  "

4 rows \u00d7 13 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 135,  "text": [  " S1 E1 S2 E2 LEN1 LEN2 IDY LENR LENQ COVR \\\n",  "29 1 1293 1510 2802 1293 1293 99.77 5229686 2802 0.02 \n",  "30 5228178 5229686 1 1509 1509 1509 99.93 5229686 2802 0.03 \n",  "31 1 988 1815 2802 988 988 99.39 5145902 2802 0.02 \n",  "32 5144089 5145902 1 1814 1814 1814 99.83 5145902 2802 0.04 \n",  "\n",  " COVQ REFID QRYID \n",  "29 46.15 Shewanella_baltica_OS185 NODE_304540_length_2772_cov_50.233044 \n",  "30 53.85 Shewanella_baltica_OS185 NODE_304540_length_2772_cov_50.233044 \n",  "31 35.26 Shewanella_baltica_OS223 NODE_304540_length_2772_cov_50.233044 \n",  "32 64.74 Shewanella_baltica_OS223 NODE_304540_length_2772_cov_50.233044 \n",  "\n",  "[4 rows x 13 columns]"  ]  }  ],  "prompt_number": 135  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "nucmerpd[nucmerpd[\"QRYID\"] == \"NODE_346358_length_6782_cov_12.580213\"]"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
S1E1S2E2LEN1LEN2IDYLENRLENQCOVRCOVQREFIDQRYID
14 495650 498088 1 2439 2439 2439 100 2812094 6812 0.09 35.8 Nitrosomonas_europaea_ATCC_19718 NODE_346358_length_6782_cov_12.580213
15 498085 502539 2358 6812 4455 4455 100 2812094 6812 0.16 65.4 Nitrosomonas_europaea_ATCC_19718 NODE_346358_length_6782_cov_12.580213
\n",
  "

2 rows \u00d7 13 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 136,  "text": [  " S1 E1 S2 E2 LEN1 LEN2 IDY LENR LENQ COVR COVQ \\\n",  "14 495650 498088 1 2439 2439 2439 100 2812094 6812 0.09 35.8 \n",  "15 498085 502539 2358 6812 4455 4455 100 2812094 6812 0.16 65.4 \n",  "\n",  " REFID QRYID \n",  "14 Nitrosomonas_europaea_ATCC_19718 NODE_346358_length_6782_cov_12.580213 \n",  "15 Nitrosomonas_europaea_ATCC_19718 NODE_346358_length_6782_cov_12.580213 \n",  "\n",  "[2 rows x 13 columns]"  ]  }  ],  "prompt_number": 136  },  {  "cell_type": "code",  "collapsed": false,  "input": [  "nucmerpd[nucmerpd[\"QRYID\"] == \"NODE_31545_length_1101_cov_9.786558\"]"  ],  "language": "python",  "metadata": {},  "outputs": [  {  "html": [  "
\n",  "\n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  " \n",  "
S1E1S2E2LEN1LEN2IDYLENRLENQCOVRCOVQREFIDQRYID
25 5293095 5293369 1 275 275 275 100.00 5786361 1131 0.00 24.31 Salinispora_arenicola_CNS-205 NODE_31545_length_1101_cov_9.786558
26 5293414 5294288 257 1131 875 875 100.00 5786361 1131 0.02 77.37 Salinispora_arenicola_CNS-205 NODE_31545_length_1101_cov_9.786558
27 4806766 4807258 1 482 493 482 83.56 5183331 1131 0.01 42.62 Salinispora_tropica_CNB-440 NODE_31545_length_1101_cov_9.786558
28 4807664 4807787 1008 1131 124 124 89.52 5183331 1131 0.00 10.96 Salinispora_tropica_CNB-440 NODE_31545_length_1101_cov_9.786558
\n",
  "

4 rows \u00d7 13 columns

\n",
  "
"
  ],  "metadata": {},  "output_type": "pyout",  "prompt_number": 137,  "text": [  " S1 E1 S2 E2 LEN1 LEN2 IDY LENR LENQ COVR \\\n",  "25 5293095 5293369 1 275 275 275 100.00 5786361 1131 0.00 \n",  "26 5293414 5294288 257 1131 875 875 100.00 5786361 1131 0.02 \n",  "27 4806766 4807258 1 482 493 482 83.56 5183331 1131 0.01 \n",  "28 4807664 4807787 1008 1131 124 124 89.52 5183331 1131 0.00 \n",  "\n",  " COVQ REFID QRYID \n",  "25 24.31 Salinispora_arenicola_CNS-205 NODE_31545_length_1101_cov_9.786558 \n",  "26 77.37 Salinispora_arenicola_CNS-205 NODE_31545_length_1101_cov_9.786558 \n",  "27 42.62 Salinispora_tropica_CNB-440 NODE_31545_length_1101_cov_9.786558 \n",  "28 10.96 Salinispora_tropica_CNB-440 NODE_31545_length_1101_cov_9.786558 \n",  "\n",  "[4 rows x 13 columns]"  ]  }  ],  "prompt_number": 137  },  {  "cell_type": "markdown",  "metadata": {},  "source": [  "As can be seen from these alignments the contigs still align mostly to the same genome, so it seems like there are no chimeras with X% one genome and Y% another genome for large values of X and Y. There are mostly rearrengements and some indels compared to the reference. It is pretty tricky to look at the alignments this way, a visualization would be nice. Perhaps using some BLAST visualizer."  ]  },  {  "cell_type": "code",  "collapsed": false,