Ethics
The human brain samples sourced from the human brain tissue repository
were collected post-mortem following written, informed consent from next
of kin of the deceased, and their use for research was approved by the
NIMHANS Institutional Ethics Committee. All canine brain tissues used in
this study were collected at necropsy from dogs that died of suspected
rabies, through the routine rabies surveillance programme conducted by
Mission Rabies, a non-governmental organization and the Department of
Animal Husbandry and Veterinary Services, Government of Goa and sent to
our rabies referral laboratory for diagnostic confirmation. Therefore,
approval from Institutional Animal Ethics Committee was not required.
The study protocol was approved by the NIMHANS Institutional Ethics
Committee (NIMH/DO/ETHICS SUB-COMMITTEE MEETING/2017, dated June 19,
2017).
RNA extraction
Approximately, 300 mg of each brain tissue was cut into small pieces
with a sterile blade, resuspended in 1 ml of nuclease free water and
homogenized. The homogenate was used for nucleic acid purification using
QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany #52904). Briefly, on
column DNase digestion (NEB #M0303L) was performed for 30 min at room
temperature following lysis and flow through on Qiagen column. This was
followed by the wash steps as recommended by the manufacturer. RNA was
eluted in 50 µl of elution buffer and stored at -80◦C until use. 8µl of
RNA from each sample was taken for cDNA preparation using LunaScript RT
SuperMix Kit (NEB#E3010L).
Amplicon primer design and sequencing
All the samples used in the study were positive for RABV by real time
reverse transcriptase PCR (Ct values 14-25). We used an amplicon-based
sequencing method using tiling primer-based approach for whole genome
sequencing using Primal Scheme 29 to recover complete
RABV genomes from the samples. A set of 5 reference sequences belonging
to the arctic lineage from datasets curated and filtered in RABV-GLUE
were chosen to represent the known circulating RABV diversity in India.
The primer sets included 38 pairs with amplicons of about 400 base pairs
(bp) spanning the whole genome except the 5’ and the 3’UTR
(Supplementary Table S1). The primer sets used in this study can detect
the Indian subcontinent as well as all other Arctic lineages. The primer
coverage for other RABV lineages reported in India is provided in
Supplementary Figure-1. PCR was performed by pooling overlapping primers
into different pools to prevent preferential amplification of short
fragments between adjacent primer pairs. The final concentration of 10
mM of primers was used for PCR. The resulting PCR amplicons were used
for preparing libraries for nanopore sequencing using the native
barcoding (NBD 104/114) approach combined with the ligation sequencing
kit (SQK-LSK109). About 12-24 samples were barcoded and included in a
single run. The resulting DNA was cleaned up, quantified on a Qubit 1.0
fluorimeter (Thermo Fisher Scientific), followed by loading onto MinION
flow cell (FLO-MIN106) for sequencing (ONT, UK). Acquisition was done
using the MinKNOW Software v.1.4.2 and the sequencing run was stopped
after 24 hours. The detailed step by step protocol is available atdx.doi.org/10.17504/protocols.io.3byl4jzb8lo5/v1 .
Data analysis
Sequences were basecalled and demultiplexed using guppy (v3.6), and read
lengths between 100–600bp were considered for further analysis.
Sequencing reads were trimmed by 25 bp on both ends and the specific
removal of primer sequences was performed using BBDuk (v38.37).
Resulting reads were mapped to LT909541 as this genome was found to have
higher similarity to the consensus genomes based on percentage identity,
using Minimap2 (v2.17) within Geneious Prime (Geneious Prime 2020.0.3).
For each sample, a consensus genome was generated with a minimum
coverage of 10× for each base and by calling the most frequently
occurring base at each position. Consensus sequences were manually
examined, edited, and aligned to the reference genome to ensure the
correct reading frame, followed by transfer of annotations from the
reference sequence.
Phylogenetic analysis
Phylogenetic trees were constructed using Maximum likelihood method with
the complete coding sequences. The available RABV whole genome rabies
sequences from all around the globe were retrieved from the NIAID Virus
Pathogen Database and Analysis Resource (ViPR) through the web site at
http://www.viprbrc.org/and used for the analysis. Sequences were
uploaded to RABV-GLUE for the lineage determination.
All the sequences were aligned using MUSCLE algorithm (Edgar 2004) in
Aliview program (V.2018). Only the sequences adequately covering the
regions of interest (n=55) were taken for further analysis. A maximum
likelihood tree was constructed with 1000 bootstraps, with GTR+I+G4
method based on Bayesian information criteria for whole genome sequences
using the iqtree (v1.3.11.1) software. The tree was visualized using
Figtree (v1.4.2) 30.
Phylogeny and divergence estimation
To infer the phylogenetic relationship and divergence time between
Indian genomes generated in this study and other known rabies genomes,
we created a dataset with 40 whole genomes from neighboring and other
south Asian countries; Nepal, Pakistan, Bangladesh, Afghanistan, China
and Sri Lanka. Sequences were aligned using MUSCLE 31in MEGA 7.0.26 32 and missing positions from both end
of the alignment were trimmed to 11700 bp. This alignment was used to
infer phylogenetic relationships and divergence estimation using BEAST
1.8.4 33. Sequence isolation dates were used to
perform tip dating and calibrate the phylogenies. The General
Time-Reversible (GTR) substitution with gamma (G) heterogeneity model
with a strict molecular clock were used for the analysis. Each phylogeny
was run for 3x108 Markov Chain Monte Carlo (MCMC)
cycles. Effective sample sizes (ESS) and parameter convergence of MCMC
cycles were assessed using Tracer 1.6 34. A maximum
clade credibility consensus tree was generated after 25% burnin. The
resultant phylogeny was visualized and annotated using FigTree 1.4.330. The same analysis was separately performed for the
south Indian (Karnataka, Tamil Nadu, Andhra Pradesh, and Goa) genomes
generated in this study to understand their phylogenetic relationship
and divergence. Internal nodes of the phylogeny were marked and colored
based on the Posterior Probability support (PP).
Results
Complete RABV genomes were recovered from all the 20 archived human and
canine brain tissue samples (100%) sequenced using an amplicon-based
approach in this study. All sequences have been deposited in GenBank;
Accession numbers are given in Table 3. The depth and breadth of the
sequences and the pairwise identity with the reference genome is given
in the Table 2 and Supplementary Figure S2. A median of 0.08 million
reads per sample was obtained. Phylogenetic analysis of the complete
concatenated coding sequences of 75 RABV from across the world
(including 20 sequences from this study) showed that all the RABV
sequences obtained from human and canine brain tissues in the study
belonged to the Artic-like lineage, sub-lineage AL1a (Figure 1).
Our analysis revealed the phylogenetic position of south Indian
sequences to other Arctic-like lineages (AL1a) from India, Nepal,
Bangladesh and Pakistan (Figure 2A). All the 20 sequences generated in
this study were part of this AL1a lineage which is sister to other
Arctic-like lineage-AL1b from Afghanistan (PP =1). These two lineages
were estimated to have diverged from an ancestral sequence around 1825
(95% HPD 1696-1895). Within the south Indian states (Figure 2B), rabies
sequences showed two strong monophyletic lineages (PP=1, Figure 2C),
separated around 1956 (95% HPD 1930-1978). Clustering was not
associated with state. The first monophyletic lineage (Figure 2C top)
primarily included human sequences from Karnataka (n=5) and Tamil Nadu
(n=2) along with sequence EF437215 reported earlier from Karnataka.
Canine sequences, CR1 and CR2, collected from Goa were also part of the
lineage. All the other canine samples were part of the second
monophyletic lineage (Figure 2C bottom) which clustered with three human
samples (one from Andhra Pradesh and two from Karnataka). However, we
observed CR3, a canine sequence from this lineage, clustering with the
first lineage in the larger phylogeny (Figure 2A). Since the posterior
probability support of the CR3 divergence in the second phylogeny is low
(PP=0.65), this minor topological variation will not affect our result
interpretation of divergence estimation.