Abstract
Group A Rotavirus (RVA), which causes acute gastroenteritis (AGE) in
children worldwide, is categorized mainly based on VP7 (genotype G) and
VP4 (genotype P) genes. Genotypes that circulate at <1% are
considered unusual. Important genes are also VP6 (genotype I) and NSP4
(genotype E). VP6 establishes the group and affects immunogenicity,
while NSP4, as enterotoxin, is responsible for the clinical symptoms.
Aim of this study was to genotype and molecularly characterize the VP6
and NSP4 genes of unusual RVA. Unusual RVA strains isolated from fecal
samples of children ≤16 years with AGE, were genotyped in VP6 and NSP4
genes with Sanger sequencing. Phylogenetics was performed using the MEGA
11 program. In a 15-year period (2007-2021), 54.8% (34/62) of unusual
RVA were successfully I and E genotyped. Three different I and E
genotypes were identified; I2 (73.5%, 25/34) and E2 (35.3%, 12/34)
were the commonest. E3 genotype was detected from 2017 onwards. The
uncommon combination of I2-E3 was found in 26.5% (9/34) of the strains
and G3-P[9]-I2-E3 was the most frequent G-P-I-E combination (20.6%,
7/34). Statistical analysis showed that children infected with E2
strains had a higher relative frequency of dehydration (50%) compared
to those with the E3 genotype (p =0.019). Multiple substitutions
were detected in both genes, but their functional effect remains
unknown. The results of this study highlight the genetic diversity of
RVA strains. Continuous surveillance of the RVA based on the whole
genome will provide a better knowledge of its evolution.
Keywords: Rotavirus, acute gastroenteritis, children, NSP4,
VP6, genotyping, phylogenetic
Introduction
Group A Rotavirus (RVA) is one of the most common etiological agents of
acute gastroenteritis (AGE) in infants and young children, especially in
developing countries. Children with RVA AGE can present severe
dehydration that can even lead to death if left untreated. RVA is
responsible for more than 100,000 deaths each year
worldwide.1
RVA is a non-enveloped, icosahedral, double stranded RNA virus (dsRNA)
and is a member of the Reoviridae family. Its genome consists of
11 linear dsRNA segments which encode six structural viral proteins
(VP1-VP4, VP6 and VP7) and six non-structural viral proteins
(NSP1-NSP6).2
The viral particles consist of a triple layered capsid. The outer capsid
consists of the glycoprotein VP7 and the spike protease-sensitive
protein VP4. The middle layer consists of VP6 and the core layer
comprises of the VP2 which encapsulates genomic RNA and viral
replication components.2 The abundant VP6 protein is
commonly used for the detection and classification of rotaviruses.
Currently, ten rotavirus species have been identified, A-J, but only A,
B, C and H can infect humans including animal-human
transmissions.2–5
RVAs are further classified based on the outer layer proteins VP7 and
VP4 in G and P genotypes, respectively. Although many different G and P
types have been identified so far the most common circulating genotypes
are G1P[8], G2P[4], G3P[8], G4P[8], G9P[8] and
G12P[8].6,7 Genotyping can also be applied in the
whole virus genome, Gx-P[x]-Ix-Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx, with “x”
indicating the number of the corresponding genotype, which represents
the genotypes of the genes
VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5.8
NSP4, except for the full length protein, encodes a toxic-peptide
(114-135 amino acids) and both act as enterotoxins that can stimulate
Ca2+ release from the endoplasmic reticulum into the
cytoplasm.9,10 NSP4 enterotoxin activity aggravates
the symptoms of gastroenteritis and especially diarrhoea and
vomiting.11,12
Since 2006, several RVA vaccines have been released worldwide. The most
widely used are the two-dose monovalent vaccine Rotarix (GlaxoSmithKline
Biologicals, Belgium) and the three-dose pentavalent vaccine RotaTeq
(Merck, United States), which cover the most common G and P
genotypes.13,14 After their release, notable changes
in genotype distribution have been described worldwide, such as the
increase in unusual G and P genotypes.2,15 The aim of
this study was the molecular and phylogenetic characterization of VP6
and NSP4 genes of previously described16 unusual G and
P RVA strains isolated from children ≤16 years with AGE.
- Materials and Methods2.1 Study design This is a retrospective study involving RVA positive fecal samples
with previously described unusual G (G6, G8, G10) and/or P (P[6],
P[9], P[10], P[11], P[14]) genotypes collected from
children ≤16 years hospitalized with AGE.16 In the
present study, these strains were further genotyped in VP6 (I
genotype) and NSP4 (E genotype) genes.
Demographic and epidemiological data such as age, gender, residence,
and RVA vaccination status were also collected from the children
infected with an unusual RVA genotype. Clinical symptoms (diarrhoea,
vomiting, fever and dehydration) and laboratory data (measurements of
potassium (K+), sodium (Na+),
calcium (Ca2+), Chlorine (Cl-),
C-reactive protein (CRP), urea, creatinine, white blood cells (WBC),
polymorphonuclear leukocytes and lymphocytes) were also recorded.
The Scientific and Bioethics Committee of “Aghia Sophia” Children’s
Hospital approved this study (No 6261).
- Reverse transcription and amplification of VP6 and NSP4
genesNucleic acid extraction and reverse transcription (RT) were
performed as previously described.16 PCR
amplification was performed using GoTaq DNA polymerase (Promega;
Madison, WIS, USA) and primers F: 5’-GAC GGV GCR ACT ACA TGG T-3’
and R: 5’-GTC CAA TTC ATN CCT GGT G-3’ for the VP6 gene and F:
5’-GGC TTT TAA AAG TTC TGT TCC GAG-3’ and R: 5’-GTC ACA YTA AGA CCR
TTC CTT CCA T-3’ for NSP4 gene17,18. The PCR
amplification was carried out with an initial denaturation at
94οC for 2 minutes (min), followed by 40 cycles of
denaturation for 1 min at 94 οC, annealing for 1
min at 55οC for the VP6 gene and
48οC for the NSP4 gene, extension for 1 min at 72οC and final extension for 10 min at 72οC. The amplification products were analyzed by
2% agarose gel electrophoresis using a 50bp DNA ladder (N3236S; New
England Biolabs, Massachusetts, USA) and ethidium bromide staining.
The expected size was 379 bp for the VP6 gene, and 749 bp for the
NSP4 gene.
- Sequencing and phylogenetic analysis
The I and E genotypes were determined by performing Sanger sequencing
with the BigDye Terminator v3.1 cycle sequencer kit on an Applied
Biosystems 3500 genetic analyser (Applied Biosystems, Waltham, MA, USA)
and using the BLAST bioinformatic tool
(https://blast.ncbi.nlm.nih.gov/Blast.cgi).
Phylogenetic evolutionary analysis was performed on VP6 and NSP4 genes
using the MEGA 11 software (Molecular Evolutionary Genetics Analysis;
www.megasoftware.net). Multiple sequence alignment was performed using
MUSCLE software (Multiple Sequence Comparison by Log-Expectation). The
nucleotide substitution model was selected based on the BIC (Bayesian
Information Criterion) scores using MEGA11. The model used in this study
was Tamura 3-parameter (T92) using a discrete Gamma distribution (+G)
with five rate categories and assuming that a certain fraction of sites
is evolutionarily invariable (+I). The evolutionary trees were
constructed using the Maximum Likelihood method and bootstrap resampling
with 1000 replicates.
Statistical analysis
Data statistical analysis was carried out using SPSS software (IBM
Statistical Package for Social Sciences for Windows, Version 25.0.
Armonk, NY: IBM Corp). p -value ≤ 0.05 was considered
statistically significant. A Pearson’s chi-square test
(χ2 test) was applied to determine the differences for
the variables that met the criteria of their application. For variables
that did not meet the criteria of the χ2 test or both
variables had two categories (2x2 double entry matrix), Fisher’s exact
test was used.
Nucleotide sequence accession numbers
The nucleotide sequences of this study were deposited in GenBank
database (https://www.ncbi.nlm.nih.gov/genbank/) with accession
numbers OM281957-59, OM287400, OM303085, OM303088, OM333185, OM333186,
OM323986, OM461377, OM461378, OM972707-OM972710, ON004913, ON009342,
ON156796, ON156797, ON185611-18, ON206978, ON206980-83, ON971933,
ON971934 for the VP6 gene and OM281953, OM281956, OM283121-26, OM287398,
OM287399, OM362404, OM948988-91, ON004914, ON156785-93, ON564370,
ON564371, ON971935 for the NSP4 gene.
Results
I and E genotyping and genetic linkage with G and P
genotypes
From 2007 to 2021, 54.8% (34/62) of the unusual RVA strains were
successfully I and E genotyped, and they consisted of 5.9% (2/34)
unusual G (G8 and G10), 64.7% (22/34) unusual P (P[6], P[9],
P[10] and P[11]) and 29.4% (10/34) unusual G and P (G6P[9],
G6P[14] and G8P[14]).
Three different I and E genotypes were identified: I1 (7/34, 20.6%), I2
(25/34, 73.5%), I3 (2/34, 5.9%) and E1 (5/34, 14.7%), E2 (12/34,
35.3%), E3 (11/34, 32.4%). The most common genotypes were I2 and E2.
The E3 genotype was first detected in samples in 2017 and was the second
most common E genotype (Figure 1) . E3 was detected in strains
with P[9] genotype combined with G3 (n=8), G4 (n=1), G6 (n=1), and
G9 (n=1) genotypes (Table 1 ).
Six (6/34, 17.6%) samples were not successfully genotyped in NSP4 gene
and they were characterized as EUD
(unidentified E genotype). These
RVA strains were the following: G2-P[6]-I2-EUD (n=1),
G6-P[9]-I2-EUD (n=2), G8-P[8]-I1-EUD (n=1), G8-P[14]-I2-EUD
(n=1), and G12-P[6]-I1-EUD (n=1) (Table 1) .
The most frequent combinations of G-P-I-E were G3-P[9]-I2-E3 and
G8-P[14]-I2-E2 accounting for 20.6% (7/34) and 11.8% (4/34) of the
samples, respectively (Table 1 ).
Association of I and E genotypes with patient characteristics
Statistical analysis of demographic, clinical, and laboratory data from
children depending on RVA I genotype showed no significant correlation.
The corresponding analysis with E genotypes showed that children
infected with E2 RVA strains had a higher relative frequency of
dehydration (6/12, 50%) compared to those with the E3 genotype (0/9,
0%) (p =0.019).
Molecular characterization and phylogenetic analysis of VP6
Molecular characterization was performed in a 378bp fragment of VP6 gene
which encodes the protein amino acids (aa) 243-368. This VP6 sequence
was compared to reference strains from the Wa
(K02086.1), DS-1
(DQ870507.1), and AU-1
(DQ490538.1) constellations
depending on I genotype to detect substitutions. This comparison showed
four homozygous missense substitutions [V252I (n=7/7), I281V (n=3/7),
A287T (n=7/7), L291S (n=7/7)] in strains carrying the I1 genotype,
four [V281I (n=19/25), S303A (n=25/25), M342L (n=1/25), V349I
(n=1/25)] in strains carrying the I2 genotype and one (V330I) in both
strains carrying the I3 genotype. However, none of these substitutions
was novel after comparison of VP6 sequences with the 100 most
similar strains using BLAST.
In the sequenced fragment of VP6 gene, a part of the antigenic region
III (aa 208-274) was included. Genetic analysis revealed the existence
of three already known substitutions; the homozygous Y248F carried by
all I1, Wa and Rotarix strains, homozygous V252I carried by all I1
strains, and the homozygous I253V carried by I3 (n=1) and AU-1 strains.
Phylogenetic analysis of the VP6 gene in 34 unusual RVA strains revealed
three distinct groups corresponding to I1, I2, and I3 genotypes with
100% reliability for I1 and I3 groups and 88% reliability for I2
group. Among the unusual RVA strains carrying the I2 genotype, two
distinct clades (I2-A and I2-B) were identified with 94% and 80%
reliability, respectively (Figure 2 ). The division of these
clades is based on the 9 synonymous substitutions
(L265L/c.793T>C, N266N/c.798T>C,
Y273Y/c.819T>C, T287T/c.861T>A,
L294L/c.882A>G, V304V/c.912G>A,
L324L/c.972A>G/T, A344A/c.1032T>A,
T347T/c.1041G>A). Clade I2-B is also divided into two
subclades (I2-B1 and I2-B2). This separation also occurred due to the
substitutions of the missense V281I/c.841G>A (carried by 5
strains) and four synonymous (N25N/c.75T>C,
A275A/c.825A>T, Τ323Τ/c.969G>A,
L324L/c.972T>G) substitutions.
Molecular characterization and phylogenetic analysis of NSP4
Molecular characterization was performed οn the whole NSP4 gene
comparing the sequences of this study to Wa (AF093199.1), DS-1
(EF672582.1), and AU-1 (D89873.1) reference strains. Through this
comparison, 13 homozygous missense substitutions were found in strains
carrying the E1 genotype, 21 homozygous and two heterozygous missense
substitutions in strains carrying the E2 genotype and 16 homozygous and
one heterozygous missense substitution in strains carrying the E3
genotype (Figure 3 ). Most of these substitutions (n=23) were
located in the VP4 binding region (aa 112-148).
The NSP4 gene sequences were compared with the 100 most similar strains
using BLAST, and eight possibly novel substitutions were
identified. These novel substitutions were the D140N in one E1
strain, the L25I (n=1), T78A (n=1) and D140N (n=2) in four E2 strains
and the D19G (n=1), I24V (n=1),
V102I (n=1), K141R (n=2) and T155M (n=2) in six E3 strains. Four of
these substitutions were located within significant domains of NSP4.
Specifically, D19G and T78A were in the conserved hydrophobic domains 1
and 3 (H1, H3), respectively, and the D140N and K141R were located in
VP4 binding domain.
In the toxic peptide region, three already known homozygous
substitutions were detected. The H131Y was found in 1/5 (20.0%) E1
strain, in 4/12 E2 strains (33.3%) and in 1/11 (9.1%) E3 strain. The
M133V was found in 7/11 (63.6%) E3 strains and the M135V was detected
in 1/12 (8.3%) E2 strain (Figure 3 ).
Phylogenetic analysis of the NSP4 gene in 28 unusual RVA strains
revealed three distinct groups corresponding to E1, E2, and E3 genotypes
with 100% reliability. Among the unusual RVA strains carrying the E2
genotype, three distinct clades (E2-A, E2-B and E2-C) were identified
(Figure 4 ). The division of the E2-A clade from E2-B and E2-C
is based on four synonymous substitutions (L21L/c.63A>G,
I56I/c.168A>T, L116L/c.346C>T,
V124V/c.372A>T). The E2-A clade differentiated from the
E2-B clade due to one missense (A45T/c.133G>A) and
additional five synonymous (N18N/c.54T>C,
Q109Q/c.327A>G, L110/c.330A>G,
I130I/c.390C>T, S138S/c.414G>A) substitutions
and from the E2-C clade due to one missense
(G140D/c.419G>A) and another five synonymous
(P34P/c.102C>T, E125E/c.375G>A,
I130I/c.390A>T, P168P/c.504G>A) substitutions.
The E2-B clade differed and separated from the E2-C clade due to two
missense (A45T/c.133G>A, G140D/c.419G>A) and
nine synonymous substitutions.
Unusual strains carrying the E1 and E3 genotypes were also divided into
3 (E1-A, E1-B, E1-C) and 2 (E3-A, E3-B) distinct clades, respectively(Figure 4) . Separation between the E1-A and E1-B strains
occurred due to three missense (I141V/c.421A>G,
T145S/c.433A>T, I169S/c.505_506AT>TC) and 22
synonymous substitutions. The E1-C clade differed from both E1-A and
E1-B clades due to two missense (I76V/c.226A>G,
S161N/c.482G>A) and three synonymous
(K3K/c.9G>A, L82L/c.244_246TTG>CTA,
P138P/c.414A>G) substitutions. Furthermore, the E1-C clade
differed from the E1-A clade in three missense
(V141T/c.421_422GT>AC, S145T/c.433T>A,
S169I/c.505_506TC>AT) and 21 synonymous substitutions and
from the E1-B clade in one missense (I141T/c.422T>C) and
seven synonymous substitutions. The division among the E3 cluster
appeared due to six missense (I51V/c.151A>G,
R59K/c.176G>A, R141K/c.422G>A,
F148I/c.442T>A, R151K/c.452G>A,
Q152H/c.456A>C) and 25 synonymous substitutions(Figure 4) .
Discussion
There are limited studies that
investigate the molecular characterization of VP6 and NSP4 genes in
human RVA strains worldwide as the interest has mainly focused on G and
P distribution. This 15-year study focusses on the genotyping and
molecular characterization of the VP6 and NSP4 genes of unusual G and P
RVA strains isolated from children hospitalized with AGE.
Genotyping revealed three different I (I1, I2, I3) and E (E1, E2, E3)
genotypes in unusual RVA strains, I2 and E2 being the most common.
According to the Rotavirus Classification Working Group, 32 I and E
genotypes are known so far, with I1, I2 and E1, E2 being the most
commonly detected genotypes among humans.9,21–23I1-E1 are strongly associated with G1/G3/G4/G5/G9-P[8] and follow
the Wa-like genotype constellation, I2-E2 are associated with
G2-P[4] typical of the DS-1 like genotype constellation and I3-E3
are associated with G3-P[9] typical of the AU-1 like
constellation.22,24
Similarly, in a 10-year study (1996-2006) conducted in Brazil, which
included both common and unusual strains, they found the same three I
and E genotypes. In their study the most prevalent I and E genotypes
were I1 (82.7%) and E1 (81.5%), respectively. However, among strains
with an unusual G (G6, G8, G10) and/or P (P[6], P[9], P[10],
P[11], P[14]) genotype, I2 and E2 were the most prevalent I and
E genotype (n=7/13),24 as in the present study. In
other epidemiological studies such as a 4-year study conducted in the
Democratic Republic of Congo, although the number of unusual G and/or P
strains recorded was substantial, only two I (I1, I2) and E (E1, E2)
genotypes were recorded.25
In this study, I3 and E3 were detected in 2019 and 2017 onwards,
respectively. Strains carrying the E3 genotype showed a significant
increase between 2019-2021, during the COVID-19 pandemic period and they
were mostly found in combination with G3-P[9]-I2. G3-P[9]-I2-E3
was the most prevalent G-P-I-E genotype combination throughout this
study. The increase in E3 was observed in the same period with the
increase of P[9] strains in Greece, as recorded by Tatsi et
al.16 The first record of the G3-P[9]-I2-E3
genotype in humans was in 2012 in Korea, where it was isolated from a 9
year old female.26 However, a similar strain
(G3-P[9]-I2-R2-C2-M2-A3-N2-T3-E3-H3) was recently identified in 2021
in Thailand, and was originated from a feline with
diarrhoea.27
The rare combination of G3-P[9]-I2-E3 that was detected in this
study is possibly derived from a reassortment event. Reassortment is
common among RVs and is a crucial mechanism for the evolution of the
virus. Molecular characterization of multiple RVA genes is important, as
it may contribute to detect strains that do not fit into any of the
major constellations (Wa, DS-1 and AU-1) and are probably products of
reassortment events. Furthermore, this finding supports that VP6 and
NSP4 can segregate independently, contradicting a study in 2003 that
reported a genetic linkage among these two proteins in common, unusual
and reassortant human strains22. Similarly to our
observation, many studies reported such reassortment events at VP6 and
NSP4, but at a lower rate. In an 11-year study (1996-2006) in Brazil,
the I1-E2 unusual I-E genotype combination was found in 1.2% of
circulating strains24. The combinations I2-E1 and
I1-E2 were detected in 15.4% of RVA strains in India during 1990-2000
and in 6.5% in Iran during 2021-2022 .28,29
Multiple
amino acid substitutions were detected in both VP6 and NSP4 genes. While
many of these variants own key positions in the proteins, their
functional impact remains unknown. VP6 protein is crucial as it is used
in molecular and serological diagnostic tests for RVA due to its high
conservation.30,31 The genetic analysis in this study
showed that I2 was more conserved compared to I1 and I3, since only 16%
of the I2 strains carried substitutions. This finding is in concordance
with a similar study in South Africa, in which only I1 and I2 genotypes
were described and I2 was found more conserved than I1 as it carried
only two substitutions. 31
VP6 also contains four major antigenic regions.32Nyaga et al. described many substitutions in I1 antigenic region III,
three of which (Y248F, V252I and I253V) were also presented in our
samples, but their functional effect is unknown.31Changes at the antigenic regions should be closely monitored since it
could potentially affect the efficiency of the rotavirus detection
methods and the future development of a VP6-based vaccine as it also
induces the development of neutralization antibodies like the capsid
proteins VP7 and VP4.33
NSP4 is an essential protein for virus morphogenesis and pathogenesis.
In the present study, nine possibly novel substitutions were
found in the NSP4 gene. Most substitutions were detected in VP4-binding
domain which also contains the toxic peptide and the interspecies
variable domain (ISVD). According to other studies characterizing the
nucleotide sequence of the NSP4 gene, the ISVD region shows great
heterogenicity and the amino acid vary according to
genotype.18,34–37 Limited functional studies exist
and therefore the effects of these variants on the functionality and
immunogenicity of the corresponding protein remain unknown.
Of interest are the substitutions in amino acid 131 in the region of the
toxic peptide, in which the majority of the strains of this study
carried the H131 and E2 strains mainly carried Y131. Ball et al.
conducted functional study for this amino acid on infant mice and they
found that substitutions in amino acid 131 has an effect on the
enterotoxin properties of NSP4.38 Specifically, they
reported that the Y131K substitutions resulted in the absence of
diarrhoea. Studies from Brazil between 1990-2000 and 1987-2003 have
reported that Y131 was detected only in E2 strains, while E1 strains had
H131, and there was no data regarding E3 strains.39,40Srivastava et al. showed that patients infected with a strain carrying
Y131 experienced more severe diarrhoea.34 Even though
the severity of symptoms was not evaluated in the present study,
statistical analysis showed that children infected with an unusual
strain carrying the E2 genotype had a higher chance to exhibit
dehydration, which may indicate more severe diarrhoea. This result may
also be related to the fact that Y131 was detected more in E2 strains.
Limitations of the present study included partial sequencing of the VP6
gene and moderate detection rates of both VP6 and NSP4 genes in
RVA-positive fecal samples. However, similar detection rates have also
been reported in other studies, possibly due to poor sample storage
conditions or the presence of RNases resulting in fragmentation of the
viral RNA genome, presence of PCR inhibitors or inability of primers to
hybridize.9,41 Another limitation of our study was
that the analysis was based only in four genes (VP7, VP4, VP6 and NSP4)
and not in the complete genotype constellation, which would provide more
information about the genetic evolution of the strains.
This is the first study of VP6 and NSP4 epidemiology and molecular
characterization of unusual RVA strains in Greece, in which the unusual
I3 and E3 genotypes, the reassortant I2-E3 human strains and many
substitutions in significant domains of VP6 and NSP4 genes were
detected. Furthermore, a significant clinical association between
dehydration and E2 genotype was described.
In this study, the genotype distribution of the VP6 and NSP4 gene in
unusual rotavirus strains was described. The association between RVA
genotype and the severity of the symptoms needs to be further
investigated. Continuous surveillance of the distribution of RVA
genotypes based on the whole genome, the molecular characterization and
their association with epidemiological and clinical data is important
for the better knowledge of the virus’ evolution, the disease prognosis
and upgrading RVA vaccines.