Spike protein mutation sites
The spike (S) glycoprotein, which mediates entry into host cells and
therefore determines the specificity, is the mostly intensively
investigated protein of coronavirus. The S protein is composed of the
putative N-terminal signal peptide, S1 which contains receptor-binding
domain (RBD) and S2. Because of many Sporadic mutations, we only showed
some representative mutations frequently happened in early submitted
genomes. Thanks to the cryo-EM structure of SARS-CoV-2 S proteins (PDB
ID: 6vsb), all these mutated sites were analyzed from the view of 3D
structure. Twelve mutations were mapped onto the structure (Figure 4)
and six more mutations (L5F, N74K, Y144del., G181V, S247R, G476S) were
not shown in the structure because of the resolution and sequence
length. In addition to one mutation (L5F) in the signal peptide and
three in S2 fragment (F797C, A930V, D936Y), fourteen mutations appeared
in the S1 fragment. To be specific, four mutations (A348T, R408I, D428E,
G476S) were discovered in the RBD domain (left upper corner) and ten
mutations (Y28N, H49Y, L54F, N74K, Y144del., F157L, G181V, S221W, S247R,
D614G) were found in other part of S1.
Figure 4