3.2 Three types of hotspots: weak, moderate and strong hotspots
The difference in preference of hotspot residues between PPI and PPepI
dataset are not very much evident with the overall frequency
distribution (Fig. 1). Therefore, we tried to cluster the dataset
obtained from residue scanning. The data clustering suggested the
possible clusters may be 3 to 5. Upon manual examination and observing
the trends, it was considered reasonable to divide the hotspot residues
into approximately three different types. The difference of hotspot
residues was found to be most pronounced in the following three
approximately different ∆∆G ranges, we refer them as weak hotspots (loss
in ∆∆G in 2-10 kcal/mol range), moderate hotspots (∆∆G in 10-20 kcal/mol
range) and strong hotspot (∆∆G >20 kcal/mol).
Out of 3732 hotspots, a great majority of 68.7% (2565) belong to weak
hotspot type. For PPI dataset, Gln, Leu Tyr are the most preferred. This
is followed by Asn, Val, Lys, Glu, Ser and Pro, which also have
substantial presence at the PPI interface. In contrast, in PPepI, Leu
and Tyr are the most preferred hotspot residues with Leu having an
overwhelming contribution in the distribution. Val, Thr, Pro and Ile
also possess large frequencies in the distribution (Fig. 2). Thus, among
weak hotspot type, in PPI, the high occurrence is observed for polar
residues followed by hydrophobic residues and minor fraction of charged
residues are also present. On the other hand, in PPepI data, hydrophobic
residues are more preferred as compared to polar residues. Somewhat
similar trend was observed for anchor residue in PPI category was
observed, even though there are very few data observed in weak type.
Frequency distribution for Gln is the highest followed by Asn and Lys.
In PPepI, the paucity of data precluded us for any reliable predictions.
The data for moderate type (∆∆G in 10-20 kcal/mol) is shown in Fig 3.
About 25.4% of data (949) belong to moderate hotspot type. In contrast
to the weak type, Arg is overwhelmingly present (~18%)
followed by Tyr (~12%) even though Lys and Leu also
possess sizable frequencies (about 10%) in the distribution. Thus, the
distribution in PPI category is dominated by charged and polar residues
and minor fraction of hydrophobic residues are also present. In
contrast, the distribution of PPepI data is dominated by substantial
presence of polar (Tyr), hydrophobic (Leu,
Ile) and charged (Arg) residues.
Among the anchor residues in PPI, Leu is dominant followed by Arg, Tyr
and Gln. However, in PPepI, highest frequencies were observed for only
hydrophobic residues Leu, Ile, Val and Phe.
Out of 3732 hotspots, only 5.8% (218) belong to the strong hotspot
type. The strong hotspot type is completely dominated by Arg residue
being the single most dominant residue in PPI occupying frequency of
~42%. For PPepI category, Arg followed by Trp are the
dominant residues, occupying frequencies of ~26% and
~20%, respectively (Fig. 4). Again for anchor residues,
similar trend was observed in PPI with Arg predominantly present. For
PPepI, Arg and Trp are preferred residues. Other than Arg, the bulky
hydrophobic side chain of Trp also serves as suitable candidate for
anchor residue in PPepI category.
Thus, going from the weak to the strong hotspot types, the PPI and PPepI
categories tend to close the gap. In the weak type, differences are
prominent with polar residues followed by hydrophobic and minor fraction
of charged residues in PPI; hydrophobic followed by polar residues in
PPepI category. Moving towards the moderate category, the nature of
interactions shift towards the polar side in PPI with dominance of
charged and polar residues. Hotspot nature in PPepI categories is
represented by all three types of residues – polar, hydrophobic and
charged. Finally, in the strong type, only Arg dominate the distribution
in PPI, and in PPepI Arg as well as Trp are overwhelmingly present
(Table 3).