Accurate prediction of protein secondary structure (alpha-helix, beta-strand and coil) is a crucial step for protein inter-residue contact prediction and ab initio tertiary structure prediction. In a previous study, we developed a deep belief network-based protein secondary structure method (DNSS1) and successfully advanced the prediction accuracy beyond 80%. In this work, we developed multiple advanced deep learning architectures (DNSS2) to further improve secondary structure prediction. The major improvements over the DNSS1 method include (i) designing and integrating six advanced one-dimensional deep convolutional/recurrent/residual/memory/fractal/inception networks to predict secondary structure, and (ii) using more sensitive profile features inferred from Hidden Markov model (HMM) and multiple sequence alignment (MSA). Most of the deep learning architectures are novel for protein secondary structure prediction. DNSS2 was systematically benchmarked on two independent test datasets with eight state-of-art tools and consistently ranked as one of the best methods. Particularly, DNSS2 was tested on the 82 protein targets of 2018 CASP13 experiment and achieved the best Q3 score of 83.74% and SOV score of 72.46%. DNSS2 is freely available at: https://github.com/multicom-toolbox/DNSS2.
Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the ac-curate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared to conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation (RMSD) decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. All data and scripts used are available at: https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR .
Allostery governing two conformational states is one of the proposed mechanisms for catch-bond behavior in adhesion proteins. In FimH, a catch-bond protein expressed by pathogenic bacteria, separation of two domains disrupts inhibition by the pili domain. Thus, tensile force can induce a conformational change in the lectin domain, from an inactive state to an active state with high affinity. To better understand allosteric inhibition in two-domain FimH (H2 inactive), we use molecular dynamics simulations to study the lectin domain alone, which has high affinity (HL active), and also the lectin domain stabilized in the low-affinity conformation by an Arg-60-Pro mutation (HL mutant). Because ligand-binding induces an allostery-like conformational change in HL mutant, this more experimentally tractable version has been proposed as a “minimal model” for FimH. We find that HL mutant has larger backbone fluctuations than both H2 inactive and HL active, at the binding pocket and allosteric interdomain region. We use an internal coordinate system of dihedral angles to identify protein regions with differences in backbone and sidechain dynamics beyond the putative allosteric pathway sites. By characterizing HL mutant dynamics for the first time, we provide additional insight into the transmission of allosteric information across the lectin domain and build upon structural and thermodynamic data in the literature to further support the use of HL mutant as a “minimal model.” Understanding how to alter protein dynamics to prevent the allosteric conformational change may guide drug development to prevent infection by blocking FimH adhesion.