2.2 Bioinformatic and evolutionary analyses
To investigate changes in INH genes during plant evolution, the PpINH1 protein sequence was used to query the One Thousand Plant Transcriptomes database (https://db.cngb.org/onekp/) using BLASTP. Database sequences that aligned with expectation values (E-values) less than 10-10 were set aside as candidates. Variations in INH gene copy number during plant evolution were investigated using the Phytozome database (https://phytozome.jgi.doe.gov/pz/portal.html). In this analysis, PpINH1 was aligned to sequences using BLASTP with the same significance criterion (E-value< 10-10. To analyze INH gene conservation, the nucleic acid sequences of candidate INH genes from Amborella trichopoda ,Prunus persica , Brasica rapa , Medicago truncatula , and Solanum lycopersicum were downloaded from Phytozome. Finally, 19 nucleic acid sequences were aligned with ClustalW, and phylogenic trees were constructed using the Maximum Likelihood method in MEGA 7.0 with 1000 bootstrap replicates.
Isoelectric points and molecular masses were predicted using Compute pI/Mw (https://web.expasy.org/compute_pi/). Signal peptides and locations of putative signal peptide cleavage sites were predicted using ProP 1.0 (http://www.cbs.dtu.dk/services/ProP/). The amino acid sequences of PpINH1 and PpVIN2 were compared against entries in the Protein Data Bank (http://www.rcsb.org/) using BLASTP. The crystal structure of a cell wall invertase inhibitor from tobacco (PDB-ID 1RJ1) (Hothorn et al. 2004) with 40% sequence identity was used as a model to predict the three-dimensional structure of PpINH1. The predicted three-dimensional structure of PpVIN2 is based on the crystal structure of a 6-SST/6-SFT from Pachysandra terminalis (PDB-ID 3UGF) (Lammens et al., 2012) with 67% sequence identity. The resulting PDB file was analyzed using PyMOL (http://www.pymol.org).