Identifying fibroin and sericin genes in S.
ricini genome
Fib-H (BAQ55621.1) and p25 (LC001863.1, LC001864.1 and LC001865.1) ofS. ricini were already registered in Genbank, thus using those
sequences as query, BLASTP search against 16,702 gene models of S.
ricini was conducted with an e-value less than 1e-5 and ‘-seg no’
option. In cases of BLASTP result being ‘No hits found,’ TBLASTN search
against nucleotide sequences of S. ricini genome was conducted
with the same parameter. In this report, ‘-evalue 1e-5’ and ‘-seg no’
options were always added when BLASTP and TBLASTN search were conducted
with silk proteins (Fib-H, Fib-L, p25, sericin) as query. In order to
investigate the homolog of Fib-L is present or not in S.
ricini genome, B. mori Fib-L (NP_001037488.1) was utilised as
query for BLASTP and TBLASTN search. In addition, we performed TBLASTN
search against A. yamamai genome using B. mori Fib-L
sequence as query.
Tsubota et al . (2015) and Dong et al . (2015) reported that
5 and 4 sericin genes are expressed in anterior silk gland and
middle silk gland, respectively (Table S7). The deduced amino acid
sequences of putative sericin transcripts were submitted to the
gene model set of S. ricini through BLASTP. Regarding LC001867
and LC001870, because the corresponding gene models were not found,
TBLASTN was conducted to confirm whether both transcripts were present
or not.
When we tried to comprehend the repertoire of silk protein encoding
genes in D. plexippus and P. xylostella , TBLASTN search
against the genome assemblies was conducted with B. mori Fib-H
(NP_001106733.1), Fib-L, p25 (NP_001139413.1) and sericin-1, 2, 3
(AB112019.1, NP_001166287.1, NP_001108116.1) sequences as queries
because any transcripts or amino acid sequences were not previously
reported as Fib-H, Fib-L, p25 and sericin in P. xylostella andD. plexippus . Genome assemblies which were used for TBLASTN
search was the ones used in BUSCO analysis (Table S5). As the
transcripts of Fib-H , Fib-L and p25 of P.
xuthus were already registered (see Table 3), those sequences were
mapped to the P. xuthus genome sequence to confirm the presence.
Regarding sericin genes in P. xuthus , no sequences were
previously registered in Genbank, thus the same procedure as the case ofP. xylostella and D. plexippus , was taken. Phylogenetic
analysis of sericin was conducted with seven S. ricini putativesericin genes, three B. mori sericin genes and fiveA. yamamai sericin genes (LC08587, LC08588, LC08589, LC08590 and
LC08591; Zurovec et al., 2016). Muscle was used to generate alignments
of protein sequences (Edgar, 2004). Aligned sequences were subjected to
phylogenetic analysis by maximum likelihood and bootstrap methods with
1,000 replicates using MEGAX (Kumar, Stecher, Li, Knyaz, & Tamura,
2018).