Repetitive sequences found in the assembled genome
RepeatMasker program (Tarailo-Graovac, & Chen, 2009) estimated that
repeat elements occupy 43.5% (196,045,652 bp) of the assembled genome
(Table 1). Except for ‘unclassified’ repeats, LINE is the largest
superfamily of repetitive sequences in S. ricini (Figs. 2A, B).
Interestingly, although the total length of LINE and its proportion to
all repetitive sequences in the genome were similar between S.
ricini and B. mori (Figs. 2A, B), the components of families of
LINE were different. Table S8 shows the copy number of each LINE family
in S. ricini and B. mori genomes. For example, while the
CR1-Zenon family was the largest LINE family in S. ricini , the
largest family in B. mori was Jockey. Given these results,
although both S. ricini and B. mori have larger amounts of
repetitive sequences in the genome than other lepidopteran species do
(Fig. 2A), the expansion of repetitive sequences seems to have occurred
in parallel and independently on their own phylogenetic branches.
Another noteworthy feature was that the S. ricini genome contains
considerably small amounts of SINE (Fig. 2A). While the B. morigenome showed a large proportion of SINE (19.4% of all repetitive
sequences), SINEs in S. ricini genome occupied only 0.0588%.
This finding also supported the hypothesis of parallel and independent
expansion of repetitive sequences.