Protein coding gene prediction
The Isoseq3 pipeline (https://github.com/pacificbiosciences/isoseq) was
used to process the full-length transcriptome data of Chinese flowering
cabbage to obtain the transcriptome sequence. At the same time, in order
to obtain a more complete gene annotation, we integrated the annotation
content of B. juncea(J. Yang et al., 2016) , B.
napus(Chalhoub et al., 2014) , B.
oleracea(Liu et al.,
2014) , B. rapa(Zhang et al., 2018) and B. nigra(W. Wang
et al., 2019) as the reference gene sequence using CD-HIT-EST
(https://github.com/weizhongli/cdhit) to remove the sequence
redundancy. The results of repeats sequence found by
EDTA(Ou et al., 2019) and
TRF(Benson, 1999) were used
as reference repeats to enter into
MAKER(Cantarel et al.,
2008) for 5 rounds of gene and repeat sequence annotation.