2.6 Iso-Seq and ssRNA-Seq data processing and lncRNA
identification
The RNA preparation, library construction, and sequencing for Iso-Seq
(Li et al., 2020) and ssRNA-Seq (Li et al., 2017) were described
previously. All sequencing data were deposited with NCBI under the
BioProject ID PRJNA198574 and PRJNA377165. For Iso-Seq, total RNA was
extracted using TRIzol reagent (Life technologies) and enriched by Oligo
(dT) magnetic beads. The enriched mRNA was reverse transcribed into cDNA
using Clontech SMARTer PCR cDNA Synthesis Kit. A total of two libraries
(Normal and Cold) were constructed and sequenced on the Pacific
Biosciences (PacBio) Sequel II platform by Gene Denovo Biotechnology
Co., Ltd. (Guangzhou, China). The raw reads were classified and
clustered into transcript consensus using SMRT Link v5.0.1 pipeline
(Gordon, Tseng, Salamov, Zhang, Meng, Zhao, Kang, Underwood, Grigoriev,
Figueroa, Schilling, Chen & Wang, 2015) supported by PacBio and then
mapped to reference genome using minimap2 (Li, 2018). Long non-coding
RNA identification was performed according to the pipeline described
previously (Li et al., 2017). The
intersection of both non-protein-coding potential results and
non-protein annotation results were chosen as lncRNA candidates. For
ssRNA-Seq data processing, clean reads from two samples (Normal and
Cold, 3 replicates per sample) were mapped to the full-length lncRNA
isoforms and cassava reference genome by HISAT (Kim, Langmead &
Salzberg, 2015). The counts of each lncRNA were quantified by RSEM (Li
& Dewey, 2011), and the quantitative estimation of each transcript was
achieved using fragments per kilobase of exon model per million mapped
reads (FPKM). Differential expressed lncRNAs were analyzed by the DESeq2
package (Love, Huber & Anders, 2014). Significant changes were
determined using |Log 2 FC| > 1 and
q-value (false discovery rate, FDR < 5%) from
multiple-testing adjustment as cut-off.