Data processing and analysis
We used the nf-core-chipseq pipeline version 1.2.2 (Ewels et al.,
2020; Patel et al., 2021) to identify differentially enriched peaks. The
pipeline includes Trimgalore(https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) for
trimming and adapter removal. We also used BWA-mem (Li, 2013) to
map the reads to a high-contiguity genome assembly of the painted lady
(Lohse et al., 2021). The Model based Analysis of ChIP-Seq
(MACS2) package (Zhang et al., 2008) was applied to identify read
coverage significantly higher than the random genome-wide variation and
to construct consensus peaks. For determining number and location of
enrichment of histone marks we used only consensus peaks called in all
four groups (treatment*replicates). The analysis of differential
activation with hostplant availability as contrast was performed with
this requirement relaxed to account for differential enrichment of peaks
absent in one of the treatments. The count data for the peaks were
transformed with Voom (Law et al., 2014), for linear models in
the R-package Limma (Ritchie et al., 2015) to detect
differentially enriched peaks between the two treatment groups. To
correct for multiple testing, the p-values were adjusted with the
Benjamini-Hochberg false discovery rate as implemented in Limma .
We used previously available annotation information (Shipilina et al.,
2022) to identify the gene located closest to each differentially
activated region. Potential functions of candidate genes were obtained
from the annotation in combination with homology searches withBLAST to the NCBI database using the nucleotide sequence of each
candidate gene (Altschul et al., 1990). Additional functional
information was extracted from Flybase (https://flybase.org).
Results