Typical RNA-seq data analysis workflow:
- Trimming, i.e. removal of the adapter sequences and poor-quality nucleotides. Quality control checks help to indicate whether trimming has been carried out appropriately.
- Alignment to a reference genome
- Gene quantification and normalization with reference to a file containing gene positions (GTF file)
- Differential expression (DE) analysis across conditions
--outSAMstrandField intronMotif option adds an XS attribute to the spliced alignments in the BAM ﬁle, which is required by Cufﬂinks for unstranded RNA-seq data (Dobin and Gingeras 2015).
HISAT2’s (Kim-2019?) alignment algorithm is based on a graph Ferragina Manzini index, which is faster and more memory-efficient than STAR.
HISAT2 binaries and indexes for H. sapians and a few model organisms are available for download from the official website.