Next Generation Sequencing (NGS) technologies are increasingly being used for gene expression profiling as a replacement for microarrays. The expression level given by these technologies is the number of reads in the library mapping to a given feature (gene, exon, transcript, etc.), i.e., the read counts. Most of the statistical methods for assessment of differential expression using count data rely on parametric assumptions about the distribution of the counts (Poisson, Negative Binomial, …).
NOISeq is a non-parametric approach for the identification of differentially expressed genes from count data or previously normalized count data. NOISeq empirically models the noise distribution of count changes by contrasting fold-change differences (M) and absolute expression differences (D) for all the features in samples within the same condition. This reference distribution is then used to assess whether the M-D values computed between two conditions for a given gene is likely to be part of the noise or represent a true differential expression.
NOISeq was tested on data sets with technical replicates. The are two variants of this method: NOISeq-real uses replicates when available to compute the noise distribution and NOISeq-sim simulates them in absence of replication. It should be noted that the NOISeq-sim simulation procedure assimilates to technical replication and does not reproduce biological variability, which is necessary for population inferential analysis.
Please, find here an outline of the NOISeq method.
Both NOISeq and NOISeqBIO are included in R/Bioconductor NOISeq package together with a set of graphical tools to assess the quality of sequencing count data, as well as the outcome of the differential expression analysis.