NOISeq

User Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
downloads [2015/05/28 15:04]
sotacam [Fusarium oxysporum data]
downloads [2015/06/02 15:48]
sotacam [Differential expression on simulated data]
Line 10: Line 10:
  
  
-====== Data used to test the tools included in R NOISeq package ======+====== Data used to test the R NOISeq package ====== 
 + 
 +===== ENCODE data ===== 
 + 
 +RNA-seq data from human B-cells (CD20+ cell line) and monocytes (CD14+ cell line) were 
 +obtained by Cold Spring Harbor Laboratory for the ENCODE project. Two different RNA extracting protocols were 
 +applied: the PolyA+ extraction method (Pap) and PolyAselection procedure (Pam). Sequencing was performed with 
 +an Illumina GAIIx platform. The read files were downloaded from ENCODE website and mapped to the reference genome downloaded from UCSC (hg19 GRCh37) (42) using TopHat v2.0.8. Gene expression was quantified using the HTSeq Python package version 0.5.3p3 and an in-house script to take multihits into account by equitably dividing each read mapping to different genes among all of them. 
 + 
 +{{:​encodecountdata.txt.zip|ENCODE count matrix}} 
 +  
  
 ===== Fusarium oxysporum data ===== ===== Fusarium oxysporum data =====
  
 The samples were sequenced using the Applied Biosystems SOLiD 4 system with SOLiD MM50 chemistry. The samples were sequenced using the Applied Biosystems SOLiD 4 system with SOLiD MM50 chemistry.
-The length of the sequencing reads is 50 bases. Two biological replicates were obtained for each condition. One of the conditions corresponds to the fungus being cultured in human blood (wt_B_30_37) and the other in minimum medium (wt_M_30_37). The reads were mapped to the reference genome downloaded from the Ensembl Fungi database ​(47) (release 14) using Lifescope software. CLC Bio tools were used to quantify the gene expression.+The length of the sequencing reads is 50 bases. Two biological replicates were obtained for each condition. One of the conditions corresponds to the fungus being cultured in human blood (wt_B_30_37) and the other in minimum medium (wt_M_30_37). The reads were mapped to the reference genome downloaded from the Ensembl Fungi database (release 14) using Lifescope software. CLC Bio tools were used to quantify the gene expression. 
 + 
 +{{:​fusariumcountdataqc.txt.zip|F.oxysporum count matrix}}  
 +(Please note that for confidentiality reasons the gene IDs have been removed) 
 + 
 + 
 +===== Prostate cancer data ===== 
 + 
 +This RNA-seq data set was downloaded from the SRA repository (ERP000550). In this study Ren et al. (2012) sequenced samples of tumoral and healthy prostate which came from Chinese patients. There were 11 biological replicates for tumoral prostate (T) and 12 replicates for healthy prostate (N). The sequencing was done with an Illumina HiSeqTM 2000 and the reads were mapped to the reference human genome downloaded 
 +from Ensembl (release 68) using TopHat 1.4.1. Gene expression was quantified using the HTSeq Python package, 
 +version 0.5.3p3. 
 + 
 +{{:​prostatecancercountdata.txt.zip|Prostate Cancer count matrix}} 
 + 
 + 
 + 
 +====== R code used to test the NOISeqBIO method ====== 
 + 
 +===== Simulation algorithm ===== 
 + 
 +R scripts for simulating RNA-seq count data for two experimental conditions:​ 
 + 
 +{{:​simulation_high.r|Simulation of HIGH variability scenarios}} 
 + 
 +{{:​simulation_low.r|Simulation of LOW variability scenarios}} 
 + 
 + 
 +===== Differential expression on simulated data =====
  
-Count matrix can be downloaded HERE. Please note that for confidentiality reasons ​the gene IDs were removed.+These scripts show how the differential expression methods ​were applied on simulated data sets:
  
 +{{:​methodsonsimulations.r|DE methods on simulations}}
  
 +{{:​globalsimulationsanalysis.r|Analysis of DE results}}