This is an old revision of the document!


Data Formats

SEA accepts three types of data files: expression, covariates and annotation. Data files are always tab-delimited files. strange symbols (such as % , &, /, # , etc) and quotation marks (“”) should be avoided in labels and gene names. Also the presence of spaces is not recommended.
We next explain each data file in detail:

  • Expression data: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with sample names and a column with gene names:


gene_namessample_1sample_2sample_3sample_4sample_5sample_6sample_7sample_8sample_9
gene_10.180.530.43-0.99-2.45-1.181.432.320.32
gene_21.220.911.521.23-0.331.58-2.31-1.27-1.08


  • Covariates: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factors. Each cell contains the value of the array in the experimental factor:


Time333999272727
TreatmentCtrTrATrBCtrTrATrBCtrTrATrB


Experimental factors must have always more than one level, i.e. two or more time-points, two or more treatments. If not, the experimental factor cannot be considered as such and should not be included in the covariates file.

  • Annotation: txt file with functional annotation of genes. Two columns: gene_annotation. If a gene has more than one annotation label, this is indicated in different rows:


gene_1GO:0005647
gene_1GO:0097635
gene_2GO:0000055
gene_3GO:0087630
gene_3GO:0008977


formats.1273007808.txt.gz · Last modified: 2010/05/04 23:16 by aconesa
CC Attribution-Noncommercial-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0