Data Formats

SEA accepts three types of data files: expression, covariates and annotation. Data files are always tab-delimited files. strange symbols (such as % , &, /, # , etc) and quotation marks (””) should be avoided in labels and gene names. Also the presence of spaces is not recommended.
We next explain each data file in detail:

  • Expression data: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with sample names and a column with gene names. Column names should not start with a number and should also not contain strange characters:


NameArray1Array2Array3Array4Array5Array6Array7Array8Array9
gene10.50.20.71.31.41.02.12.42.6
gene20.50.30.40.30.40.10.10.40.5


  • Covariates: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factors. Each cell contains the value of the array in the experimental factor:


Time333999272727
TreatmentCtrTrATrBCtrTrATrBCtrTrATrB


Experimental factors must have always more than one level, i.e. two or more time-points, two or more treatments, etc. If not, the experimental factor cannot be considered as such and should not be included in the covariates file.

  • Annotation: txt file with functional annotation of genes. Two columns: gene_annotation. If a gene has more than one annotation label, this is indicated in different rows:


gene_1GO:0005647
gene_1GO:0097635
gene_2GO:0000055
gene_3GO:0087630
gene_3GO:0008977


For data file examples, go to the Worked Examples Menu

formats.txt · Last modified: 2011/08/31 09:51 by aconesa
CC Attribution-Noncommercial-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0