This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
formats [2010/05/04 23:04] aconesa |
formats [2011/08/31 09:51] (current) aconesa |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Data Formats ====== | ====== Data Formats ====== | ||
- | SEA accepts three types of data files: **expression**, **covariates** and **annotation**. Data files are always **tab-delimited** files. strange symbols (such as % , &, /, # , etc) and quotation marks ("") should be avoided in labels and gene names. Also the presence of spaces is not recommended. \\ | + | SEA accepts three types of data files: **expression**, **covariates** and **annotation**. Data files are always **tab-delimited** files. **strange symbols** (such as % , &, /, # , etc) and quotation marks ("") should be **avoided** in labels and gene names. Also the presence of **spaces is not recommended**. \\ |
We next explain each data file in detail: | We next explain each data file in detail: | ||
\\ | \\ | ||
- | | + | \\ |
- | ***Expression data**: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with sample names and a column with gene names. \\ | + | * **Expression data**: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with sample names and a column with gene names. Column names should **not start with a number** and should also not contain strange characters: \\ |
- | |gene_names|sample_1|sample_2|sample_3|sample_4|sample_5|sample_6|sample_7|sample_8|sample_9| | + | |
- | |gene_1|0.18|0.53|0.43|-0.99|-2.45|-1.18|1.43|2.32|0.32| | + | |
- | + | ||
- | *//Covariates//: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factors. Each cell contains the value of the array in the experimental factor. E.g: | + | |
\\ | \\ | ||
+ | |Name|Array1|Array2|Array3|Array4|Array5|Array6|Array7|Array8|Array9|…| | ||
+ | |gene1|0.5|0.2|0.7|1.3|1.4|1.0|2.1|2.4|2.6|…| | ||
+ | |gene2|0.5|0.3|0.4|0.3|0.4|0.1|0.1|0.4|0.5|…| | ||
+ | |...|...|...|...|...|...|...|...|...|...|…| | ||
+ | \\ | ||
+ | * **Covariates**: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factors. Each cell contains the value of the array in the experimental factor: \\ | ||
+ | \\ | ||
|Time|3|3|3|9|9|9|27|27|27|…| | |Time|3|3|3|9|9|9|27|27|27|…| | ||
|Treatment|Ctr|TrA|TrB|Ctr|TrA|TrB|Ctr|TrA|TrB|…| | |Treatment|Ctr|TrA|TrB|Ctr|TrA|TrB|Ctr|TrA|TrB|…| | ||
+ | \\ | ||
+ | Experimental factors must have always **more than one level**, i.e. two or more time-points, two or more treatments, etc. If not, the experimental factor cannot be considered as such and should not be included in the covariates file. | ||
+ | \\ | ||
+ | \\ | ||
+ | * **Annotation**: txt file with functional annotation of genes. Two columns: gene_annotation. If a gene has more than one annotation label, this is indicated in different rows: | ||
+ | \\ | ||
+ | |gene_1|GO:0005647| | ||
+ | |gene_1|GO:0097635| | ||
+ | |gene_2|GO:0000055| | ||
+ | |gene_3|GO:0087630| | ||
+ | |gene_3|GO:0008977| | ||
+ | |...|...| | ||
+ | \\ | ||
+ | For data file examples, go to the [[examples| Worked Examples]] Menu | ||
+ | \\ |